Loudness Differences for Voice-Over-Voice Audio in TV and Streaming
×
Cite This
Citation & Abstract
D. Geary, M. Torcoli, J. Paulus, C. Simon, D. Straninger, A. Travaglini, and B. Shirley, "Loudness Differences for Voice-Over-Voice Audio in TV and Streaming," J. Audio Eng. Soc., vol. 68, no. 11, pp. 810-818, (2020 November.). doi: https://doi.org/10.17743/jaes.2020.0022
D. Geary, M. Torcoli, J. Paulus, C. Simon, D. Straninger, A. Travaglini, and B. Shirley, "Loudness Differences for Voice-Over-Voice Audio in TV and Streaming," J. Audio Eng. Soc., vol. 68 Issue 11 pp. 810-818, (2020 November.). doi: https://doi.org/10.17743/jaes.2020.0022
Abstract: Voice-over-Voice (VoV) is a common mixing practice observed in news reports and documentaries, where a foreground voice is mixed on top of a background voice, e.g., to translate an interview. This is achieved by ducking the background voice so that the foreground voice is more intelligible, while still allowing the listener to perceive the presence and tone of the background voice. Currently there is little published research on ducking practices for VoV or on technical details such as the Loudness Difference (LD) between foreground and background speech. This paper investigates the ducking practices of nine expert audio engineers and the preferred LDs of 13 non-expert listeners of ages 57 years and older. Results highlight a clear difference between the LDs used by the experts and those preferred by the non-expert listeners. Experts tended toward LDs of 11.5–17 LU, while non-experts preferred a range of 20–30 LU. Based on these results, a minimum LD of 20 LU is recommended for VoV. High inter-subject variance due to personal preference was observed. This variance makes a substantial case for the introduction of personalization in broadcast and streaming. The audiovisual material used for the tests is provided at https://www.audiolabs-erlangen.de/resources/2020-VoV-DB.
@article{geary2020loudness,
author={geary, david and torcoli, matteo and paulus, jouni and simon, christian and straninger, davide and travaglini, alessandro and shirley, ben},
journal={journal of the audio engineering society},
title={loudness differences for voice-over-voice audio in tv and streaming},
year={2020},
volume={68},
number={11},
pages={810-818},
doi={https://doi.org/10.17743/jaes.2020.0022},
month={november},}
@article{geary2020loudness,
author={geary, david and torcoli, matteo and paulus, jouni and simon, christian and straninger, davide and travaglini, alessandro and shirley, ben},
journal={journal of the audio engineering society},
title={loudness differences for voice-over-voice audio in tv and streaming},
year={2020},
volume={68},
number={11},
pages={810-818},
doi={https://doi.org/10.17743/jaes.2020.0022},
month={november},
abstract={voice-over-voice (vov) is a common mixing practice observed in news reports and documentaries, where a foreground voice is mixed on top of a background voice, e.g., to translate an interview. this is achieved by ducking the background voice so that the foreground voice is more intelligible, while still allowing the listener to perceive the presence and tone of the background voice. currently there is little published research on ducking practices for vov or on technical details such as the loudness difference (ld) between foreground and background speech. this paper investigates the ducking practices of nine expert audio engineers and the preferred lds of 13 non-expert listeners of ages 57 years and older. results highlight a clear difference between the lds used by the experts and those preferred by the non-expert listeners. experts tended toward lds of 11.5–17 lu, while non-experts preferred a range of 20–30 lu. based on these results, a minimum ld of 20 lu is recommended for vov. high inter-subject variance due to personal preference was observed. this variance makes a substantial case for the introduction of personalization in broadcast and streaming. the audiovisual material used for the tests is provided at https://www.audiolabs-erlangen.de/resources/2020-vov-db.},}
TY - paper
TI - Loudness Differences for Voice-Over-Voice Audio in TV and Streaming
SP - 810
EP - 818
AU - Geary, David
AU - Torcoli, Matteo
AU - Paulus, Jouni
AU - Simon, Christian
AU - Straninger, Davide
AU - Travaglini, Alessandro
AU - Shirley, Ben
PY - 2020
JO - Journal of the Audio Engineering Society
IS - 11
VO - 68
VL - 68
Y1 - November 2020
TY - paper
TI - Loudness Differences for Voice-Over-Voice Audio in TV and Streaming
SP - 810
EP - 818
AU - Geary, David
AU - Torcoli, Matteo
AU - Paulus, Jouni
AU - Simon, Christian
AU - Straninger, Davide
AU - Travaglini, Alessandro
AU - Shirley, Ben
PY - 2020
JO - Journal of the Audio Engineering Society
IS - 11
VO - 68
VL - 68
Y1 - November 2020
AB - Voice-over-Voice (VoV) is a common mixing practice observed in news reports and documentaries, where a foreground voice is mixed on top of a background voice, e.g., to translate an interview. This is achieved by ducking the background voice so that the foreground voice is more intelligible, while still allowing the listener to perceive the presence and tone of the background voice. Currently there is little published research on ducking practices for VoV or on technical details such as the Loudness Difference (LD) between foreground and background speech. This paper investigates the ducking practices of nine expert audio engineers and the preferred LDs of 13 non-expert listeners of ages 57 years and older. Results highlight a clear difference between the LDs used by the experts and those preferred by the non-expert listeners. Experts tended toward LDs of 11.5–17 LU, while non-experts preferred a range of 20–30 LU. Based on these results, a minimum LD of 20 LU is recommended for VoV. High inter-subject variance due to personal preference was observed. This variance makes a substantial case for the introduction of personalization in broadcast and streaming. The audiovisual material used for the tests is provided at https://www.audiolabs-erlangen.de/resources/2020-VoV-DB.
Voice-over-Voice (VoV) is a common mixing practice observed in news reports and documentaries, where a foreground voice is mixed on top of a background voice, e.g., to translate an interview. This is achieved by ducking the background voice so that the foreground voice is more intelligible, while still allowing the listener to perceive the presence and tone of the background voice. Currently there is little published research on ducking practices for VoV or on technical details such as the Loudness Difference (LD) between foreground and background speech. This paper investigates the ducking practices of nine expert audio engineers and the preferred LDs of 13 non-expert listeners of ages 57 years and older. Results highlight a clear difference between the LDs used by the experts and those preferred by the non-expert listeners. Experts tended toward LDs of 11.5–17 LU, while non-experts preferred a range of 20–30 LU. Based on these results, a minimum LD of 20 LU is recommended for VoV. High inter-subject variance due to personal preference was observed. This variance makes a substantial case for the introduction of personalization in broadcast and streaming. The audiovisual material used for the tests is provided at https://www.audiolabs-erlangen.de/resources/2020-VoV-DB.
Authors:
Geary, David; Torcoli, Matteo; Paulus, Jouni; Simon, Christian; Straninger, Davide; Travaglini, Alessandro; Shirley, Ben
Affiliations:
Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany; Acoustics Research Centre, University of Salford, UK; International Audio Laboratories Erlangen, Germany(See document for exact affiliation information.) JAES Volume 68 Issue 11 pp. 810-818; November 2020
Publication Date:
December 21, 2020Import into BibTeX
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=20995