AES Warsaw 2015
Paper Session P1

P1 - (Lecture) Spatial Audio—Part 1


Thursday, May 7, 10:00 — 12:30 (Room: Belweder)

Chair:
Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada

P1-1 Subjective Loudness of 22.2 Multichannel ProgramsTomoyasu Komori, NHK Science and Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Waseda University - Shinjuku-ku, Tokyo, Japan; Satoshi Oode, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Kazuho Ono, NHK Engineering System Inc. - Setagaya-ku, Tokyo, Japan; Kensuke Irie, Science & Technology Research Laboratories, Japan Broadcasting Corp. - Setagaya, Tokyo, Japan; Yo Sasaki, NHK Science & Technology Research Laboratories - Kinuta, Setagaya-ku, Tokyo, Japan; Tomomi Hasegawa, NHK Science & Technology Research Laboratories - Kinuta, Setagaya-ku, Tokyo, Japan; Ikuko Sawaya, Science & Technology Research Laboratories, Japan Broadcasting Corp (NHK). - Setagaya, Tokyo, Japan
NHK is planning 8K Super Hi-Vision (SHV) broadcasting with 22.2 multichannel sound as a new broadcasting service. The current loudness measurement algorithm, however, are only standardized up to 5.1 channels in Recommendation ITU-R BS.1770. To extend the algorithm beyond 5.1 ch, we conducted a subjective loudness evaluation of various program materials and formats. The results showed that different formats differed only slightly. Furthermore, we measured objective loudness values on the basis of an algorithm compatible with the current algorithm and found that the objective loudness values had a good correlation with the subjective loudness values.
Convention Paper 9219 (Purchase now)

P1-2 MPEG-D Spatial Audio Object Coding for Dialogue Enhancement (SAOC-DE)Jouni Paulus, Fraunhofer IIS - Erlangen, Germany; International Audio Laboratories Erlangen - Erlangen, Germany; Jürgen Herre, International Audio Laboratories Erlangen - Erlangen, Germany; Fraunhofer IIS - Erlangen, Germany; Adrian Murtaza, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Leon Terentiv, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Harald Fuchs, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Sascha Disch, International Audio Laboratories Erlangen - Erlangen, Germany; Falko Ridderbusch, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
The topic of Dialogue Enhancement and personalization of audio has recently received increased attention. Both hearing-impaired and normal-hearing audience benefit, for example, from the possibility of boosting the commentator speech to minimize listening effort, or to attenuate the speech in favor of sports stadium atmosphere in order to enhance the feeling of being there. In late 2014, the ISO/MPEG standardization group made available a new specification, Spatial Audio Object Coding for Dialogue Enhancement (SAOC-DE), which was closely derived from the well-known MPEG-D Spatial Audio Object Coding (SAOC). This paper describes the architecture and features of the new system. The envisioned applications will be discussed and the performance of the new technology is demonstrated in subjective listening tests.
Convention Paper 9220 (Purchase now)

P1-3 Multichannel Systems: Listeners Choose Separate Reproduction of Direct and Reflected SoundsPiotr Kleczkowski, AGH University of Science and Technology - Krakow, Poland; Aleksandra Król, AGH University of Science and Technology - Krakow, Poland; Pawel Malecki, AGH University of Science and Technology - Krakow, Poland
Arguments can be put forward for the separation of direct and reflected components of the sound field and reproducing them through appropriate transducers, but there is no definite opinion about that. In this work the perceptual effect of separation in commonly used 5.0 and 7.0 multichannel systems was investigated. Four listening experiments were performed involving several schemes of separation and a variety of experimental conditions. The listeners consistently preferred some schemes involving separation to schemes without separation.
Convention Paper 9221 (Purchase now)

P1-4 On the Influence of Headphone Quality in the Spatial Immersion Produced by Binaural RecordingsPablo Gutierrez-Parera, Universitat de Valencia - Valencia, Spain; Jose J. Lopez, Universidad Politecnica de Valencia - Valencia, Spain; Emanuel Aguilera, Universidad Politecnica de Valencia - Valencia, Spain
The binaural recordings obtained using an acoustic manikin produce a realistic sound immersion played through high quality headphones. However, most people commonly use headphones of inferior quality as the ones provided with smartphones or music players. Factors such as frequency response, distortion, and the disparity between the left-right transducers could be some of the degrading factors. This work lays the foundation for a strategy for studying what are the factors that affect the end result and what level do. A first experiment focuses on the analysis of how the disparity in levels between the two transducers affects the final result. A second test studies the influence of the frequency response. A third test analyzes the effects of distortion using a Volterra kernels scheme for the simulation of the distortion using convolutions. The results of this work reveal how disparity between both transducers can affect the perception of direction. In the case of frequency response the results are more difficult to quantify and further work will be necessary. Finally the study reveals that the distortion produced by the range of headphones tested does not affect to the perception of binaural sound.
Convention Paper 9222 (Purchase now)

P1-5 Binaural Audio with Relative and Pseudo Head TrackingChristof Faller, Illusonic GmbH - Zurich, Switzerland; EPFL - Lausanne, Switzerland; Fritz Menzer, Technische Universität München - Munich, Germany; Christophe Tournery, Illusonic GmbH - Lausanne, Switzerland
While it has been known for years that head-tracking can significantly improve binaural rendering, it has not been widely used in consumer applications. The goal of the proposed techniques is to leverage head tracking, while making it more usable for mobile applications, where the sound image shall not have an absolute position in space. Relative head tracking keeps the sound image in front, while reducing the effect of head movements to only small fluctuations. Relative head tracking can be implemented with only a gyrometer; there is no need for absolute direction. An even more economical technique with the goal to improve binaural rendering is pseudo head tracking. It generates small head movements using a random process without resorting to a gyroscope. The results of a subjective test indicate that both relative and pseudo head tracking can contribute to spaciousness and front/back differentiation.
Convention Paper 9223 (Purchase now)


Return to Paper Sessions

EXHIBITION HOURS May 7th   10:00 – 18:00 May 8th   09:00 – 18:00 May 9th   09:00 – 18:00
REGISTRATION DESK May 6th   15:00 – 18:00 May 7th   09:30 – 18:30 May 8th   08:30 – 18:30 May 9th   08:30 – 18:30 May 10th   08:30 – 16:30
TECHNICAL PROGRAM May 7th   10:00 – 18:00 May 8th   09:00 – 18:00 May 9th   09:00 – 18:00 May 10th   09:00 – 17:00
AES - Audio Engineering Society