AES New York 2019
Paper Session P10
P10 - Spatial Audio, Part 1
Thursday, October 17, 1:15 pm — 4:15 pm
Sungyoung Kim, Rochester Institute of Technology - Rochester, NY, USA
P10-1 Use of the Magnitude Estimation Technique in Reference-Free Assessments of Spatial Audio Technology—Alex Brandmeyer, Dolby Laboratories - San Francisco, CA, USA; Dan Darcy, Dolby Laboratories, Inc. - San Francisco, CA, USA; Lie Lu, Dolby Laboratories - San Francisco, CA, USA; Richard Graff, Dolby Laboratories, Inc. - San Francisco, CA, USA; Nathan Swedlow, Dolby Laboratories - San Francisco, CA, USA; Poppy Crum, Dolby Laboratories - San Francisco, CA, USA
Magnitude estimation is a technique developed in psychophysics research in which participants numerically estimate the relative strengths of a sequence of stimuli along a relevant dimension. Traditionally, the method has been used to measure basic perceptual phenomena in different sensory modalities (e.g., "brightness," "loudness"). We present two examples of using magnitude estimation in the domain of audio rendering for different categories of consumer electronics devices. Importantly, magnitude estimation doesn’t require a reference stimulus and can be used to assess general ("audio quality") and domain-specific (e.g., "spaciousness") attributes. Additionally, we show how this data can be used together with objective measurements of the tested systems in a model that can predict performance of systems not included in the original assessment.
Convention Paper 10273
P10-2 Subjective Assessment of the Versatility of Three-Dimensional Near-Field Microphone Arrays for Vertical and Three-Dimensional Imaging—Bryan Martin, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, QC, Canada; Jack Kelly, McGill University - Montreal, QC, Canada; Brett Leonard, University of Indianapolis - Indianapolis, IN, USA; The Chelsea Music Festival - New York, NY, USA
This investigation examines the operational size-range of audio images recorded with advanced close-capture microphone arrays for three-dimensional imaging. It employs a 3D panning tool to manipulate audio images. The 3D microphone arrays used in this study were: Coincident-XYZ, M/S-XYZ, and Non-coincident-XYZ/five-point. Instruments of the orchestral string, woodwind, and brass sections were recorded. The objective of the test was to determine the point of three-dimensional expansion onset, preferred imaging, and image breakdown point. Subjects were presented with a continuous dial to manipulate the three-dimensional spread of the arrays, allowing them to expand or contract the microphone signals from 0° to 90° azimuth/elevation. The results showed that the M/S-XYZ array is the perceptually “biggest” of the capture systems under test and displayed the fasted sense of expansion onset. The coincident and non-coincident arrays are much less agreed upon by subjects in terms of preference in particular, and also in expansion onset.
Convention Paper 10274
P10-3 Defining Immersion: Literature Review and Implications for Research on Immersive Audiovisual Experiences—Sarvesh Agrawal, Bang & Olufsen a/s - Struer, Denmark; Adèle Simon, Bang & Olufsen a/s - Struer, Denmark; Søren Bech, Bang & Olufsen a/s - Struer, Denmark; Aalborg University - Aalborg, Denmark; Klaus Bærentsen, Aarhus University - Aarhus, Denmark; Søren Forchhammer, Technical University of Denmark - Lyngby, Denmark
The use of the term “immersion” to describe a multitude of varying experiences in the absence of a definitional consensus has obfuscated and diluted the term. This paper presents a non-exhaustive review of previous work on immersion on the basis of which a definition of immersion is proposed: a state of deep mental involvement in which the subject may experience disassociation from the awareness of the physical world due to a shift in their attentional state. This definition is used to contrast and differentiate interchangeably used terms such as presence and envelopment from immersion. Additionally, an overview of prevailing measurement techniques, implications for research on immersive audiovisual experiences, and avenues for future work are discussed briefly.
Convention Paper 10275
P10-4 Evaluation on the Perceptual Influence of Floor Level Loudspeakers for Immersive Audio Reproduction—Yannik Grewe, Fraunhofer IIS - Erlangen, Germany; Andreas Walther, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Julian Klapp, Fraunhofer IIS - Erlangen, Germany
Listening tests were conducted to evaluate the perceptual influence of adding a lower layer of loudspeakers to a setup that is commonly used for immersive audio reproduction. Three setups using horizontally arranged loudspeakers (1M, 2M, 5M), one with added height loudspeakers (5M+4H), and one with additional ?oor level loudspeakers (5M+4H+3L) were compared. Basic Audio Quality was evaluated in a sweet-spot test with explicit reference, and two preference tests (sweet-spot and off sweet-spot) were performed to evaluate the Overall Audio Quality. The stimuli, e.g., ambient recordings and sound design material, made dedicated use of the lower loudspeaker layer. The results show that reproduction comprising a lower loudspeaker layer is preferred compared to reproduction using the other loudspeaker setups included in the test.
Convention Paper 10276
P10-5 Investigating Room-Induced Influences on Immersive Experience Part II: Effects Associated with Listener Groups and Musical Excerpts—Sungyoung Kim, Rochester Institute of Technology - Rochester, NY, USA; Shuichi Sakamoto, Tohoku University - Sendai, Japan
The authors previously compared four distinct multichannel playback rooms and showed that perceived spatial attributes of program material (width, depth, and envelopment) were similar across all four rooms when reproduced through a 22-channel loudspeaker array. The present study further investigated perceived auditory immersion from two additional variables: listener group and musical style. We found a three-way interaction of variables, MUSIC x (playback) ROOM x GROUP for 22-channel reproduced music. The interaction between musical material and playback room acoustics differentiates perceived auditory immersion across listener groups. However, in the 2-channel reproductions, the room and music interaction is prominent enough to flatten inter-group differences. The 22-channel reproduced sound fields may have shaped idiosyncratic cognitive bases for each listener group.
Convention Paper 10277
P10-6 Comparison Study of Listeners’ Perception of 5.1 and Dolby Atmos—Tomas Oramus, Academy of Performing Arts in Prague - Prague, Czech Republic; Petr Neubauer, Academy of Performing Arts in Prague - Prague, Czech Republic
Surround sound reproduction has been a common technology in almost every theater room for several decades. In 2012 Dolby Laboratories, Inc. announced a new spatial 3D audio format – Dolby Atmos  that (due to its object-based rendering) pushes the possibilities of spatial reproduction and supposedly listeners' experience forward. This paper examines listeners' perception of this format in comparison with today's unwritten standard for cinema reproduction – 5.1. Two sample groups were chosen for the experiment - experienced listeners (sound designers and sound design students) and inexperienced listeners; the objective was to examine how these two groups perceive selected formats and whether there is any difference between these two groups. We aimed at five aspects – Spatial Immersion (Envelopment), Localization, Dynamics, Audio Quality, and Format Preference. The results show mostly an insignificant difference between these two groups while both of them slightly leaned towards Dolby Atmos over 5.1.
Convention Paper 10278