AES New York 2018
Paper Session P18
P18 - Spatial Audio-Part 2 (Evaluation)
Saturday, October 20, 1:30 pm — 4:00 pm (1E12)
Jonas Braasch, Rensselear Polytechnic Institute - Troy, NY, USA
P18-1 Prediction of Binaural Lateralization Percepts from the Coherence Properties of the Acoustic Wavefield—Mark F. Bocko, University of Rochester - Rochester, NY, USA; Steven Crawford, University of Rochester - Rochester, NY, USA; Michael Heilemann, University of Rochester - Rochester, NY, USA
A framework is presented that employs the space-time coherence properties of acoustic wavefields to compute features corresponding to listener percepts in binaural localization. The model employs a short-time windowed cross-correlator to compute a sequence of interaural time differences (ITDs) from the binaurally-sampled acoustic wavefield. The centroid of the distribution of this sequence of measurements indicates the location of the virtual acoustic source and the width determines the perceived spatial extent of the source. The framework provides a quantitative method to objectively assess the performance of various free-space and headphone-based spatial audio rendering schemes and thus may serve as a useful tool for the analysis and design of spatial audio experiences in VR/AR and other spatial audio systems.
Convention Paper 10127 (Purchase now)
P18-2 Influence of Visual Content on the Perceived Audio Quality in Virtual Reality—Olli Rummukainen, Fraunhofer IIS - Erlangen, Germany; Jing Wang, Beijing Institute of Technology - Beijing, China; Zhitong Li, Beijing Institute of Technology - Beijing, China; Thomas Robotham, International Audio Laboratories Erlangen - Erlangen, Germany; Zhaoyu Yan, Beijing Institute of Technology - Beijing, China; Zhuoran Li, Beijing Institute of Technology - Beijing, China; Xiang Xie, Beijing Institute of Technology - Beijing, China; Frederik Nagel, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; International Audio Laboratories - Erlangen, Germany; Emanuël A. P. Habets, International Audio Laboratories Erlangen - Erlangen, Germany
To evoke a place illusion, virtual reality builds upon the integration of coherent sensory information from multiple modalities. This integrative view of perception could be contradicted when quality evaluation of virtual reality is divided into multiple uni-modal tests. We show the type and cross-modal consistency of visual content to affect overall audio quality in a six-degrees-of-freedom virtual environment with expert and naïve participants. The effect is observed both in their movement patterns and direct quality scores given to three real-time binaural audio rendering technologies. Our experiments show that the visual content has a statistically signi?cant effect on the perceived audio quality.
Convention Paper 10128 (Purchase now)
P18-3 HRTF Individualization: A Survey—Corentin Guezenoc, Centrale-Supélec - Rennes, France; 3D Sound Labs - Rennes, France; Renaud Seguier, Centrale-Supélec - Rennes, France
The individuality of head-related transfer functions (HRTFs) is a key issue for binaural synthesis. While, over the years, a lot of work has been accomplished to propose end-user-friendly solutions to HRTF personalization, it remains a challenge. In this article we establish a state-of-the-art of that work. We classify the various proposed methods, review their respective advantages and disadvantages, and, above all, methodically check if and how the perceptual validity of the resulting HRTFs was assessed.
Convention Paper 10129 (Purchase now)
P18-4 Spatial Auditory-Visual Integration: The Case of Binaural Sound on a Smartphone—Julian Moreira, Cnam (CEDRIC) - Paris, France; Orange Labs - Lannion, France; Laetitia Gros, Orange - Lannion, France; Rozenn Nicol, Orange Labs - Lannion, France; Isabelle Viaud-Delmon, IRCAM - CNRS - Paris, France
Binaural rendering is a technology for spatialized sound that can be advantageously coupled with the visual of a mobile phone. By rendering the auditory scene out of the screen, all around the user, it is a potentially powerful tool of immersion. However, this audio-visual association may lead to specific perception artifacts. One of them is the ventriloquist effect, i.e., the perception of a sound and an image as they come from the same location, while they are actually at different places. We investigate the conditions of this effect to occur using an experimental method called Point of Subjective Spatial Alignment (PSSA). Given the position of a visual stimulus, we determine the integration window, i.e., the range of locations in the horizontal plane where auditory stimulus is perceived as matching the visual stimulus location. Several parameters are varied: semantic type of the stimuli (neutral or meaningful) and sound elevation (same elevation as the visual or above subject’s head). Results reveal the existence of an integration window in all cases. But, surprisingly, the sound is attracted by the visual as located in the virtual scene, rather than its real location on screen. We interpret it as a mark of immersion. Besides, we observe that integration window is not altered by elevation, provided that stimuli are semantically meaningful.
Convention Paper 10130 (Purchase now)
P18-5 Online vs. Offline Multiple Stimulus Audio Quality Evaluation for Virtual Reality—Thomas Robotham, International Audio Laboratories Erlangen - Erlangen, Germany; Olli Rummukainen, Fraunhofer IIS - Erlangen, Germany; Jürgen Herre, International Audio Laboratories Erlangen - Erlangen, Germany; Fraunhofer IIS - Erlangen, Germany; Emanuël A. P. Habets, International Audio Laboratories Erlangen - Erlangen, Germany
Virtual reality technology incorporating six degrees-of-freedom introduces new challenges for the evaluation of audio quality. Here, a real-time “online” evaluation platform is proposed, allowing multiple stimulus comparison of binaural renderers within the virtual environment, to perceptually evaluate audio quality. To evaluate the sensitivity of the platform, tests were conducted using the online platform with audiovisual content, and two traditional platforms with pre-rendered “off-line” audiovisual content. Conditions employed had known relative levels of impairments. A comparison of the results across platforms indicates that only the proposed online platform produced results representative of the known impaired audio conditions. Off-line platforms were found to be not sufficient in detecting the tested impairments for audio as part of a multi-modal virtual reality environment.
Convention Paper 10131 (Purchase now)