AES Los Angeles 2016
Paper Session

Saturday, October 1, 9:30 am — 11:30 am (Rm 409A)

Paper Session: AVAR Paper Session 6: Perceptual Consideration for VR/AR

Spatial Auditory Feedback in Response to Tracked Eye Position—Durand R. Begault, NASA Ames Research Center - Moffet Field, CA, USA; Charles M Salter Associates- Audio Forensic Center - San Francisco, CA USA
Fixation of eye gaze toward one or more specific positions or regions of visual space is a desirable feature within several types of high-stress human interfaces, including vehicular operation, flight deck control, target acquisition, etc. It is therefore desirable to have a means to give spatial auditory feedback to a human in such a system about whether or not the gaze is specifically directed towards a desired position. Alternatively, it is desirable to use eye position as a means of controlling a device that provides auditory feedback so that there is a correspondence between eye position and control voltages that manipulate aspects of an auditory cue that includes spatial position, pitch and/or timbre. This session is part of the co-located AVAR Conference which is not included in the normal convention All Access badge.

Perceptual Weighting of Binaural Information: Toward an Auditory Perceptual "Spatial Codec" for Auditory Augmented Reality—G. Christopher Stecker, Vanderbilt University School of Medicine - Nashville, TN, USA; Anna Diedesch, Vanderbilt University School of Medicine - Nashville, TN, USA
Auditory augmented reality (AR) requires accurate estimation of spatial information conveyed in the natural scene, coupled with accurate spatial synthesis of virtual sounds to be integrated within it. Solutions to both problems should consider the capabilities and limitations of the human binaural system, in order to maximize relevant over distracting acoustic information and enhance perceptual integration across AR layers. Recent studies have measured how human listeners integrate spatial information across multiple conflicting cues, revealing patterns of “perceptual weighting” that sample the auditory scene in a robust but spectrotemporally sparse manner. Such patterns can be exploited for binaural analysis and synthesis, much as time-frequency masking patterns are exploited by perceptual audio codecs, to improve efficiency and enhance perceptual integration. This session is part of the co-located AVAR Conference which is not included in the normal convention All Access badge.

DeepEarNet: Individualizing Spatial Audio with Photography, Ear Shape Modeling, and Neural Networks—Shoken Kaneko, Yamaha Corporation - Iwata-shi, Japan; Tsukasa Suenaga, Yamaha Corporation - Iwata-shi, Japan; Satoshi Sekine, Yamaha Corporation - Iwata-shi, Japan
Individualizing spatial audio is of crucial importance for high-quality virtual and augmented reality audio. In this paper we propose a method for individualizing spatial audio by combining the recently proposed ear shape modeling technique with computer vision and machine learning. We use a convolutional neural network to obtain estimates of the ear shape model parameters from stereo photographs of the user ear. The individualized ear shape and its associated individualized head-related transfer function (HRTF) can be calculated from the obtained parameters based on the ear shape model and numerical acoustic simulations. Preliminary experiments, evaluating the shapes of the estimated individual ears, proved the effect of individualization. This session is part of the co-located AVAR Conference which is not included in the normal convention All Access badge.

Adjustment of the direct-to-Reverberant-Energy-Ratio to Reach Externalization within a Binaural Synthesis System—Thomas Sporer, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany; Stephan Werner, Technische Universität Ilmenau - Ilmenau, Germany; Florian Klein, Technische Universität Ilmenau - Ilmenau, Germany
The contribution presents a study that investigates the perception of spatial audio reproduced by a binaural synthesis system. The quality features externalization and room congruence are measured within a listening test. Former studies imply that especially externalization is decreased if acoustic divergence between the synthesized and listening room exists. Other studies show that the adjustment of the Direct-to-Reverberant- Energy-Ratio (DRR) can increase the perceived congruence between synthesized and listening room. Within this experiment test persons are able to adjust the DRR of the synthesis until perceptional congruence between the synthesis and the internal reference concerning the listening room occurs. The ratings show that the test persons are able to adjust DRR of the listening room and therefore externalization increases. This session is part of the co-located AVAR Conference which is not included in the normal convention All Access badge.

Return to AVAR Conference Sessions

AES Los Angeles 2016Paper Session

Paper Session: AVAR Paper Session 6: Perceptual Consideration for VR/AR

AES Los Angeles 2016
Paper Session