AES New York 2015
Poster Session P11

P11 - Spatial Audio

Friday, October 30, 2:00 pm — 3:30 pm (S-Foyer 1)

P11-1 Comparison of Techniques for Binaural Navigation of Higher-Order Ambisonic Soundfields—Joseph G. Tylka, 3D3A Lab, Princeton University - Princeton, NJ, USA; Edgar Choueiri, Princeton University - Princeton, NJ, USA
Soundfields that have been decomposed into spherical harmonics (i.e., encoded into higher-order ambisonics—HOA) can be rendered binaurally for off-center listening positions, but doing so requires additional processing to translate the listener and necessarily leads to increased reproduction errors as the listener navigates further away from the original expansion center. Three techniques for performing this navigation (simulating HOA playback and listener movement within a virtual loudspeaker array, computing and translating along plane-waves, and re-expanding the soundfield about the listener) are compared through numerical simulations of simple incident soundfields and evaluated in terms of both overall soundfield reconstruction accuracy and predicted localization. Results show that soundfield re-expansion achieves arbitrarily low reconstruction errors (relative to the original expansion) in the vicinity of the listener, whereas errors generated by virtual-HOA and plane-wave techniques necessarily impose additional restrictions on the navigable range. Results also suggest that soundfield re-expansion is the only technique capable of accurately generating high-frequency localization cues for off-center listening positions, although the frequencies and translation distances over which this is possible are strictly limited by the original expansion order.
Convention Paper 9421 (Purchase now)

P11-2 Estimation of Individual HRIRs Based on SPCA from Impulse Responses Acquired in Ordinary Sound Fields—Shouichi Takane, Akita Prefectural University - Yurihonjo, Akita, Japan
In this paper a method for estimation of individual Head-Related Impulse Responses (HRIRs) from impulse responses acquired in an ordinary sound field is proposed based on the Spatial Principal Components Analysis (SPCA) of the HRIRs. The average vector and the principal components matrix are assumed to be obtained by adopting the SPCA to the set of HRIRs of multiple subjects covering all directions. A part of the impulse response from sound source to an ear of a certain subject, regarded as one of his/her HRIR, is used together for estimating the weight coefficients of the principal components. Applying the method using the dataset involving the HRIRs of the multiple subjects covering all sound source directions to the estimation of the individual HRIRs showed that the acceptable estimation accuracy is obtained for the estimation of the HRIRs in an ipsilateral direction.
Convention Paper 9422 (Purchase now)

P11-3 Height Perception in Ambisonic Based Binaural Decoding—Gavin Kearney, University of York - York, UK; Tony Doyle, University of York - York, UK
This paper presents an investigation into the perception of height in Ambisonic decoding schemes for binaural reproduction. We compare the spatial resolution of first, third, and fifth order Ambisonic decoders to that of real-world monophonic sources presented in the vertical plane. Spatial preservation of the spectral cues required for rendering sources with height is investigated and cross-referenced to binaural models of the rendered systems. The results presented address the applicability of higher order Ambisonics to the rendering of sound source elevation given the high frequency distortion of pinnae cues.
Convention Paper 9423 (Purchase now)

P11-4 An HRTF Database for Virtual Loudspeaker Rendering—Gavin Kearney, University of York - York, UK; Tony Doyle, University of York - York, UK
This paper presents a database of Head Related Transfer Functions (HRTFs), collected from 20 subjects for use in virtual loudspeaker reproduction systems. The paper documents the measurement procedure and format of the HRTFs. The database accommodates Ambisonic rendering up to 5th Order and includes loudspeaker configurations derived from platonic, convex polyhedra and other spherical distributions. The datasets are also presented with matching acoustic responses to assist externalization and decode matrices for higher order Ambisonic rendering.
Convention Paper 9424 (Purchase now)

P11-5 Influence of Energy Distribution on Elevation Judgments—Taku Nagasaka, University of Aizu - Aizu-Wakamatsu, Japan; Shunsuke Nogami, University of Aizu - Aizu-Wakamatsu, Japan; Julian Villegas, University of Aizu - Aizu Wakamatsu, Fukushima, Japan; Jie Huang, University of Aizu - Aizuwakamatsu City, Japan
The relative influence of spectral cues on elevation localization of virtual sources was investigated by comparing judgments of loudspeaker reproduced stimuli spatialized with three methods, two of them based on vector-based amplitude panning: 3D vector-based amplitude panning (3D-VBAP), and 2D-VBAP in conjunction with HRIR convolution; and a third method that filtered the stimuli to simulate spectral peaks and troughs naturally occurring at different angles (equalizing filters). For the last two methods a single horizontal loudspeaker array was used. Smallest absolute errors were observed for the 3D-VBAP judgments regardless of azimuth; no significant difference in the mean absolute error was found between the other two methods. However, for most presentation azimuths, the equalizing filter method yielded the least dispersed results. These results could be used for improving elevation localization in two-dimensional VBAP reproduction systems.
Convention Paper 9425 (Purchase now)

P11-6 Influence of Spectral Energy Distribution on Subjective Azimuth Judgments—Shunsuke Nogami, University of Aizu - Aizu-Wakamatsu, Japan; Taku Nagasaka, University of Aizu - Aizu-Wakamatsu, Japan; Julian Villegas, University of Aizu - Aizu Wakamatsu, Fukushima, Japan; Jie Huang, University of Aizu - Aizuwakamatsu City, Japan
In this research we compare subjective judgments of azimuth obtained by three methods: Vector-Based Amplitude Panning (VBAP), VBAP mixed with binaural rendition over loudspeakers (VBAP + HRTF), and a newly proposed method based on equalizing spectral energy. In our results, significantly smaller errors were found for the stimuli treated with VBAP and HRTFs; differences between the other two treatments were not significant. Regarding spherical dispersion of the judgments, VBAP results have the greatest dispersion, whereas the dispersion on the results of the other two methods were significantly smaller, however similar between them. These results suggest that horizontal localization using VBAP methods can be improved by applying a frequency dependent panning factor a opposed to a constant scalar as commonly used.
Convention Paper 9426 (Purchase now)

P11-7 Subjective Diffuseness in Layer-Based Loudspeaker Systems with Height—Michael P. Cousins, University of Southampton - Southampton, UK; Filippo Maria Fazi, University of Southampton - Southampton, Hampshire, UK; Stefan Bleeck, University of Southampton - Southampton, UK; Frank Melchior, BBC Research and Development - Salford, UK
Loudspeaker systems with more channels and with elevated loudspeakers are becoming more common. There is an opportunity for greater spatial impression with listeners surrounded in three dimensions. There is research showing the advantages of more loudspeakers and of 3D layouts over 2D layouts although it is not clear whether the cause of these improvements is the greater number of loudspeaker, their position, or both. In this paper two listening tests are presented that investigate the subjective diffuseness of a range of loudspeaker layouts. The first experiment was used to optimize the distribution of loudness between horizontal layers of loudspeakers to allow fair comparison between different layouts. The second experiment investigated the perceived diffuseness of a range of loudspeaker layouts chosen to critically assess parameters of layer-based loudspeaker systems as well as validate the results of the first experiment. The number of loudspeakers at head-height, the number of loudspeakers not at head-height, and the relative level between head-height and non-head-height layers were all found to be statistically significant in terms of perceived diffuseness. It was also confirmed that 3D loudspeaker layouts can have statistically greater perceived diffuseness than 2D layouts.
Convention Paper 9427 (Purchase now)

P11-8 Echo Canceler for Real-Time Audio Communication with Wave Field Reconstruction—Satoru Emura, NTT Media Intelligence Laboratories - Tokyo, Japan; Sachiko Kurihara, NTT Media Intelligence Laboratories - Tokyo, Japan
For immersive sharing of a sound field between two remote sites wave field synthesis (WFS) and echo cancellation are essential. Though both technologies have been studied for more than a decade, it was not clear whether it was possible to build a real-time system for full-duplex audio communication with WFS. We show in this paper that such a system can be built.
Convention Paper 9428 (Purchase now)

Return to Paper Sessions

EXHIBITION HOURS October 30th 10am - 6pm October 31st 10am - 6pm November 1st 10am - 4pm

REGISTRATION DESK October 28th 3pm - 7pm October 29th 8am - 6pm October 30th 8am - 6pm October 31st 8am - 6pm November 1st 8am - 4pm

TECHNICAL PROGRAM October 29th 9am - 7pm October 30th 9am - 7pm October 31st 9am - 7pm November 1st 9am - 6pm

Audio Engineering Society

AES New York 2015Poster Session P11

P11 - Spatial Audio

AES New York 2015
Poster Session P11