AES San Francisco 2008
Paper Session P19
Spatial Audio Processing
Saturday, October 4, 2:30 pm — 5:30 pm
: Agnieszka Roginska
, New York University - New York, NY, USAP19-1 Head-Related Transfer Functions Reconstruction from Sparse Measurements Considering a Priori Knowledge from Database Analysis: A Pattern Recognition Approach
—Pierre Guillon, Rozenn Nicol
, Orange Labs - Lannion, France; Laurent Simon
, Laboratoire d’Acoustique de l’Université du Maine - Le Mans, France
Individualized Head-Related Transfer Functions (HRTFs) are required to achieve high quality Virtual Auditory Spaces. This paper proposes to decrease the total number of measured directions in order to make acoustic measurements more comfortable. To overcome the limit of sparseness for which classical interpolation techniques fail to properly reconstruct HRTFs, additional knowledge has to be injected. Focusing on the spatial structure of HRTFs, the analysis of a large HRTF database enables to introduce spatial prototypes. After a pattern recognition process, these prototypes serve as a well-informed background for the reconstruction of any sparsely measured set of individual HRTFs. This technique shows better spatial fidelity than blind interpolation techniques.
Convention Paper 7610 (Purchase now)P19-2 Near-Field Compensation for HRTF Processing
—David Romblom, Bryan Cook
, Sennheiser DSP Research Lab - Palo Alto, CA, USA
It is difficult to present near-field virtual audio displays using available HRTF filters, as most existing databases are measured at a single distance in the far-field of the listener’s head. Measuring near-field data is possible, but would quickly become tiresome due to the large number of distances required to simulate sources moving close to the head. For applications requiring a compelling near-field virtual audio display, one could compensate the far-field HRTF filters with a scheme based on 1/r spreading roll off. However, this would not account for spectral differences that occur in the near-field. Using difference filters based on a spherical head model, as well as a geometrically accurate HRTF lookup scheme, we are able to compensate existing data and present a convincing virtual audio display for near field distances.
Convention Paper 7611 (Purchase now)P19-3 A Method for Estimating Interaural Time Difference for Binaural Synthesis
—Juhan Nam, Jonathan S. Abel, Julius O. Smith III
, Stanford University - Stanford, CA, USA
A method for estimating interaural time difference (ITD) from measured head-related transfer functions (HRTFs) is presented. The method forms ITD as the difference in left-ear and right-ear arrival times, estimated as the times of maximum cross-correlation between measured HRTFs and their minimum-phase counterparts. The arrival time estimate is related to a least-squares fit to the measured excess phase, emphasizing those frequencies having large HRTF magnitude and deweighting large phase delay errors. As HRTFs are nearly minimum-phase, this method is robust compared to the conventional approach of cross-correlating left-ear and right-ear HRTFs, which can be very different. The method also performs slightly better than techniques averaging phase delay over a limited frequency range.
Convention Paper 7612 (Purchase now)P19-4 Efficient Delay Interpolation for Wave Field Synthesis
—Andreas Franck, Karlheinz Brandenburg
, Fraunhofer Institute for Digital Media Technology - Ilmenau, Germany; Ulf Richter
, Fraunhofer Institute for Digital Media Technology - Ilmenau, Germany, and HTWK Leipzig, Leipzig, Germany
Wave Field Synthesis enables the reproduction of complex auditory scenes and moving sound sources. Moving sound sources induce time-variant delay of source signals, To avoid severe distortions, sophisticated delay interpolation techniques must be applied. The typically large numbers of both virtual sources and loudspeakers in a WFS system result in a very high number of simultaneous delay operations, thus being a most performance-critical aspect in a WFS rendering system. In this paper we investigate suitable delay interpolation algorithms for WFS. To overcome the prohibitive computational cost induced by high-quality algorithms, we propose a computational structure that achieves a significant complexity reduction through a novel algorithm partitioning and efficient data reuse.
Convention Paper 7613 (Purchase now)P19-5 Obtaining Binaural Room Impulse Responses from B-Format Impulse Responses
—Fritz Menzer, Christof Faller
, Ecole Polytechnique Fédérale de Lausanne - Lausanne, Switzerland
Given a set of head related transfer functions (HRTFs) and a room impulse response measured with a Soundfield microphone, the proposed technique computes binaural room impulse responses (BRIRs) that are similar to binaural room impulse responses that would be measured if, in place of the Soundfield microphone, the dummy head used for the HRTF set was directly recording the BRIRs. The proposed technique enables that from a set of HRTFs corresponding BRIRs for different rooms are obtained without a need for the dummy head or person to be present for measurement.
Convention Paper 7614 (Purchase now)P19-6 A New Audio Postproduction Tool for Speech Dereverberation
—Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi
, NTT Communication Science Laboratories - Kyoto, Japan; Toshiyuki Kubota
, NTT Media Lab. - Tokyo, Japan
This paper proposes a new audio postproduction tool for speech dereverberation utilizing our previously proposed method. In previous studies we have proposed the single-channel dereverberation method as a preprocessing of automatic speech recognition and reported good performance. This paper focuses more on the improvement of the audible quality of the dereverberated signals. To achieve good dereverberation with less audible artifacts, the previously proposed dereverberation method is combined with the post-processing that implicitly takes account of the perceptual masking property. The system has three adjustable parameters for controlling the audible quality. With an informal evaluation, we found that the proposed tool allows the professional audio engineers to dereverberate a set of reverberant recordings efficiently.
Convention Paper 7615 (Purchase now)