AES Munich 2009
Paper Session P13
P13 - Spatial Audio and Spatial Perception
Friday, May 8, 14:00 — 18:30
Chair: Tapio Lokki
P13-1 Evaluation of Equalization Methods for Binaural Signals—Zora Schärer, Alexander Lindau, TU Berlin - Berlin, Germany
The most demanding test criterion for the quality of binaural simulations of acoustical environments is whether they can be perceptually distinguished from a real sound field or not. If the simulation provides a natural interaction and sufficient spatial resolution, differences are predominantly perceived in terms of spectral distortions due to a non-perfect equalization of the transfer functions of the recording and reproduction systems (dummy head microphones, headphones). In order to evaluate different compensation methods, several headphone transfer functions were measured on a dummy head. Based upon these measurements, the performance of different inverse filtering techniques re-implemented from literature was evaluated using auditory measures for spectral differences. Additionally, an ABC/HR listening test was conducted, using two different headphones and two different audio stimuli (pink noise, acoustical guitar). In the listening test, a real loudspeaker was directly compared to a binaural simulation with high spatial resolution, which was compensated using seven different equalization methods.
Convention Paper 7721 (Purchase now)
P13-2 Crosstalk Cancellation between Phantom Sources—Florian Völk, Thomas Musialik, Hugo Fastl, Technical University of München - München, Germany
This paper presents an approach using phantom sources (resulting from the so-called summing localization of two loudspeakers) as sources for crosstalk cancellation (CTC). The phantom sources can be rotated synchronously with the listener’s head, thus demanding significantly less processing power than traditional approaches using fixed CTC loudspeakers, as an online re-computation of the CTC filters is (under certain circumstances) not necessary. First results of localization experiments show the general applicability of this procedure.
Convention Paper 7722 (Purchase now)
P13-3 Preliminary Evaluation of Sweet Spot Size in Virtual Sound Reproduction Using Dipoles—Yesenia Lacouture Parodi, Per Rubak, Aalborg University - Aalborg, Denmark
In a previous study, three crosstalk cancellation techniques were evaluated and compared under different conditions. Least square approximations in frequency and time domain were evaluated along with a method based on minimum-phase approximation and a frequency independent delay. In general, the least square methods outperformed the method based on minimum-phase approximation. However, the evaluation was only done for the best-case scenario, where the transfer functions used to design the filters correspond to the listener’s transfer functions and his/her location and orientation relative to the loudspeakers. In this paper we present a follow-up evaluation of the performance of the three inversion techniques when these conditions are violated. A setup to measure the sweet spot of different loudspeaker arrangements is described. Preliminary measurement results are presented for loudspeakers placed at the horizontal plane and an elevated position, where a typical 60-degree stereo setup is compared with two closely spaced loudspeakers. Additionally, two- and four-channel arrangements are evaluated.
Convention Paper 7723 (Purchase now)
P13-4 The Importance of the Direct to Reverberant Ratio in the Perception of Distance, Localization, Clarity, and Envelopment—David Griesinger, Consultant - Cambridge, MA, USA
The Direct to Reverberant ratio (D/R)—the ratio of the energy in the first wave front to the reflected sound energy—is absent from most discussions of room acoustics. Yet only the direct sound (DS) provides information about the localization and distance of a sound source. This paper discusses how the perception of DS in a reverberant field depends on the D/R and the time delay between the DS and the reverberant energy. Threshold data for DS perception will be presented, and the implications for listening rooms, hall design, and electronic enhancement will be discussed. We find that both clarity and envelopment depend on DS detection. In listening rooms the direct sound must be at least equal to the total reflected energy for accurate imaging. As the room becomes larger (and the time delay increases) the threshold goes down. Some conclusions: typical listening rooms benefit from directional loudspeakers, small concert halls should not have a shoe-box shape, early reflections need not be lateral, and electroacoustic enhancement of late reverberation may be vital in small halls.
Convention Paper 7724 (Purchase now)
P13-5 Frequency-Domain Interpolation of Empirical HRTF Data—Brian Carty, Victor Lazzarini, National University of Ireland - Maynooth, Ireland
This paper discusses Head Related Transfer Function (HRTF)-based artificial spatialization of audio. Two alternatives to the minimum phase method of HRTF interpolation are suggested, offering novel approaches to the challenge of phase interpolation. A phase truncation, magnitude interpolation technique aims to avoid complex preparation, manipulation or transformation of empirical HRTF data, and any inaccuracies that may be introduced by these operations. A second technique adds low frequency nonlinear frequency scaling to a functionally based phase model. This approach aims to provide a low frequency spectrum more closely aligned to the empirical HRTF data. Test results indicate favorable performance of the new techniques.
Convention Paper 7725 (Purchase now)
P13-6 Analysis and Implementation of a Stereophonic Play Back System for Adjusting the “Sweet Spot” to the Listener’s Position—Sebastian Merchel, Stephan Groth, Dresden University of Technology - Dresden, Germany
This paper focuses on a stereophonic play back system designed to adjust the “sweet spot” to the listener’s position. The system includes an optical face tracker that provides information about the listener’s x-y position. Accordingly, the loudspeaker signals are manipulated in real-time in order to move the “sweet spot.” The stereophonic perception with an adjusted “sweet spot” is theoretically investigated on the basis of several models of binaural hearing. The results indicate that an adjustment of signals corresponding to the center of the listener’s head does improve the localization over the whole listening area. Although some localization error remains due to asymmetric signal paths for off-center listening positions, which can be estimated and compensated for.
Convention Paper 7726 (Purchase now)
P13-7 Issues on Dummy-Head HRTFs Measurements—Daniela Toledo, Henrik Møller, Aalborg University - Aalborg, Denmark
The dimensions of a person are small compared to the wavelength at low frequencies. Therefore, at these frequencies head-related transfer functions (HRTFs) should decrease asymptotically until they reach 0 dB—i.e., unity gain—at DC. This is not the case in measured HRTFs: the limitations of the equipment used result in a wrong—and random—value at DC and the effect can be seen well within the audio frequencies. We have measured HRTFs on a commercially available dummy-head Neumann KU-100 and analyzed issues associated to calibration, DC correction, and low-frequency response. Informal listening tests suggest that the ripples seen in HRTFs with a wrong DC value affect the sound quality in binaural synthesis.
Convention Paper 7727 (Purchase now)
P13-8 Binaural Processing Algorithms: Importance of Clustering Analysis for Preference Tests—Andreas Silzle, Bernhard Neugebauer, Sunish George, Jan Plogsties, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
The acceptability of a newly proposed technology for commercial application is often assumed if the sound quality reached in a listening test surpasses a certain target threshold. As an example, it is a well-established procedure for decisions on the deployment of audio codecs to run a listening test comparing the coded/decoded signal with the uncoded reference signal. For other technologies, e.g., upmix or binaural processing, however, the unprocessed signal only can act as a "comparison signal." Here, the goal is to achieve a significant preference of the processed over the comparison signal. For such preference listening tests, we underline the importance of clustering the test results to obtain additional valuable information, as opposed to using the standard statistic metrics like mean and confidence interval. This approach allows determining the size of the user group that significantly prefers to use the proposed algorithm when it would be available in a consumer device. As an example, listening test data for binaural processing algorithms are analyzed in this investigation.
Convention Paper 7728 (Purchase now)
P13-9 Perception of Head-Position-Dependent Variations in Interaural Cross-Correlation Coefficient—Russell Mason, Chungeun Kim, Tim Brookes, University of Surrey - Guildford, Surrey, UK
Experiments were undertaken to elicit the perceived effects of head-position-dependent variations in the interaural cross-correlation coefficient of a range of signals. A graphical elicitation experiment showed that the variations in the IACC strongly affected the perceived width and depth of the reverberant environment, as well as the perceived width and distance of the source. A verbal experiment gave similar results and also indicated that the head-position-dependent IACC variations caused changes in the perceived spaciousness and envelopment of the stimuli.
Convention Paper 7729 (Purchase now)