AES Rome 2013
Paper Session P12
Monday, May 6, 09:00 — 13:00 (Sala Carducci)
Paper Session: P12 - Spatial Audio—Part 1: Binaural, HRTF
Michele Geronazzo, University of Padova - Padova, Italy
P12-1 Binaural Ambisonic Decoding with Enhanced Lateral Localization—Tim Collins, University of Birmingham - Birmingham, UK
When rendering an ambisonic recording, a uniform speaker array is often preferred with the number of speakers chosen to suit the ambisonic order. Using this arrangement, localization in the lateral regions can be poor but can be improved by increasing the number of speakers. However, in practice this can lead to undesirable spectral impairment. In this paper a time-domain analysis of the ambisonic decoding problem is presented that highlights how a non-uniform speaker distribution can be used to improve localization without incurring perceptual spectral impairment. This is especially relevant to binaural decoders, where the locations of the virtual speakers are fixed with respect to the head, meaning that the interaction between speakers can be reliably predicted.
Convention Paper 8878 (Purchase now)
P12-2 A Cluster and Subjective Selection-Based HRTF Customization Scheme for Improving Binaural Reproduction of 5.1 Channel Surround Sound—Bosun Xie, South China University of Technology - Guangzhou, China; Chengyun Zhang, South China University of Technology - Guangzhou, China; Xiaoli Zhong, South China University of Technology - Guangzhou, China
This work proposes a cluster and subjective selection-based HRTF customization scheme for improving binaural reproduction of 5.1 channel surround sound. Based on similarity of HRTFs from an HRTF database with 52 subjects, a cluster analysis on HRTF magnitudes is applied. Results indicate that HRTFs of most subjects can be classified into seven clusters and represented by the corresponding cluster centers. Subsequently, HRTFs used in binaural 5.1 channel reproduction are customized from the seven cluster centers by means of subjective selection, i.e., a subjective selection-based customization scheme. Psychoacoustic experiments indicate that the proposed scheme partly improves the localization performance in the binaural 5.1 channel surround sound.
Convention Paper 8879 (Purchase now)
P12-3 Spatially Oriented Format for Acoustics: A Data Exchange Format Representing Head-Related Transfer Functions—Piotr Majdak, Austrian Academy of Sciences - Vienna, Austria; Yukio Iwaya, Tohoku Gakuin University - Tagajo, Japan; Thibaut Carpentier, UMR STMS IRCAM-CNRS-UPMC - Paris, France; Rozenn Nicol, Orange Labs, France Telecom - Lannion, France; Matthieu Parmentier, France Television - Paris, France; Agnieszka Roginska, New York University - New York, NY, USA; Yoiti Suzuki, Tohoku University - Sendai, Japan; Kankji Watanabe, Akita Prefectural University - Yuri-Honjo, Japan; Hagen Wierstorf, Technische Universität Berlin - Berlin, Germany; Harald Ziegelwanger, Austrian Academy of Sciences - Vienna, Austria; Markus Noisternig, UMR STMS IRCAM-CNRS-UPMC - Paris, France
Head-related transfer functions (HRTFs) describe the spatial filtering of the incoming sound. So far available HRTFs are stored in various formats, making an exchange of HRTFs difficult because of incompatibilities between the formats. We propose a format for storing HRTFs with a focus on interchangeability and extendability. The spatially oriented format for acoustics (SOFA) aims at representing HRTFs in a general way, thus, allowing to store data such as directional room impulse responses (DRIRs) measured with a microphone-array excited by a loudspeaker array. SOFA specifications consider data compression, network transfer, a link to complex room geometries, and aim at simplifying the development of programming interfaces for Matlab, Octave, and C++. SOFA conventions for a consistent description of measurement setups are provided for future HRTF and DRIR databases.
Convention Paper 8880 (Purchase now)
P12-4 Head Movements in Three-Dimensional Localization—Tommy Ashby, University of Surrey - Guildford, Surrey, UK; Russell Mason, University of Surrey - Guildford, Surrey, UK; Tim Brookes, University of Surrey - Guildford, Surrey, UK
Previous studies give contradicting evidence as to the importance of head movements in localization. In this study head movements were shown to increase localization response accuracy in elevation and azimuth. For elevation, it was found that head movement improved localization accuracy in some cases and that when pinna cues were impeded the significance of head movement cues was increased. For azimuth localization, head movement reduced front-back confusions. There was also evidence that head movement can be used to enhance static cues for azimuth localization. Finally, it appears that head movement can increase the accuracy of listeners’ responses by enabling an interaction between auditory and visual cues.
Convention Paper 8881 (Purchase now)
P12-5 A Modular Framework for the Analysis and Synthesis of Head-Related Transfer Functions—Michele Geronazzo, University of Padova - Padova, Italy; Simone Spagnol, University of Padova - Padova, Italy; Federico Avanzini, University of Padova - Padova, Italy
The paper gives an overview of a number of tools for the analysis and synthesis of head-related transfer functions (HRTFs) that we have developed in the past four years at the Department of Information Engineering, University of Padova, Italy. The main objective of our study in this context is the progressive development of a collection of algorithms for the construction of a totally synthetic personal HRTF set replacing both cumbersome and tedious individual HRTF measurements and the exploitation of inaccurate non-individual HRTF sets. Our research methodology is highlighted, along with the multiple possibilities of present and future research offered by such tools.
Convention Paper 8882 (Purchase now)
P12-6 Measuring Directional Characteristics of In-Ear Recording Devices—Flemming Christensen, Aalborg University - Aalborg, Denmark; Pablo F. Hoffmann, Aalborg University - Aalborg, Denmark; Dorte Hammershøi, Aalborg University - Aalborg, Denmark
With the availability of small in-ear headphones and miniature microphones it is possible to construct combined in-ear devices for binaural recording and playback. When mounting a microphone on the outside of an insert earphone the microphone position deviates from ideal positions in the ear canal. The pinna and thereby also the natural sound transmission are altered by the inserted device. This paper presents a methodology for accurately measuring the directional dependent transfer functions of such in-ear devices. Pilot measurements on a commercially available device are presented and possibilities for electronic compensation of the non-ideal characteristics are considered.
Convention Paper 8883 (Purchase now)
P12-7 Modeling the Broadband Time-of-Arrival of the Head-Related Transfer Functions for Binaural Audio—Harald Ziegelwanger, Austrian Academy of Sciences - Vienna, Austria; Piotr Majdak, Austrian Academy of Sciences - Vienna, Austria
Binaural audio is based on the head-related transfer functions (HRTFs) that provide directional cues for the localization of virtual sound sources. HRTFs incorporate the time-of-arrival (TOA), the monaural timing information yielding the interaural time difference, essential for the rendering of multiple virtual sound sources. In this study we propose a method to robustly estimate spatially continuous TOA from an HRTF set. The method is based on a directional outlier remover and a geometrical model of the HRTF measurement setup. We show results for HRTFs of human listeners from three HRTF databases. The benefits of calculating the TOA in the light of the HRTF analysis, modifications, and synthesis are discussed.
Convention Paper 8884 (Purchase now)
P12-8 Multichannel Ring Upmix—Christof Faller, Illusonic GmbH - Uster, Switzerland; Lutz Altmann, IOSONO GmbH - Erfurt, Germany; Jeff Levison, IOSONO GmbH - Erfurt, Germany; Markus Schmidt, Illusonic GmbH - Uster, Switzerland
Multichannel spatial decompositions and upmixes have been proposed, but these are usually based on an unrealistically simple direct/ambient sound model, not capturing the full diversity offered by N discrete audio channels, where in an extreme case each channel can contain an independent sound source. While it has been argued that a simple direct/ambient model is sufficient, in practice such is limiting the achievable audio quality. To circumvent the problem of capturing multichannel signals with a single model, the proposed “ring upmix" applies a cascade of 2-channel upmixes to surround signals to generate channels for setups with more loudspeakers featuring full support for 360-degree panning with high channel separation.
Convention Paper 8908 (Purchase now)