AES New York 2015
Paper Session P13

P13 - Spatial Audio—Part 1

Saturday, October 31, 9:00 am — 12:30 pm (Room 1A08)

Chair:
Francis Rumsey, Logophon Ltd. - Oxfordshire, UK

P13-1 On the Performance of Acoustic Intensity-Based Source Localization with an Open Spherical Microphone Array—Mert Burkay Cöteli, METU Middle East Technical University - Ankara, Turkey; ASELSAN A.S. - Ankara, Turkey; Hüseyin Hacihabiboglu, Middle East Technical University (METU) - Ankara, Turkey
Sound source localization is important in a variety of contexts. A notable example is acoustic scene analysis for parametric spatial audio where not only recording the sound source but also deducing its direction is necessary. Sound source localization methods based on acoustic intensity provide a viable alternative to more traditional, delay-based techniques. However, special sound intensity probes or microphone arrays need to be used. This paper presents the evaluation of the sound source localization performance of an icosahedral open spherical microphone array using a method based on intensity vector distributions in time-frequency domain.
Convention Paper 9429 (Purchase now)

P13-2 A Microphone Array for Recording Music in Surround-Sound with Height Channels—David Bowles, Swineshead Productions LLC - Berkeley, CA, USA
In the past few years, sound recordings with spatial audio have moved from the realm of theoretical research to the actuality of physical and digital releases in the market. At present three Blu-ray disc formats utilize a traditional 5.1 surround-sound recording, with an added 4-channel layer of height channels. The topic of this paper is how to capture vertical localization effectively within this release format, utilizing existing research on hearing localization and techniques learned in the field. The proposed microphone array has time-of-arrival differences between all microphones, yet mixes down to 5.1 and stereo without excessive comb-filtering or other artifacts.
Convention Paper 9430 (Purchase now)

P13-3 Exploring 3D: A Subjective Evaluation of Surround Microphone Arrays Catered for Auro-3D Reproduction Systems—Alex Ryaboy, New York University - New York, NY, USA
As multichannel systems grow in popularity, audio professionals must make an informed decision when choosing a correct capturing method to deliver their vision. Many of today’s microphone arrays that are catered for surround sound with height, employ traditional spaced surround techniques that are aided by an additional array in the upper plane and are widely used to capture a performance in large spaces. This paper uses a perceptual study to evaluate a fully coincident microphone array Double-MSZ and a semi-coincident array Twins Square on Envelopment, Localization and Spatial Impression in a small recording studio environment. The study revealed overall lower widths, better localization, and a more stable vertical imaging for Double-MSZ, while the Twins Square technique exhibited higher ensemble envelopment and a more spacious perceived environment.
Convention Paper 9431 (Purchase now)

P13-4 Three Dimensional Spatial Techniques in 22.2 Multichannel Surround Sound for Popular Music Mixing—Bryan Martin, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, QC, Canada; Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
Current multichannel spatial mixing practices are largely limited to the construction of three-dimensional space using two dimensional panning tools (meant for 5.1, 7.1, etc.) and those designed for common stereo production. A great deal of research is currently underway in spatial sound reproduction through computer modeling and signal processing, with little focus on actual recording and mixing practices. This investigation examines the design and implementation of early and late reflections and reverberant fields in 22.2 multichannel sound system mixing based upon research in listener envelopment. The techniques discussed will include the expansion of spatial elements into three dimensions using conventional tools and the implementation of multichannel impulse responses for reverberant fields. Listening tests were conducted upon the final music mix with positive results reported for listener immersion.
Convention Paper 9432 (Purchase now)

P13-5 On the Use of a Lebedev Grid for Ambisonics—Pierre Lecomte, Conservatoire National des Arts et Métiers - Paris, France; Université de Sherbrooke - Sherbrooke, Canada; Philippe-Aubert Gauthier, Université de Sherbrooke - Sherbrooke, Quebec, Canada; McGill University - Montreal, Quebec, Canada; Christophe Langrenne, Conservatoire National des Arts et Métiers - Paris, France; Alexandre Garcia, Conservatoire National des Arts et Métiers - Paris, France; Alain Berry, Université de Sherbrooke - Sherbrooke, Quebec, Canada; McGill University - Montreal, Quebec, Canada
Ambisonics provide tools for three-dimensional sound field analysis and synthesis. The theory is based on sound field decomposition using a truncated basis of spherical harmonics. For the three-dimensional problem the decomposition of the sound field as well as the synthesis imply an integration over the sphere that respects the orthonormality of the spherical harmonics. This integration is practically achieved with discrete angular samples over the sphere. This paper investigates spherical sampling using a Lebedev grid for practical applications of Ambisonics. The paper presents underlying theory, simulations of reconstructed sound fields, and examples of actual prototypes using a 50 nodes grid able to perform recording and reconstruction up to order 5. Orthonormality errors are provided up to sixth order and compared for two grids: (1) the Lebedev grid with 50 nodes and (2) the Pentakis-Dodecahedron with 32 nodes. Finally, the paper presents some practical advantages using Lebedev grids for Ambisonics, in particular the use of sub-grids working up to order 1 or 3 and sharing common nodes with the 50 nodes grid.
Convention Paper 9433 (Purchase now)

P13-6 ISO/MPEG-H 3D Audio: SAOC 3D Decoding and Rendering—Adrian Murtaza, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Jürgen Herre, International Audio Laboratories Erlangen - Erlangen, Germany; Fraunhofer IIS - Erlangen, Germany; Jouni Paulus, Fraunhofer IIS - Erlangen, Germany; International Audio Laboratories Erlangen - Erlangen, Germany; Leon Terentiv, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Harald Fuchs, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Sascha Disch, Fraunhofer IIS, Erlangen - Erlangen, Germany
The ISO/MPEG standardization group recently finalized the MPEG-H 3D Audio standard for the universal carriage of encoded 3D-sound from channel-based, object-based, and HOA-based input. To achieve efficient low-bitrate coding of a high number of channels and objects, an advanced version of the well-known MPEG-D Spatial Audio Object Coding (SAOC) has been developed under the name SAOC 3D. The new SAOC 3D system supports direct reproduction to any output format from 22.2 and beyond down to 5.1 and stereo. This paper describes the SAOC-3D technology as it is part of the MPEG-H 3D Audio (phase one) International Standard and provides an overview of its features, capabilities, and performance.
Convention Paper 9434 (Purchase now)

P13-7 Auditory Distance Rendering Using a Standard 5.1 Loudspeaker Layout—Mikko-Ville Laitinen, Aalto University - Espoo, Finland; Andreas Walther, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Jan Plogsties, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Ville Pulkki, Aalto University - Espoo, Finland
Human hearing is known to be sensitive to the distances of sound sources. However, spatial-sound rendering systems typically do not allow controlling the distance of the auditory objects. This paper proposes a distance-rendering method that uses standard 5.1 loudspeaker layouts. The proposed method applies an input signal to multiple loudspeakers and controls the gains and the coherence of the loudspeaker signals. In addition, the method is combined with amplitude panning, thus allowing to continuously control both the distance and the direction of the auditory objects. Based on listening tests, the proposed method was found to provide the ability to realistically manipulate the perception of both direction and distance.
Convention Paper 9435 (Purchase now)

Return to Paper Sessions

EXHIBITION HOURS October 30th 10am - 6pm October 31st 10am - 6pm November 1st 10am - 4pm

REGISTRATION DESK October 28th 3pm - 7pm October 29th 8am - 6pm October 30th 8am - 6pm October 31st 8am - 6pm November 1st 8am - 4pm

TECHNICAL PROGRAM October 29th 9am - 7pm October 30th 9am - 7pm October 31st 9am - 7pm November 1st 9am - 6pm

Audio Engineering Society

AES New York 2015Paper Session P13

P13 - Spatial Audio—Part 1

AES New York 2015
Paper Session P13