AES Munich 2009 Friday, May 8, 16:30 — 18:30
Paper Session P14
P14 - Multichannel Coding
Chair: Ville Pulkki
P14-1 Further EBU Tests of Multichannel Audio Codecs—David Marston, BBC R&D - Tadworth, Surrey, UK; Franc Kozamernik, EBU - Geneva, Switzerland; Gerhard Stoll, Gerhard Spikofski, IRT - Munich, Germany
The European Broadcasting Union technical group D/MAE has been assessing the quality of multichannel audio codecs in a series of subjective tests. The two most recent tests and results are described in this paper. The first set of tests covered 5.1 multichannel audio emission codecs at a range of bit-rates from 128 kbit/s to 448 kbit/s. The second set of tests covered cascaded contribution codecs, followed by the most prominent emission codecs. Codecs under test include offerings from Dolby, DTS, MPEG, Apt, and Linear Acoustics. The conclusions observe that while high quality is achievable at lower bit-rates, there are still precautions to be aware of. The results from cascading of codecs have shown that the emission codec is usually the bottleneck of quality.
Convention Paper 7730 (Purchase now)
P14-2 Spatial Parameter Decision by Least Squared Error in Parametric Stereo Coding and MPEG Surround—Chi-Min Liu, Han-Wen Hsu, Yung-Hsuan Kao, Wen-Chieh Lee, National Chiao Tung University - Hsinchu, Taiwan
Parametric stereo coding (PS) and MPEG Surround (MPS) are used to reconstruct stereo or multichannel signals from down-mixed signals with a few spatial parameters. For extracting spatial parameters, the first issue is to decide a time-frequency (T-F) tile that controls the resolution of reconstructed spatial scenes and highly determines the amount of consumed bits. On the other hand, according to the standard syntax, the up-mixing matrices for time slots not on time borders are reconstructed by interpolation in the decoder. Therefore, the second issue is to decide the transmitted parameter values on the time borders for confirming the minimum reconstruction error of matrices. For both PS and MPS, based on the criterion of least squared error, this paper proposes a generic dynamic programming method for deciding the two issues under the tradeoff of audio quality and limited bits.
Convention Paper 7731 (Purchase now)
P14-3 The Potential of High Performance Computing in Audio Engineering—David Moore, Jonathan Wakefield, University of Huddersfield - West Yorkshire, UK
High Performance Computing (HPC) resources are fast becoming more readily available. HPC hardware now exists for use in conjunction with standard desktop computers. This paper looks at what impact this could have on the audio engineering industry. Several potential applications of HPC within audio engineering research are discussed. A case study is also presented that highlights the benefits of using the Single Instruction, Multiple Data (SIMD) architecture when employing a search algorithm to produce surround sound decoders for the standard 5-speaker surround sound layout.
Convention Paper 7732 (Purchase now)
P14-4 Efficient Methods for High Quality Merging of Spatial Audio Streams in Directional Audio Coding—Giovanni Del Galdo, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Ville Pulkki, Helsinki University of Technology - Espoo, Finland; Fabian Kuech, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Mikko-Ville Laitinen, Helsinki University of Technology - Espoo, Finland; Richard Schultz-Amling, Markus Kallinger, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Directional Audio Coding (DirAC) is an efficient technique to capture and reproduce spatial sound. The analysis step outputs a mono DirAC stream, comprising an omnidirectional microphone pressure signal and side information, i.e., direction of arrival and diffuseness of the sound field expressed in time-frequency domain. This paper proposes efficient methods to merge multiple mono DirAC streams to allow a joint playback at the reproduction side. The problem of merging two or more streams arises in applications such as immersive spatial audio teleconferencing, virtual reality, and online gaming. Compared to a trivial direct merging of the decoder outputs, the proposed methods are more efficient as they do not require the synthesis step. From this it follows the benefit that the loudspeaker setup at the reproduction side does not have to be known in advance. Simulations and listening tests confirm the absence of any artifacts and that the proposed methods are practically indistinguishable from the ideal merging.
Convention Paper 7733 (Purchase now)