AES San Francisco 2008
Paper Session P1

P1 - Audio Coding

Thursday, October 2, 9:00 am — 12:30 pm
Chair: Marina Bosi, Stanford University - Stanford, CA, USA

P1-1 A Parametric Instrument Codec for Very Low Bit Rates—Mirko Arnold, Gerald Schuller, Fraunhofer Institute for Digital Media Technology - Ilmenau, Germany
A technique for the compression of guitar signals is presented that utilizes a simple model of the guitar. The goal for the codec is to obtain acceptable quality at significantly lower bit rates compared to universal audio codecs. This instrument codec achieves its data compression by transmitting an excitation function and model parameters to the receiver instead of the waveform. The parameters are extracted from the signal using weighted least squares approximation in the frequency domain. For evaluation a listening test has been conducted and the results are presented. They show that this compression technique provides a quality level comparable to recent universal audio codecs. The application however is, at this stage, limited to very simple guitar melody lines. [This paper is being presented by Gerald Schuller.]
Convention Paper 7501 (Purchase now)

P1-2 Stereo ACC Real-Time Audio Communication—Anibal Ferreira, University of Porto - Porto, Portugal, ATC Labs, Chatham, NJ, USA; Filipe Abreu, SEEGNAL Research - Portugal; Deepen Sinha, ATC Labs - Chatham, NJ, USA
Audio Communication Coder (ACC) is a codec that has been optimized for monophonic encoding of mixed speech/audio material while minimizing codec delay and improving intrinsic error robustness. In this paper we describe two major recent algorithmic improvements to ACC: on-the-fly bit rate switching and coding of stereo. A combination of source, parametric, and perceptual coding techniques allows a very graceful switching between different bit rates with minimal impact on the subjective quality. A real-time GUI demonstration platform is available that illustrates the ACC operation from 16 kbit/s mono till 256 kbit/s stereo. A real-time two-way stereo communication platform over Bluetooth has been implemented that illustrates the ACC operational flexibility and robustness in error-prone environments.
Convention Paper 7502 (Purchase now)

P1-3 MPEG-4 Enhanced Low Delay AAC—A New Standard for High Quality Communication—Markus Schnell, Markus Schmidt, Manuel Jander, Tobias Albert, Ralf Geiger, Fraunhofer IIS - Erlangen, Germany; Vesa Ruoppila, Per Ekstrand, Dolby Stockholm/Sweden, Nuremberg/Germany; Bernhard Grill, Fraunhofer IIS - Erlangen, Germany
The MPEG Audio standardization group has recently concluded the standardization process for the MPEG-4 ER Enhanced Low Delay AAC (AAC-ELD) codec. This codec is a new member of the MPEG Advanced Audio Coding family. It represents the efficient combination of the AAC Low Delay codec and the Spectral Band Replication (SBR) technique known from HE-AAC. This paper provides a complete overview of the underlying technology, presents points of operation as well as applications, and discusses MPEG verification test results.
Convention Paper 7503 (Purchase now)

P1-4 Efficient Detection of Exact Redundancies in Audio Signals—José R. Zapata G., Universidad Pontificia Bolivariana - Medellín, Antioquia, Colombia; Ricardo A. Garcia, Kurzweil Music Systems - Waltham, MA, USA
An efficient method to identify bitwise identical long-time redundant segments in audio signals is presented. It uses audio segmentation with simple time domain features to identify long term candidates for similar segments, and low level sample accurate metrics for the final matching. Applications in compression (lossy and lossless) of music signals (monophonic and multichannel) are discussed.
Convention Paper 7504 (Purchase now)

P1-5 An Improved Distortion Measure for Audio Coding and a Corresponding Two-Layered Trellis Approach for its Optimization—Vinay Melkote, Kenneth Rose, University of California - Santa Barbara, CA, USA
The efficacy of rate-distortion optimization in audio coding is constrained by the quality of the distortion measure. The proposed approach is motivated by the observation that the Noise-to-Mask Ratio (NMR) measure, as it is widely used, is only well adapted to evaluate relative distortion of audio bands of equal width on the Bark scale. We propose a modification of the distortion measure to explicitly account for Bark bandwidth differences across audio coding bands. Substantial subjective gains are observed when this new measure is utilized instead of NMR in the Two Loop Search, for quantization and coding parameters of scalefactor bands in an AAC encoder. Comprehensive optimization of the new measure, over the entire audio file, is then performed using a two-layered trellis approach, and yields nearly artifact-free audio even at low bit-rates.
Convention Paper 7505 (Purchase now)

P1-6 Spatial Audio Scene Coding—Michael M. Goodwin, Jean-Marc Jot, Creative Advanced Technology Center - Scotts Valley, CA, USA
This paper provides an overview of a framework for generalized multichannel audio processing. In this Spatial Audio Scene Coding (SASC) framework, the central idea is to represent an input audio scene in a way that is independent of any assumed or intended reproduction format. This format-agnostic parameterization enables optimal reproduction over any given playback system as well as flexible scene modification. The signal analysis and synthesis tools needed for SASC are described, including a presentation of new approaches for multichannel primary-ambient decomposition. Applications of SASC to spatial audio coding, upmix, phase-amplitude matrix decoding, multichannel format conversion, and binaural reproduction are discussed.
Convention Paper 7507 (Purchase now)

P1-7 Microphone Front-Ends for Spatial Audio Coders—Christof Faller, Illusonic LLC - Lausanne, Switzerland
Spatial audio coders, such as MPEG Surround, have enabled low bit-rate and stereo backwards compatible coding of multichannel surround audio. Directional audio coding (DirAC) can be viewed as spatial audio coding designed around specific microphone front-ends. DirAC is based on B-format spatial sound analysis and has no direct stereo backwards compatibility. We are presenting a number of two capsule-based stereo compatible microphone front-ends and corresponding spatial audio encoder modifications that enable the use of spatial audio coders to directly capture and code surround sound.
Convention Paper 7508 (Purchase now)

AES San Francisco 2008Paper Session P1

P1 - Audio Coding

AES San Francisco 2008
Paper Session P1