Return to Paper Sessions  
AES Barcelona 2005
Paper Session N - Low Bit Rate Audio Coding, Part 2 (Research)

Last Updated: 20050401, mei

Monday, May 30, 16:30 — 18:30

Chair: Anibal Ferreira, University of Porto/ATC Labs - Porto, Portugal

N-1 Scalable Noise Coder for Parametric Sound CodingSteven van de Par, Valery Kot, Nicolle van Schijndel, Philips Research - Eindhoven, The Netherlands
In many state-of-the-art parametric audio codecs, the signal model is composed of sinusoids combined with synthetic noise. The sinusoids represent the perceptually most relevant signal part, while the remaining part is represented with noise. This structure poses a problem for developing a scalable sinusoidal coder where certain layers of the bit stream can be dropped to lower the bit rate. When dropping a layer containing a significant portion of sinusoids, the accompanying synthetic noise should be adapted to fit the remaining sinusoidal components. A scheme is proposed here that determines the noise signal at the decoder instead of the encoder side. Therefore, there is no need to send any information about the adaptation of the noise coder in the bit stream. It will be shown that this scheme can also be used to increase efficiency of the sinusoidal parameter encoding.
Convention Paper 6465 (Purchase now)

N-2 Efficient Coding of Excitation Patterns Combined with a Transform Audio CoderOliver Niemeyer, Bernd Edler, University of Hannover - Hannover, Germany
An efficient encoding of excitation patterns designed for bit rates between 4 and 10 kbit/s is presented. It is based on a two-dimensional transform of the excitation patterns in frequency and time direction, followed by a bit plane encoding of the resulting transform coefficients. The bit plane encoding ensures that all significant coefficients are captured, both for excitation patterns resulting from more tonal and transient-like audio signals. In such a way, coded excitation patterns can be used in a scalable noise coder but also can substitute the scale factors, which usually control the quantization in subband/transform audio coders. A subband/transform audio coding scheme is presented, which combines these two applications in one system.
Convention Paper 6466 (Purchase now)

N-3 A Fractal Self-Similarity Model for the Spectral Representation of Audio SignalsDeepen Sinha, ATC Labs - Chatham, NJ, USA; Anibal Ferreira, ATC Labs - Chatham, NJ, USA, and University of Porto, Porto, Portugal; Deep Sen, ATC Labs - Chatham, NJ, USA, and University of New South Wales, Sydney, Australia
In the application of conventional audio compression algorithms to low bit-rate audio coding one is faced with the unsatisfactory tradeoff between coarser quantization and audio bandwidth reduction. Frequency Extension has therefore emerged as an important tool for the satisfactory performance of low bit-rate audio codecs. In this paper we describe one of a newer class of Frequency Extension techniques that are applied directly to the high frequency resolution representation of the signal (e.g., MDCT). This particular technique is based on a Fractal Self-Similarity Model (FSSM) for the short-term frequency representation of the signal. The FSSM model, which may include multiple dilation and translation terms, has been found to be effective for a wide variety of speech and music signals and provides a compact description for long term correlation that may exist in frequency domain. The high frequency resolution of MDCT aids in accurate parameter estimation for the model, which in turn has shown promise as a Frequency Extension tool that offers a detailed and natural sounding quality at low bit rates. Structure of the FSSM model, issues related to parameter estimation, and its application to audio coding for bit rates of 8 to 48-kbps is discussed. Audio demos are available at
Convention Paper 6467 (Purchase now)

N-4 Improved Quantization and Lossless Coding for Subband Audio CodingNikolaus Meine, Bernd Edler, University of Hannover - Hannover, Germany
A source coding algorithm based on the classic Markov model is presented that uses vector quantization and arithmetic coding in conjunction with a dynamically adapted context of previously coded vector indices. The core of this algorithm is the numerically optimized mapping from a large number of source states to a small number of different code tables. This enables its application to audio coding, where it provides higher efficiency than the quantization and lossless coding used in MPEG-AAC.
Convention Paper 6468 (Purchase now)

©2005 Audio Engineering Society, Inc.