AES Conventions and Conferences

   Other AES Events
   Chairman's Welcome
   General Information
   Calendar in Excel
   Calendar in PDF
   Paper Sessions
   Special Events
   Historical Program
   Student Program
   Technical Tours
   Cultural Tours
   Standards Comm Mtgs
   Technical Comm Mtgs

Session C Saturday, May 12 13:30 - 18:00 hr Room B

Low-Bit Rate Audio Coding

Chair: Marina Bosi, MPEG LA, Denver, CO, USA

13:30 hr C-1
Processor-Efficient Implementation of a High Quality MPEG-2 AAC Encoder
Toshiyuki Nomura & Yuichiro Takamizawa
NEC Corporation, Kawasaki, Japan

This paper describes processor-efficient implementation of a high quality MPEG-2 AAC encoder employing fast psycho-acoustic analysis, efficient encoding of side information, and SIMD instructions. A psycho-acoustic analysis in the MDCT domain reduces computational costs. Smoothing of scale factors and optimized selection of Huffman tables are introduced to efficiently encode the side information. SIMD instructions are heavily used in MDCT and quantization processes to improve the encoding speed. Seven-grade comparison MOS test results show that the AAC encoder at 96 kbps/stereo achieves sound quality equivalent to that of MP3 at 128 kbps/stereo. The encoder works 13 times faster than real-time for stereo encoding on an 800 MHz Pentium III processor.
Paper 5294

14:00 hr C-2
A Multichannel Audio Coding Algorithm for Inter-Channel Redundancy Removal
Ye Wang (1), Leonid Yaroslavsky (2), Miikka Vilermo (1) & Mauri Vaananen (1)
  Nokia Research Center, Tampere, Finland
  Tel Aviv University, Ramat Aviv, Israel

This paper presents a novel lossless multichannel audio coding algorithm to remove inter-channel redundancy. We employ an Integer-to-Integer Discrete Cosine Transform (INT-DCT) to perform inter-channel decorrelation after quantization of Modified Discrete Cosine Transform (MDCT) coefficients of individual channels. When compared with a Karhunen-Loeve Transform (KLT) based approach our new method has three major advantages: 1) avoids quantization noise spreading to other channels; 2) computational simplicity; 3) uses less overhead information (a quantized covariance matrix or eigenvector is avoided in our algorithm), while having a similar decorrelation capability.
Paper 5295

14:30 hr C-3
Compander Domain Approach to Scalable AAC
Ashish Aggarwal, Kenneth Rose & Shankar Regunathan
University of California, Santa Barbara, CA, USA

We propose a new approach to achieve efficient scalability in audio coders, and demonstrate its performance using the MPEG-4 Advanced Audio Coder (AAC). In conventional scalable coding, the enhancement-layer performs straightforward re-quantization of the base-layer reconstruction error. This coding scheme implicitly discards useful information from the base-layer, and does not truly minimize a perceptually meaningful distortion criterion such as the noise-mask ratio. We reformulate the problem of scalable coding within a companding framework, and show that re-quantization in the compander's compressed domain achieves, in the asymptotic sense, optimal scalability. Based on this observation, we develop a scalable AAC coder which performs enhancement-layer quantization while exploiting all the information available at that layer. Simulation results of a two-layer scalable coder on the standard test database of 44.1kHz sampled audio show that the proposed approach yields substantial savings in bit rate for a given reproduction quality.
Paper 5296

15:00 hr C-4
An Embedding Codec for Multiple Generations Compression Based on
MPEG-1 Layer III
Frank Kurth & Andreas Ribbrock
University of Bonn, Bonn, Germany

We propose an MPEG-1 Layer III conforming audio codec for multiple generations (cascaded) compression without loss of perceptual quality. Previous research addressing this topic mainly focused on less complex coding schemes like MPEG-1 Layer II. In our paper, the techniques proposed in those approaches are extended to comply with Layer III's advanced features such as hybrid filtering, block switching, and bit reservoir based processing. A prototypic implementation including extensive listening tests shows the feasibility of perceptually stable cascaded Layer III compression.
Paper 5297

15:30 hr C-5
Audio Coding with a Masking Threshold Adapted Wavelet Packet based on Run-Time Reconfigurable Processor Architecture
Alexey Petrovsky, Belarusian State University, Minsk, Belarus
Alexander Petrovsky, Technical University of Bialystok, Bialystok, Poland

Adaptive WP tree derived via dynamic algorithm transforms (DAT's) is presented. The DAT is to define parameter of input audio signals (sub band entropy) and output coded sequences (sub band rate) for the given embedded system architecture. A DAT-based pipe-line processor (WP trees analysis (encoder) and synthesis (decoder) algorithms) based on a reconfigurable hardware (such as SRAM-FPGA plus distributed arithmetic) is described.
Paper 5298

16:00 hr C-6
Low Delay Audio Coding Incorporating Psycho-Acoustic Information
Manuel Zurera, Francisco Lopez-Ferreras, Damian Martinez MuŅoz & Fernando Cruz Roldan
Universidad de Alcala, Madrid, Spain

This paper deals with the design and implementation of a scheme for CD quality audio coding that introduces a delay as low as 6 ms., and provides near transparent coding. It is implemented with a uniform filter bank that decomposes the audio signal in thirty-two bands. For such a delay, the filters in the filter bank must have a short impulse response and so the filters non-ideal frequency responses should be taken into account. Near transparent coding is achieved at 96kbps, that is a very good result for such a low delay.
Paper 5299

16:30 hr C-7
Error Protection and Concealment for HILN MPEG-4 Parametric Audio Coding
Heiko Purnhagen, Bernd Edler & Nikolaus Meine
University of Hannover, Hannover, Germany

The HILN (Harmonic and Individual Lines plus Noise) MPEG-4 parametric audio coding tool allows efficient representation of general audio signals at very low bit rates. Therefore possible applications include transmission over IP or wireless channels which are both characterized by specific transmission error models. On the other hand, since parametric audio coding is a relatively new technique compared to transform coding and CELP speech coding, there have been only very limited investigations on HILN's behavior in error prone environments. In this paper we present an analysis of error sensitivities and approaches to error protection and concealment.
Paper 5300

17:00 hr C-8
Avoiding Overlapping in a Time-Varying Wavelet-Packet Based Audio Coder
Manuel Zurera (1), Nicolas Ruiz-Reyes (2), Pedro Vera-Candeas (2) & Francisco Lopez-Ferreras (1)
  Universidad de Alcala, Madrid, Spain
  Universidad de Jaen, Linares (Jaen), Spain

A new algorithm for avoiding overlapping of adjacent frames in order to reduce the block effect is presented. The algorithm is based on forward and backward prediction at the border of frames and has been applied with success to an audio coder based on time-varying wavelet-packet decompositions that uses symmetrical and periodic extension as method for processing frames in isolation.
Paper 5301

17:30 hr C-9
Realtime Implementations of MPEG-2 and MPEG-4 Natural Audio Coders
Alberto Duenas (1), BegoŅa Rivas (1), Antonio Pena (2), Rafael Perez (1) & Enrique Alexandre (2)
  Prodys, Madrid, Spain
  Universidade de Vigo, Vigo, Spain

MPEG natural audio coders, as MPEG-1/2 Layer III and MPEG-2/4 AAC, require a great amount of calculation, mainly due to the iterative bit allocation processes proposed by the ISO/IEC technical documentation. This complexity makes difficult a real-time implementation of the normative algorithms. To solve this difficulty, this paper discusses a set of non optimal solutions to reduce the computing load and, based on these solutions, a real-time implementation of a MPEG-1/2 Layer III using a single fixed-point DSP is presented. In addition, techniques to achieve good audio performance and the methods for the adaptation of parameters will be discussed.
Paper 5302


Return to list of Sessions

Back to AES Events Back to AES Home Page

(C) 2001, Audio Engineering Society, Inc.