AES Conventions and Conferences

   Return to 111th
   Chairman's Welcome
   AES Statement
   Detailed Calendar
         (in Excel)
   Calendar (in PDF)
   Paper Sessions
   Special Events
   Technical Tours
   Student Program
   Heyser Lecture
   Tech Comm Mtgs
   Standards Comm Mtgs
   Hotel Information
   Press Information

Session Q Monday, December 3 2:00 pm-5:00 pm
Chair: Michael J. Smithers, Dolby Laboratories, Inc., San Francisco, CA, USA

2:00 pm

Q-1 Perceptual Audio Coders: What to Listen For (Invited)

Markus Erne, Scopein Research, Aarau, Switzerland and AES Technical Committee on Audio Coding

Low-bit rate audio coding has become a widely used technology during past years. By of the use of sophisticated signal processing techniques, exploiting psychoacoustic phenomena, nontransparent coding results in artifacts sounding very different from traditional distortions which are frequently not obvious at all to the untrained listener. The AES Technical Committee on Audio Coding therefore has started an activity to produce a CD-ROM which presents some of the most common coding artifacts in more detail. The CD-ROM not only explains and comments each of the coding artifacts separately but for each artifact, audio examples are presented, using different degrees of distortion, varying from "subtle" up to "obvious".

Convention Paper 5489


2:30 pm

Q-2 Increased Efficiency MPEG-2 AAC Encoding

Michael J. Smithers and Matt C. Fellers, Dolby Laboratories, Inc., San Francisco, CA, USA

Presented are modifications to the MPEG-2 AAC encoder that significantly increase computational efficiency while maintaining high sound quality. These modifications include changes to the perceptual model, block-switching control, pre-estimation of quantizer scale-factors, and changes to the quantizer rate/distortion loop. These changes result in an overall speed-up (when combined with processor-specific optimizations) of approximately 250% compared to the reference low-complexity professional MPEG-2 AAC encoder. Tests show a mean degradation of 0.2 on the ITU-R 5-point audio impairment scale.

Convention Paper 5490


3:00 pm

Q-3 Fine-Grain Scalability in MPEG-4 Audio

Sang-Wook Kim, Sung-Hee Park and Yeon-Bae Kim, Samsung Advanced Institute of Technology, Suwon, Korea

Previous MPEG-1, MPEG-2 Audio standards provided a single bitrate, single bandwidth tool set, with different configurations of that tool set specified for use in various applications. MPEG-4 provides several bitrate and bandwidth options within a single bitstream, providing a scalability functionality that permits a given bitstream to scale to the requirement of different channels and applications or to be responsive to a given channel that has dynamic throughput characteristics. Many of the tools specified in MPEG-4 are the state-of-the-art tools providing scalable compression of speech and audio signals. In this paper, we present the fine grain scalability tool in MPEG-4 Audio.

Convention Paper 5491


3:30 pm

Q-4 Efficient Audio Coding with FineóGrain Scalability

Chris Dunn, Scala Technology, London, UK

A comparison of audio coder quantisation schemes that offer fine-grain bitrate scalability is made with reference to fixed-rate quantisation. Coding efficiency is assessed in terms of the number of bits allocated to significant transform coefficients, and the average number of significant coefficients coded. A new method of arranging the transform hierarchy for SPIHT zero tree algorithms is shown to result in significantly improved performance relative to previously reported SPIHT implementations. Results for a new quantisation algorithm are presented which suggest low-complexity fine-grain scalable coding is possible with no coding efficiency penalty relative to fixed-rate coding.

Convention Paper 5492


4:00 pm

Q-5 Low-Latency Encoding for Consumer Applications

Michael Truman, John White and Michael Smithers, Dolby Laboratories, Inc., San Francisco, CA, USA

A key requirement of interactive applications is that they respond quickly to user input. This demands that the audio signal processing be performed with minimal delay. Generally, perceptual audio coders are in conflict with this requirement because they process data in long blocks to improve compression performance. This paper describes a real-time, multi-channel audio encoder designed to minimize the delay that is compatible with current consumer home theatre decoding technology.

Convention Paper 5493


4:30 pm

Q-6 A Study of Why Cross-Channel Prediction Is Not Applicable to Perceptual Audio Coding

Shyh-shiaw Kuo and James D. Johnston, AT&T Labs - Research, Florham Park, NJ, USA

We have studied and concluded that the time domain cross channel prediction is generally not applicable to perceptual audio coding.

Convention Paper 5494

Back to AES 111th Convention Back to AES Home Page

(C) 2001, Audio Engineering Society, Inc.