v3.1, 20040329, ME
Session G Sunday, May 9 10:00 h12:30 h
LOW BIT-RATE AUDIO CODINGPART 1
(focus on standards)
Chair: Jürgen Herre, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
G-1 MPEG-4 Audio Lossless CodingTilman Liebchen1, Yuriy Reznik2, Takehiro Moriya3, Dai Tracy Yang4
1 Technical University of Berlin, Berlin, Germany
2 RealNetworks, Inc., Seattle, WA, USA
3 NTT Human and Information Science Lab, Atsugi, Japan
4 University of Southern California, Los Angeles, CA, USA
Lossless coding will become the latest extension of the MPEG-4 audio standard. The lossless audio codec of the Technical University of Berlin was chosen as the reference model for MPEG-4 Audio Lossless Coding (ALS). The MPEG-4 ALS encoder is based on linear prediction, which enables high compression even with moderate complexity, while the corresponding decoder is straightforward. The paper describes the basic elements of the codec as well as some additional features, gives compression results, and points out envisaged applications..
G-2 A Low Power SBR Algorithm for the MPEG-4 Audio Standard and Its DSP ImplementationOsamu Shimada1, Toshiyuki Nomura1, Yuichiro Takamizawa1, Masahiro Serizawa1, Naoya Tanaka2, Mineo Tsushima2, Takeshi Norimatsu2, Chong Kok Seng3, Kuah Kim Hann3, Neo Sua Hong3
1 NEC Corporation, Kanagawa, Japan
2 Matsushita Electric Industrial Co., Ltd., Osaka, Japan
3 Panasonic Singapore Laboratories Pte. Ltd., Singapore
This paper proposes a Low Power Spectral Band Replication algorithm (LP-SBR) adopted in the MPEG-4 audio standard. It operates with low computational complexity compared to the conventional SBR algorithm called the High Quality SBR algorithm (HQ-SBR). LP-SBR utilizes real-valued processing instead of complex-valued processing used in HQ-SBR for complexity reduction. To minimize the sound quality degradation caused by this reduction, LP-SBR employs aliasing reduction techniques and a gain compensation technique. Subjective quality test results show that there is no statistical difference between LP-SBR and HQ-SBR when they are incorporated into AAC decoders. A complexity comparison of both SBR decoders implemented on 16-bit fixed-point DSPs shows that an AAC decoder with LP-SBR requires 30 percent less computational complexity than that with HQ-SBR.
G-3 MP3 Surround: Efficient and Compatible Coding of Multichannel AudioJürgen Herre1, Christof Faller2, Christian Ertel1, Johannes Hilpert1, Andreas Hoelzer1, Claus Spenger1
1 Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
2 Agere Systems, Allentown, PA, USA, USA;
Finalized in 1992, the MP3 compression format has become a synonym for personalized music enjoyment for millions of users. The paper presents a novel extension of this popular format which adds support for the coding of multichannel signals, including the widely used 5.1 surround sound. As a prominent feature of the extended format, complete backward compatibility with existing stereo MP3 decoders is retained, i.e., standard decoders reproduce a full stereo downmix of the multichannel sound image. The paper discusses the underlying advanced technology enabling the representation of multichannel sound at bit rates that are comparable to what is currently used to encode stereo material. Results for subjective sound quality are presented; related activities of the MPEG standardization group are reported.
G-4 Reduction of Artifacts in MPEG-AAC with MDCT Spectrum RegularizationOlivier Derrien1, Laurent Daudet2
1 Université de Toulon et du Var, La Garde, France
2 Université Pierre et Marie Curie, Paris, France
In the context of lossy audio coding, the power spectral density of stationary tones can be over/underestimated in some windows due to the time-shift sensitivity of the Modified Discrete Cosine Transform (MDCT), which leads to potentially audible coding artifacts. This paper discusses the advantages of using a nearly time-shift invariant regularized MDCT spectrum for the bit allocation in MPEG-AAC coders. We show how this modification applies to the standard iterative algorithm, as well as to a more efficient model-based framework. Objective and subjective results indicate that the overall quality is significantly improved when rich stationary sounds are encoded at low bit-rates or when the coder operates in a variable bit-rate mode.
G-5 The Efficient Temporal Noise Shaping MethodChi-Min Liu, Wen-Chieh Lee, Tzu-Wen Chang, National Chiao Tung University, Hsinchu, Taiwan
Temporal noise shaping has been defined in MPEG-4 AAC to control the pre-echo noise in attack signals. The module, which is especially important for the MPEG-4 Low Delay AAC due to the absence of a window switching mechanism, can shape and control quantization noise spread to improve the quality under bit rate constraint. However, this paper illustrates that the TNS will introduce three artifacts. The first artifact is similar to the Gibbs phenomenon which has high noise level occurring at the edge of the attack signal. The second effect is the time-domain aliasing noise which has unusual noise at a distance from the attack time frame. The third is the noise spreading with the TNS filter orders. This paper will propose the efficient TNS method which shapes noise with good concerns on the above three artifacts. Also, we provide an efficient computing method to activate the TNS. Both subjective and objective tests are conducted to illustrate the improvement over existing TNS methods.