• Sessions by Industry
• Detailed Calendar
• Convention Planner
• Paper Sessions
• Master Classes
• Live Sound Seminars
• Exhibitor Seminars
• Special Events
• Student Program
• Technical Tours
• Technical Council
• Standards Committee
• Heyser Lecture
AES Amsterdam 2008
P26 - Low Bit-Rate Audio Coding
Poster Session P26
Tuesday, May 20, 11:30 — 13:00
P26-1 Autoregressive Modeling of Hilbert Envelopes for Wide-Band Audio Coding—Sriram Ganapathy, IDIAP Research Institute - Matigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland; Petr Motlicek, IDIAP Research Institute - Matigny, Switzerland; Hynek Hermansky, IDIAP Research Institute - Matigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland; Harinath Garudadri, Qualcomm Inc. - San Diego, CA, USA
Frequency Domain Linear Prediction (FDLP) represents the technique for approximating temporal envelopes of a signal using autoregressive models. In this paper we propose a wide-band audio coding system exploiting FDLP. Specifically, FDLP is applied on critically sampled sub-bands to model the Hilbert envelopes. The residual of the linear prediction forms the Hilbert carrier, which is transmitted along with the envelope parameters. This process is reversed at the decoder to reconstruct the signal. In the objective and subjective quality evaluations, the FDLP-based audio codec at 66 kbps provides competitive results compared to the state-of-art codecs at similar bit-rates.
Convention Paper 7481 (Purchase now)
P26-2 On Locality of Spectral Oriented Tree for Bit-Plane Based Low-Bit Rate Audio Coding—Yu-Lin Wang, Alvin W. Y. Su, National Cheng-Kung University - Tainan, Taiwan
For Spectral Oriented Trees (SOT) based coders such as SPIHT and CEIHT, locality is usually related to the locations of coefficients within a SOT and its effect to coding efficiency. How to construct a SOT to achieve better locality is very important. This paper presents a diagnostic aspect of the localities of different ordering techniques for low bit-rate audio coding. We used several coefficient ordering schemes to construct SOTs with the same set of MDCT coefficients and observed their effects. Both objective and subjective results are presented.
Convention Paper 7482 (Purchase now)
P26-3 Perceptual Matching Pursuit for Audio Coding—Hossein Najaf-Zadeh, Ramin Pichevar, Hassan Lahdili, Louis Thibault, Communications Research Centre Canada - Ottawa, Ontario, Canada
This paper introduces a Perceptual Matching Pursuit (PMP) algorithm for audio coding. A masking model has been developed and integrated into the matching pursuit algorithm to account for the characteristics of the hearing system. By doing so, only an audible kernel is extracted at each iteration. Moreover, contrary to the matching pursuit algorithm, PMP will stop decomposing an audio signal once there is no audible part left in the residual. We have used ITU_R PEAQ to compare audio materials decomposed by PMP and by matching pursuit. Objective scores for PMP increase by up to 1 unit. A semi-formal listening test has verified the objective scores and shown the perceptual superiority of PMP over the matching pursuit algorithm.
Convention Paper 7483 (Purchase now)
P26-4 A Unifying Approach to Transform and Sinusoidal Coding of Audio—Maciej Bartkowiak, Poznan University of Technology - Poznan, Poland
The paper describes a new scenario for low bit rate audio compression that combines two classical techniques: transform coding and sinusoidal coding into a united framework. The main idea is to adaptively decompose the audio signal into subbands whose central frequencies follow continuously the local instantaneous frequencies of certain signal components (formants or individual harmonic partials). The content in each subband is encoded in the baseband after frequency shift toward DC. The technique may be considered either as modified transform coding, i.e., coding along instantaneous frequencies or as extended sinusoidal coding, i.e., modeling with partial envelopes that are represented by transform coefficients. In other words, it is a hybrid scheme offering a continuous operating mode between purely transform and purely sinusoidal compression.
Convention Paper 7484 (Purchase now)
P26-5 Low Bit Rate Audio Coding for Digital Wireless Systems—Stephen Wray, APT Ltd. - Belfast, Northern Ireland, UK
With the transition from analog to digital television, the available spectrum for wireless microphones, in-ear monitors, and other wireless devices could be under threat. Spectrum is a valuable commodity, and it is the responsibility of governments to manage it appropriately. Much has been made recently of the Spectrum Squeeze both sides of the Atlantic with discussions on White Spaces and the Digital Dividend. With bandwidth at such a premium the audio industry has been forced to consider new technologies that make efficient use of spectrum without sacrificing quality or service. Within this context, we need a new revolutionary approach to maximizing bandwidth efficiency. The author will present a new and novel coding solution to overcome the prevailing technical limitations and industry requirements for wireless applications.
Convention Paper 7485 (Purchase now)
P26-6 Bit Allocation for Linear Prediction Coefficients with Application to Lossless Audio Compression—Florin Ghido, Ioan Tabus, Tampere University of Technology - Tampere, Finland
We propose a novel technique of using bit allocation for linear prediction coefficients in asymmetric lossless audio compression. We show how to determine the optimal bit allocation using a new closed-form formula for the excess error from quantization, and describe a recently introduced algorithm (Optimization-Quantization Least Squares), which computes the optimal quantized prediction coefficients applied for the allocation. The proposed method, implemented as a modified asymmetrical OptimFROG, obtains small (but consistent) signal dependent compression improvements with virtually no decoder complexity increase (on a 847 MB audio corpus, up to 0.27%, on average 0.06%). Compared to MPEG-4 ALS, it obtained 0.38% better compression, while being at the same time approximately 5 times faster at decoding.
Convention Paper 7486 (Purchase now)
P26-7 Design of Framing in MPEG Surround Based on Dynamic Programming Algorithm—Chi-Min Liu, Chung-Han Yang, Han-Wen Hsu, Wen-Chieh Lee, National Chiao Tung University - Hsinchu, Taiwan
MPEG Surround (MPS) defined by ISO/IEC is the audio coding standard of multichannel signals based on the down-mixed signal and the spatial parameters. In MPEG Surround, the time-frequency tiles decide the units to share the same spatial parameters among the multichannel signals. Hence, the decision of the tiles is the critical module deciding the required quality and bits. However, the large number of combination in the time regions, frequency bands, and multichannel signal statistics has spanned the huge search space for deciding the tiles. Our previous work at AES 119 has proposed the dynamic programming method to efficiently decide the time-frequency units for the parameter stereo coding in HE-AAC. This paper will extend the dynamic programming method to the MPS coding.
Convention Paper 7487 (Purchase now)
P26-8 New Enhancements to the Audio Bandwidth Extension Toolkit (ABET)—Harinarayanan E. V., Raghuram Annadana, ATC Labs - Noida, India; Deepen Sinha, ATC Labs - Chatham, NJ, USA; Anibal Ferreira, University of Porto - Porto, Portugal, and ATC Labs, Chatham, NJ, USA
Audio bandwidth extension has emerged as a key low bit rate coding tool. In continuation with our on going research on audio bandwidth extension, this paper presents new enhancements to the Audio Bandwidth Extension Toolkit (ABET). ABET consists of three primary tools Accurate Spectral Replacement (ASR), Fractal Self Similarity Model (FSSM), and Multi-band Temporal Envelope Amplitude Coding (MBTAC). Additionally we have also introduced a blind bandwidth extension mode into ABET. We discuss several new ideas / improvements to ABET. Specifically, enhancements to the blind bandwidth extension architecture that allow it to work with signals with only 3.5–4.0 kHz audio bandwidth are described. We also elaborate on a new tool for efficient coding of time-frequency envelope that cuts the overhead by 0.75–1.0 kbps/channel. We also address a practical issue, i.e., the computational complexity and describe a new low decoder complexity mode of ABET.
Convention Paper 7488 (Purchase now)
Last Updated: 20080612, tendeloo