AES New York 2011
Paper Session P16

P16 - Low Bit-Rate Coding—Part 2


Friday, October 21, 4:30 pm — 6:30 pm (Room: 1E07)

Chair:
Christof Faller, Illusonic LLC - St-Sulpice, Swtizerland

P16-1 A Subband Analysis and Coding Method for Downmixing Based Multichannel Audio CodecShi Dong, Ruimin Hu, Weiping Tu, Xiang Zheng, Wuhan University - Wuhan, Hubei, China
In the present downmixing based multichannel coding standard, the downmixing process causes the “tone leakage” problem by mixing different channels into one channel. In this paper a novel multichannel analysis method is proposed to reduce “tone leakage” phenomenon with additive side information, the basic idea is to find subbands with the largest spectrum difference and coding their spectrum envelope information. By analyzing the decoded signals, leakage tones are identified and attenuated, then the original ones are reconstructed, which meantime retain the original interchannel level difference (ICLD) of subbands unchanged. Results show our method can improve subjective quality compared with HE-AAC (v2) codec with bit rate increasing slightly.
Convention Paper 8530 (Purchase now)

P16-2 Characterizing the Perceptual Effects Introduced by Low Bit Rate Spatial Audio CodecsPaulo Marins, Universidade de Brasília - Brasília, Brazil
This paper describes a series of experiments that was carried out aiming to characterize the perceptual effects introduced by low bit rate spatial audio codecs. An initial study was conducted with the intention of investigating the contribution of selected attributes to the basic audio quality of low bit rate spatial codecs. Furthermore, another two experiments were performed in order to identify the perceptually salient dimensions or the independent perceptual attributes related to the artifacts introduced by low bit rate spatial audio coding systems.
Convention Paper 8531 (Purchase now)

P16-3 Error Robust Low Delay Audio Coding Based on Subband-ADPCMStephan Preihs, Jörn Ostermann, Institute for Information Processing, Leibniz Universität Hannover - Hannover, Germany
In this paper we present an approach for error robust audio coding at a medium data rate of about 176 kbps (mono, 44.1 kHz sampling rate). By combining a delay-free Adaptive Differential Pulse Code Modulation (ADPCM) coding-scheme and a numerically optimized low delay filter bank we achieve a very low algorithmic coding delay of only about 0.5 ms. The structure of the codec also allows for a high robustness against random single bit errors and even supports error resilience. Implementation structure, results of a listening test, and PEAQ (Perceptual Evaluation of Audio Quality) based objective audio quality evaluation as well as tests of random single bit error performance are given. The presented coding-scheme provides a very good audio quality for vocals and speech. For most of the critical signals the audio quality can still be denoted as acceptable. Tests of random single bit error performance show good results for error rates up to 10-4.
Convention Paper 8532 (Purchase now)

P16-4 The Transient Steering Decorrelator Tool in the Upcoming MPEG Unified Speech and Audio Coding StandardAchim Kuntz, Sascha Disch, Tom Bäckström, Julien Robillard, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Applause signals are still challenging to code with good perceptual quality using parametric multichannel audio coding techniques. To improve the codec performance for these particular items, the Transient Steering Decorrelator (TSD) tool has been adopted into the upcoming Moving Picture Experts Group (MPEG) standard on Unified Speech and Audio Coding (USAC) as an amendment to the MPEG Surround 2-1-2 module (MPS). TSD improves the perceptual quality of signals which contain rather dense spatially distributed transient auditory events like in applause type of signals. Within TSD, transient events are separated from the core decoder output and a dedicated decorrelator algorithm distributes the transients in the spatial image according to parametric guiding information transmitted in the bitstream. Listening tests show a substantial improvement in subjective quality.
Convention Paper 8533 (Purchase now)

Information Last Updated: 20111005, mei


Return to Paper Sessions