AES New York 2009
Paper Session P8
P8 - Data Compression
Saturday, October 10, 9:00 am — 12:30 pm
Chair: Christof Faller
P8-1 Wireless Transmission of Audio Using Adaptive Lossless Coding—David Trainor, APTX (APT Licensing Ltd.) - Belfast, Northern Ireland, UK
In audio devices, such as smartphones, media players, and wireless headsets, designers face the conflicting requirements of elevating coded audio quality and reducing algorithm complexity, device power dissipation, and transmission bit-rates. As a result, there are significant challenges in providing highest-quality real-time audio streaming between devices over wireless networks. Mathematically-lossless audio coding algorithms are an attractive means of maximizing coded audio quality. However, in the context of wireless audio transmission between portable devices, characteristics of such algorithms such as modest levels of bandwidth reduction, encoding complexity, and robustness to data loss need to be carefully controlled. Such control can be elegantly engineered by incorporating real-time adaptation and scaling into the audio coding algorithm itself. This paper describes a lossless audio coding algorithm called apt-X Lossless, which has been designed with scalability and automated adaptation as its principal characteristics.
Convention Paper 7873 (Purchase now)
P8-2 Quantization with Constrained Relative Entropy and Its Application to Audio Coding—Minyue Li, W. Bastiaan Kleijn, KTH - Royal Institute of Technology - Stockholm, Sweden
Conventional quantization distorts the probability density of the source. In scenarios such as low bit rate audio coding, this leads to perceived distortion that is not well characterized by commonly used distortion criteria. We propose the relative entropy between the probability densities of the original and reconstructed signals as an additional fidelity measure. Quantization with a constraint on relative entropy ensures that the probability density of the signal is preserved to a controllable extent. When it is included in an audio coder, the new quantization facilitates a continuous transition between the underlying concepts of the vocoder, the bandwidth extension, and a rate-distortion optimized coder. Experiments confirm the effectiveness of the new quantization scheme.
Convention Paper 7874 (Purchase now)
P8-3 Enhanced Stereo Coding with Phase Parameters for MPEG Unified Speech and Audio Coding—JungHoe Kim, Eunmi Oh, Samsung Electronics Co. Ltd. - Gyeonggi-do, Korea; Julien Robilliard, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
The proposed technology is concerned with a bit-efficient way to deliver phase information. This technology is to encode only interchannel phase difference (IPD) parameter and to estimate overall phase difference (OPD) parameter at the decoder with transmitted interchannel phase difference and channel level difference. The proposed technology reduces the bit rate for phase parameters compared to the case that both IPD parameters and OPD parameters are transmitted as specified in MPEG Parametric Stereo. The entropy coding scheme for phase parameters is improved utilizing the wrapping property of the phase parameters. We introduce phase smoothing at the decoder and adaptive control of quantization precision for phase parameters to minimize annoying artifacts due to abrupt changes of quantized phase parameters. The proposed phase coding can improve stereo sound quality significantly and it was accepted as a part of the MPEG-D USAC (Unified Speech and Audio Coding) standard.
Convention Paper 7875 (Purchase now)
P8-4 An Enhanced SBR Tool for Low-Delay Applications—Michael Werner, Ilmenau University of Technology - Ilmenau, Germany; Gerald Schuller, Ilmenau Technical University - Ilmenau, Germany, Fraunhofer IDMT, Ilmenau, Germany
An established technique to reduce the data rate of an audio coder is Spectral Band Replication (SBR). The standard SBR tool is made for applications where encoding/decoding delay is of no concern. Our goal is to obtain an SBR system with little algorithmic delay for use in real-time applications, such as wireless microphones or video conferencing. We already developed a low delay SBR tool (LD-SBR) but it produces a relatively high amount of side information. This paper presents an enhanced SBR tool for low delay applications that uses techniques from LD-SBR in combination with Codebook Mapping (CBM). This leads to an enhanced low delay SBR tool with a reduced amount of side information without degrading audio quality.
Convention Paper 7876 (Purchase now)
P8-5 Audio Codec Employing Frequency-Derived Tonality Measure—Maciej Kulesza, Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland
A transform codec employing efficient algorithm for detection of spectral tonal components is presented. The tonality measure used in the MPEG psychoacoustic model is replaced with the method providing adequate tonality estimates even if the tonal components are deeply frequency modulated. The reliability of the hearing threshold estimated using a psychoacoustic model with standardized tonality measure and the proposed one is investigated using objective quality testing methods. The proposed tonality estimator is also used as a basis for detector of noise-like signal bands. Instead of quantizing the noise-like signal components according to the usual transform coding scheme, the signal bands containing only noise-like components are filled with locally generated noise in the decoder. The results of the listening tests revealing usefulness of employed tonality estimation method for such a coding scenario are presented.
Convention Paper 7877 (Purchase now)
P8-6 New Approaches to Statistical Multiplexing for Perceptual Audio Coders Used in Multi-Program Audio Broadcasting—Deepen Sinha, ATC Labs - Chatham, NJ, USA; Harinarayanan E.V., Ranjeeta Sharma, ATC Labs - Noida, India
In the case of multi-program audio broadcasting or transmission joint encoding is an attractive proposition. It has previously been reported that the conventional joint encoding benefits conventional perceptual audio coders in this scenario. However, previous attempts to such statistical multiplexing have focused primarily on joint bit allocation. Here we show that such an approach is not sufficient to realize the promise of statistical multiplexing. Rather a successful strategy has two essential ingredients including a highly accurate psychoacoustic model and a coordination mechanism that goes beyond joint allocation. We describe methods to achieve these objectives and also present objective and subjective coding results.
Convention Paper 7878 (Purchase now)
P8-7 Subjective Evaluation of mp3 Compression for Different Musical Genres—Amandine Pras, Rachel Zimmerman, Daniel Levitin, Catherine Guastavino, McGill University - Montreal, Quebec, Canada
Mp3 compression is commonly used to reduce the size of digital music files but introduces a number of artifacts, especially at low bit rates. We investigated whether listeners prefer CD quality to mp3 files at various bit rates (96 kb/s to 320 kb/s), and whether this preference is affected by musical genre. Thirteen trained listeners completed an AB comparison task judging CD quality and compressed. Listeners significantly preferred CD quality to mp3 files up to 192 kb/s for all musical genres. In addition, we observed a significant effect or expertise (sound engineers vs. musicians) and musical genres (electric vs. acoustic music).
Convention Paper 7879 (Purchase now)