AES Show: Make the Right Connections Audio Engineering Society

AES San Francisco 2008
Poster Session P21

Saturday, October 4, 5:00 pm — 6:30 pm

P21 - Low Bit-Rate Audio Coding


P21-1 A Framework for a Near-Optimal Excitation Based Rate-Distortion Algorithm for Audio CodingMiikka Vilermo, Nokia Research Center - Tampere, Finland
An optimal excitation based rate-distortion algorithm remains an elusive target in audio coding. Typical complexity of the problem for one frame alone is in the order of 6050. This paper presents a framework for reducing the complexity. Excitation is calculated using cochlear filters that have relatively steep slopes above and below the central frequency of the filter. An approximation of the excitation can be calculated by limiting the cochlear filters to a small frequency region. For example, the cochlear filters may span 15 subbands. In this way, the complexity can be reduced approximately to the order of 6015•50.
Convention Paper 7621 (Purchase now)

P21-2 Audio Bandwidth Extension by Frequency Scaling of Sinusoidal PartialsTomasz Zernicki, Maciej Bartkowiak, Poznan University of Technology - Poznan, Poland
This paper describes a new technique of efficient coding of high-frequency signal components as an alternative to Spectral Band Replication. The main idea is to reconstruct the high frequency harmonic structure trajectories by using fundamental frequencies obtained at the encoder side. Audio signal is decomposed into narrow subbands by demodulation based on the local instantaneous fundamental frequency of individual partials. High frequency components are reconstructed by modulation of the baseband signals with appropriately scaled instantaneous frequencies. Such approach offers correct synthesis of rapidly changing sinusoids as well as proper reconstruction of harmonic structure in the high-frequency band. This technique allows correct energy adjustment over sinusoidal partials. The high efficiency of the proposed technique has been confirmed by listening tests.
Convention Paper 7622 (Purchase now)

P21-3 Robustness Issues in Multi-View Audio CodingMauri Väänänen, Nokia Research Center - Tampere, Finland
This paper studies the problem of noise unmasking when multiple spatial filtering options (multiple views) are required from multi-microphone recordings compressed with lossy coding. The envisaged application is re-use and postprocessing of user-created content. A potential solution based on inter-channel prediction is outlined, that would also allow subtractive downmix options without excessive noise unmasking. The simple case of two relatively closely spaced omnidirectional microphones and mono downmix is used as an example, experimenting with real-world recordings and MPEG-1 Layer 3 coding.
Convention Paper 7623 (Purchase now)

P21-4 Quality Improvement of Very Low Bit Rate HE-AAC Using Linear Prediction ModuleGunWoo Lee, JaeSeong Lee, University of Yonsei - Seoul, Korea; YoungCheol Park, University o Yonsei - Wonju-city, Korea; DaeHee Youn, University of Yonsei - Seoul, Korea
This paper proposes a new method of improving the quality of High Efficiency Advanced Audio Coding (HE-AAC) at very low bit rate under 16 kbps. Low bit rate HE-AAC often produces obvious spectral holes inducing musical noise in low energy frequency bands due to its limited number of available bits. In the proposed system, a linear prediction module is combined with HE-AAC as a pre-processor to reduce the spectral holes. For its efficient implementation, masking threshold of psychoacoustic model is normalized with LPC spectral envelope to quantize LPC residual signal with appropriate masking threshold. To reduce the pre-echo, we also modified the block switching module. Experimental results show that, at very low bit rate modes, the linear prediction module effectively reduce the spectral holes, which results in the reduction of musical noises compared to the conventional HE-AAC.
Convention Paper 7624 (Purchase now)

P21-5 An Implementation of MPEG-4 ALS Standard Compliant Decoder on ARM Core CPUsNoboru Harada, Takehiro Moriya, Yutaka Kamamoto, NTT Communication Science Labs. - Kanagawa, Japan
MPEG-4 Audio Lossless Coding (ALS) is a standard that losslessly compresses audio signals in an efficient manner. MPEG-4 ALS is a suitable compression scheme for high-sound-quality portable music players. We have implemented a decoderder compliant with the MPEG-4 ALS standard on the ARM platform. In this paper the required CPU resources for MPEG-4 ALS tools on ARM9E are characterized by using an ARM CPU emulator, called ARMulator, as a simulation platform. It is shown that the required CPU clock cycle for decoding MPEG-4 ALS standard compliant bit streams is less than 20 MHz for 44.1-kHz/16-bit, stereo signals on ARM9E when the combination of the MPEG-4 ALS tools is properly selected and coding parameters are properly restricted.
Convention Paper 7625 (Purchase now)