Monday, October 7 9:00 am 12:00 noon
SESSION I: LOW BIT-RATE CODING, PART 2
Chair: Marina Bosi, MPEG LA,
I-1 Perceptually-Based Joint-Program Audio CodingChristof Faller1, Raziel Haimi-Cohen2, Peter Kroon1, Joseph Rothweiler1 - 1Agere Systems, Murray Hill, NJ, USA; 2Lucent Technologies, Murray Hill, NY, USA
Some digital audio broadcasting systems, such as Satellite Digital Audio Radio Services (SDARS), transmit many audio programs over the same transmission channel. Instead of splitting up the channel into fixed bit-rate sub-channels, each carrying one audio program, one can dynamically distribute the channel capacity among the audio programs. We describe an algorithm which implements this concept taking into account statistics of the bit rate variation of audio coders and perception. The result is a dynamic distribution of the channel capacity among the coders depending on the perceptual entropy of the individual programs. This solution provides improved audio quality compared with fixed bit-rate subchannels for the same total transmission capacity. The proposed scheme is non-iterative and has a low computational complexity.
Convention Paper 5684
I-2 Incorporation of Inharmonicity Effects into Auditory Masking ModelsHossein Najaf-Zadeh, Hassan Lahdili, Louis Thibault, Communications Research Centre, Ottawa, Ontario, Canada
In this paper the effect of inharmonic structure of audio maskers on the produced masking pattern is addressed. In most auditory models, the tonal structure of the masker is analyzed to determine the masking threshold. Based on psychoacoustic data, masking thresholds caused by tonal signals are lower compared to those produced by noise-like maskers. However, the relationship between spectral components has not been considered. It has been found that for two different multi-tonal maskers with the same power, the one with a harmonic structure produces a lower masking threshold. This paper proposes a modification to the MPEG psychoacoustic model 2 in order to take into account the inharmonic structure of the input signal. Informal listening tests have shown that the bit rate required for transparent coding of inharmonic (multitonal) audio material can be reduced by 10 percent if the new modified psychoacoustic model 2 is used in the MPEG 1 Layer II encoder.
Convention Paper 5685
I-3 Binaural Cue Coding Applied to Audio Compression with Flexible RenderingChristof Faller, Frank Baumgarte, Agere Systems, Murray Hill, NJ, USA
In this paper we describe an efficient scheme for compression and flexible spatial rendering of audio signals. The method is based on binaural cue coding (BCC) which was recently introduced for efficient compression of multichannel audio signals. The encoder input consists of separate signals without directional spatial cues, such as separate sound source signals, i.e., several monophonic signals. The signal transmitted to the decoder consists of the mono sum-signal of all input signals plus a low bit rate (e.g., 2 kb/s) set of BCC parameters. The mono signal can be encoded with any conventional audio or speech coder. Using the BCC parameters and the mono signal, the BCC synthesizer can flexibly render a spatial image by determining the perceived direction of the audio content of each of the encoder input signals. We provide the results of an audio quality assessment using headphones, which is a more critical scenario than loudspeaker playback.
Convention Paper 5686
I-4 Two-Pass Encoding of Audio Material Using MP3 CompressionMartin Weishart1, Ralf Göbel2, Jürgen Herre1 - 1Fraunhofer Institute for Integrated Circuits, Erlangen, Germany; 2Fachhochschule Koblenz, Koblenz, Germany
Perceptual audio coding has become a widely-used technique for economic transmission and storage of high-quality audio signals. Audio compression schemes, as known from MPEG-1, 2, and 4 allow encoding with either a constant or a variable bit rate over time. While many applications demand a constant bit rate due to channel characteristics, the use of variable bit-rate encoding becomes increasingly attractive, e.g., for Internet audio and portable audio players. Using such an approach can lead to significant improvements in audio quality compared to traditional constant bit-rate encoding, but the consumed average bit rate will generally depend on the compressed audio material. This paper presents investigations into two-pass encoding which combines the flexibility of variable bit rate encoding and a predictable target bit consumption.
Convention Paper 5687
I-5 Technical Aspects of Digital Rights Management SystemsChristian Neubauer1, Karlheinz Brandenburg2, Frank Siebenhaar1 - 1Fraunhofer Institute for Integrated Circuits IIS-A, Erlangen, Germany; 2Fraunhofer Arbeitsgruppe Elektronische Medientechnologie AEMT and Ilmenau University, Ilmenau, Germany
In todays multimedia world digital content is easily available to and widely used by end consumers. On the one hand high quality, the ability to be copied without loss of quality, and the existence of portable players make digital content, in particular digital music, very attractive to consumers. On the other hand the music industry is facing increasing revenue loss due to illegal copying. To cope with this problem so-called Digital Rights Management (DRM) systems have been developed in order to control the usage of content. However, currently no vendor and no DRM system is widely accepted by the market. This is due to the incompatibility of different systems, the lack of open standards, and other reasons. This paper analyzes the current situation of DRM systems, derives requirements for DRM systems, and presents technological building blocks to meet these requirements. Finally, an alternative approach for a DRM system is presented that better respects the rights of the consumers.
Convention Paper 5688
I-6 Configurable Microprocessor Implementation of Low Bit-Rate Audio DecodingGary S. Brown, Tensilica, Inc., Santa Clara, CA, USA
Using a configurable microprocessor to implement low bit-rate audio applications by tailoring the instruction set reduces algorithm complexity and implementation cost. As an example, this paper describes a Dolby digital (AC-3) decoder implementation that uses a commercially-available configurable microprocessor to achieve 32-bit floating-point precision while minimizing the required processor clock rate and die size. This paper focuses on how the audio quality and features of the reference decoder algorithm dictate the customization of the microprocessor. This paper shows examples of audio specific extensions to the processors instruction set to create a family of AC-3 decoder implementations that meet multiple performance and cost points. How this approach benefits other audio applications is also discussed.
Convention Paper 5689