AES Technical Committee

Coding of Audio Signals

Chair:    Juergen Herre    (Send Email)
Chair:    Schuyler Quackenbush    (Send Email)

Mission Statement:

This committee has as its scope the theory, study, research, practice, engineering, and other standing issues involving the coding (i.e. bit rate reduction) of audio signals, especially but not limited to methods that use knowledge of human perception in the coding process.

The goal of this committee is to further work, knowledge and interest in subjects within its scope, and to increase awareness of the methods, problems, and standards within the AES. Within its scope, the committee will provide an informal technical forum for discussion of scientific and engineering problems as well as non-technical matters, participate in convention programs at the invitation of Convention Chairs, and work with the editor of the JAES, allowing it to distribute accurate and timely information to the membership.


Areas of Concentration

  1. Coding Research and Engineering
  2. Bit Rate Reduction Theory and Practice
  3. Human Perception

Current Areas Of Work

  • Parametric Multi-Channel Audio Coding
  • Parametric Multi-Object Audio Coding
  • Low Delay General Audio Coding

Recent/Planned Activities

  • 105th AES Convention: Workshop on Perceptual Audio Coding
  • 106th AES Convention: Workshop on MPEG-4 Audio
  • 17th AES International Conference on High Quality Audio Coding
  • 107th AES Convention: Workshop on Internet Audio
  • 108th AES Convention: Workshop on MPEG-4 Audio, Version 2
  • CD-ROM Project on Coding Distortions. Listener training and education on coding distortions
  • 109th AES Convention, Workshop on CD-ROM Project
  • 110th AES Convention, Introduction & Introductory Paper on CD-ROM "Perceptual Audio Coders - What To Listen For"
  • 111th AES Convention: Release of CD-ROM "Perceptual Audio Coders - What To Listen For"
  • 112th AES Convention: Workshop on CD-ROM, Workshop on Tandem Coding (together with AES TC on Perception)
  • 113th AES Convention: Workshop on recent MPEG-4 Audio Extensions, Workshop on "Coding of Spatial Audio", 2nd Release of CD-ROM
  • 114th AES Convention: Workshop on MPEG-4 Audio Extensions, Workshop on "Coding of Spatial Audio"
  • 115th AES Convention: Workshop on MPEG-4 AudioBIFS
  • 117th AES Convention: Workshop on Lossless Audio Coding
  • 117th AES Convention: Workshop on Spatial Coding of Surround Sound
  • 119th AES Convention: Workshop on Next Generation Audio Communications
  • 120th AES Convention: Workshop on MPEG Surround Parametric Multi-Channel Audio Coding
  • 121st AES Convention: Workshop on Lossless Audio Coding
  • 121st AES Convention: Workshop on MPEG Surround
  • 123rd AES Convention: Workshop on MPEG Spatial Audio Object Coding (SAOC)
  • 125th AES Convention: Workshop on MPEG Spatial Audio Object Coding (SAOC)
  • 126th AES Convention: Workshop on MPEG Spatial Audio Object Coding (SAOC)
  • 127th AES Convention: Workshop "What Will Perceptual Audio Coding Stand for 20 Years from Now?"
  • 127th AES Convention: Workshop on Scalable Audio Coding
  • 128th AES Convention: Workshop on MPEG Unified Speech and Audio Coding (USAC)
  • 129th AES Convention: Workshop on Wireless Audio Streaming
  • 131st AES Convention: Tutorial 'mp3 can sound good'
  • 131st AES Convention: Workshop on Sound Quality Evaluation
  • 131st AES Convention: Workshop on Low-delay Audio Coding for High-Quality Communications
  • 133rd AES Convention: Workshop on Error-Tolerant Audio Coding

Audio Coding CD-ROM

A CD-ROM on audio coding artifacts was prepared by the members of the AES Technical Committee on Coding of Audio Signals. This is the first educational/tutorial CD-ROM presented by the AES Technical Council on a particular topic, combining background information with specific audio examples. To facilitate the use of high-quality home playback equipment for the reproduction of the audio excerpts, the disk can also be played back on all standard audio CD players. Available via AES publications!

Emerging Technology Trends

Audio coding has emerged as a critical technology in numerous audio applications. In particular, it is a key component of mobile multimedia applications in the consumer market. Examples include wireless audio broadcast, Internet radio and streaming music, music download, storage and playback, mobile audio recording and Internet-based teleconferencing. Example platforms include digital audio broadcast radio receivers, portable music players, mobile phones and personal computers. From this, a variety of implications and trends can be discerned:

• Digital distribution of content is offered to the consumer in many formats with varying quality / bitrate trade-off, depending on application context. This ranges from very compact formats (e.g. MPEG HE-AACv2 and MPEG USAC) for wireless mobile distribution to perceptually transparent, scalable-to-lossless and lossless formats for regular IP-based distribution (e.g. MPEG AAC, HD-AAC and ALS).

• The frontiers of compression have been pushed further, allowing carriage of full-bandwidth signals at very low bit rates to the point where recent coding systems are considered appropriate for some broadcasting applications, particularly relatively expensive wireless communication channels such as satellite or cellular channels. While such technology predominantly makes use of parametric approaches (at least in part) to achieve highest possible quality at lowest bitrates, they are typically not designed to deliver “transparent” audio quality (i.e. that original and encoded/decoded audio signal cannot be perceptually distinguished even under most rigorous circumstances). Nevertheless, “entertainment quality” services over wireless channels have been very successful. Examples of audio coding that facilitates these new markets include MPEG HE-AACv2 and MPEG USAC.

• Transform-based audio coding schemes have been exploited to their full potential (quality v.s bitrate). As such, new paradigms will be exploited to gain further compression efficiency.

• There is a consistent trend toward hybrid coding techniques that employ parametric modeling to represent aspects of a signal, where the parametric coding techniques typically are motivated by aspects of human perception. The core of most successful audio coders is still largely based on a classic filterbank based coding paradigm, in which the quantization noise is shaped in the time/frequency domain to exploit (primarily) simultaneous masking in the human auditory system. However, the recent success of parametric extensions to the core audio codec, in both market deployment and standardization, illustrates this tendency:

o Audio bandwidth extension technology substitutes the explicit transmission of the signal’s high-frequency part (e.g. by sending quantized spectral coefficients) by a parametric synthesis of high-frequency spectrum at the decoder side based on the transmitted low frequency part and some parametric side information that captures the most relevant aspects of the original high frequency spectrum. This exploits the lower perceptual acuity in the high-frequency region of the human auditory system. An example is MPEG HE-AAC.

o Parametric stereo techniques enable rendering of several output channels at very low bitrates. Instead of a full transmission of all channel signals, the stereo / multi-channel sound image is re-synthesized at the decoder side based on a transmitted downmix signal and parametric side information that describes the perceptual properties (cues) of the original stereo / multi-channel sound scene. Examples are MPEG Parametric Stereo (for coding of two channels) and MPEG Surround (for full surround representation).

o Parametric coding of audio object signals provides, similarly to parametric coding of multi-channel audio, a very compact representation of a scene consisting of several audio objects (e.g. music instruments, talkers etc.). Rather than transmitting discrete object signals, the (downmixed) scene is transmitted, plus parametric side information describing the properties of the individual objects. At the decoder side, the scene can be modified by the user according to his/her preference, e.g. the level of a particular object can be attenuated or boosted. A recent example for such a technology is MPEG Spatial Audio Object Coding (SAOC).

• Audio coding has successfully entered the world of telecommunication, providing low-delay high-quality codecs that enable natural sound for teleconferencing and video-conferencing. Such codecs deliver full bandwidth and high quality, not only for speech material but also for any type of music and environmental sound, enabling applications such as tele-teaching for music. They support spatial reproduction of sound (stereo or even surround), which can greatly increase the ease of communication in conferences between several partners.

• For broadcast-only applications where delay is not a constraint, there is the possibility to gain further compression efficiency by exploiting large algorithmic delays or even multi-pass algorithms in the case of “off-line” audio coding.

• There has been significant progress in the challenge of developing a truly universal coder which can delivers state of the art performance for all kinds of input signals, including music and speech, has been achieved. Hybrid coders, such as MPEG USAC (Unified Speech and Audio Coding), have a structure combining elements from the speech and the audio coding architectures and, over a wide range of bitrates, perform better than coders designed for only speech or only audio.

• Also, the role of higher-level psychoacoustics and perception is becoming increasingly important in audio coding. Detection of auditory objects in an audio stream, separation into auditory (as opposed to acoustic) objects, and storage and manipulation as auditory objects is beginning to play a role. This will be an important and ongoing area of research.

• Solid-state and hard drive-based storage for audio has become extremely inexpensive and consumer Internet connection speeds reach into the Mb/s range. When such resources are available, music streaming, download and storage applications no longer require state of the art audio compression. Instead, what is occurring in the marketplace is that consumers are operating well-known perceptual coders at higher bit rates (lower compression) to achieve “perceptually transparent” compression of music, since the additional increment in resources required for such operating points is relatively inexpensive. For example, consumers are opting to use MPEG Layer III (MP3) or MPEG AAC at rates of 256 kb/s or higher to code their music libraries for their portable music players.

• Processor speed has continued to increase at a tremendous pace. Even with the low-power restrictions imposed by battery powered portable devices, the quantity of CPU cycles potentially available for audio processing is large. Present audio coders work in a fraction of available CPU capacity, even for multichannel coding, and new research may be needed to discover how to use the additional CPU cycles and memory space. Some possibilities are improved psychoacoustic models and sophisticated acoustic scene analysis. Seen overall, the research in audio coding is moving to the extremes, both toward lowest bit rates (very lossy compression using parametric coding extensions) and highest bitrates (noiseless/lossless coding for high resolution audio at high sampling rates/resolutions), as well as the more complex high-level processing (scene analysis and sound field synthesis of various sorts).

• There is considerable research activity exploring audio presentation that is more immersive than the pervasive consumer 5.1 channel audio systems. One might apply the label of "3D Audio" to such explorations, since the common thread is the use of many loudspeakers positioned around, above and below the listener.


Meeting Report:

These documents do not necessarily express the official position of the AES on the issues discussed at these meetings, and only represent the views of committee members participating in the discussion. Any unauthorized use of these publications is prohibited. Authorization must be obtained from the Executive Director of the AES: Email, Tel: +1 212 661 8528, Address: 60 East 42nd Street, Room 2520, New York, New York 10165-2520, USA.

2014-10-27     137th_Convention_TC_CAS_Minutes
Description: Minutes of TC-CAS meeting at 137th AES Convention, Los Angeles, Oct 2014

2014-5-21     TC_CAS_Minutes_Berlin_2014
Description: Minutes of the TC on Audio Coding at the 136th AES Convention, Berlin, April 2014

2013-11-26     135th_AES_Convention_TC_CAS_Minutes
Description: 135th AES Convention TC_CAS Meeting Minutes

2013-5-13     134th AES Convention TC-CAS Meeting Report
Description: TC-CAS meeting minutes

2012-11-26     TC_CAS_Minutes_San Francisco_2012
Description: Minutes of the TC on Audio Coding at the 133rd AES Convention, San Francisco, October 2012

2012-6-29     2012_04_TC_CAS_Minutes_Budapest
Description: Minutes of the TC on Audio Coding at the 132nd AES Convention, Budapest, April 2012

2011-11-10     2011_10_TC_CAS_Minutes_New_York
Description: Minutes of the TC on Audio Coding at the 131st AES Convention, New York, Oct 2011

2011-6-3     TC_CAS_Minutes_London_2011
Description: Minutes of TC on Coding of Audio Signals from the 130th AES Convention, May 2011, London.

2010-11-23     2010_11_TC_CAS_Minutes_San_Francisco
Description: Minutes of TC on Coding of Audio Signals from the 129th AES Convention, November, 2010, San Francisco.

2010-5-27     2010_05_TC_CAS_Minutes_London
Description: Minutes of TC on Coding of Audio Signals from the 128th AES Convention, May 22, 2010, London.

2009-12-14     2009_10_TC_CAS_Minutes_NewYork
Description: Minutes of TC on Coding of Audio Signals from the 127th AES Convention, October 8-12, 2009, New York.

2009-6-23     TC_CAS Minutes Munich '09
Description: Minutes of TC on Audio Coding at 126th AES Convention, Munich, 05/09

2008-10-16     TC_CAS Minutes San Francisco '08
Description: Minutes of TC on Audio Coding at 125th AES Convention, San Francisco, 10/08

2008-5-26     TC_CAS Minutes Amsterdam '08
Description: Minutes of TC on Audio Coding at 124th AES Convention, Amsterdam, 05/08

2007-10-29     TC_CAS Minutes New York '07
Description: Minutes of TC on Audio Coding at 123rd AES Convention, New York, 10/07

2007-6-18     TC_CAS Minutes Vienna '07
Description: Minutes of TC on Audio Coding at 122nd AES Convention, Vienna, 5/07

2006-11-24     TC_CAS Minutes San Francisco '06
Description: Minutes of TC on Audio Coding at 121st AES Convention, San Francisco, 10/06

2006-9-24     TC_CAS Minutes Paris '06
Description: Minutes of TC on Audio Coding at 120th AES Convention, Paris, 5/06

2005-12-15     TC_CAS Minutes New York '05
Description: Minutes of TC on Audio Coding at 119th AES Convention, New York, 12/05

2005-6-23     TC_CAS Minutes Barcelona '05
Description: Minutes of TC on Audio Coding at 118th AES Convention, Barcelona, 5/05

2004-11-4     TC_CAS Minutes San Francisco '04
Description: Minutes of TC on Audio Coding at 117th AES Convention, San Francisco, 10/04

2004-6-23     TC_CAS Minutes Berlin '04
Description: Minutes of TC on Audio Coding at 116th AES Convention, Berlin, 5/04

2003-11-14     TC_CAS Minutes New York '03
Description: Minutes of TC on Audio Coding at 115th AES Convention, New York, 10/03

2003-4-8     TC_CAS Minutes Amsterdam '03
Description: Minutes of TC on Audio Coding at 114th AES Convention, Amsterdam, 3/03

2002-10-22     TC_CAS Minutes Los Angeles '02
Description: Minutes of TC on Audio Coding at 113th AES Convention, Los Angeles, 10/02

2002-9-22     TC_CAS Minutes Munich '02
Description: Minutes of TC on Audio Coding at 112th AES Convention, Munich, 5/02

2001-12-11     TC_CAS Minutes New York '01
Description: Minutes of TC on Audio Coding at 111th AES Convention, New York, 12/01

2001-5-29     TC_CAS Minutes Amsterdam '01
Description: Minutes of TC on Audio Coding at 110th AES Convention, Amsterdam, 5/01

2000-10-17     TC_CAS Minutes Los Angeles '00
Description: Minutes of TC on Audio Coding at 109th AES Convention, Los Angeles 9/00

2000-8-2     TC_CAS Minutes Paris '00
Description: Minutes of TC on Audio Coding at 108th AES Convention, Paris, 2/00

1999-11-4     TC_CAS Minutes New York '99
Description: Minutes of TC on Audio Coding at 107th AES Convention, New York, 9/99

1999-9-29     TC_CAS Minutes New York '97
Description: Minutes of the TC on Audio Coding at the 103rd AES Convention in NY, 9/97

1999-9-29     TC_CAS Minutes Amsterdam '98
Description: Minutes of TC on Audio Coding at 104th AES Convention, Amsterdam, 5/98

1999-5-26     TC_CAS Minutes Munich '99
Description: Minutes of TC on Audio Coding at 106th AES convention, Munich, 5/99

1998-12-22     TC_CAS Minutes San Francisco '98
Description: Minutes of TC on Audio Coding at 105th AES Convention, San Francisco 10/98


Other:

2009-11-3     Workshop: What Will Perceptual Audio Coding Stand for 20 Years from Now?
Description: Slide presentations from workshop W5 chaired by Anibal Ferreira at the 127th AES Convention, October 2009

2009-5-10     Workshop: MPEG SAOC: Interactive Audio and Broadcasting ...
Description: Slide presentations from workshop W22 chaired by Oliver Hellmuth at the 126th AES Convention, May 2009

2008-10-13     Workshop: Upcoming MPEG Standard for Efficient Parametric Coding and Rendering of Audio Objects
Description: Slide presentations from workshop W11 chaired by Oliver Hellmuth at the 125th AES Convention, October 2008

2007-10-6     Workshop: FROM SAC TO SAOC - RECENT DEVELOPMENTS IN PARAMETRIC CODING OF SPATIAL AUDIO
Description: Slide presentations from workshop W10 chaired by Juergen Herre at the AES 123rd Convention, October 2007

2006-10-29     Audio Coding Tutorial
Description: Tutorial on Perceptual Audio Coding, presented at 121st AES Convention, October 2006 (Part by Marina Bosi)

2006-10-20     Workshop: LOSSLESS AUDIO COMPRESSION — TECHNOLOGY AND FORMATS
Description: Slide presentations from workshop W14 chaired by Tilman Liebchen at the AES 121st Convention, October 2006

2006-10-16     Workshop: MPEG SURROUND — THE MPEG STANDARD FOR PARAMETRIC MULTICHANNEL AUDIO CODING
Description: Slide presentations from workshop W16 chaired by Juergen Herre at the AES 121st Convention, October 2006

2006-6-5     Workshop: MPEG SURROUND—RECENT PROGRESS IN PARAMETRIC CODING OF MULTICHANNEL AUDIO
Description: Slide presentations from workshop W9 chaired by Juergen Herre at the AES 120th Convention, May 2006


Committee Members

 Bernd Edler  Bernhard Grill  Christof Faller 
 Gerald Schuller  Gerhard Stoll  Grant Davidson 
 Heiko Purnhagen  James Johnston  Jens Spille 
 Juergen Herre  Karlheinz Brandenburg  Kenzo Akagiri 
 Louis Fielder  Marina Bosi  Mark Sandler 
 Martin Dietz  Mike Goodwin  Noel McKenna 
 Schuyler Quackenbush  Takehiro Moriya  Thomas Sporer 
 Werner Oomen  Michael Kelly  Andreas Ehret 
 Hossein Najaf-Zadeh  Miikka Vilermo  Ricardo Garcia 
 Frank Baumgarte  Tilman Liebchen  Anibal Ferreira 
 Oliver Hellmuth  Deepen Sinha  Regis Rossi Alves Faria 
 Ralf Geiger  David Trainor  Gary Spittle 
 Manfred Lutzky  Arijit Biswas 

To request membership in this Technical Committee please email the Chair by using the link above.

 
Facebook   Twitter   LinkedIn   Google+   YouTube   RSS News Feeds  
AES - Audio Engineering Society