Publications

Digital Audio Educational CD

Digital Audio Educational CD

This new educational CD, produced by the AES Technical Committee on Signal Processing (TC_SP), is intended primarily to address issues relevant to digital signal processing algorithm designers and implementers.

Digital audio systems have obviously become ubiquitous in recent years, due in large part to the ingenuity and effort of Audio Engineering Society members. The maturity of the digital audio signal processing field can present a serious challenge for new students, researchers, software engineers, product testers, and other newcomers who must try to understand the relevance and relative importance of various signal processing parameters without being able to hear and see the details and effects: simply reading a written description is not the best way to learn and understand audio processing issues.

Since a typical characteristic of audio digital signal processing algorithms is the need for sustained, uninterrupted processing, even a single sample dropout or parameter update error can result in an audible artifact. Detecting, diagnosing, and correcting this sort of implementation error often requires experience listening for the defects, and examples of this type are included on this CD. The examples on this disc are intended to demonstrate a variety of the effects, both good and bad, that digital audio signal processing engineers are likely to come across in their work.

Note: this disc is "dual session" which means it contains CD-Audio in the CD part and WAV files in the CD-ROM/HTML part. This allows you to use it in your computer or in a normal audio CD player.

To purchase this CD-ROM online please click here.

AES Technical Committee on Signal Processing - Digital Audio Educational CD

Thank you for exploring this educational CD-ROM on digital audio signal processing, prepared by members of the AES Technical Committee on Signal Processing.

There are several notable audio demonstration CDs already available from commercial publishers and professional societies. These include the AES Technical Committee on Coding of Audio Signals' "What to Listen For" CD.

This new AES Technical Committee on Signal Processing (TC_SP) CD is intended primarily to address issues relevant to digital signal processing algorithm designers and implementers.

This is the second educational/tutorial CD-ROM presented by the AES Technical Council on a particular topic, combining background information with specific audio examples. Be aware that playing back the audio examples through small computer loudspeakers or with a noisy computer sound card may not reproduce the full frequency range and subtle nuances intended by the authors. To facilitate the use of high-quality home playback equipment for the reproduction of the audio excerpts, the disk can also be played back on all standard audio CD players. We hope you find this a useful feature.

Digital audio systems have obviously become ubiquitous in recent years, due in large part to the ingenuity and effort of Audio Engineering Society members. The maturity of the digital audio signal processing field can present a serious challenge for new students, researchers, software engineers, product testers, and other newcomers who must try to understand the relevance and relative importance of various signal processing parameters without being able to hear and see the details and effects: simply reading a written description is not the best way to learn and understand audio processing issues.

While at one time it was only possible to implement audio signal processing algorithms using special purpose hardware or arcane assembly language with a fast DSP microprocessor, it is now quite common to do audio processing with general-purpose microprocessors and high-level languages. This has made audio processing accessible to programmers who might not have had extensive experience with real-time audio systems. We believe that an awareness of numerical issues and proper algorithmic implementation can help avoid audible distortion and computational instability.

Since a typical characteristic of audio digital signal processing algorithms is the need for sustained, uninterrupted processing, even a single sample dropout or parameter update error can result in an audible artifact. Detecting, diagnosing, and correcting this sort of implementation error often requires experience listening for the defects, and examples of this type are included on this CD. The examples on this disc are intended to demonstrate a variety of the effects, both good and bad, that digital audio signal processing engineers are likely to come across in their work.

This project has actually been underway for many years. The AES TC_SP began initial discussions in 2001, and technical contributions and suggestions have come from many TC_SP members and friends since that time. The AES TC_SP thanks the AES Technical Council and the AES headquarters staff for making this project possible.

We also wish to thank the TC_SP members who voluntarily contributed their time and talent in creating the tutorial material and examples. We also wish to acknowledge the organizational contributions since 2001 of the TC_SP leadership: Ronald Aarts, James D. Johnston, and Christoph M. Musialik, and the work of Rob Maher, CD project coordinator. Many of the examples are prepared from a special recording provided by Stanley Lipshitz. The entire project team also acknowledges the work of Nermin Osmanovic in assembling the final CD layout.

We hope that you will find this CD useful, and thank you for your support.
- Christoph M. Musialik and James D. Johnston, co-chairs, AES Technical Committee on Signal Processing
- Robert C. Maher, TC_SP CD project coordinator

SECTION 1: SAMPLING, RECONSTRUCTION, AND AUDIO DSP

1.1 Sampling/Aliasing
1.2 Quantization and Dithering
1.3 Hard Limiting (Clipping) and Wrap-Around
1.4 Audible Effects of Frequency Filtering
1.5 Audibility of Interchannel Phase and Timing Differences
1.6 The Decibel Scale and Frequency Weightings

SECTION 2: BASIC PSYCHOACOUSTICS OF DIGITAL AUDIO

2.0 Section Introduction
2.1 Tone Masking Noise, and Noise Masking Tone Demonstrations
2.2 FM versus AM Modulation Inside a Critical Band
2.3 Level Differences

SECTION 3: DSP APPLIED IN PRACTICAL DIGITAL AUDIO SYSTEMS

3.1 Envelopes and Parameter Update Rate
3.2 Wavetable Signal Synthesis
3.3 Broadband Denoising for Audio Signals
3.4 Effects of Implementation Errors and Error Concealment
3.5 Effects of Cascaded Sample Rate Conversion

Audio  
Track  
Section  Description
  
1.1 Sampling and Aliasing
11.1.1Original piano
21.1.2Piano with 20 kHz bandwidth sampled at 10 kHz without a proper anti-aliasing filter.
31.1.3Piano with 20 kHz bandwidth sampled at 8 kHz without a proper anti-aliasing filter.
  
1.2 Quantization, Dither & Noise Shaping Tracks
41.2.1Original 16-bit piano music excerpt. Call it U ("Unattenuated").
51.2.220-dB attenuated 16-bit piano music excerpt. Call it A ("Attenuated").
61.2.3Faded 16-bit piano music excerpt (0 to -60 dB at -3 dB/s). Call it F ("Faded").
71.2.4Undithered mid-tread requantization of U to n bits, where n goes from 16 to 2 and back to 16 again, one bit at a time.
81.2.5Undithered mid-riser requantization of U to n bits, where n goes from 16 to 2 and back to 16 again, one bit at a time.
91.2.6Undithered mid-tread 8-bit requantization of A.
101.2.7Undithered mid-riser 8-bit requantization of A.
111.2.8Undithered mid-tread 8-bit requantization of F.
121.2.9Quantization error of Track 11 (1.2.8).
131.2.10Undithered mid-riser 8-bit requantization of F.
141.2.11Quantization error of Track 13 (1.2.10).
151.2.12Fractional RPDF-dithered mid-tread 8-bit quantization of a 300-Hz sine-wave of 9 (8-bit) LSBs peak-to-peak amplitude, where the peak-to-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again.
161.2.13Total error of Track 15 (1.2.12).
171.2.14Fractional RPDF-dithered mid-riser 8-bit quantization of a 300-Hz sine-wave of 9 (8-bit) LSBs peak-to-peak amplitude, where the peak-to-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again.
181.2.15Total error of Track 17 (1.2.14).
191.2.16Fractional RPDF-dithered mid-tread 8-bit requantization of A, where the peak-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again.
201.2.17Fractional RPDF-dithered mid-riser 8-bit requantization of A, where the peak-peak width of the RPDF dither is varied linearly from zero to one (8-bit) LSB and back to zero again.
211.2.18RPDF-dithered mid-tread 8-bit requantization of F.
221.2.19Total error of Track 21 (1.2.18).
231.2.20RPDF-dithered mid-tread 8-bit requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 (8-bit) LSB with a period of 2 s}.
241.2.21Total error of Track 23 (1.2.20).
251.2.22Undithered mid-tread 8-bit requantization of F with RPDF dither added after the requantization.
261.2.23TPDF-dithered mid-tread 8-bit requantization of F.
271.2.24Total error of Track 26 (1.2.23).
281.2.25TPDF-dithered mid-tread 8-bit requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}.
291.2.26Total error of Track 28 (1.2.25).
301.2.27Subtractive RPDF-dithered mid-tread 8-bit requantization of F.
311.2.28High-pass TPDF-dithered mid-tread 8-bit requantization of F.
321.2.29Undithered 1st-order mid-tread 8-bit noise-shaper requantization of F.
331.2.30Undithered 1st-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}.
341.2.31RPDF-dithered 1st-order mid-tread 8-bit noise-shaper requantization of F.
351.2.32TPDF-dithered 1st-order mid-tread 8-bit noise-shaper requantization of F.
361.2.33Undithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of F.
371.2.34Undithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}.
381.2.35RPDF-dithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of F.
391.2.36TPDF-dithered tilted-F-weighted 9th-order mid-tread 8-bit noise-shaper requantization of F.
401.2.37Undithered 12th-order mid-tread 8-bit noise-shaper requantization of F.
411.2.38Undithered 12th-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}.
421.2.39Total error of Track 41 (1.2.38).
431.2.40RPDF-dithered 12th-order mid-tread 8-bit noise-shaper requantization of F.
441.2.41RPDF-dithered 12th-order mid-tread 8-bit noise-shaper requantization of {F + triangle-wave dc ramp varying linearly up and down between 0 and +1 LSB with a period of 2 s}.
451.2.42TPDF-dithered 12th-order mid-tread 8-bit noise-shaper requantization of F.
461.2.43Concatenation of 5 s segments of dithered "silence" from each of the systems of Tracks 23 (TPDF-dithered rounding), 28 (high-pass TPDF-dithered rounding), 32 (TPDF-dithered 1st-order noise-shaped rounding), and 36 (TPDF-dithered tilted-F-weighted 9th-order noise-shaped rounding), repeated twice.
  
1.3 Hard Limiting (Clipping), Analog and Digital
471.3.1Hard digital clipping.
481.3.2Wrap around.
491.3.2Hard 'analog' clipping.
  
1.4 Audible Effects of Frequency Filtering
501.4.1Original piano.
 1.4.2Piano with 100 Hz highpass filter (signal frequency content attenuated below 100 Hz).
 1.4.3Piano with 500 Hz highpass filter (signal frequency content attenuated below 500 Hz).
 1.4.4Piano with 1000 Hz highpass filter (signal frequency content attenuated below 1000 Hz).
511.4.5Piano with 1000 Hz lowpass filter (signal frequency content attenuated above 1000 Hz).
 1.4.6Piano with 500 Hz lowpass filter (signal frequency content attenuated above 500 Hz).
 1.4.7Piano with 100 Hz lowpass filter (signal frequency content attenuated above 100 Hz). Note: this example may be inaudible when using small loudspeakers that are unable to reproduce frequency content below 100 Hz.
521.4.8Piano with 440 Hz - 880 Hz bandpass filter (signal frequency content attenuated below 440 Hz and above 880 Hz).
  
1.5 Audibility of Interchannel Phase and Timing Differences
531.5.1Interchannel delay sequence, 441 Hz tones. Phase difference between channels: 0 samples (in phase), 1 sample, 2 samples, 5 samples, 10 samples, 20 samples, 50 samples. Sequence presented three times.
 1.5.2Interchannel delay sequence, 1 ms clicks. Phase difference between channels: 0 samples (in phase), 1 sample, 2 samples, 5 samples, 10 samples, 20 samples, 50 samples. Sequence presented three times.
 1.5.3Interchannel delay sequence, 1 ms clicks. Time difference between channels:: 0 ms (synchronized), 100us, 200us, 500us, 1ms, 2ms, 5ms, 10ms. Sequence presented three times.
  
1.6 The Decibel Scale
541.6.1Original broadband noise scaled for -12 dB with respect to full scale.
 1.6.2Broadband noise with A-weighting applied.
 1.6.3Broadband noise with C-weighting applied.
  
2.1 Tone Masking Noise and Noise Masking Tone
552.1.1Tone masking noise sequence.
562.1.2Noise masking tone sequence.
  
2.2 Audibility of Phase: AM vs. FM Within a Critical Band
572.2.1AM and FM sequence
  
2.3 Level Differences
582.3.1Level comparison by windowing between the levels.
592.3.2Level comparison by forcing a time interval between the two pulses at different levels.
  
3.1 Envelopes and Parameter Update Rate
603.1.1Gain changes abruptly from unity to 8 and back to unity (0 dB to +18 dB and back to 0 dB), presented twice.
613.1.2Gain changes linearly from unity to 8 and back to unity (0 dB to +18 dB and back to 0 dB), presented twice.
623.1.3Gain changes linearly from x1 to x8 in a set of 4 abrupt steps.
 3.1.4Gain changes linearly from x1 to x8 in a set of 16 abrupt steps.
 3.1.5Gain changes linearly from x1 to x8 in a set of 64 abrupt steps.
633.1.6Gain changes linearly from x1 to x8 in a set of 4 linearly ramped steps.
 3.1.7Gain changes linearly from x1 to x8 in a set of 16 linearly ramped steps.
 3.1.8Gain changes linearly from x1 to x8 in a set of 64 linearly ramped steps.
643.1.9Gain changes logarithmically from x1 to x8 in a set of 4 abrupt steps.
 3.1.10Gain changes logarithmically from x1 to x8 in a set of 16 abrupt steps.
 3.1.11Gain changes logarithmically from x1 to x8 in a set of 64 abrupt steps.
653.1.12Gain changes logarithmically from x1 to x8 in a set of 4 linearly ramped steps.
 3.1.13Gain changes logarithmically from x1 to x8 in a set of 16 linearly ramped steps.
 3.1.14Gain changes logarithmically from x1 to x8 in a set of 64 linearly ramped steps.
  
3.2 Signal Generation Via Wave Tables
663.2.1Sample Increment (SI) = 1 (each sample of the wave table is played sequentially).
673.2.2SI=2 (every other sample of the wave table is played).
683.2.3SI=3 (every third sample of the wave table is played).
693.2.4SI= 1.49 (non-integer SI means that the table lookup index is generally not an integer, so round-to-nearest sample introduces a waveform error).
703.2.5SI=1.71(non-integer SI means that the table lookup index is generally not an integer, so round-to-nearest sample introduces a waveform error).
713.2.6Sample Increment (SI) swept from 0.5 to 2.
723.2.7Difference between the wave table sweep and a "perfect" (no lookup rounding) sweep.
733.2.8Difference boosted by 20dB to make it more audible.
  
3.3 Denoising to Remove Broadband Noise
743.3.1Original.
753.3.2Strong (-20 dBFs Peak ) tape-like noise.
763.3.3Distorted file resulting from adding the original and the noise.
773.3.4File denoised with DeNoiser from Sound Laundry, without attack and release parameters: note an obtrusive chirping.
783.3.5File denoised with NoiseFree denoiser, short attack, short release: better, but still with a lot of phasy noise.
793.3.6File denoised with NoiseFree denoiser; medium attack and release: less noise, but note the "fuzzy" onsets due to large block size leading to transient "smearing".
803.3.7File denoised with NoiseFree denoiser, short attack and long release: more denoising is still possible, but at the cost of transient smoothing.
813.3.8Original file optimally denoised with Algorithmix NoiseFree denoiser.
  
3.4 Effects of Implementation Errors and Error Concealment
823.4.1Every eighth sample is left out (missed).
833.4.2Every 32nd sample is left out (missed).
843.4.3Every eighth sample is set to zero (skipped).
853.4.4Every 32nd sample is set to zero (skipped).
863.4.5Comb filter (delay length 8) for comparison to miss and skip examples.
873.4.6Comb filter (delay length 32) for comparison to miss and skip examples.
883.4.7Filtered source without overflows (correct output).
893.4.8Filtered source with internal filter overflow compensation by clipping.
903.4.9Filtered source with internal filter overflow compensation by wrap-around.
913.4.10Lattice filtered source without overflows (correct output).
923.4.11Lattice filtered source with overflow compensation by clipping.
933.4.12Lattice filtered source with overflow compensation by wrap-around.
  
3.5 Cascading Sample Rate Conversions
943.5.1Original source (44.1khz/16b stereo).
953.5.2High quality, cascaded 1x (44.1kHz/16b stereo).
963.5.3High quality, cascaded 2x (44.1kHz/16b stereo).
973.5.4Consumer quality, cascaded 1x (44.1kHz/16b stereo).
983.5.5Consumer quality, cascaded 2x (44.1kHz/16b stereo).
AES - Audio Engineering Society