Meeting Topic: Issues in Digital Production That Remain a Problem
Moderator Name: Bob Moses
Speaker Name: James D. (JJ) Johnston, Chief Scientist, Immersion Networks and Bob Smith, SoundSmith Labs
Other business or activities at the meeting: Held in conjunction with the IEEE Signal Processing Society, Seattle IEEE Section
Meeting Location: Digipen Institute of Technology, Redmond WA
For October, PNW Section held a joint meeting with the Signal Processing Society from the Seattle Section of the IEEE. James D. Johnston (JJ) from Immersion Networks and Bob Smith from SoundSmith Labs spoke about digital audio production problems that seem to remain misused. The meeting was held October 3, 2018 at Digipen Institute of Technology in Redmond, WA, where around 36 attended (20 AES members/5 IEEE members).
PNW Chair Bob Moses welcomed IEEE Signal Processing Society Chair Adam Loper, who hopes to have an active IEEE program.
JJ is Chief Scientist (or Factotum as he puts it) at Redmond-based Immersion Networks. He worked 26 years for AT&T Bell Labs and its successor AT&T Labs Research. One of the first investigators in the field of perceptual audio coding, he was one of the inventors and standardizers of MPEG 1/2 audio Layer 3 and MPEG-2 AAC, as well as the AT&T Labs-Research PXFM (perceptual transform coding), PAC (perceptual audio coding) and the ASPEC.
At Immersion, he is working in the area of auditory perception of soundfields, electronic soundfield correction, ways to capture soundfield cues and represent them, and ways to expand the limited sense of realism available in standard audio playback for both captured and synthetic performances.
JJ is an IEEE Fellow, an AES Fellow, a NJ Inventor of the Year, an AT&T Technical Medalist and Standards Awardee, and a co-recipient of the IEEE Donald Fink Paper Award. In 2006, he received the James L. Flanagan Signal Processing Award from the IEEE Signal Processing Society. He presented the 2012 Heyser Lecture at the AES 133rd Convention.
JJ began his talk, "Make it Loud and Other Disasters" by pointing out known technical issues with digital audio; that digital clipping creates aliased harmonics in the audible range; that digital to analog converters (DACs) can generate all manner of artifacts at full scale signal, and that full scale signals put through a codec (i.e. MP3) will additionally generate more clipping. Additionally, DAC designs vary and some implementations omit some codes, dither, or otherwise have different and odd behaviors at 0dBFS.
Graphs showed spectral analysis of even an under full scale clipped signal creating tremendous audible harmonics not intended. Thus, he recommended oversampling sufficiently to avoid audible aliases, then distorting your signal, then downsampling.
"Intersample overs" are another real problem. Two clipped digital values can produce an output waveform of much higher amplitude, unpredictably depending on the DAC. Processing like EQ can easily create such clipping and overs, as can perceptual coding (i.e. MP3).
Graphs from several unnamed commercial CD music examples were shown, with histograms of the actual and 5X oversampled levels, perceived loudness (using a model he likes), and a Spectral Flatness Measure (SFM).
All of these DAC behaviors, combined with many modern recording's desire for loudness and a certain distorted tone, showed that the effect heard by a particular end user could be wildly unpredictable. To keep the audible end result consistent, JJ suggests keeping levels —3dBFS down to avoid these problems, as well as checking for intersample overs, peak/RMS level, and checking the behavior of your DACs won't hurt, either. Artistic distortion should be created other than by pushing the converter chain.
Bob Smith earned his BSEE at the University of Washington and is principal of SoundSmith Labs. He has also worked in the biomedical field for over 40 years, notably on speech systems for medical devices.
Bob demonstrated several real-world DAC behaviors at their full scale limits. Five commercial audio DAC interfaces with different chips were tested with square waves at —3dB and full-scale, as well as an iPhone 4S. Their behaviors at or near full scale were examined (on a Picoscope), demonstrating vastly different results. Square wave testing predicts which DACs will have trouble with intersample overs. Then, testing with high level audio signals confirmed different resulting waveforms and spectra from clipping and intersample overs. Each manufacturer makes design tradeoffs in both digital anti-alias filter algorithms and analog output gain staging. Some of those implementations will faithfully reproduce the signal near full scale; others fail in various ways, some spectacularly, but all performed similarly, with much more faithful reproduction when the source material was less than —3dBFS.