Audio Engineering Society

Chicago Section

Meeting Review, April 1998

AES member Matt Nelson reviews the April 1998 meeting presentation, entitled Perception Based Audio Signal Processing, given by William Sethares of the University of Wisconsin-Madison . . .

Bill's presentation focused mainly on the human perception of sounds in music, specifically the consonance and dissonance of sounds. It is obvious that certain combinations of pairs of notes sound "good" together while others don't. Why is this? Bill theorizes that the cause of this difference is that "good" sounding combinations often have an integer ratio between the frequencies played, where "bad" sounding combinations may be something different such as 1:1.41.

This can be backed up with theories by Pythagoras and Helmholtz. Specifically:

-Pythagoras states that humans generally like small integer ratios of sound intervals, which corresponds to a small period of repetition.

-Helmholtz states that partials of a sound that are close in frequency cause beats that are perceived as roughness, or dissonance.

Following this up, the absence of beats is then called consonance. As two tones played simultaneously become farther apart in frequency, they reach a maximally fast beat frequency, at which point we distinctly perceive two different tones.

Bill then explores that concept of examining dissonance curves to obtain "good sounding" notes in any scale. Many musical sounds have harmonic partials; f, 2f, 3f ... and so on, which is approximately the case in the 12 tet scale. Dissonance curves are drawn by using the spectrum of a sound and summing the dissonance between all pairs of partials over a range of frequencies. Upon examination of dissonance curves for harmonic tones, there are many minima that occur at simple integer ratios. These sharp minima are caused by overlapping partials providing low dissonance.

Bill provided samples of music consisting of nonharmonic partials created by stretching and compressing tones. With this, timbre can be chosen that has minima at desired places in a dissonance curve. Bill plays an interesting example of how 10 tone intonation can be used to create "good" musical sounds. Through sampling and manipulation, any number tone temperament can be used, in effect moving harmonics.

Thus, dissonance curves really do capture something crucial about our perceptions of desirable and undesirable sounds. Bill shows that predictions of good and bad sounding intervals are consistent with calculations. The usefulness of this is that the minima of dissonance curves are good scale/chord indicators. Likewise, we can use the notion of consonance as a basis for audio signal processing devices. We can also build into our machines knowledge of how our perceptions work, synthesizing artificial scales which at least sound listenable, if not natural! In conclusion, the Chicago Section thanks Bill for presenting his compelling theories.