Audio Engineering Society

Chicago Section

Meeting Review, October 22, 2009

other meeting reports

10/22/09 Meeting Highlights
by Giles Davis

Audio vs. Video

By: James (JJ) Johnston


James Johnston (JJ) presented “Audio vs Video” to 61 people, half of whom were online (see special thanks below).  The slide deck JJ presented from is available at The following is to complement, not replace, that slidedeck.


JJ discussed fundamental differences in how we process vision and hearing: The ear is very good at frequency/time analysis, and the eye is very good at spatial analysis. When comparing the two perceptual systems, it was noted that the cochlea is a mechanical filter bank, and there is no analog in vision. Both systems then go on to loudness analysis (brightness for vision), then feature analysis, then feature extraction. When considering the time component of our audio system’s analysis, JJ observed that loudness analysis incorporates ~200ms integration. It is not linear, and it is a “peak catcher,” which allows the Haas / precedence effect. The Haas effect is possible because the hair cells of the cochlea depolarize, making them less sensitive.


JJ discussed the difference between loudness memory, feature memory and long-term memory, and introduced the concept of steering – listen to a passage of music, then have someone point out some element of the passage, and when you listen a second time (listening for that element), you listen differently – and you hear it differently. The concept of steering the listener can explain many supposed differences when auditioning audio gear that aren’t really there.


The eye does not share the ear’s sensitivity to time: if one repeats a frame in video, “it’s not a great thing,” but it’s not a big deal. The equivalent in audio “creates an event that is very ‘big’… (and) pushes a great deal of different information down the perceptual chain.”


When comparing dynamic range, both audio and vision are capable of ~120dB. However, the ear adapts much more quickly, and can do something the eye can-not do: present a vast dynamic range at once and perceive both signals. An example made was that 18kHz and 4kHz presented 90dB apart could both be audible simultaneously. JJ did also observe that the eye is more linear, at least in the short term.


When discussing color vision, JJ observed that the eye adjusts to the same firing rate after adapting to whatever the ambient light condition, making it inherently linear. He also observed that peripheral vision actually gets better in dark.


In the Q&A, it was discussed that when you lose articulation in audio, you have to work hard to understand, and you fatigue. Yet with vision, you can still see the “message” even if the contrast / color is off. Amplitude distortion can be huge and still work. A person walking across the screen is perceived as such almost no matter the color distortion.


Thank you to James Johnston for an engaging presentation and to Shure for hosting. Special thanks to RentCom for providing the first ever webcast of a Chicago Section presentation. Ron Steinburg, Bill Larsen, Glen Steinburg and Tony McGrath’s efforts made it possible for 31 people to view the presentation live.