Camera zooming would be more compelling if the audio was subjected to a corresponding zoom that matched the video. Psychophysical and neuroimaging results suggest that a cross-modal approach to zooming facilitates multisensory integration. Because auditory distance perception is primarily determined by sound intensity, an audiovisual zoom effect can be obtained by matching the levels of different sources in a sound scene with their visually perceived distance. The authors propose a general theory for independent sound source level control that can be used to attain an acoustic zoom effect. The theory does not require sound source separation, which reduces computational load. An efficient implementation using fixed and adaptive spatial and spectral noise-reduction algorithms is proposed and evaluated. Experimental results using an array of a small number of low-cost microphones confirm that the proposed approach is particularly suited for consumer audiovisual capture applications.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.