Last Updated: 20050901, mei
P4 - Spatial Perception and Processing
Friday, October 7, 1:30 pm — 4:30 pm
Chair: Richard Stroud
P4-1 Audibility of Spectral Switching in Head-Related Transfer Functions—Pablo Faundez Hoffmann, Henrik Møller, Aalborg University - Aalborg, Denmark
Binaural synthesis of a time-varying sound field is performed by updating head-related transfer functions (HRTFs). The updating is done to reflect the changes in the sound transmission to the listener’s ears that occur as a result of moving sound. Unless the differences in HRTFs are sufficiently small, a direct switching between them will cause an audible artifact that is heard as a click. By modeling HRTFs as minimum-phase filters and pure delays, it is possible to study the effects of spectral and time switching separately. Time switching was studied in a previous investigation. This work presents preliminary results on minimum audible spectral switching (MASS).
Convention Paper 6537 (Purchase now)
P4-2 Virtual Source Location Information for Binaural Cue Coding—Sang Bae Chon, In Yong Choi, Han-gil Moon, Seoul National University - Seoul, Korea; Joengil Seo, Electronics and Telecommunications Research Institute - Daejeon, Korea; Koeng-Mo Sung, Seoul National University - Seoul, Korea
Binaural Cue Coding (BCC) is an audio coding technology that expresses multichannel audio signals with mono or stereo downmixed audio signal(s) and side information which are Inter-Channel Level Difference (ICLD), Inter- Channel Time Delay (ICTD), and Inter-Channel Correlation (ICC). Among these, the ICLD describes the level difference between the signal of one channel and the reference downmixed signal of the BCC system. This ICLD plays the most important role to lateralize spatial image. However, the fact that the spatial image of sound is created by the location of the sound source in nature raises the question whether there is a more direct solution to describe the location of the sound source for the spatial image than ICLD. Virtual Source Location Information (VSLI), the proposed new side information in this paper, provides an answer to this question.
Convention Paper 6538 (Purchase now)
P4-3 Perceptual Movement of Auditory Images Fed through Multiway Loudspeakers Perpendicularly Set Up—Yu Agatsuma, Eiichi Miyasaka, Musashi Institute of Technology - Yokohama, Kanagawa, Japan
Vertical localization and perceptual movement of auditory images were investigated through 8 loudspeakers perpendicularly set up with each loudspeaker 30-cm. The results obtained by more than 20 observers show that the localization was fairly identified for one octave bands of noises with the center frequencies from 2 to 8 kHz, and smooth movement of auditory images from low to high was perceived when a stimulus consisting of one octave bands of noises was linearly climbed up with a various movement speed.
Convention Paper 6539 (Purchase now)
P4-4 High Order Spatial Audio Capture and its Binaural Head-Tracjed Playback Over Headphones with HRTF Cues—Ramani Duraiswami, Dmitry Zotkin, Zhiyun Li, Elena Grassi, Nail Gumerov, Larry Davis, University of Maryland - College Park, MD, USA
A theory and a system for capturing an audio scene and then rendering it remotely are developed and presented. The sound capture is performed with a spherical microphone array. The sound field at the location is deduced from the captured sound and is represented using either spherical wave-functions or plane-wave expansions. The sound field representation is then transmitted to a remote location for immediate rendering or stored for later us. The sound renderer, coupled with the head tracker, reconstructs the acoustic field using individualized head-related transfer functions to preserve the perceptual spatial structure of the audio scene. Rigorous error bounds and a Nyquist-like sampling criterion for the representation of the sound field are presented and verified.
Convention Paper 6540 (Purchase now)
P4-5 Performance of Spatial Audio Using Dynamic Cross-Talk Cancellation—Tobias Lentz, Ingo Assenmacher, Jan Sokoll, Aachen University (RWTH) - Aachen, Germany
Creating a virtual sound scene with spatial distributed sources needs a technique to introduce spatial cues into audio signals and an appropriate reproduction system. In our case a complete binaural approach is used. It consists of binaural synthesis and head-tracked dynamic cross talk cancellation (CTC) for the reproduction of the binaural signal at the ears of the listener. In this paper performance and limitations of the complete system and also the various subsystems will be investigated and discussed. The channel separation of the dynamic CTC system is measured for various positions in the listening area as well as the subjective accomplishment of the localization is examined in listening tests.
Convention Paper 6541 (Purchase now)
P4-6 An Application of Lined-Up Loudspeaker Array System for Mixed Reality Audio-Visual Reproduction System—Hiroyuki Okubo, Yasushige Nakayama, NHK Science & Technical Research Laboratories - Setagaya, Tokyo, Japan; Yoichi Naito, NHK Engineering Administration Department - Shibuya, Tokyo, Japan; Toshikazu Ikenaga, NHK Nagasaki Station - Nagasaki City, Nagasaki, Japan; Setsu Komiyama, NHK Broadcasting Engineering Department - Shibuya, Tokyo, Japan
An interactive audio-visual system called the Mixed Reality Audio-Visual (MRAV) reproduction system has been developed. The MRAV system employs a method of stereoscopic image projection and a technique of multichannel sound field reproduction in which the loudspeaker array is able to focus the sound pressure in front of the listener to coincide with the three-dimensional (3-D) visual image. A new sound screen with a silver metallic coating has also been developed. It helps to maintain the sound quality even if it radiated from the loudspeaker array system when it is set behind the screen. This paper describes the design of the loudspeaker array system and it discusses our examination of the generated sound field corresponding to 3-D CG images.
Convention Paper 6542 (Purchase now)