v3.1, 20040329, ME
Session D Saturday, May 8 13:30 h16:00 h
SPATIAL PERCEPTION AND PROCESSINGPART 2
(focus on novel rendering and control techniques)
Chair: Francis Rumsey, University of Surrey, Guildford, Surrey, UK
D-1 Motion-Tracked Binaural SoundV. Ralph Algazi, Richard Duda, Dennis Thompson, University of California, Davis, CA, USA
A new method is presented for capturing, recording, and reproducing spatial sound. The method generalizes binaural recording, preserving the information needed for dynamic head-motion cues. These dynamic cues stabilize the perceived sound field, largely eliminate front/back confusion, and greatly reduce the need for customization to the listener. During either capture or recording, the sound field in the vicinity of the head is sampled with a microphone array. During reproduction, a head tracker is used to determine the microphones that are closest to the positions of the listeners ears. Interpolation procedures are used to produce the headphone signals. The properties of different methods for interpolating the microphone signals are presented and analyzed.
D-2 IKA-SIM: A System to Generate Auditory Virtual Environments Andreas Silzle1, Pedro Novo1, Holger Strauss2
1 Ruhr-Universität Bochum, Bochum, Germany
2 VCS Aktiengesellschaft, Bochum, Germany
The basic requirements for an Auditory Virtual Environment (AVE) are presented and a system based on a physical approach (IKA-SIM), employing the mirror-image model to generate the early reflections, is described. The static and dynamic structure of the IKA-SIM software (written in C++) is shown in diagrams and the computational requirements for real-time performance are delineated. IKA-SIM is able to render rooms of arbitrary shape, to account for frequency-dependent absorption factors, and to calculate high-order reflections in real-time on a standard PC. The different interfaces for real-time interaction are presented. IKA-SIM supports headphone and loudspeaker reproduction. A new elevation panning algorithm for loudspeaker reproduction is introduced. Design aspects relevant to a real-time AVE system are presented.
D-3 Further Study of Sound Field Coding with Higher Order AmbisonicsJérôme Daniel, Sébastien Moreau, France Telecom R&D, Lannion, France
Higher Order Ambisonics (HOA) is a spatialization technology based on the spherical harmonic decomposition of a sound field. This technology provides a flexible way to represent and render 3-D sound scenes. Nevertheless, it is only recently that the problem of representing near field sources and recording natural sound fields (infinite bass boost) has been addressed and partially solved. This paper proposes a further study on the frequency-dependent amplitude of the spherical harmonic components for finite distance source encoding, by connecting it with several parameters: the source distance, the microphone array size (case of natural recording), the size of the targeted reproduction area, and the distance of the reproduction loudspeakers. A solution is investigated to limit excessive low- frequency amplification of high order ambisonic components while still achieving a correct reproduction of wave fronts. As a particular result, it leads to improved distance coding tools for virtual sources, especially when these are simulated inside the listening area.
D-4 Sound-Source Radiation Synthesis: From Stage Performance to Domestic RenderingOlivier Warusfel, Nicolas Misdariis, IRCAM, Paris, France
A diffusion device based on a digitally controlled 3-D array of loudspeakers La Timée was developed in order to synthesize a given radiation pattern from the combination of a set of elementary directivities. This radiation synthesis method, designed for musical and performance constraints (real-time control, musical vocabulary associated to different directivity patterns, etc.), has been used for stage performances and sound installations. In order to translate the sound experience for domestic reproduction, the paper addresses the postproduction step where the spatial image associated with the radiation synthesis is transcoded on conventional formats like transaural, ambisonic or 5.1 formats. The method is based on the characterization of the performance room with the different elementary directivities, which are then superimposed according to the musical score.
D-5 Surround Sound: Relations of Listening and Viewing ConfigurationsThe Useful Assignment of Loudspeaker Basis Width to Video Picture DimensionGerhard Steinke, Audio Consultant, Berlin Germany
The growing penetration of the DVD into todays marketplace also simulates a more intimate association of sophisticated multichannel sound and larger high-quality images with the ideal TV format 16:9 (1.78:1). Nevertheless, different geometrical assignments may exist between image size and loudspeaker basis width in production studios, multimedia rooms, and home living roomsbesides varying room-acoustical and qualitative conditions. For best possible use of program essences, the exact locations of sound and picture sources should be assigned as near as possible, i.e., with corresponding horizontal listening angles and viewing angles for avoiding disturbing discrepancies between acoustical and optical perspective. Essential connections are considered, and the recommendation is derived to adjust the optimum viewing distance 2H with regard to appropriate large loudspeaker basis width and image size for high-quality home theater experiences.