AES New York 2007
P22 - Signal Processing for 3-D Audio, Part 2
Paper Session P22
Monday, October 8, 1:00 pm — 4:30 pm
Chair: Soren Bech, Bang & Olufsen a/s - Struer, Denmark
P22-1 Real-Time Auralization Employing a Not-Linear, Not-Time-Invariant Convolver—Angelo Farina, University of Parma - Parma, Italy; Adriano Farina, Liceo Ginnasio Statale G. D. Romagnosi - Parma, Italy
The paper reports the first results of listening tests performed with a new software tool, capable of not-linear convolution (employing the Diagonal Volterra Kernel approach) and of time-variation (employing efficient morphing among a number of kernels). The listening tests were done in a special listening room, employing a menu-driven playback system, capable of presenting blindly sound samples recorded from real-world devices and samples simulated employing the new software tool, and, for comparison, samples obtained by traditional linear, time-invariant convolution. The listener answers a questionnaire for each sound sample, being able to switch them back and forth for better comparing. The results show that this new device-emulation tool provides much better results than already-existing convolution plug-ins (which only emulate the linear, time-invariant behavior), requiring little computational load and causing short latency and prompt reaction to user’s action.
Convention Paper 7295 (Purchase now)
P22-2 Real-Time Panning Convolution Reverberation—Rebecca Stewart, Mark Sandler, Queen Mary University of London - London, UK
Convolution reverberation is an excellent method for generating high-quality artificial reverberation that accurately portrays a specific space, but it can only represent the static listener and source positions of the measured impulse response being convolved. In this paper multiple measured impulse responses along with interpolated impulse responses between measured locations are convolved with dry input audio to create the illusion of a moving source. The computational cost is decreased by using a hybrid approach to reverberation that recreates the early reflections through convolution with a truncated impulse response, while the late reverberation is simulated with a feedback delay network.
Convention Paper 7296 (Purchase now)
P22-3 Ambisonic Panning—Martin Neukom, Zurich University of the Arts - Zurich, Switzerland
Ambisonics is a surround-system for encoding and rendering a 3-D sound field. Sound is encoded and stored in multichannel sound files and is decoded for playback. In this paper a panning function equivalent to the result of ambisonic encoding and so-called in-phase decoding is presented. In this function the order of ambisonic resolution is just a variable that can be an arbitrary positive number not restricted to integers and that can be changed during playback. The equivalence is shown, limitations and advantages of the technique are mentioned, and real time applications are described.
Convention Paper 7297 (Purchase now)
P22-4 Adaptive Karhunen-Lòeve Transform for Multichannel Audio—Yu Jiao, Slawomir Zielinski, Francis Rumsey, University of Surrey - Guildford, Surrey, UK
In previous works, the authors proposed the hierarchical bandwidth limitation technique based on Karhunen-Lòeve Transform (KLT) to reduce the bandwidth for multichannel audio transmission. The subjective results proved that this technique could be used to reduce the overall bandwidth without significant audio quality degradation. Further study found that the transform matrix varied considerably over time for many recordings. In this paper the KLT matrix was calculated based on short-term signals and updated adaptively over time. The perceptual effects of the adaptive KLT process were studied using a series of listening tests. The results showed that adaptive KLT resulted in better spatial quality than nonadaptive KLT but introduced some other artifacts.
Convention Paper 7298 (Purchase now)
P22-5 Extension of an Analytic Secondary Source Selection Criterion for Wave Field Synthesis—Sascha Spors, Berlin University of Technology - Berlin, Germany
Wave field synthesis (WFS) is a spatial sound reproduction technique that facilitates a high number of loudspeakers (secondary sources) to create a virtual auditory scene for a large listening area. It requires a sensible selection of the loudspeakers that are active for the reproduction of a particular virtual source. For virtual point sources and plane waves suitable intuitively derived selection criteria are used in practical implementations. However, for more complex virtual source models and loudspeaker array contours the selection might not be straightforward. In a previous publication the author proposed secondary source selection criterion on the basis of the sound intensity vector. This contribution will extend this criterion to data-based rendering and focused sources and will discuss truncation effects.
Convention Paper 7299 (Purchase now)
P22-6 Adaptive Wave Field Synthesis for Sound Field Reproduction: Theory, Experiments, and Future Perspectives—Philippe-Aubert Gauthier, Alain Berry, Université de Sherbrooke - Sherbrooke, Quebec, Canada
Wave field synthesis is a sound field reproduction technology that assumes that the reproduction environment is anechoic. A real reproduction space thus reduces the objective accuracy of wave field synthesis. Adaptive wave field synthesis is defined as a combination of wave field synthesis and active compensation. With adaptive wave field synthesis the reproduction errors are minimized along with the departure penalty from the wave field synthesis solution. Analysis based on the singular value decomposition connects wave field synthesis, active compensation, and Ambisonics. The decomposition allows the practical implementation of adaptive wave field synthesis based on independent radiation mode control. Results of experiments in different rooms support the theoretical propositions and show the efficiency of adaptive wave field synthesis for sound field reproduction.
Convention Paper 7300 (Purchase now)
P22-7 360° Localization via 4.x RACE Processing—Ralph Glasgal, Ambiophonics Institute - Rockleigh, NJ, USA
Recursive Ambiophonic Crosstalk Elimination (RACE), implemented as a VST plug-in, convolved from an impulse response, or purchased as part of a TacT Audio or other home audiophile product, properly reproduces all the ITD and ILD data sequestered in most standard two or multichannel media. Ambiophonics is so named because it is intended to be the replacement for 75 year old stereophonics and 5.1 in the home, car, or monitoring studio, but not in theaters. The response curves show that RACE produces a loudspeaker binaural sound field with no audible colorations, much like Ambisonics or Wavefield Synthesis. RACE can do this starting with most standard CD/LP/DVD two, four or five-channel media, or even better, 2 or 4 channel recordings made with an Ambiophone, using one or two pairs of closely spaced loudspeakers. The RACE stage can easily span up to 170° for two channel orchestral recordings or 360° for movie/electronic-music surround sources. RACE is not sensitive to head rotation and listeners can nod, recline, stand up, lean sideways, move forward and back, or sit one behind the other. As in 5.1, off center listeners can easily localize the center dialog even though no center speaker is ever needed.
Convention Paper 7301 (Purchase now)
Last Updated: 20070821, mei