AES San Francisco 2010
Paper Session P26
Sunday, November 7, 2:30 pm — 5:00 pm (Room 220)
Paper Session: P26 - Auditory Perception
P26-1 Progress in Auditory Perception Research Laboratories—Multimodal Measurement Laboratory of Dresden University of Technology—M. Ercan Altinsoy, Ute Jekosch, Sebastian Merchel, Jürgen Landgraf, Dresden University of Technology - Dresden, Germany
This paper presents the general ideas and implementation details of the MultiModal Measurement Laboratory (MMM Lab) of Dresden University of Technology. This lab combines VR equipment for multiple modalities (auditory, tactile, vestibular, visual) and is capable of presenting high-performance, interactive simulations. The goals are to discuss the progress in auditory perception research laboratories in recent years and the technical parameters, which should be considered for the implementation of reproduction systems for different modalities.
Convention Paper 8305 (Purchase now)
P26-2 Families of Sound Attributes for the Assessment of Spatial Audio—Sarah Le Bagousse, Orange Labs - France Télécom R&D - Cesson Sévigné, France; Mathieu Paquier, LISyC - Université de Bretagne Occidentale - Brest, France; Catherine Colomes, Orange Labs - France Télécom R&D - Cesson Sévigné, France
Over the last years, studies have highlighted many features liable to be used for the characterization of sounds by several elicitation methods. These various experiments have resulted in the production of a long list of sound attributes. But, as their respective meaning and weight are not alike for assessors and listeners, the analysis of the results of a listening test based on sound criteria remains complex and difficult. The experiments reported in this paper were aimed at shortening the list of attributes by clustering them in sound families from the results of two semantic tests based on either a free categorization (i) or use of a multi-dimensional scaling method (ii).
Convention Paper 8306 (Purchase now)
P26-3 Listening Tests for the Effect of Loudspeaker Directivity and Positioning on Auditory Scene Perception—David Clark, DLC Design - Northville, MI, USA
Using stereo playback in a typical living room, subjects were exposed to six loudspeaker configurations under double-blind conditions and asked if the auditory scene was better or worse than that presented by a reference stereo system. For all configurations, the auditory scene was judged to be plausible, but mean scores were lower than those for the reference. The reference comprised symmetrically-placed conventional box loudspeakers with subwoofers.
Convention Paper 8307 (Purchase now)
P26-4 Modeling Tempo of Human Response to a Sudden Tempo Change Using Damped Harmonic Oscillators—Nima Darabi, Peter Svensson, Jon Forbord, Norwegian University of Science and Telecommunications - Trondheim, Norway
A human-computer interactive subjective test was held in which 12 users tapped with a suddenly changing metronome by hand-clapping and finger-tapping. Up-sampled recorded trials with different interpolation methods were used to measure their internal timekeeper's tempo in response to each tempo step. An iterative prediction error minimization method was applied on the step response signals, to identify the underlying human users’ tempo-changing system related to this sensori-motor synchronization task. Experimental data indicated that the system is fairly LTI and would most likely resemble a second order damped harmonic oscillator. Fit ratio comparison showed that a delayed two-pole one-zero underdamped oscillator (P2DUZ) could be the trade-off between complexity and efficiency of the model. The related parameters for each user (as a set of their memory related built-in factors) were also extracted and shown to be slightly individual-dependent.
Convention Paper 8308 (Purchase now)
P26-5 Increasing Intelligibility of Multiple Talkers by Selective Mixing—Piotr Kleczkowski, Magdalena Plewa, Marek Pluta, AGH University of Science and Technology - Kraków, Poland
Five tracks of speech signal were recorded. One of the tracks, the target track, consisted of spoken numbers, so that by counting the number of correctly heard words the degree of comprehension of the target talker could be quantified in each trial. Two types of mixes of all five tracks were performed: a simple mix and a selective mix. The latter mix is a development of the processing technique known as binary masking. A large group of subjects (54) listened to both types of mixes and it was found that selective mixing slightly increased the intelligibility of the target talker.
Convention Paper 8309 (Purchase now)