Events of the AES: AES 112th Convention: Session I: PSYCHOACOUSTICS, PERCEPTION AND LISTENING TESTS

Return to 112th

Detailed Calendar
(in Excel)

Calendar (in PDF)

Chairman's Welcome

Exhibitors

Metadata Symposium

Special Events

Papers

Workshops

Technical Tours

Cultural Tours

Students

Historical

Heyser Lecture

Information
for Authors

Tech Comm Mtgs

Standards Comm Mtgs

Travel

Hotel Information

Registration

Session I: PSYCHOACOUSTICS, PERCEPTION AND LISTENING TESTS

Sunday, May 12, 09:00 – 13:00 h
Chair: Matthew Watson, Dolby Laboratories, San Francisco, CA, USA

09:00 h
I1 Smart Acoustic Volume Controller for Mobile Phones – Suthikshn Kumar, Infineon Technologies India Pvt Ltd, Bangalore, India

The quality of speech (QoS) deteriorates in a noisy environment due to the fixed volume setting available in the current day mobile phones. This paper proposes the use of smart acoustic volume controller for mobile phones based on fuzzy logic concept for improving the QoS in a stationary or non-stationary noise. The fuzzy volume controller makes use of the noise level and class information generated by a system for fuzzy pattern classification of background noise. The smart acoustic volume controller is extended to be useful for hearing impaired by using the audiogram to design the fuzzy rule base. The design and simulation of the fuzzy volume controller is discussed along with the implementation details. We report on the use of two tools i.e., ECANSE and FuzzyControl++ for the simulation of fuzzy volume controller for mobile phones
Convention Paper 5545

09:30 h
I2 Ideal Point Modeling of Speech Quality in Mobile Communications Based on Multi-Dimensional Scaling (MDS) - Ville-Veikko Mattila, Nokia Research Center, Tampere, Finland

Multi-dimensional scaling and preference mapping were used for the perceptual analysis of speech quality in mobile communications. 41 processing chains, representing, e.g., transmission of speech over mobile networks, were studied. 30 screened subjects were used in the quality test and 15 screened and trained subjects in the MDS test. Based on an external profiling of the auditory characteristics, the dimensions appeared to relate to the naturalness of speech, limitation of the frequency band, smoothness of speech, a bubbling sound alternating with the signal and noisiness of speech. The Phase I ideal point model was used to predict the quality with an average error of about 6 %, to study the interaction between the attributes and the linearity of the attributes.
Convention Paper 5546

10:00 h
I3 Subjective Evaluation of Perceived Spatial Differences in Car Audio Systems Using a Graphical Assessment Language – Natanya Ford¹, Francis Rumsey¹, Tim Nind² - ¹University Of Surrey, Guildford, UK; ²Harman/Becker Automotive Systems, Bridgend, UK

Results from preliminary investigations studying graphical elicitation techniques suggest that a graphical assessment language, whereby listeners use their own non-verbal descriptors to depict spatial attributes of a reproduced sound, may be effective for demonstrating differences in perceived image skew and scene width. This paper investigates the use of a graphical assessment language for evaluating subjective differences in car audio systems with respect to their distortion of stereo images from sub-optimal listening locations. The study compares the image obtained from a surround processing system and conventional two channel stereo reproduction, analyzing the graphical depictions obtained using conventional statistical methods. Source material for the investigation employs both time and amplitude variation to position instruments within the reproduced stereo scene.
Convention Paper 5547

10:30 h
I4 The Continuity Illusion in Virtual Auditory Space – Michael Kelly, Anthony Tew, University of York, York, UK

The cocktail party effect describes the human ability to direct attention to a single sound source amongst a mixture of competing sources. Under certain conditions signal fragments from these sources can be removed without creating a perceptible effect (often referred to as the continuity illusion). In this study we evaluate the impact of removing fine spectral detail in the regions of spectro-temporal overlap between a pure tone and a white noise source using Fourier spectral gating. We go on to discuss the use of spectral gating in the treatment of natural signals in relation to the discontinuities that are introduced and their effect on the continuity illusion.
Convention Paper 5548

11:00 h
I5 The Perception of Multiple Broadband Noise Sources Presented Concurrently in Virtual Auditory Space – Andre van Schaik, Virginia Best, Simon Carlile, University of Sydney, Sydney, Australia

The effect of spatial separation on the ability of subjects to hear both sounds in a pair of concurrent broadband sounds presented in virtual auditory space was examined. Results suggested that this ability relied strongly on differences in the binaural cues delivered by the sounds, and stimulus pairs could not be separated if they delivered the same binaural cues. A relatively simple model was developed to explain these data using a combination of computational tools relevant to auditory processing: (a) a 128-channel cochlea model, (b) spike generation representing the temporal structure of the energy in each channel, (c) within-channel cross-correlation of left and right ear spike patterns, (d) exclusion of low-energy channels and (e) aggregation of cross-correlation results over remaining channels.
Convention Paper 5549

11:30 h
I6 Binaural Modeling of Multiple Sound Source Perception: Coloration of Wideband Sound – Kazuho Ono¹, Ville Pulkki², Matti Karjalainen²- ¹NHK Science and Technical Research Laboratories, Tokyo, Japan; ²Helsinki University of Technology, Espoo, Finland

Binaural modeling of coloration of sound perceived due to multiple coherent sources is studied under the condition in which sounds arrive at a listener successively within a certain time delay. Our former work showed that a binaural model of timbre perception describes the coloration almost independently of the directional perception, based on listening tests using 1-Bark-width band limited noise and pulse trains. The present work is an extension of the study to verify the timbre model, including listening experiments with broadband noise and pulse trains. The results show that our modeling still has a good agreement with the listening tests results for broadband signals in general, but also show clear deviation from the band limited noise case especially at low frequencies.
Convention Paper 5550

12:00 h
I7 Determination of the Relative Hierarchy of Audible Cues in Conflict – Koray Ozcan¹, Simon Busbridge², Peter Fryer^{3 - 1}University of Brighton, Brighton, UK; ²University of Brighton, Moulsecoomb, UK; ³B&W Loudspeakers Ltd, Steyning, UK

The reproduction of sounds in rooms generates multiple (auditory) cues the relative importance of which is still not clearly understood. This paper presents an experimental investigation to determine the relative hierarchy of conflicting cues. Subjects were asked to auralize different stimuli when two cues were put into conflict. The relative importance of interaural time, phase and intensity differences, the effect of pinnae, motion and reverberation cues have been determined. The use of windowed tone bursts allowed multiple variables to be controlled simulating real signals. The results show the change in auralization as one cue in varied in opposition to another. The aim of this work is to identify the major and minor roles of different auditory cues in acoustic virtual reality.
Convention Paper 5551

12:30 h
I8 Multidimensional Perceptual Scaling of Tone Color Variation in Three Modeled Guitar Amplifiers – William Martens, Atsushi Marui, University of Aizu, Aizuwakamatsu-shi, Japan

Multidimensional perceptual scaling analyses were performed for a set of stimuli that were generated by submitting two pre-recorded guitar performances to a popular effects processor designed to model a variety of guitar amplifiers. Within three characteristic types of amplifier distortion (British Crunch, Combo 335, and Twin Drive), the tone color of the output was varied using three nominal output character settings (Normal, Edge, and Punch). As it was only the variation in timbre and tone coloration that was of interest, the loudness of the processor outputs was equalized prior to listening sessions in order to determine the most salient perceptual attributes of these amplifier models as their output character was varied. This analysis separated out two salient tone-coloration dimensions from a third dimension of timbral variation. This third dimension corresponded to a timbral characteristic particular to the three modeled amplifier types. Interpretation of the meanings of the three dimensions was aided by the results of a semantic differential analysis for the same sounds using bipolar adjective scales. The timbral quality distinguishing the three modeled amplifiers was well described by the verbal attributes ``wildness'' and ``hardness.'' The tone coloration variation introduced particularly by the Punch output character settings was most highly correlated with ratings on ``thickness'' and ``heaviness'' scales. A straightforward relation was also found for the use of the ``sharpness'' and “muddiness” scales in describing tone coloration variation introduced by the Edge output character setting, though interpretation was complicated somewhat by the correlation of these ratings with the timbral quality that differed between the amplifiers. The results of this study provided the basis for a graphical user interface to computer-controlled musical effects processing that is more immediately accessible to a wide range of users.
Convention Paper 5552