AES Amsterdam 2008: Paper Session P14

P14 - Psychoacoustics, Perception, and Listening Tests - 1

Monday, May 19, 09:00 — 12:00
Chair: Jan de Laat, LUMC - Leiden, The Netherlands

P14-1 Speech Quality Measurement for the Hearing Impaired on the Basis of PESQ—John G. Beerends, TNO Information and Communication Technology - Delft, The Netherlands; Jan Krebber, Technische Universität Dresden - Dresden, Germany, now with Sysopendigia PLC, Helsinki, Finland; Rainer Huber, HörTech GmbH - Oldenburg, Germany; Koen Eneman, Heleen Luts, Katholieke Universiteit Leuven - Leuven, Belgium
One of the research topics within the HearCom project, a European project that studies the impact of hearing loss on communication, is to find methods with which the speech quality as perceived by the hearing impaired can be measured objectively. ITU-T Recommendation P.862 PESQ and its wideband extension P.862.2, are obvious candidates for this despite the fact that they were developed for normal hearing subjects. This paper investigates the extent to which PESQ and possible simple extensions can be used to measure the quality of speech signals as perceived by hearing impaired subjects.
Convention Paper 7404 (Purchase now)

P14-2 Subjective Evaluation of Speech Quality in a Conversational Context—Emilie Geissner, Valérie Gautier-Turbin, France Télécom R&D - Lannion, France; Marie Guéguin, Laboratoire Traitement du Signal et de l'Image - Rennes, France, and Université de Rennes, Rennes, France; Laetitia Gros, France Télécom R&D - Lannion, France
Within the framework of ITU-T, an objective conversational model is developed to predict the impact of network impairments on the conversational quality experienced by a end-user. To train and validate such a model, subjective scores are required. Assuming that a conversation is made of talking, listening, and inter-action activities, a subjective test protocol is specially designed to take into account these multidimensional aspects of the speech quality in a conversation. Subjects are asked to evaluate speech quality in talking, listening, and conversational contexts separately during three successive tasks. The analyses of several tests show that this method is valid for the assessment of listening, talking, and conversational quality.
Convention Paper 7405 (Purchase now)

P14-3 Contribution of Interaural Difference to Obstacle Sense of the Blind While Walking—Takahiro Miura, Teruo Muraoka, Shuichi Ino, Tohru Ifukube, University of Tokyo - Tokyo, Japan
Most blind people can recognize some measure of objects existing around them only by hearing. This ability is called "obstacle sense" or "obstacle perception." It is known that this ability is facilitated while the subjects are moving, however, the exact reason of the facilitation has been unknown. It is apparent that some differences of sounds reaching between both ears significantly change while approaching the obstacles. We focused on this phenomenon called interaural difference in order to analyze the facilitation mechanism of the obstacle sense. We investigated how the interaural differences change depending on the head rotation while walking and then measured the DL (Difference Limen) of the interaural difference. Furthermore, we compared the measurement data and the DL with the relationship between the subject-to-obstacle distance and then discussed one of the factors of the facilitating the obstacle sense.
Convention Paper 7406 (Purchase now)

P14-4 The Accuracy of Localizing Virtual Sound Sources: Effects of Pointing Method and Visual Environment—Piotr Majdak, Bernhard Laback, Matthew Goupell, Michael Mihocic, Austrian Academy of Sciences - Vienna, Austria
The ability to localize sound sources in a 3-D-space was tested in humans. The subjects listened to noises filtered with subject-specific head-related transfer functions. In the experiment using naïve subjects, the conditions included the type of visual environment (darkness or structured virtual world) presented via head mounted display and pointing method (head and manual pointing). The results show that the errors in the horizontal dimension were smaller when head pointing was used. Manual pointing showed smaller errors in the vertical dimension. Generally, the effect of pointing method was significant but small. The presence of structured virtual visual environment significantly improved the localization accuracy in all conditions. This supports the benefit of using a visual virtual environment in acoustic tasks like sound localization.
Convention Paper 7407 (Purchase now)

P14-5 Perceived Spatial Distribution and Width of Horizontal Ensemble of Independent Noise Signals as Function of Waveform and Sample Length—Toni Hirvonen, Ville Pulkki, Helsinki University of Technology - Espoo, Finland
This paper investigates the perceived sound distribution and width of a horizontal loudspeaker ensemble as a function of signal length, as all loudspeakers emit simultaneous, white Gaussian noise bursts. In Experiment 1, subjects indicated the perceived distribution of 10 frozen cases where signal length was 2.5 ms. In Experiment 2, two cases from the previous test were investigated with signal lengths of 5-640 ms. The results indicate that (1) ensembles consisting of different short noise bursts vary in perceived distribution between cases and (2) when the length of the signal is increased, the produced sound event is generally perceived more wide. In perceiving such cases, the hearing system possibly utilized some temporal integration and/or adaptive processes.
Convention Paper 7408 (Purchase now)

P14-6 Effect of Minimizing Spatial Separation and Melodic Variations in Simultaneously Presented Two-Syllable Words—Jon Allan, Jan Berg, Luleå University of Technology - Piteå, Sweden
This paper will examine two important factors for the conception Auditory Streaming defined by Bregman, pitch, and localization. By removing one or two of these factors as possible identifiers to separate sound sources, the importance of each of them and the effect of reducing both of them will be studied. Stimuli with combinations of two-syllable words will be presented simultaneously in speakers to subjects, and the number of correct identifications will be measured. In one category of stimuli speech, melody will be removed and replaced with a monotonous pitch, equal for all words. One category will have all words presented from one speaker only. Conclusions will be related to earlier studies and common theories, the Cocktail party effect among others.
Convention Paper 7409 (Purchase now)

Last Updated: 20080612, tendeloo

AES Amsterdam 2008Paper Session P14

AES Amsterdam 2008
Paper Session P14