145th AES CONVENTION Poster Session P14: Perception

AES New York 2018
Poster Session P14

P14 - Perception

Friday, October 19, 2:45 pm — 4:15 pm (Poster Area)

P14-1 Perception of Stereo Noise Bursts with Controlled Interchannel CoherenceSteven Crawford, University of Rochester - Rochester, NY, USA; Michael Heilemann, University of Rochester - Rochester, NY, USA; Mark F. Bocko, University of Rochester - Rochester, NY, USA
Lateralization in stereo-rendered acoustic fields with controlled interchannel cross-correlation properties was explored in subjective listening tests. Participants indicated the perceived lateral locations of a series of 2 ms stereo white noise bursts with specified interchannel cross-correlation properties. Additionally, participants were asked to indicate the spatial location and apparent source width of a series of 2 sec white noise bursts composed of one-thousand 2 ms bursts with specified interchannel cross-correlation. The distribution of peak locations in the signals’ cross-correlation corresponds to the perceived spatial extent of the auditory image. This illustrates the role of the averaging time in the short-time windowed cross-correlation model of binaural hearing and how the coherence properties of audio signals determine source image properties in spatial audio rendering.
Convention Paper 10110 (Purchase now)

P14-2 Analysis of the performance of Evolved Frequency Log-Energy Coefficients in Hearing Aids for different Cost Constraints and ScenariosJoaquín García-Gómez, University of Alcalá - Alcalá de Henares, Madrid, Spain; Inma Mohíno-Herranz, University of Alcalá - Alcalá de Henares, Madrid, Spain; César Clares-Crespo, University of Alcalá - Alcalá de Henares, Madrid, Spain; Alfredo Fernández-Toloba, University of Alcalá - Alcalá de Henares, Madrid, Spain; Roberto Gil-Pita, University of Alcalá - Alcalá de Henares, Madrid, Spain
Hearing loss is a common problem in old people. Nowadays hearing aids compensate these losses and make their life better, but they present some important issues (reduced battery life, requirement of real-time processing). Because of that, the algorithms implemented in these devices must work at low clock rates. Voice Activity Detection (VAD) is one of the main algorithms used in hearing aids, since it is useful for reducing the environmental noise and enhancing the speech intelligibility. In this paper a VAD algorithm will be tested using QUT-NOISE-TIMIT Corpus, with different computational cost constraints and at different locations.
Convention Paper 10111 (Purchase now)

P14-3 Evaluation of Additional Virtual Sound Sources in a 9.1 Loudspeaker ConfigurationSungsoo Kim, New York University - New York, NY, USA; Sripathi Sridhar, New York University - New York, NY, USA
This study aims to evaluate the addition of virtual sound sources to a 9.1 loudspeaker configuration in terms of spatial attributes such as envelopment and sound image width. It is the second part of a previous study where different upmixing algorithms to convert stereo to a 9.1 mix were examined. Four virtual sound sources (VSS) are added to a 9.1 configuration to simulate virtual loudspeakers in the height layer with the help of Vector-Based Amplitude Panning (VBAP). A subjective test is conducted to determine whether listeners perceive an improvement due to the addition of VSS channels in the height layer.
Convention Paper 10112 (Purchase now)

P14-4 Noticeable Rate of Continuous Change of Intensity for Naturalistic Music Listening in Attentive and Inattentive AudiencesYuval Adler, McGill University - Montreal, Quebec, Canada; CCRMA - Stanford, CA, USA
An investigation was done into the threshold of noticeability for a continuous rate of change in intensity and how listener attention affects this threshold rate. Results suggest listener attention has a strong effect on the threshold. Much previous work has been done to try and find the intensity discrimination threshold of human hearing involving comparison of consecutive stimuli differing in intensity but not with a constant change over a long time for a continuous stimulus. Exposure to high intensity sounds over time can damage hearing, so the driving goal behind this investigation is to inform development of a mechanism that will lower the intensity of sound listeners are subjected to when consuming music while having minimal effect on perceived loudness.
Convention Paper 10113 (Purchase now)

P14-5 Environment Replication with Binaural Recording: Three-Dimensional (3-D) Quadrant and Elevation Localization AccuracyJoseph Erichsen, Belmont University - Nashville, TN, USA; Wesley Bulla, Belmont University - Nashville, TN, USA
The purpose of this experiment was to examine the spatial accuracy of an environmental image created via binaural recording. Listeners were asked to localize 10 sources each positioned in one of four horizontal quadrants in three vertical planes. A binaural-recording was created in both anechoic and reverberant environments and subjective tests were conducted. The experiment yielded data for a comparative study of the effectiveness of the binaural recording in recreating the perceived source locations in “3-D” space around a specified listening position. ANOVA for overall accuracy and target hit/miss binomial measures in the free-field and with binaural recordings via headphone reproduction revealed areas of concern for future investigation as well as measures of relative accuracy for the experimental environments.
Convention Paper 10114 (Purchase now)

P14-6 Localization of Elevated Virtual Sources Using Four HRTF DatasetsPatrick Flanagan, THX Ltd. - San Francisco, CA, USA; Juan Simon Calle Benitez, THX Ltd. - San Francisco, CA, USA
At the core of spatial audio renderers are the HRTF filters that are used to virtually place the sounds in space. There are different ways to calculate these filters, from acoustical measurements to digital calculations using images. In this paper we evaluate the localization of elevated sources using four different HRTF datasets. The datasets used are SADIE (York University), Kemar (MIT), CIPIC (UC Davis), and finally, a personalized dataset that uses an image-capturing technique in which features are extracted from the pinnae. Twenty subjects were asked to determine the location of randomly placed sounds by selecting the azimuth and the elevation from where they felt the sound was coming from. It was found that elevation accuracy is better for HRTFs that are located near elevation = 0°. There was a tendency to under-aim and over-aim towards the area between 0° and 20° in elevation. A high impact of elevation in azimuth location was observed in sounds placed above 60°.
Convention Paper 10115 (Purchase now)

Return to Paper Sessions