Chair: Tony Tew
Spatial audio rendering techniques often make assumptions about head-related transfer functions to simplify the reproduction process, however, state-of-the-art techniques require critical consideration of the transfer function accuracy. This paper uses boundary-element method HRTFs to investigate the perceptual spectral difference (PSD) that arises with distance and angle discrepancies. PSD between HRTFs at various radial distances is calculated, and between HRTFs in head- and ear-centred systems as function of radial distance. The distance at which the average and maximum PSD falls below the threshold of perception is determined. HRTF average variation reaches perceptual limits by 2-3m, however, perceivable PSD values occur up to and beyond 10m, indicating that care must be taken when approximating either the distance or angular location of HRTFs.
Binaural rendering allows us to reproduce auditory scenes through headphones while preserving spatial cues. The best results are achieved if the headphone effect is compensated with an individualized filter, which depends on the headphone transfer function, ear morphology and fitting. However, due to the high complexity of remeasuring a new filter every time the user repositions the headphone, generic compensation may be of interest. In this study, the effects of generic headphone equalization in binaural rendering are evaluated objectively and subjectively, with respect to unequalized and individually-equalized cases. Results show that generic headphone equalization yields perceptual benefits similar to individual equalization for non-individual binaural renderings, and it increases overall quality, reduces coloration, and improves distance perception compared to unequalized renderings.
Traditional sound localization studies are often performed in anechoic chambers and in complete darkness. In our daily life, however, we are exposed to rich auditory scenes with multiple sound sources and complementary visual information. Although it is understood that the presence of maskers hinders auditory spatial awareness, it is not known whether competing sound sources can provide spatial information that helps in localizing a target stimulus. In this study, we explore the effect of presenting controlled auditory scenes with different amounts of visual and spatial cues during a sound localization task. A novel, gamified localization task is also presented. Preliminary results suggest that subjects who are exposed to audio-visual anchors show faster improvements than those who are not.
Verbal descriptors used by participants when describing audio-visual distance match and mismatch conditions in cinematic VR are analyzed to expose underlying similarities and structures. The participants are analyzed from two perspectives: accuracy in auditory distance discrimination and audio expertise. Similarities are found in the verbal descriptors used between accurate groups and inaccurate groups, as opposed to groups split by expertise. We propose the use of descriptors can be explained by internalized certainty. Audio experts and non-experts were equally likely to be accurate or consistent, thus audio expertise is not synonymous with spatial audio expertise, which demands unique consideration.
The ability of human listeners to segregate two sound sources was examined by conducting an experiment when the sources are concurrently presented from different directions in the median plane. A high-pass filtered pink noise was utilised as sound stimuli in a free-field condition and presented as either a pair of incoherent sound sources or a single-source. Subjects were tested with both monaural and binaural hearing, and responded whether they perceived sound from one or two directions. The responses showing "two directions" for pairwise stimuli exceeded 50% above 33.75° separation and reached above 70% at 67.5° in both hearing sessions. The difference in the ability to segregate sources in the median plane with binaural or monaural hearing was not prominently different.