AES Journal

Journal of the AES

2023 October - Volume 71 Number 10

Review Papers

Tracing Distortion on Vinyl LPs

Open
Access

Author:Jovanovic, Vladan
Affiliation:Consultant, Bloomington, IN
Page:616

Tracing errors arise because the reproducing stylus is of a different shape than the cutting chisel used to create the original acetate lacquer master for vinyl Long Play (LP) records. Tracing errors are typically the most significant source of distortion in vinyl reproduction and probably the main reason manufacturers of pickup cartridges seldom specified distortion figures for their products. In this paper, a historical overview of harmonic distortion results due to tracing errors is provided. In many cases, these results are 70--80 years old and, at least in some cases, seem largely forgotten by now. Some new simulation results are provided to verify various approximations proposed and used in the past.

Download: PDF (1.4MB)

Be the first to discuss this reviewPaper

Papers

Six-Degrees-of-Freedom Binaural Reproduction of Head-Worn Microphone Array Capture

Open
Access

Authors:Mccormack, Leo; Meyer-Kahlen, Nils; Lou Alon, David; Ben-Hur, Zamir; V. Amengual Garí, Sebastià; Robinson, Philip
Affiliation:Reality Labs Research, Meta, Redmond, WA; Reality Labs Research, Meta, Redmond, WA; Reality Labs Research, Meta, Redmond, WA; Reality Labs Research, Meta, Redmond, WA; Reality Labs Research, Meta, Redmond, WA; Reality Labs Research, Meta, Redmond, WA
Page:638

This article formulates and evaluates four different methods for six-degrees-of-freedom binaural reproduction of head-worn microphone array recordings, which may find application within future augmented reality contexts. Three of the explored methods are signalindependent, utilizing least-squares, magnitude least-squares, or plane wave decomposition--based solutions. Rotations and translations are realized by applying directional transformations to the employed spherical rendering or optimization grid. The fourth considered approach is a parametric signal-dependent alternative, which decomposes the array signals into directional and ambient components using beamformers. The directional components are then spatialized by applying binaural filters corresponding to the transformed directions, whereas the ambient sounds are reproduced using the magnitude least-squares solution. Formal perceptual studies were conducted, whereby test participants rated the perceived relative quality of the four binaural rendering methods being evaluated. Of the three signal-independent approaches, the magnitude least-squares solution was rated the highest. The parametric approach was then rated higher than the magnitude least-square solution when the listeners were permitted to move away from the recording point.

Download: PDF (720KB)

Be the first to discuss this paper

Effects of Head-Tracking Artefacts on Externalization and Localization in Azimuth With Binaural Wearable Devices

Open
Access

Authors:Grimaldi, Vincent; S.R. Simon, Laurent; Courtois, Gilles; Lissek, Hervé
Affiliation:LTS2 - Groupe Acoustique, Ecole Polytechnique Fèdèrale de Lausanne (EPFL), Lausanne, Switzerland; Sonova AG, Stäfä, Switzerland; Sonova AG, Stäfä, Switzerland; LTS2 - Groupe Acoustique, Ecole Polytechnique Fèdèrale de Lausanne (EPFL), Lausanne, Switzerland
Page:650

Head tracking combined with head movements have been shown to improve auditory externalization of a virtual sound source and contribute to the performance in localization. With certain technically constrained head-tracking algorithms, as can be found in wearable devices, artefacts can be encountered. Typical artefacts could consist of an estimation mismatch or a tracking latency. The experiments reported in this article aim to evaluate the effect of such artefacts on the spatial perception of a non-individualized binaural synthesis algorithm. The first experiment focused on auditory externalization of a frontal source while the listener was performing a large head movement. The results showed that a degraded head tracking combined with head movement yields a higher degree of externalization compared to head movements with no head tracking. This suggests that the listeners could still take advantage of spatial cues provided by the head movement. The second experiment consisted of a localization task in azimuth with the same simulated head-tracking artefacts. The results showed that a large latency (400 ms) did not affect the ability of the listeners to locate virtual sound sources compared to a reference headtracking. However, the estimation mismatch artefact reduced the localization performance in azimuth.

Download: PDF (680KB)

Be the first to discuss this paper

Spatio-Temporal Windowing for Encoding Perceptually Salient Early Reflections in Parametric Spatial Audio Rendering

Open
Access

Authors:Jüterbock, Tobias; Brinkmann, Fabian; Gamper, Hannes; Raghuvanshi, Nikunj; Weinzierl, Stefan
Affiliation:Audio Communication Group, Technical University of Berlin, Berlin, Germany; Audio Communication Group, Technical University of Berlin, Berlin, Germany; Microsoft Research Redmond, Redmond, WA; Microsoft Research Redmond, Redmond, WA; Audio Communication Group, Technical University of Berlin, Berlin, Germany
Page:664

Parametric spatial audio rendering aims to provide perceptually convincing audio cues that are agnostic to the playback system to enable the acoustic design of games and virtual reality. The authors propose an algorithm for detecting perceptually important reflections from spatial room impulse responses. First, a parametric representation of the sound field is derived based on perceptually motivated spatio-temporal windowing, followed by a second step that estimates the perceptual salience of the detected reflections by means of a masking threshold. In this work, a vertical dependency is incorporated into both these components. This was inspired by recent research revealing that two sound sources in the median plane can evoke two independent auditory events if their spatial separation is sufficiently large. The proposed algorithm is evaluated in nine simulated shoebox rooms with a wide range of sizes and reverberation times. Evaluation results show improved selection of early reflections by accounting for source elevation and suggest that for speech signals, the perceptual quality increases with an increasing number of rendered early reflections.

Download: PDF (980KB)

Be the first to discuss this paper

When Spatial Sounds Affect the Ability to Apprehend Visual Information: A Physiological Approach

Authors:Mendonça, Catarina; Wang, Heng; Pulkki, Ville
Affiliation:University of the Azores, Ponta Delgada, Portugal, Porto University, Porto, Portugal; Aalto University, Espoo, Finland; Aalto University, Espoo, Finland
Page:679

The current technological solutions for spatial audio provide realistic auditory impressions but rarely account for multisensory interactions. The intent of this study was to discover if and when spatial sounds could lower the accuracy of visual perception. Sequences of light and sound events were presented, and different sound parameters were tested: spatial and temporal congruency, horizontal and vertical spatial distribution, and source broadness. Participants were asked to report the location of the last visual event, in a left-right discrimination task. During the task, cognitive effort was monitored through pupil size measurements. It was found that both spatial and temporal congruence are important for higher accuracy levels and lower cognitive effort levels. However, spatial congruence was found to not be crucial, if sounds occur within the same spatial region as visual events. Sounds hindered the visual accuracy levels and increased effort when they occurred within a narrower or wider field than that of the visual events, but not too discrepant. These effects were replicated with vertical sound distributions.Broad sounds made the task more effortful and limited negative effects of spatially mismatched audiovisual events. When creating spatial sound for audiovisual reproductions, source distribution and broadness should be intentionally controlled.

Download: PDF (631KB)

Be the first to discuss this paper

A Perceptual Model of Spatial Quality for Automotive Audio Systems

Open
Access

Authors:Koya, Daisuke; Mason, Russell; Dewhirst, Martin; Bech, Søren
Affiliation:University of Surrey, Guildford Surrey GU2 7XH, United Kingdom; University of Surrey, Guildford Surrey GU2 7XH, United Kingdom; Focusrite Audio Engineering Ltd., Artisan Hillbottom Road, High Wycombe Buckinghamshire HP12 4HJ, United Kingdom; Bang & Olufsen, Peter Bangs Vej 15 Struer, 7600, Denmark
Page:689

Aperceptual modelwas developed to evaluate the spatial quality of automotive audio systems by adapting the Quality Evaluation of Spatial Transmission and Reproduction by an Artificial Listener (QESTRAL) model of spatial quality developed for domestic audio systems. The QESTRAL model was modified to use a combination of existing and newly created metrics, based on---in order of importance---the interaural cross-correlation, reproduced source angle, scenewidth, level, entropy, and spectral roll-off. The resulting model predicts the overall spatial quality of two-channel and five-channel automotive audio systems with a cross-validation R2 of 0.85 and root-mean-square error (RMSE) of 11.03%. The performance of the modified model improved considerably for automotive applications compared with that of the original model, which had a prediction R2 of 0.72 and RMSE of 29.39%. Modifying the model for automotive audio systems did not invalidate its use for domestic audio systems, which were predicted with an R2 of 0.77 and RMSE of 11.90%.

Download: PDF (994KB)

Be the first to discuss this paper

Engineering Reports

Listener Preferences for High-Frequency Response of Insert Headphones

Open
Access

Authors:Miller, Thomas; Downey, Cristina
Affiliation:Knowles Electronics, LLC, Itasca, IL, USA; Knowles Electronics, LLC, Itasca, IL, USA
Page:707

The frequency response of a headphone is very important for listener satisfaction. Listener preferences have been well studied for frequencies below 10 kHz, but preferences above that frequency are less well known. Recent improvements in the high-frequency performance of ear simulators makes it more practical to study this frequency region now. The goal of this study was to determine the preferred headphone response for insert headphones for the audible range above 10 kHz. A new target response is proposed, based on listener preference ratings in a blind listening test. The results show a clear preference for significantly more high-frequency energy than was proposed in a previous popular headphone target curve. The preferred response is also affected by the listener's hearing thresholds, with additional high-frequency boost being preferred for listeners with age-related hearing loss.

Download: PDF (1.5MB)

Be the first to discuss this report

On-Device Intelligence for Real-Time Audio Classification and Enhancement

Authors:Hwang, Inwoo; Kim, Kibeom; Kim, Sunmin
Affiliation:Sound Laboratory, Visual Display Division, Samsung Electronics, Suwon, South Korea; Sound Laboratory, Visual Display Division, Samsung Electronics, Suwon, South Korea; Sound Laboratory, Visual Display Division, Samsung Electronics, Suwon, South Korea
Page:719

Audio enhancement is a signal processing method that improves the listening experience. Although most audio devices provide a variety of sound-enhancing effects, it is reported that very few people are active users of this feature. This lack of usability comes from insufficient sound improvement because of concerns about scene-rendering mismatch, which means that the processing applied to an unintended target may even damage the sound quality. The key solution to this problem is sound intelligence that provides an optimal sound effect with very low latency. The authors propose a real-time audio enhancement system based on a highly precise audio scene classifier using convolutional neural networks. The entire computation including convolutions is optimized for digital signal processing--level implementation, resulting in enhanced audio outputs for every audio frame.

Download: PDF (550KB)

Be the first to discuss this report

Standards and Information Documents

AES Standards Committee News

Page: 729

Download: PDF (39KB)

Features

Call For Papers: Special Issue on Spatial and Immersive Audio

Page: 731

Download: PDF (46KB)

Departments

Obituaries

Page: 730

Download: PDF (225KB)

Conv&Conf

Page: 732

Download: PDF (12.7MB)

Navigation

Journal of the AES

2023 October - Volume 71 Number 10

Review Papers

Tracing Distortion on Vinyl LPs

Papers

Six-Degrees-of-Freedom Binaural Reproduction of Head-Worn Microphone Array Capture

Effects of Head-Tracking Artefacts on Externalization and Localization in Azimuth With Binaural Wearable Devices

Spatio-Temporal Windowing for Encoding Perceptually Salient Early Reflections in Parametric Spatial Audio Rendering

When Spatial Sounds Affect the Ability to Apprehend Visual Information: A Physiological Approach

A Perceptual Model of Spatial Quality for Automotive Audio Systems

Engineering Reports

Listener Preferences for High-Frequency Response of Insert Headphones

On-Device Intelligence for Real-Time Audio Classification and Enhancement

Standards and Information Documents

AES Standards Committee News

Features

Call For Papers: Special Issue on Spatial and Immersive Audio

Departments

Obituaries

Conv&Conf

Extras

Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

ABOUT AES

Contact Us