AES Journal

Journal of the AES

2020 September - Volume 68 Number 9


A Letter From the Outgoing Editor

Author: Bozena Kostek

Page: 612

Download: PDF (39KB)

Papers

Wavelet-Based Spatial Audio Format

Authors:
Affiliation:
Page:

Ambisonics is a spatial audio technique covering all steps of the audio production chain, from encoding and recording to transmission and decoding, whose building blocks are the spherical harmonics. Some of the drawbacks of low order Ambisonics, like large source spread and small sweet-spot, are directly related to the fact that spherical harmonics do not have compact support on the sphere. In this paper we propose a novel spatial audio format similar in spirit to Ambisonics but which replaces the spherical harmonics by an alternative set of functions with compact support, the spherical wavelets.We develop a complete audio chain from encoding to decoding, using discrete spherical wavelets built on a multiresolution mesh, illustrating with an example implementation of the format. We present a decoding algorithm optimizing acoustic and psychoacoustic parameters that can generate decoding matrices to irregular layouts for both Ambisonics and the new wavelet format. This audio workflow is directly compared with Ambisonics. For an industry-standard loudspeaker layout, we show how we can reach well localized sound sources with almost no negative gains (which are a common issue in most Ambisonics decoder designs). The approach is very flexible: there are different possible incarnations of the wavelet-based audio format, depending on the specific multiresolution mesh and the wavelet family, making possible to customize the format, for example adapting it tomeshes that closely resemble the distribution of loudspeakers in standard layouts.

  Download: PDF (HIGH Res) (7.2MB)

  Download: PDF (LOW Res) (1.2MB)

  Be the first to discuss this paper

Effect of Skill Level on Listener Performance in 3D Audio Evaluation

Authors:
Affiliation:
Page:

A previous experiment (Part 1) found that, within the context of 3D audio evaluation, both audio production experience and musical training were significant predictors of listener consistency in making preference or attribute rating judgments of stimuli. In that study, 72 subjects ranging from highly experienced to na ¨ive listeners evaluated an excerpt of orchestral music captured by three different 3D music-recording techniques. Using the same data set from Part 1, the current study (Part 2) examines whether the results of skilled listeners can be generalized to the larger population of unskilled listeners within the context of 3D audio evaluation. Results show no significant changes in the rank order of recording technique attribute ratings or preferences as a function of listener skill. Results also show that using highly skilled participants will result in gains in sta- tistical power. This allows for the detection of subtler differences between stimuli or greater efficiency in the number of trials needed to achieve a significant result.

  Download: PDF (HIGH Res) (707KB)

  Download: PDF (LOW Res) (375KB)

  Be the first to discuss this paper

Comparison of Pairwise Dissimilarity and Projective Mapping Tasks With Auditory Stimuli

Open Access

Open
Access

Authors:
Affiliation:
Page:

Two methods for undertaking subjective evaluation were compared: a pairwise dissimilarity task (PDT) and a projective mapping task (PMT). For a set of unambiguous, synthetic, auditory stimuli, the aim was to determine the following: whether the PMT limits the recovered dimensionality to two dimensions; how subjects respond using PMT’s two-dimensional response format; the relative time required for PDT and PMT; and hence, whether PMT is an appropriate alternative to PDT for experiments involving auditory stimuli. The results of both Multi-Dimensional Scaling (MDS) analyses and Multiple Factor Analyses (MFA) indicate that, with multiple participants, PMT allows for the recovery of three meaningful dimensions. The results from the MDS and MFA analyses of the PDT data, on the other hand, were ambiguous and did not enable recovery of more than two meaningful dimensions. This result was unexpected given that PDT is generally considered not to limit the dimensionality that can be recovered. Participants took less time to complete the experiment using PMT compared to PDT (a median ratio of approximately 1:4), and employed a range of strategies to express three perceptual dimensions using PMT’s two-dimensional response format. PMT may provide a viable and efficient means to elicit up to 3-dimensional responses from listeners.

  Download: PDF (HIGH Res) (1.2MB)

  Download: PDF (LOW Res) (780KB)

  Be the first to discuss this paper

A Method for Spatial Upsampling of Voice Directivity by Directional Equalization

Authors:
Affiliation:
Page:

To describe the sound radiation of the human voice into all directions, measurements need to be performed on a spherical grid. However, the resolution of such captured directivity patterns is limited and methods for spatial upsampling are required, for example by interpolation in the spherical harmonics (SH) domain. As the number of measurement directions limits the resolvable SH order, the directivity pattern suffers from spatial aliasing and order-truncation errors. We present an approach for spatial upsampling of voice directivity by spatial equalization. It is based on preprocessing, which equalizes the sparse directivity pattern by spectral division with corresponding directional rigid sphere transfer functions, resulting in a time-aligned and spectrally matched directivity pattern that has a significantly reduced spatial complexity. The directivity pattern is then transformed into the SH domain, interpolated to a dense grid by an inverse spherical Fourier transform and subsequently de-equalized by spectral multiplication with corresponding rigid sphere transfer functions. Based on measurements of a dummy head with an integrated mouth simulator, we compare this approach to reference measurements on a dense grid. The results show that the method significantly decreases errors of spatial undersampling and this allows a meaningful high-resolution voice directivity to be determined from sparse measurements.

  Download: PDF (HIGH Res) (3.8MB)

  Download: PDF (LOW Res) (2.3MB)

  Be the first to discuss this paper

Generating Continuous Deterministic Band-Limited Test Signals With Nearly Laplace Distribution

Author:
Page:

Many natural and man-made signals including speech and music are well-modeled by Laplace distributions. Yet testing, evaluation, design, and simulation of devices and systems are often performed with a sine or noise with much different distributions. Such practice, while generally useful, can lead to erroneous estimates of system performance. Three novel methods each with several optimized variations are presented herein to generate continuous—and computable at arbitrary instants of time—signals with nearly Laplace distributions. Further, each method produces signals that are band-limited and thus do not require a low pass filter when used with sampled systems or limited-bandwidth channels. In the bargain, some distribution functions are presented that might not be widely known. Implementations are summarized in readily accessible form. Other distributions can also do well modeling the same signal types and the methods described can all be adapted to generate signals with these density distributions that are strongly peaked around the origin.

  Download: PDF (HIGH Res) (973KB)

  Download: PDF (LOW Res) (749KB)

  Be the first to discuss this paper

Standards and Information Documents

AES Standards Committee News

Page: 680

Download: PDF (242KB)

Features

Engineering XR

Author:
Page:

A number of significant challenges arise when attempting to engineer audio systems and processes for extended reality applications. Authors of papers presented at the recent AVAR conference have begun to find ways of representing the acoustics of virtual environments more accurately, such that objects, characters, and participants within them perceive sounds in a more believable way. There’s interesting evidence that the more accurately one renders the acoustics, the less bothered people are about the differences between real and virtual sounds. There's also the interesting problem of the competition for attention in mixed reality environments crowded with stimuli that the user may need to know about.

  Download: PDF (355KB)

  Be the first to discuss this feature

Call for Papers Special Issue on Internet of Sounds

Page: 686

Download: PDF (77KB)

Audio Engineering Society Educational Foundation 2020 Awardees

Page: 687

Download: PDF (47KB)

Departments

Obituaries

Page: 687

Download: PDF (47KB)

New Products

Page: 694

Download: PDF (151KB)

Section News

Page: 688

Download: PDF (120KB)

Book Reviews

Page: 690

Download: PDF (115KB)

AES Conventions and Conferences

Page: 696

Download: PDF (87KB)

Extras

Table of Contents

Download: PDF (38KB)

Cover & Sustaining Members List

Download: PDF (77KB)

AES Officers, Committees, Offices & Journal Staff

Download: PDF (51KB)

AES - Audio Engineering Society