AES Journal

Journal of the AES

2019 December - Volume 67 Number 12

Presidents' Message

Page: 936

Download: PDF ()

Editor's Note and JAES Reviewers

Author: Bozena Kostek

Page: 937

Download: PDF (51KB)


A Comparative Analysis of Classifiers and Feature Sets for Acoustic Environment Classification


When an audio recording is used as evidence in litigation and forensic investigations, it needs to be checked thoroughly for authenticity and integrity in order to be admissible, compelling, and decisive evidence in a court of law. An audio recording can be subject to tampering attacks with easy-to-use editing and signal processing tools, thereby undermining its legal value. Artifacts embedded in an audio recording can provide valuable clues about the acoustic environment in` which the audio was recorded and allow for the detection of tampering. This paper presents findings of two parallel methodologies: (1) where the features are extracted from the room impulse response and (2) where features are extracted directly from the reverberated recordings. These methods focus on extracting parameters from audio recordings that helped distinguish different auditory scenes. Experiments employing an exhaustive set of machine learning classifiers along with different acoustic features were conducted for the classification of auditory environments. A comparative analysis has been carried out to assess the performance of each classifier and relative performance impact of each feature set in terms of the accuracy of classification. A two-layer Artificial Neural Network (ANN) provided an accuracy of 98.7% using room impulse responses and an accuracy of 99.5% when the reverberated audio recordings were trained.

  Download: PDF (HIGH Res) (6.3MB)

  Download: PDF (LOW Res) (394KB)

  Be the first to discuss this paper

Rapid Prediction of Polyester Magnetic Tape Playability Using Water Contact Angles

Open Access



The playability and degradation of polyester magnetic media has been an ongoing concern for decades for audio curators, technicians, and hobbyists. As these collections continue to age, users increasingly desire to transfer their contents. However, such a task can be daunting. This report presents a new, rapid, nontechnical tool for evaluating the playability and physical surface of polyester magnetic tapes without needing to place them on playback equipment or use expensive technical instrumentation. Water contact angle, using a micro liter-sized droplet, was found to accurately predict the physical playback condition of the vast majority of tapes from a sampling of test tapes from the Library of Congress testing labs. This tool provides an appealingly simple and powerful method to directly probe a tape's physical surface. Results could frequently be interpreted by eye, without needing technical processing equipment or software.

  Download: PDF (HIGH Res) (9.0MB)

  Download: PDF (LOW Res) (384KB)

  Be the first to discuss this paper

Simulation of the Ondes Martenot Ribbon-Controlled Oscillator Using Energy-Balanced Modeling of Nonlinear Time-Varying Electronic Components


Even though digital technology now dominates the audio industry, there is still the need to preserve historic analog machines and instruments. The Onde Martenot, invented in 1928, is an example of a classic electronic musical instrument based on heterodyne processing. This paper describes a simulation of that instrument. In the Onde Martenot, two oscillators generate high-frequency quasi-sinusoidal signals, one of which is fixed and other is controlled by a player using a sliding ribbon. The sum of these two oscillators is an amplitude-modulated signal whose envelope is detected using a triode vacuum tube. That produces an audible sound with a frequency that is the difference of the two oscillators. The triode vacuum tube in the detector is a nonlinear component that adds harmonics to the signal. This paper focuses on using a power-balanced simulation of its ribbon-controlled oscillator, composed of linear, nonlinear, as well as time-varying components. Numerical experiments on the nonlinear time-varying circuit lead to expected observations: (1) the combination of the triode amplification and the LC-resonator produces a quasi-sinusoidal oscillation with a stable amplitude for a static configuration; (2) the mechanical force produced by the variable capacitor due to the ribbon displacement is undetectable by the musician for over-speed movement; and (3) the latency between the instantaneous frequency and the ribbon position is also undetectable. This corroborates that the Martenot s ribbon-controlled circuit is close to an ideal oscillator.

  Download: PDF (HIGH Res) (4.9MB)

  Download: PDF (LOW Res) (490KB)

  Be the first to discuss this paper

Management of Sound Levels in Live Music Venues


With the increased recognition of potential damage to listeners when subjected to excessively loud sound, software-based sound level management systems can be viewed as a component of a strategy for reducing sound exposure to patrons and staff in live music venues. However, the use of level management tools in small indoor music venues, which represent a unique environment, has not been systematically explored. In an experimental approach for sound level management, a software system was tried in six indoor live-music venues in Melbourne. Comparing a control (without sound level management software) and the experimental condition (using the software), there was no reduction in mean LAeq,T, although there was a reduction in the number of events with extreme volume levels. Subjective questionnaires indicated that one-fifth of the patrons preferred lower sound levels than they experienced. The findings suggest that modifications to the software system may be necessary if the aim of the system is to reduce patron and staff sound exposure rather than simply to avoid exceeding legislative sound level limits. Recommended alterations could include greater flexibility in choice of target, matching with context of the performance, or changes to the system's visual display so that staying below, not at target, is positively reinforced.

  Download: PDF (HIGH Res) (3.4MB)

  Download: PDF (LOW Res) (415KB)

  Be the first to discuss this paper

A Generative Adversarial Net-Based Bandwidth Extension Method for Audio Compression


To reduce the burden of storing and transmitting audio signals, they are often compressed with a lossy single-channel code. Because the high-frequency components are effectively truncated when using a low bitrate encoder, listeners may experience the sound as being uncomfortable, muffled, or dull. To compensate for the perceived degradation, bandwidth extension technology can be used to regenerate the missing high frequencies from the low-frequency components during the decoding process. In this paper the authors propose a bandwidth extension method based on Generative Adversarial Networks (GAN), which is used to estimate the relationship between the MDCT spectrum in the high-frequency part and the low-frequency part. It is evaluated by a discriminant network in the GAN to get a more natural result. A complete audio coding system was built by using AAC Low Complex as the single-channel core encoder with the proposed bandwidth extension method. To evaluate the audio quality decoded by the new system, a subjective evaluation experiment was carried out using the HE-AAC as the baseline system with the MUSHRA experimental method.

  Download: PDF (HIGH Res) (8.3MB)

  Download: PDF (LOW Res) (437KB)

  Be the first to discuss this paper

Modeling the Perception of System Errors in Spherical Microphone Array Auralizations


A prominent trend in spatial audio research is the realization of virtual acoustic environments based on binaural technology. This study estimates the perceptual influence of system errors on the binaural reproduction of spherical microphone array data for room simulation applications. Specifically, the impact of spatial aliasing, system noise, and microphone positioning errors is perceptually analyzed in a listening experiment using an auditory model. Perceptual and technical data are related by various predictive modeling techniques, which enable estimating the perceptual strength of system errors. The experimental data comprises spherical array simulations under free-field conditions and in two reflective environments, a dry and a reverberant shoebox-shaped room, using five different audio signals for auralization. Results show that error prediction is possible with high accuracy and low errors using nonlinear modeling techniques such as artificial neural networks.

  Download: PDF (HIGH Res) (358KB)

  Download: PDF (LOW Res) (167KB)

  Be the first to discuss this paper

Preferred Levels for Background Ducking to Produce Esthetically Pleasing Audio for TV with Clear Speech

Open Access



In audio production, background ducking facilitates speech intelligibility while allowing the background to fulfill its purpose, e.g., to create ambiance, set the mood, or convey semantic cues. Technical details for recommended ducking practices are not currently documented in the literature. This report first analyzes the common practices found in TV documentaries, and it then describes a listening test that investigated the preferences of 22 normal-hearing participants on the Loudness Difference (LD) between commentary and background during ducking. Highly personal preferences were observed, highlighting the importance of object-based personalization. Statistically significant difference was found between nonexpert and expert listeners. On average, nonexperts preferred LDs that were 4 LU higher than the ones preferred by experts. A statistically significant difference was also found between Commentary over Music (CoM) and Commentary over Ambiance (CoA). Based on the test results, the authors recommend at least 10 LU difference for CoM and at least 15 LU for CoA. Moreover, a computational method based on the Binaural Distortion-Weighted Glimpse Proportion (BiDWGP) was found to match the median preferred LD for each item with good accuracy.

  Download: PDF (HIGH Res) (2.0MB)

  Download: PDF (LOW Res) (491KB)

  Be the first to discuss this paper

Standards and Information Documents

AES Standards Committee News

Page: 1012

Download: PDF (146KB)


147th Convention Report, New York

Page: 1014

Download: PDF (1.9MB)

Exhibitors and Sponsors

Page: 1028

Download: PDF (503KB)

Room Acoustics: Old Wine in New Skins?


Room acoustics affect many of the things that audio engineers make or do. Scale models for simulating building acoustics have been given a new look in a potential role as echo chambers for live performance or recording. Modal decay at low frequencies can be evaluated using a clever wavelet transform tech-nique. Finite element models may be able to be employed to support measure-ments of reverberation time in small rooms. Music performances may vary in tempo when they’re done in different reverberation conditions, but the effects are not entirely predictable. Finally it may be possible make a desk screen that is both visually transparent and performs well acoustically.

  Download: PDF (436KB)

  Be the first to discuss this feature

Call for Nominations for the Board of Governors

Page: 1039

Download: PDF (33KB)

147th Convention Papers Abstracts, New York

Page: 1042

Download: PDF (504KB)

Call for Awards Nominations

Page: 1067

Download: PDF (78KB)

Index to Volume 66

Page: 1068

Download: PDF (222KB)

AES Bylaws

Page: 1092

Download: PDF (75KB)


Section News

Page: 1037

Download: PDF (85KB)

Book Reviews

Page: 1038

Download: PDF (50KB)


Page: 1041

Download: PDF (84KB)

2018 Statement of Financial Position

Page: 1090

Download: PDF (70KB)

AES Conventions and Conferences

Page: 1096

Download: PDF (106KB)


Table of Contents

Download: PDF (42KB)

Cover & Sustaining Members List

Download: PDF (78KB)

AES Officers, Committees, Offices & Journal Staff

Download: PDF (101KB)

AES - Audio Engineering Society