AES E-Library

AES E-Library Search Results

Bulk download: Download Zip archive of all papers from this Journal issue

Audio for Cinema

[Feature] Dialog recording, editing, and replacement is probably one of the most important aspects of movie sound production. It is the basis of good storytelling, as poor dialog is the “best way to ruin a movie.” Panelists during Audio for Cinema sessions at the 143rd Convention tackle this fascinating topic. Afterwards, a panel chaired by Nuno Fonseca debated the future challenges of audio for cinema.

Author: Rumsey, Francis
JAES Volume 66 Issue 3 pp. 182-185; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this feature!

Design of a Wideband Linear Microphone Array for High-Quality Audio Recording

Because of their superior directional properties, shotgun microphones are preferable choices for high-quality speech and audio recording in environments with intense ambient noise. However, their low- and middle-frequency directivities are usually not sufficiently high for practical usage. Alternatively, linear microphone arrays with unequally spaced elements enable high wideband directivity with a small number of microphones. A general procedure for designing such arrays steered at the endfire direction with unequally spaced elements is proposed for high-quality audio recording from 20 Hz to 16 kHz. A simulated annealing method is used to iteratively optimize the spatial distribution for microphones that do not have matched amplitude and phase. The challenge in the optimization process arises because large bandwidth requires large matrices, which produces large accumulation error. Therefore, the optimization process needs to be carefully regularized with the error being restricted to a relatively small tolerable level. The proposed method can produce microphone arrays with higher directivity than the corresponding shotgun microphones of the same length with comparable low self-noise level.

Authors: Zhou, Haoran; Lu, Jing; Qiu, Xiaojun
Affiliations: Key Lab of Modern Acoustics, Institute of Acoustics, Nanjing University, Nanjing, China; Centre for Audio, Acoustics and Vibration, Faculty of Engineering and IT, University of Technology Sydney, Ultimo, Australia(See document for exact affiliation information.)
JAES Volume 66 Issue 3 pp. 154-166; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this report!

Determination and Validation of Mix Parameters for Modifying Envelopment in Object-Based Audio

This work is licensed under a
Creative Commons Attribution
4.0 International License.

With object-based audio transmission, a scene is distributed as a set of audio objects, as opposed to discrete audio channels. An object comprises an audio stream for a particular aspect of the scene, accompanied by some metadata, such as the desired level and spatial position of the object. Object-based audio offers the possibility of altering the rendering of an audio scene in order to modify or maintain perceptual attributes if the relationships between attributes and mix parameters are known. This research aims to determine the relationship between parameters of an object-based mix and the perception of envelopment (an important attribute of spatial audio reproduction systems), and to develop and test a system for manipulating envelopment in object-based audio in a perceptually relevant manner. An experiment was performed in which mixing engineers were asked to create mixes of object-based content at three levels of envelopment (low, medium, and high) while keeping the overall mix quality at an acceptable level. This enabled analysis of parameter values in order to assess how participants created different levels of envelopment. It was shown in a validation experiment that these parameters could be used to adjust envelopment to a target level.

Open
Access

Authors: Francombe, Jon; Brookes, Tim; Mason, Russell
Affiliation: Institute of Sound Recording, University of Surrey, Guildford, UK
JAES Volume 66 Issue 3 pp. 127-145; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

Download Now (547 KB)

This paper is Open Access which means you can download it for free.

Start a discussion about this paper!

Reference Equivalent Threshold Sound Pressure Levels for Nonaudiometric Headphones

For a given type of headphone, the sound pressure levels of pure tones at which an adequate number of young listeners without hearing loss just perceived the tones is called Reference Equivalent Threshold Sound Pressure Levels (RETSPLs). Current standards and norms have established RETSPL values for specific headphones; other headphones do not have such values. Although the Sennheiser HD 650 circumaural headphone is often used in behavioral experiments and listening tests, its RETSPL values have not yet been published. The HD 650 circumaural headphone was measured at frequencies between 125 Hz and 16 kHz for twenty five young listeners whose ages were in the range of 19 to 28 years in order to establish a RETSPL value. In addition, the paper compares the currently measured RETSPLs with several other types of circumaural and supra-aural headphones: the Sennheiser HDA 300, the Sennheiser HDA 200, the Beyer DT 48, and the Telephonics TDH 39. The results showed significant differences between the data for circumaural and supra-aural headphones at low frequencies below 500 Hz.

Authors: Vencovsky, Vaclav; Rund, Frantisek; Slegl, David
Affiliation: Czech Technical University in Prague, Department of Radioelectronics, Czech Republic
JAES Volume 66 Issue 3 pp. 167-171; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this report!

Removing Reflections in Semianechoic Impulse Responses by Frequency-Dependent Truncation

Acoustic reflections in impulse responses can be eliminated by truncation to short observation times that exclude the reflections. However, truncating the response tail distorts the corresponding low-frequency transfer function. When reflections in ”semianechoic” data originate from moderate-sized objects, e.g., equipment in anechoic chambers, their composition is mostly high frequencies. Consequently, truncation must only be performed in the mid to high frequencies where the information is contained in a brief time interval; the impulse response tail is anechoic for the low frequencies and can be retained. The authors present a frequency-dependent truncation approach that exploits this property by adapting the truncation length in each band. This avoids low-frequency errors while disturbing reflections are windowed out. Among several tested formulations, a novel Short Time Fourier Transform-based formulation generated the least artifacts while the anechoic impulse response was well preserved in both simulated and measured semianechoic data.

Authors: Denk, Florian; Kollmeier, Birger; Ewert, Stephan D.
Affiliation: Medizinische Physik and Cluster of Excellence
JAES Volume 66 Issue 3 pp. 146-153; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this report!

Required Measurement Accuracy of Head Dimensions for Modeling the Interaural Time Difference

Numerous studies have shown that on the interaural time difference (ITD) depends on the angle of incident of the sound wave as well as the individual’s anthropometric dimensions. When a geometric model is used for determining ITD, exact anthropometric head dimensions are desirable. However, measuring anthropometric dimensions always introduces uncertainties that are primarily caused by the projection from three-dimensional head shape to one-dimensional measures. This paper describes a listening experiment to derive the direction-dependent just-noticeable time deviation from an individual ITD. The determined threshold is then utilized to calculate the required measurement accuracy of the input dimensions of an ITD model exemplar. A feasibility study is presented in which four required head dimensions of 17 subjects are automatically measured on three-dimensional head models. These models are determined using an RGBD sensor and software for three-dimensional surface reconstruction. The accuracy of the automatically determined anthropometric head dimensions is evaluated by comparing them to dimensions obtained from magnetic resonance imaging scans.

Authors: Bomhardt, Ramona; Mejía, Isabel C. Patiño; Zell, Andreas; Fels, Janina
Affiliations: RWTH Aachen University, Institute of Technical Acoustics, Medical Acoustics Group, Aachen, Germany; University of Tübingen, Tübingen, Germany(See document for exact affiliation information.)
JAES Volume 66 Issue 3 pp. 114-126; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Single-Channel Speech Enhancement Based on Subband Spectral Entropy

The goal of speech enhancement is to make speech more pleasant and understandable, improving one or more perceptual aspects of speech, such as quality or intelligibility. This paper addresses single-channel speech enhancement. The authors explore improved multiband spectral subtraction based on the equivalent rectangular bandwidth (ERB) scale. In the proposed algorithm, the full speech spectrum is divided into different nonuniform frequency bands, and spectral subtraction is performed separately in each band. Moreover, subband spectral entropy is used directly to do the noise estimation rather than using speech endpoint detection. The ERB scale is adopted in the subband spectral entropy instead of the traditional linear scale or the Bark scale. The subband spectral entropy based on ERB scale can obtain a more accurate noise estimation, which can achieve better single-channel speech enhancement. The speech spectrograms, objective measures, and informal subjective listening tests show that the remnant noise is suppressed more by the proposed algorithm than by the Upadhyay’s algorithm.

Authors: Wei, Yi; Zeng, Yumin; Li, Chen
Affiliations: School of Physics and Technology Nanjing Normal University, Nanjing, China; Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing, China; State Key Laboratory Cultivation Base of Geographical Environment Evolution (Jiangsu Province), Nanjing, China; Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing, China(See document for exact affiliation information.)
JAES Volume 66 Issue 3 pp. 100-113; March 2018 Permalink
Publication Date: March 19, 2018 Import into BibTeX

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Join the discussion about this paper! (1 comment)

AES E-Library Search Results

Audio for Cinema

Design of a Wideband Linear Microphone Array for High-Quality Audio Recording

Determination and Validation of Mix Parameters for Modifying Envelopment in Object-Based Audio

Reference Equivalent Threshold Sound Pressure Levels for Nonaudiometric Headphones

Removing Reflections in Semianechoic Impulse Responses by Frequency-Dependent Truncation

Required Measurement Accuracy of Head Dimensions for Modeling the Interaural Time Difference

Single-Channel Speech Enhancement Based on Subband Spectral Entropy

ABOUT AES

Contact Us

Search Results (Displaying 7 matches)		New Search
Sort by: