AES Dublin 2019
Engineering Brief EB04
EB04 - E-Brief Poster Session 2
Friday, March 22, 10:00 — 12:00 (The Liffey B)
EB04-1 A Study in Machine Learning Applications for Sound Source Localization with Regards to Distance—Hugh O'Dwyer, Trinity College - Dublin, Ireland; Sebastian Csadi, Trinity College Dublin - Dublin, Ireland; Enda Bates, Trinity College Dublin - Dublin, Ireland; Francis M. Boland, Trinity College Dublin - Dublin, Ireland
This engineering brief outlines how Machine Learning (ML) can be used to estimate objective sound source distance by examining both the temporal and spectral content of binaural signals. A simple ML algorithm is presented that is capable of predicting source distance to within half a meter in a previously unseen environment. This algorithm is trained using a selection of features extracted from synthesized binaural speech. This enables us to determine which of a selection of cues can be best used to predict sound source distance in binaural audio. The research presented can be seen not only as an exercise in ML but also as a means of investigating how binaural hearing works.
Engineering Brief 509 (Download now)
EB04-2 Setup and First Experimentation Over an AES67 Over 802.11 Network—Mickaël Henry, Digigram - Montbonnot, France; University of Grenoble - Grenoble, France; Willy Aubry, Digigram - Montbonnot, France
In this paper the AES67 standard tackles the transport of audio data over IP technology. This standard was originally created for audio transmission over local area ethernet networks. However, other types of IP links exist with behavior different from ethernet. This paper investigates the setup and constraints to put on WiFi links to support AES67 transmission. We will show the limits to employ an AES67 stereo audio stream on a Wireless Local Area Network. We will detail the difficulties encountered when setting up an AES67 wireless network. Then we will analyze disruptions brought by wireless networks on device PTP offset, audio packets, and audio PSNR compared to an AES67 ethernet network. We will show the gains brought by the SMPTE 2022-7 redundancy technique.
Engineering Brief 510 (Download now)
EB04-3 Computational Complexity of a Nonuniform Orthogonal Lapped Filterbank Based on MDCT and Time Domain Aliasing Reduction—Nils Werner, International Audio Laboratories Erlangen - Erlangen, Germany; Bernd Edler, International Audio Laboratories Erlangen - Erlangen, Germany
In this brief we investigate the computational complexity of a non-uniform lapped orthogonal filterbank with time domain aliasing reduction. The computational complexity of such filterbank is crucial for its usability in real-time systems, as well as in embedded and mobile devices. Due to the signal-adaptive nature of the filterbank, the actual real-world complexity will be situated between two theoretical bounds and has to be estimated experimentally by processing real-world signals using a coder-decoder pipeline. Both the bounds and the real-world complexity were analyzed in this brief, and a median 14–22% increase in complexity over an adaptive uniform MDCT filterbank was found.
Engineering Brief 511 (Download now)
EB04-4 Research on Reference-Processed Pair Differential Signal Objective Parameters in Relationship with Subjective Listening Tests Results in Digital Audio Coding Domain—Krzysztof Goliasz, Dolby Poland - Wroclaw, Poland; Wroclaw University of Science and Technology - Wroclaw, Poland; Sylwia Prygon, Dolby Poland - Wroclaw, Poland; Wroclaw University of Science and Technology - Wroclaw, Poland; Mikolaj Zwarycz, Dolby Poland - Wroclaw, Dolnoslaskie, Poland
This engineering brief presents results of the research on reference-processed pair differential signal objective parameters in relationship with subjective listening tests results in digital audio coding domain. Authors encoded set of encoder-stressful test signals using two different digital audio encoders. Various bitrates have been used in order to cover many types of audio coding artifacts. Listening tests were conducted against reference signals. Differential signals were created from reference and processed signals pairs. Set of objective parameters of differential signals have been analyzed in order to find relationship between those signals parameters and listening tests results. Conclusion was made based on found dependencies.
Engineering Brief 512 (Download now)
EB04-5 360° Binaural Room Impulse Response (BRIR) Database for 6DOF Spatial Perception Research—Bogdan Ioan Bacila, University of Huddersfield - Huddersfield, West Yorkshire, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
This engineering brief presents an open-access database for 360° binaural room impulse responses (BRIR) captured in a reverberant concert hall. Head-rotated BRIRs were acquired with 3.6° angular resolution for each of 13 different receiver positions, using a custom-made head-rotation system that was automated and integrated with the Huddersfield Acoustical Analysis Research Toolbox. The BRIRs are provided in the SOFA format. The library also contains impulse responses captured using a first-order Ambisonic microphone and an omnidirectional microphone. The database can be downloaded through the Resource section of the APL website: http://www.hud.ac.uk/apl . It is expected that the database would be useful for studying the perception of spatial attributes in a six degrees-of-freedom context.
Engineering Brief 513 (Download now)
EB04-6 Vertical Localization of Noise Bands Pairs by Time-Separation and Frequency Separation—Tao Zhang, Tokyo University of the Arts - Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Atsushi Marui, Tokyo University of the Arts - Tokyo, Japan
This study investigates how the vertical localization of the low-frequency is released from the domination of the high-frequency sound. Four different noise band pairs were used. They consist of a low-frequency band and a high-frequency band with different cut off frequency, and the low-frequency band was reproduced with a delay time of 25 ms to 200 ms to the high-frequency band. These stimuli were presented from one of five full-range speakers with different elevation set in the median plane. The speakers cover an upper vertical angle of 30 degrees and a lower vertical angle of 30 degrees. The subjects answered both the vertical sound image localization and the sound image width of the low-frequency band and the high-frequency band, respectively. As a result, while the high-frequency band was localized at the actual reproduction position, the low-frequency band showed a tendency of different vertical localization depending on the reproduction position. This tendency of the low-frequency band depending on the reproduction position was shown: in 30°, 15°, and 30°. As the delay time increased, the offset (deviation from the reproducing position) also increased, but the direction of deviation was different.
Engineering Brief 514 (Download now)
EB04-7 Development of a 4-pi Sampling Reverberator, VSVerb—Application to In-Game Sounds—Masataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Akira Omoto, Kyushu University - Fukuoka, Japan; Onfuture Ltd. - Tokyo, Japan; Yasuhiko Nagatomo, Evixar Inc. - Tokyo, Japan
The authors develop a 4-pi sampling reverberator named “VSVerb.” The VSVerb restores a 4-pi reverberant field by using information of dominant reflections, i.e., dominant virtual sound sources, which are captured in a target space. The distances, amplitudes, and locations of virtual sound sources are detected from measured x, y, z sound intensities at the site, and they are translated into time responses. The generated reverb-effect by the VSVerb provides high flexibility in sound design of post-production works. In order to verify its practicability, reverb-effects at several positions in a virtual room of a video game are re-generated from a VSVerb data that is sampled at one position in a real room. Re-generated reverbs for in-game sounds are implemented into a Dolby Atmos compliant production flow, and their spatial impressions and sound qualities are listened to be checked.
Engineering Brief 515 (Download now)