AES Milan 2018
Engineering Brief EB04
EB04 - Spatial Audio
Friday, May 25, 14:15 — 15:45 (Scala 2)
Frank Schultz, University of Music and Performing Arts Graz - Graz, Austria
EB04-1 Ambilibrium—A User-Friendly Ambisonics Encoder/Decoder-Matrix Designer Tool—Michael Romanov, University of Music and Performing Arts Graz - Graz, Austria
The name "Ambilibrium" is composed of the terms "Ambi" (prefix)—Latin: around and "aequilibrium"—Latin: balance. This engineering brief discusses the implementation of a tool that creates the spatial balance within any spherical microphone array to any loudspeaker array in the form of custom Ambisonics encoder / decoder matrices packed in a user friendly interface. A technique for automatic loudspeaker position estimation using similar approach to GPS and the optimization of the AllRAD are also discussed in this engineering brief.
Engineering Brief 428 (Download now)
EB04-2 Distant Speech Beamforming Improving Multi-User ASR—Adam Kupryjanow, Intel Technology Poland - Gdansk, Poland; Raghavendra Rao R, Intel Technology - Devarabeesanahalli, KA, India; Przemek Maziewski, Intel Technology Poland - Gdansk, Poland; Lukasz Kurylo, Intel Technology Poland - Gdansk, Poland
In this paper an algorithm that improves side speaker attenuation for super directive beamformer like MVDR (Minimum Variance Distortionless Response) is presented. This technique can be utilized in a scenario where there are multiple people in a room intending to interact with an ASR- (automatic speech recognition) enabled device, e.g., smart speaker. The experiments show that the proposed solution gives a reduction of WER (word error rate) up to 23.93% calculated for command uttered by one user when a second user was treated as the side speaker.
Engineering Brief 429 (Download now)
EB04-3 An Efficient Method for Producing Binaural Mixes of Classical Music from a Primary Stereo Mix—Tom Parnell, BBC Research & Development - Salford, UK; Chris Pike, BBC R&D - Salford, UK; University of York - York, UK
Radio audiences in the UK are increasingly listening using headphones, and binaural mixes are likely to offer more natural and immersive classical musical experiences than stereo broadcasts. However, the stereo mix is currently a priority for broadcasters, and producers have limited resources to create an additional, binaural mix. This engineering brief describes the semi-automated workflow used to produce binaural mixes of performances from the BBC Proms. Spatial audio mixes were created by repositioning the individual microphone signals from the stereo broadcast in three dimensions, and adding ambient signals captured using a 3D microphone array. A commercial mixing application was used for spatial panning and binaural rendering, and the resulting binaural audio was streamed live online. Comments on the production workflow were collected from the music balancers, and audience responses were surveyed.
Engineering Brief 430 (Download now)
EB04-4 The Anaglyph Binaural Audio Engine—David Poirier-Quinot, Sorbonne Université, CNRS - Paris, France; Brian F. G. Katz, Sorbonne Université, CNRS, Institut Jean Le Rond d'Alembert - Paris, France
Anaglyph is part of an ongoing research effort into the perceptual and technical capabilities of binaural rendering. The Anaglyph binaural audio engine is a VST audio plugin for binaural spatialization integrating the results of over a decade of spatial hearing research. Anaglyph has been designed as an audio plugin to both support ongoing research efforts as well as to make accessible the fruits of this research to audio engineers through traditional existing DAW environments. Among its features, Anaglyph includes a personalizable morphological ITD model, near-field ILD and HRTF parallax corrections, a Localization Enhancer, an Externalization Booster, and SOFA HRIR file support. The basic architecture and implementation of each audio-related component is presented here.
Engineering Brief 431 (Download now)
Engineering Brief 432 (Download now)
EB04-6 HOBA-VR: HRTF On Demand for Binaural Audio in Immersive Virtual Reality Environments—Michele Geronazzo, Aalborg University - Copenhagen, Denmark; Jari Kleimola, Hefio Ltd., - Espoo, Finland; Erik Sikström, Virsabi ApS - Copenhagen, Denmark; Amalia de Götzen, Aalborg University - Copenhagen, Denmark; Stefania Serafin, Aalborg University - Copenhagen, Denmark; Federico Avanzini, University of Milan - Milan, Italy; Dept. of Computer Science
One of the main challenges of spatial audio rendering in headphones is the personalization of the so-called head-related transfer functions (HRTFs). HRTFs capture the listener’s acoustic effects supporting immersive and realistic virtual reality (VR) contexts. This e-brief presents the HOBA-VR framework that provides a full-body VR experience with personalized HRTFs that were individually selected on demand based on anthropometric data (pinnae shapes). The proposed WAVH transfer format allows a flexible management of this customization process. A screening test aiming to evaluate user localization performance with selected HRTFs for a non-visible spatialized audio source is also provided. Accordingly, is might be possible to create a user profile that contains also individual non-acoustic factors such as localizability, satisfaction, and confidence.
Engineering Brief 433 (Download now)