AES New York 2019
Immersive & Spatial Audio Track Event Details

Wednesday, October 16, 9:00 am — 10:30 am (1E06)

Game Audio & XR: GA01 - 4-Pi Reverb Effects for In-Game Sounds

Tomoya Kishi, CAPCOM Co., Ltd. - Japan
Steve Martz, THX Ltd. - San Rafael, CA, USA
Masataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan
Kazutaka Someya, beBlue Co., Ltd. - Tokyo, Japan

In a video game, sound fields of virtual spaces are created in a 4-pi field which is free from channel restrictions. That is, in a video game, there are no picture frames, which cut out a part of sound fields, nor channel borders, which divide sound fields into finite numbers of areas. The workshop introduces how to create 4-pi channel-free reverberations for in-game sounds both from current and future technical points of the views. Demonstrations will be also provided, and reverberations that are generated by proposed methods will be listened to be compared with conventional reverb sounds that are created by a skilled mixing engineer.

AES Technical Council This session is presented in association with the AES Technical Committee on Audio for Games and AES Technical Committee on Spatial Audio


Wednesday, October 16, 9:15 am — 10:45 am (1E08)

Game Audio & XR: GA02 - Abbey Road Spatial Audio Forum—Music Production in VR and AR

Gavin Kearney, University of York - York, UK
Stephen Barton, Respawn Entertainment/EA - Los Angeles, CA, USA; Afterlight Inc. - Los Angeles, CA, USA
Etienne Corteel, L-Acoustics - Marcoussis, France
Oliver Kadel, 1.618 Digital - London, UK; University Of West London - London, UK
Muki Kulhan, Muki International - UK
Hyunkook Lee, University of Huddersfield - Huddersfield, UK
Mirek Stiles, Abbey Road Studios - London, UK

Virtual and augmented reality offers a new platform for the creation, production, and consumption of immersive music experiences. Immersive technologies now have the power to create experiences that transform how we experience music, from transporting the listener to the original recording studio in VR or even bringing the musicians to their own living room in an AR scenario. However, the creation of such audio experiences has many challenges. With different parts of the immersive audio production chain being developed by various third parties, there is a danger of confusing the producer/musician and perhaps scaring off talent before we even get off the ground. Can we do better? What are the barriers and how can they be broken down? What are the strengths and weaknesses of existing tools? Can we achieve better clarity in the different formats that are available, and should we move towards standardization? In this open panel discussion of the Abbey Road Spatial Audio Forum, we will be looking at workflow challenges for recording, mixing, and distributing music for VR and AR.

AES Technical Council This session is presented in association with the AES Technical Committee on Audio for Games


Wednesday, October 16, 10:30 am — 12:00 pm (1E17)


Immersive & Spatial Audio: IS01 - ISSP: Immersive Sound System Panning. An Interactive Software Application and Tools for Live Performances

Ianina Canalis, National University of Lanús - Buenos Aires, Argentina

Spatial audio has been gaining popularity in the area of commercial live performances, and immersive audio systems are now available for large as well as small concerts. There are a number of challenges in developing and implementing immersive and spatial audio systems; in particular the use of dedicated hardware interfaces. This workshop introduces the Immersive Sound System Panning (ISSP) software application, which allows a free choice in the position of the speakers and sound sources. ISSP has two versions; the first one processes the audio in a DiGiCo mixer and the second one, in a computer.

This software was designed with the specific aim of making it user-friendly and to make an immersive system that is more intuitive to mixing engineers, artists, and the public. The tools have been designed with the express needs of sound engineers and artists in mind, specifically as spatial sound is a new way of expressing music. The idea is that artists and the public can also take part in what happens with the audio as involving them will intensify the experience for everyone.

This workshop showcases the Immersive Sound System Panning (ISSP) application, the main features of the software and the tools that have been developed to spatialize the sound within the space. Within the workshop, the audience will be encouraged to get/use hands-on with the app and demonstrate its features, showing the intuitive nature of the app design and the ease with which sounds can be panned around the space.


Wednesday, October 16, 10:45 am — 12:00 pm (1E06)

Immersive & Spatial Audio: IS02 - Music Production in Immersive Formats: Alternative Perspectives

Thomas Aichinger, scopeaudio - Austria
Zachary Bresler, University of Agder - Kristiansand S, Vest-Agder, Norway
Sally Kellaway, Microsoft - Seattle, WA, USA
Jo Lord, University of West London - London, UK

Evidenced by the large volume of presentations on immersive audio at the previous AES conventions in Dublin and New York and the AES conferences on Immersive and Interactive Audio, Virtual and Augmented Reality, and Spatial Reproduction, 3D audio is of growing interest within our field. As Anastasia Devana of Magic Leap stated in her keynote at the IIA conference, it is the “bleeding edge” of our industry. In spatial audio, there are many competing ideas and technologies from delivery formats to production standards and aesthetics. From the perspective of music creators, the norms of music production and the terms used to describe practice are often clumsy or not helpful when applied in 3D. It is in this context that we propose this workshop on immersive music production. We will discuss several questions from the perspective of creatives in immersive and interactive music content. What are the changing ways that creators use and exploit 3D technologies? How do we describe the way that we make content for such systems? What are the practices, standards, and formats that creators use, and which ones should they use in the future? What are the interesting use-cases for 3D audio that challenge the way we think about music and audio production?

Zachary Bresler, Ph.D. fellow, University of Agder. Research explores music production in immersive formats and the staging of listeners in immersive compositional design.

Jo Lord, Ph.D. student, University of West London. Research investigates the development, practical application, and aesthetic suitability of 3D mix technique for record production.

Dr. Eve Klein, senior lecturer in music technology and popular music, University of Queensland. Currently researches within the VR/AR space, creating large-scale immersive festival experiences.

Thomas Aichinger, founder, scopeaudio. Studio specialized in sound design and post-production, focusing on spatial audio productions for VR and 360° videos.


Wednesday, October 16, 12:00 pm — 1:00 pm (1E17)

Immersive & Spatial Audio: IS11 - Genelec Play Immersive

A wide collection of excellent immersive music, nature and film recordings, conveyed via a point source 7.1.4 reproduction system.


Wednesday, October 16, 2:00 pm — 3:00 pm (1E17)


Immersive & Spatial Audio: IS12 - Florian Camerer Talk&Play

Florian Camerer, ORF - Austrian TV - Vienna, Austria; EBU - European Broadcasting Union

Having finally arrived where human beings are all the time, immersive audio recording and reproduction of sound is here to stay. Besides the ubiquitous 3D-audio bombardment of action movies, music and sound effects provide potentially more subtle but certainly not less compelling listening experiences. In the latter realm (atmosphere recording), Florian Camerer has gone the extra mile to explore the frontiers of quality for location sound setups in 3D audio. "Nothing is more practical than a good theory" is the foundation of his immersive outdoor rig. The thinking behind its dimensions will be covered, and the audience can judge themselves if the practical result holds up to the theory through the examples that will be played.


Wednesday, October 16, 2:45 pm — 4:15 pm (1E08)

Game Audio & XR: GA05 - Spatial Storytelling in Games

Rob Bridgett, Eidos Montreal - Montreal, Canada
Cedric Diaz, Senior Sound Designer, People Can Fly - New York, NY, USA
Jason Kanter, Audio Director, Avalanche Studios - New York, NY, USA
Phillip Kovats, WWS Sound, Sony Interactive Entertainment
Mark Petty, Gearbox Software

In this exciting panel discussion join several industry experts, all with deep experience in authoring narrative content for spatial audio entertainment platforms, in discussing some of the incredible opportunities and challenges of bringing stories to life using spatial elements. Our goal is to discuss the techniques, thinking, approaches, and execution of how 3D spaces interface with and infiltrate our storytelling practices. As audio directors, sound designers, mixers and storytellers, we will focus on how we are able to leverage spatial audio to bring a greater level of engagement, spectacle, and immersion for audiences inside our story worlds.


Wednesday, October 16, 3:00 pm — 4:30 pm (1E12)

Sound Reinforcement: SR02 - Canceled


Wednesday, October 16, 3:15 pm — 4:45 pm (1E21)

Recording & Production: RP04 - Spatial Audio Microphones

Helmut Wittek, SCHOEPS Mikrofone GmbH - Karlsruhe, Germany
Svein Berge, Harpex Ltd - Berlin, Germany
Gary Elko, mh acoustics - Summit, NJ USA
Len Moskowitz, Core Sound LLC - Teaneck, NJ, USA
Tomasz Zernicki, Zylia sp. z o.o. - Poznan, Poland

Multichannel loudspeaker setups as well as Virtual Reality applications enable Spatial sound to be reproduced with large resolution. However, on the recording side it is more complicated to gather a large spatial resolution. Various concepts exist in theory and practice for microphone arrays. In this workshop the different concepts are presented by corresponding experts and differences, applications as well as pros and cons are discussed. The different array solutions include coincident and spaced Ambisonics arrays as well as Stereophonic multi-microphone (one-point) arrays.

AES Technical Council This session is presented in association with the AES Technical Committee on Microphones and Applications


Wednesday, October 16, 4:15 pm — 5:45 pm (1E17)

Immersive & Spatial Audio: IS03 - Reproduction and Evaluation of Spatial Audio through Speakers

Juan Simon Calle Benitez, THX Ltd. - San Francisco, CA, USA
Patrick Flangan, THX Ltd.
Gavin Kearney, University of York - York, UK
Nils Peters, Qualcomm, Advanced Tech R&D - San Diego, CA, USA
Marcos Simon, AudioScenic - Southampton UK; University of Southampton - Southampton, UK

A lot has been discussed about the reproduction of spatial audio in headphones as it is a controlled environment to generate the signals that trick our brain to believe there are sources outside of our head. Speakers are another way to reproduce spatial audio, but less development and evaluation has been done as it is harder to reproduce binaural audio without coloration to the signal. In this panel we will discuss the benefits, the challenges, and the ways we can evaluate spatial audio reproduction in speakers. We will discuss topics like crosstalk cancellation, wavefield synthesis and multichannel arrays applied to real life applications like virtual surround, virtual reality and augmented reality and how to test subjectively and objectively the different characteristics of 3D audio.

AES Technical Council This session is presented in association with the AES Technical Committee on Audio for Games and AES Technical Committee on Spatial Audio


Wednesday, October 16, 4:30 pm — 6:00 pm (1E06)


Recording & Production: RP05 - 3D Microphone Technique Shootout—9.1 Demo and Discussion

Hyunkook Lee, University of Huddersfield - Huddersfield, UK

3D audio is rapidly becoming an industry standard for film, music, virtual reality, etc. Various types of 3D microphone arrays have been proposed over the recent years, and it is important to understand the perceptual differences among different techniques. This workshop presents a large-scale 3D mic array shootout recording session that has been conducted in St. Paul's concert hall in Huddersfield, UK. An open-access database for research and education has been created from this session. A total of 104 channels of audio were recorded simultaneously for 7 different main array configurations for 9.1 (a.k.a. 5.1.4) reproduction (OCT, 2L-Cube, PCMA v1&2, Decca Tree, Hamasaki-Cube v1&2), a 32-channel spherical microphone array, a First-Order Ambisonics microphone and a dummy head. The microphones used for the main arrays were from the same manufacturer to minimize spectral differences. Various sound sources including string quartet, piano trio, a cappella singers, organ, clarinet, piano, etc., were recorded. This workshop will explain the basic psychoacoustic principles of the arrays used, with accompanying 9.1 demos of the recordings and pictures. The pros and cons of each technique depending on the type of sound source will be discussed in depth.


Wednesday, October 16, 4:30 pm — 5:30 pm (1E08)

Game Audio & XR: GA06 - Simulating Real World Acoustic Phenomena: From Graphics to Audio

Christophe Tornieri, Audiokinetic

Simulating real world acoustic phenomena in virtual environments drastically enhances immersion. However, computing high order reflections and convincing diffraction towards more dynamic and realistic audio scenes is a challenging task to achieve in real-time. In this talk we propose an approach derived from modern 3D graphic rendering techniques used in CGI. We will first introduce the general concepts behind ray tracing and stochastic methods, then present the adaptation of these techniques to spatial audio, focusing on how to compute reflections and diffraction while maintaining time and spatial coherence.

AES Technical Council This session is presented in association with the AES Technical Committee on Audio for Games and AES Technical Committee on Spatial Audio


Thursday, October 17, 9:00 am — 10:30 am (1E21)

Game Audio & XR: GA07 - MPEG-H 3D Audio Goes VR

Jürgen Herre, International Audio Laboratories Erlangen - Erlangen, Germany; Fraunhofer IIS - Erlangen, Germany
Adrian Murtaza, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Nils Peters, Qualcomm, Advanced Tech R&D - San Diego, CA, USA

The MPEG-H 3D Audio is a recent MPEG standard that was designed to represent and render 3D audio experiences while supporting all known production paradigms (channel-based, object-based, and Higher Order Ambisonics based audio) and reproduction setups (loudspeaker, headphone/binaural). As the audio production world moves forward to embrace Virtual and Augmented Reality (VR/AR), MPEG-H found considerable adoption and re-use in recently finalized VR standards, such as MPEG-I OMAF(Omnidirectional Media Format), VR Industry Forum (VR-IF) Guidelines as well as 3GPP "VRStream" (Virtual Reality profiles for streaming applications) where it was selected as the audio standard for VR content delivered over 5G networks.

AES Technical Council This session is presented in association with the AES Technical Committee on Coding of Audio Signals


Thursday, October 17, 10:30 am — 12:00 pm (1E06)

Recording & Production: RP06 - Immersive Music Listening Session: Critical Listening and Recording Techniques

David Bowles, Swineshead Productions LLC - Berkeley, CA, USA
Paul Geluso, New York University - New York, NY, USA

In this workshop, new immersive music recordings will be presented followed by a brief technical discussion by their creators. Paul Geluso and David Bowles will host the session presenting their recent work and invite other recording engineers and music producers working in immersive formats to present recent works as well. Tracks will be played in their entirety to preserve their artistic impact and create an environment for critical listening. Following playback of each work will be a brief presentation and Q and A session. New immersive recording techniques designed specifically to optimize Dolby ATMOS compatibility will be presented by Geluso and Bowles as well.


Thursday, October 17, 12:00 pm — 1:00 pm (1E17)


Immersive & Spatial Audio: IS13 - Morten Lindberg Talk&Play

Morten Lindberg, 2L (Lindberg Lyd AS) - Oslo, Norway

Morten: “We should create the sonic experience that emotionally moves the listener to a better place. Immersive Audio is a completely new conception of the musical experience.” Listen to Morten and some of his most legendary recordings.


Thursday, October 17, 1:15 pm — 4:15 pm (1E10)

Paper Session: P10 - Spatial Audio, Part 1

Sungyoung Kim, Rochester Institute of Technology - Rochester, NY, USA

P10-1 Use of the Magnitude Estimation Technique in Reference-Free Assessments of Spatial Audio TechnologyAlex Brandmeyer, Dolby Laboratories - San Francisco, CA, USA; Dan Darcy, Dolby Laboratories, Inc. - San Francisco, CA, USA; Lie Lu, Dolby Laboratories - San Francisco, CA, USA; Richard Graff, Dolby Laboratories, Inc. - San Francisco, CA, USA; Nathan Swedlow, Dolby Laboratories - San Francisco, CA, USA; Poppy Crum, Dolby Laboratories - San Francisco, CA, USA
Magnitude estimation is a technique developed in psychophysics research in which participants numerically estimate the relative strengths of a sequence of stimuli along a relevant dimension. Traditionally, the method has been used to measure basic perceptual phenomena in different sensory modalities (e.g., "brightness," "loudness"). We present two examples of using magnitude estimation in the domain of audio rendering for different categories of consumer electronics devices. Importantly, magnitude estimation doesn’t require a reference stimulus and can be used to assess general ("audio quality") and domain-specific (e.g., "spaciousness") attributes. Additionally, we show how this data can be used together with objective measurements of the tested systems in a model that can predict performance of systems not included in the original assessment.
Convention Paper 10273 (Purchase now)

P10-2 Subjective Assessment of the Versatility of Three-Dimensional Near-Field Microphone Arrays for Vertical and Three-Dimensional ImagingBryan Martin, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, QC, Canada; Jack Kelly, McGill University - Montreal, QC, Canada; Brett Leonard, University of Indianapolis - Indianapolis, IN, USA; The Chelsea Music Festival - New York, NY, USA
This investigation examines the operational size-range of audio images recorded with advanced close-capture microphone arrays for three-dimensional imaging. It employs a 3D panning tool to manipulate audio images. The 3D microphone arrays used in this study were: Coincident-XYZ, M/S-XYZ, and Non-coincident-XYZ/five-point. Instruments of the orchestral string, woodwind, and brass sections were recorded. The objective of the test was to determine the point of three-dimensional expansion onset, preferred imaging, and image breakdown point. Subjects were presented with a continuous dial to manipulate the three-dimensional spread of the arrays, allowing them to expand or contract the microphone signals from 0° to 90° azimuth/elevation. The results showed that the M/S-XYZ array is the perceptually “biggest” of the capture systems under test and displayed the fasted sense of expansion onset. The coincident and non-coincident arrays are much less agreed upon by subjects in terms of preference in particular, and also in expansion onset.
Convention Paper 10274 (Purchase now)

P10-3 Defining Immersion: Literature Review and Implications for Research on Immersive Audiovisual ExperiencesSarvesh Agrawal, Bang & Olufsen a/s - Struer, Denmark; Department of Photonics Engineering, Technical University of Denmark; Adèle Simon, Bang & Olufsen a/s - Struer, Denmark; Søren Bech, Bang & Olufsen a/s - Struer, Denmark; Aalborg University - Aalborg, Denmark; Klaus Bærentsen, Aarhus University - Aarhus, Denmark; Søren Forchhammer, Technical University of Denmark - Lyngby, Denmark
The use of the term “immersion” to describe a multitude of varying experiences in the absence of a definitional consensus has obfuscated and diluted the term. This paper presents a non-exhaustive review of previous work on immersion on the basis of which a definition of immersion is proposed: a state of deep mental involvement in which the subject may experience disassociation from the awareness of the physical world due to a shift in their attentional state. This definition is used to contrast and differentiate interchangeably used terms such as presence and envelopment from immersion. Additionally, an overview of prevailing measurement techniques, implications for research on immersive audiovisual experiences, and avenues for future work are discussed briefly.
Convention Paper 10275 (Purchase now)

P10-4 Evaluation on the Perceptual Influence of Floor Level Loudspeakers for Immersive Audio ReproductionYannik Grewe, Fraunhofer IIS - Erlangen, Germany; Andreas Walther, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Julian Klapp, Fraunhofer IIS - Erlangen, Germany
Listening tests were conducted to evaluate the perceptual influence of adding a lower layer of loudspeakers to a setup that is commonly used for immersive audio reproduction. Three setups using horizontally arranged loudspeakers (1M, 2M, 5M), one with added height loudspeakers (5M+4H), and one with additional ?oor level loudspeakers (5M+4H+3L) were compared. Basic Audio Quality was evaluated in a sweet-spot test with explicit reference, and two preference tests (sweet-spot and off sweet-spot) were performed to evaluate the Overall Audio Quality. The stimuli, e.g., ambient recordings and sound design material, made dedicated use of the lower loudspeaker layer. The results show that reproduction comprising a lower loudspeaker layer is preferred compared to reproduction using the other loudspeaker setups included in the test.
Convention Paper 10276 (Purchase now)

P10-5 Investigating Room-Induced Influences on Immersive Experience Part II: Effects Associated with Listener Groups and Musical ExcerptsSungyoung Kim, Rochester Institute of Technology - Rochester, NY, USA; Shuichi Sakamoto, Tohoku University - Sendai, Japan
The authors previously compared four distinct multichannel playback rooms and showed that perceived spatial attributes of program material (width, depth, and envelopment) were similar across all four rooms when reproduced through a 22-channel loudspeaker array. The present study further investigated perceived auditory immersion from two additional variables: listener group and musical style. We found a three-way interaction of variables, MUSIC x (playback) ROOM x GROUP for 22-channel reproduced music. The interaction between musical material and playback room acoustics differentiates perceived auditory immersion across listener groups. However, in the 2-channel reproductions, the room and music interaction is prominent enough to flatten inter-group differences. The 22-channel reproduced sound fields may have shaped idiosyncratic cognitive bases for each listener group.
Convention Paper 10277 (Purchase now)

P10-6 Comparison Study of Listeners’ Perception of 5.1 and Dolby AtmosTomas Oramus, Academy of Performing Arts in Prague - Prague, Czech Republic; Petr Neubauer, Academy of Performing Arts in Prague - Prague, Czech Republic
Surround sound reproduction has been a common technology in almost every theater room for several decades. In 2012 Dolby Laboratories, Inc. announced a new spatial 3D audio format – Dolby Atmos [1] that (due to its object-based rendering) pushes the possibilities of spatial reproduction and supposedly listeners' experience forward. This paper examines listeners' perception of this format in comparison with today's unwritten standard for cinema reproduction – 5.1. Two sample groups were chosen for the experiment - experienced listeners (sound designers and sound design students) and inexperienced listeners; the objective was to examine how these two groups perceive selected formats and whether there is any difference between these two groups. We aimed at five aspects – Spatial Immersion (Envelopment), Localization, Dynamics, Audio Quality, and Format Preference. The results show mostly an insignificant difference between these two groups while both of them slightly leaned towards Dolby Atmos over 5.1.
Convention Paper 10278 (Purchase now)


Thursday, October 17, 4:30 pm — 5:30 pm (1E08)


Immersive & Spatial Audio: IS04 - 3D Audio Philosophies & Techniques for Commercial Music

Bt Gibbs, Skyline Entertainment and Publishing - Morgan Hill, CA, USA; Tool Shed Studios - Morgan Hill, CA, USA

3D Audio (360 Spatial) for immersive content has made massive strides forward in just the first five months of 2019. However, the majority of content remains in the animated VR world. Commercial audio (in all genres) continues to be delivered across streaming and download platforms in L+R stereo audio. With the binaural (headphone) delivery options for spatial audio as a topic of discussion for many major hi-res audio delivery platforms, commercial music delivery options are coming very soon. The ability for commercial artists to deliver studio quality audio (if not MQA) to consumers with an "in-the-studio" experience soon will be delivered in ambisonic formats.
This presentation will demonstrate studio sessions delivered in 360 video and stereo mixes translated to static (non-HRTF) 360 audio, which was originally captured for standard stereo delivery through traditional streaming and download sites. All of this audio is prepared to be delivered in a simultaneous (and rapid) turn around from pre-production to final masters delivered on both 360 and stereo platforms. To do so, requires planning in even in the earliest of (pre-production) stages prior to actual recording.


Friday, October 18, 9:00 am — 10:00 am (1E08)

Immersive & Spatial Audio: IS05 - Building Listening Tests in VR

Gavin Kearney, University of York - York, UK
Tomasz Rudzki, University of York - York, UK
Benjamin Tsui, University of York - York, UK

In this workshop we will demonstrate how to prepare and conduct various listening tests in VR easily with the tools we created. Our open source tool-box consists of Unity objects, VST plugins, and MATLAB based data analysis app. It provides an end-to-end workflow from creating the test to visualizing the results. Researchers can create their own listening tests which can be run on different VR headsets, e.g., Oculus Rift, HTC Vive. We will dive into some of the actual use-cases to show the practicality and robustness of using audio-visual VR presentation for perceptual tests. We would like to encourage researchers to use the toolbox, express feedback, and contribute to the project development.

AES Technical Council This session is presented in association with the AES Technical Committee on Audio for Games


Friday, October 18, 9:00 am — 10:15 am (1E06)

Recording & Production: RP12 - Recording and Realizing Immersive Classical Music For, and with, Dolby Atmos

John Loose, Dolby Laboratories, Inc. - San Francisco, CA, USA
David Bowles, Swineshead Productions LLC - Berkeley, CA, USA
Morten Lindberg, 2L (Lindberg Lyd AS) - Oslo, Norway
Jack Vad, San Francisco Symphony - San Francisco, CA, USA

Producing for Classical releases in immersive formats like Dolby Atmos has unique considerations unique to the genre. Dolby’s John Loose will moderate this lively panel of Grammy nominated and winning Classical engineers and producers will discuss translation from microphones to immersive playback environments including binaural Dolby Atmos playback.


Friday, October 18, 9:00 am — 11:00 am (1E10)

Paper Session: P13 - Spatial Audio, Part 2

Doyuen Ko, Belmont University - Nashville, TN, USA

P13-1 Simplified Source Directivity Rendering in Acoustic Virtual Reality Using the Directivity Sample CombinationGeorg Götz, Aalto University - Espoo, Finland; Ville Pulkki, Aalto University - Espoo, Finland
This contribution proposes a simplified rendering of source directivity patterns for the simulation and auralization of auditory scenes consisting of multiple listeners or sources. It is based on applying directivity filters of arbitrary directivity patterns at multiple, supposedly important directions, and approximating the filter outputs of intermediate directions by interpolation. This reduces the amount of required filtering operations considerably and thus increases the computational efficiency of the auralization. As a proof of concept, the simplification is evaluated from a technical as well as from a perceptual point of view for one specific use case. The promising results suggest further studies of the proposed simplification in the future to assess its applicability to more complex scenarios.
Convention Paper 10286 (Purchase now)

P13-2 Classification of HRTFs Using Perceptually Meaningful Frequency ArraysNolan Eley, New York University - New York, NY, USA
Head-related transfer functions (HRTFs) are essential in binaural audio. Because HRTFs are highly individualized and difficult to acquire, much research has been devoted towards improving HRTF performance for the general population. Such research requires a valid and robust method for classifying and comparing HRTFs. This study used a k-nearest neighbor (KNN) classifier to evaluate the ability of several different frequency arrays to characterize HRTFs. The perceptual impact of these frequency arrays was evaluated through a subjective test. Mel-frequency arrays showed the best results in the KNN classification tests while the subjective test results were inconclusive.
Convention Paper 10288 (Purchase now)

P13-3 An HRTF Based Approach towards Binaural Sound Source LocalizationKaushik Sunder, Embody VR - Mountain View, CA, USA; Yuxiang Wang, University of Rochester - Rochester, NY, USA
With the evolution of smart headphones, hearables, and hearing aids there is a need for technologies to improve situational awareness. The device needs to constantly monitor the real world events and cue the listener to stay aware of the outside world. In this paper we develop a technique to identify the exact location of the dominant sound source using the unique spectral and temporal features listener’s head-related transfer functions (HRTFs). Unlike most state-of-the-art beamforming technologies, this method localizes the sound source using just two microphones thereby reducing the cost and complexity of this technology. An experimental framework is setup at the EmbodyVR anechoic chamber, and hearing aid recordings are carried out for several different trajectories, SNRs, and turn-rates. Results indicate that the source localization algorithms perform well for dynamic moving sources for different SNR levels.
Convention Paper 10289 (Purchase now)

P13-4 Physical Controllers vs. Hand-and-Gesture Tracking: Control Scheme Evaluation for VR Audio MixingJustin Bennington, Belmont University - Nashville, TN, USA; Doyuen Ko, Belmont University - Nashville, TN, USA
This paper investigates potential differences in performance for both physical and hand-and-gesture control within a Virtual Reality (VR) audio mixing environment. The test was designed to draw upon prior evaluations of control schemes for audio mixing while presenting sound sources to the user for both controller schemes within VR. A VR audio mixing interface was developed in order to facilitate a subjective evaluation of two control schemes. Response data was analyzed with t- and ANOVA tests. Physical controllers were generally rated higher than the hand-and-gesture controls in terms of perceived accuracy, efficiency, and satisfaction. No significant difference in task completion time for either control scheme was found. The test participants largely preferred the physical controllers over the hand-and-gesture control scheme. There were no significant differences in the ability to make adjustments in general when comparing groups of more experienced and less experienced audio engineers.
Convention Paper 10290 (Purchase now)


Friday, October 18, 10:00 am — 11:00 am (1E17)

Immersive & Spatial Audio: IS14 - Mick Sawaguchi Immersive Preview

Mick Sawaguchi, Mick Sound Lab - Tokyo, Japan

Mick Sawaguchi recently made unique 7.1.4 music recordings in Finland, based on his vast immersive experience and new microphone techniques. This session is an apetiser for the full suite, to be presented at InterBEE in November.


Friday, October 18, 10:15 am — 11:45 am (1E08)

Recording & Production: RP13 - Platinum Mastering: Mastering Immersive Audio

Michael Romanowski, Coast Mastering - Berkeley, CA, USA; The Tape Project
Stefan Bock, msm-studios GmbH - Munich, Germany
Gavin Lurssen, Lurssen Mastering - Los Angeles, CA, USA
Andres A. Mayo, 360 Music Lab - Buenos Aires, Argentina
Darcy Proper, Darcy Proper Mastering at Valhalla Studios - Auburn, NY, USA
Mark Wilder

The Long Running Platinum Mastering Panel will focus on Mastering in an Immersive environment. Mastering experts from around the world discuss the challenges and expectations of mastering for release and distribution. The focus will be on the many aspects of the emerging popularity of immersive music and the challenges mastering engineers face with delivery, formats, work flow and many other details related to providing the best combination of the Art and Technical aspects for the artist and the consumer. There are many different formats, with different requirements, vying for prominence and mastering engineers need to be prepared for each.


Friday, October 18, 10:30 am — 12:00 pm (1E06)

Game Audio & XR: GA13 - Borderlands 3 - The Wild West of Atmos

Brian Fieser, Gearbox Software
Julian Kwasneski, Bay Area Sound
Mark Petty, Gearbox Software
William Storkson, Bay Area Sound

This event will cover: Emitter / object based in game design vs rendered 7.1.4 assets; How do these two approaches differ when it comes to spatialization from the player perspective; Understanding the end user environment and mixing for Atmos virtualization; Atmos for headphones; Linear post / cinematic design for Atmos; Mix perspectives / how aggressive should we be with height information— Atmos is the wild west; Using 7.1.4 discreet as a tool to better understand inconsistencies in game / run time spatial information.

AES Technical Council This session is presented in association with the AES Technical Committee on Audio for Games


Friday, October 18, 11:00 am — 12:00 pm (1E17)

Immersive & Spatial Audio: IS15 - Hyunkook Lee Talk&Play

This session will demonstrate various 7.1.4 and 4.0.4 immersive 3D recordings made using PCMA-3D and ESMA-3D microphone techniques. The demos will include an orchestral concert recorded at Victoria Hall, Geneva, choral performances recorded at Merton College Chapel, Oxford and York Minster, an organ performance at Huddersfield Town Hall and some soundscapes of New York City. The psychoacoustic principles of the microphone arrays used will also be explained.


Friday, October 18, 1:45 pm — 4:15 pm (1E10)

Paper Session: P14 - Spatial Audio, Part 3

Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland

P14-1 Measurement of Oral-Binaural Room Impulse Response by Singing ScalesMunhum Park, King Mongkut's Institute of Technology Ladkrabang - Bangkok, Thailand
Oral-binaural room impulse responses (OBRIRs) are the transfer functions from mouth to ears measured in a room. Modulated by many factors, OBRIRs contain information for the study of stage acoustics from the performer’s perspective and can be used for auralization. Measuring OBRIRs on a human is, however, a cumbersome and time-consuming process. In the current study some issues of the OBRIR measurement on humans were addressed in a series of measurements. With in-ear and mouth microphones volunteers sang scales, and a simple post-processing scheme was used to re?ne the transfer functions. The results suggest that OBRIRs may be measured consistently by using the proposed protocol, where only 4~8 diatonic scales need to be sung depending on the target signal-to-noise ratio.
Convention Paper 10291 (Purchase now)

P14-2 Effects of Capsule Coincidence in FOA Using MEMS: Objective ExperimentGabriel Zalles, University of California, San Diego - La Jolla, CA, USA
This paper describes an experiment attempting to determine the effects of capsule coincidence in First Order Ambisonic (FOA) capture. While the spatial audio technique of ambisonics has been widely researched, it continues to grow in interest with the proliferation of AR and VR devices and services. Specifically, this paper attempts to determine whether the increased capsule coincidence afforded by Micro-Electronic Mechanical Systems (MEMS) capsules can help increase the impression of realism in spatial audio recordings via objective and subjective analysis. This is the first of a two-part paper.
Convention Paper 10292 (Purchase now)

P14-3 Spatial B-Format EqualizationAlexis Favrot, Illusonic GmbH - Uster, Switzerland; Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland
Audio corresponding to the moving picture of a virtual reality (VR) camera can be recorded using a VR microphone. The resulting A or B-format channels are decoded with respect to the look-direction for generating binaural or multichannel audio following the visual scene. Existing post-production tools are limited to only linear matrixing and filtering of the recorded channels when only the signal of a VR microphone is available. A time-frequency adaptive method is presented: providing native B-format manipulations, such as equalization, which can be applied to sound arriving from a specific direction with a high spatial resolution, yielding a backwards compatible modified B-format signal. Both linear and adaptive approaches are compared to the ideal case of truly equalized sources.
Convention Paper 10293 (Purchase now)

P14-4 Exploratory Research into the Suitability of Various 3D Input Devices for an Immersive Mixing TaskDiego I Quiroz Orozco, McGill University - Montreal, QC, Canada; Denis Martin, McGill University - Montreal, QC, Canada; CIRMMT - Montreal, QC, Canada
This study evaluates the suitability of one 2D (mouse and fader) and three 3D (Leap Motion, Space Mouse, Novint Falcon) input devices for an immersive mixing task. A test, in which subjects were asked to pan a monophonic sound object (probe) to the location of a pink noise burst (target), was conducted in a custom 3D loudspeaker array. The objectives were to determine how quickly the subjects were able to perform the task using each input device, which of the four was most appropriate for the task, and which was most preferred overall. Results show significant differences in response time between 2D and 3D input devices. Furthermore, it was found that localization blur had a significant influence over the subject’s response time, as well as “corner” locations.
Convention Paper 10294 (Purchase now)

P14-5 The 3DCC Microphone Technique: A Native B-format Approach to Recording Musical PerformanceKathleen "Ying-Ying" Zhang, New York University - New York, NY, USA; McGill University - Montreal, QC, Canada; Paul Geluso, New York University - New York, NY, USA
In this paper we propose a “native” B-format recording technique that uses dual-capsule microphone technology. The three dual coincident capsule (3DCC) microphone array is a compact sound?eld capturing system. 3DCC’s advantage is that it requires minimal matrix processing during post-production to create either a B-format signal or a multi-pattern, discrete six-channel output with high stereo compatibility. Given its versatility, the system is also capable of producing a number of different primary and secondary signals that are either natively available or derived in post-production. A case study of the system’s matrixing technique has resulted in robust immersive imaging in a multichannel listening environment, leading to the possibility of future development of the system as a single six-channel soundfield microphone.
Convention Paper 10295 (Purchase now)


Friday, October 18, 2:15 pm — 3:15 pm (1E08)

Immersive & Spatial Audio: IS06 - Capturing Reality with the Use of Spatial Sound and High Order Ambisonics – Ethnographic and Six Degrees of Freedom (6DoF) Case Studies

Tomasz Zernicki, Zylia sp. z o.o. - Poznan, Poland
Florian Grond, McGill University - Montreal, Canada
Eduardo Patricio, Zylia Sp. z o.o. - Poznan, Poland
Zack Settel, University of Montreal - Montreal, Quebec, Canada

This workshop will present spatial sound works from a practical perspective. Professional audio engineers and musicians will discuss their 360°, 3D, and ambient productions combining sound, image, and written text. The speakers will address the use of spatial audio and ambisonics for creating immersive representations of reality, including six-degrees-of-freedom live recorded sound. The need for new thinking and specific tools will be discussed and demonstrations of examples created via experimental 6DoF end-to-end workflows will be presented. The workshop will focus especially on the usage of spherical microphone arrays that enables the recording of entire 3D sound scenes as well as six degrees of freedom experiences (6DoF VR). Additionally, the workshop will address Ambisonics and the separation of individual sound sources in post-production, which give creators access to unique sonic possibilities.


Friday, October 18, 3:00 pm — 4:00 pm (1E17)


Immersive & Spatial Audio: IS16 - Florian Camerer Talk&Play

Florian Camerer, ORF - Austrian TV - Vienna, Austria; EBU - European Broadcasting Union

Having finally arrived where human beings are all the time, immersive audio recording and reproduction of sound is here to stay. Besides the ubiquitous 3D-audio bombardment of action movies, music and sound effects provide potentially more subtle but certainly not less compelling listening experiences. In the latter realm (atmosphere recording), Florian Camerer has gone the extra mile to explore the frontiers of quality for location sound setups in 3D audio. "Nothing is more practical than a good theory" is the foundation of his immersive outdoor rig. The thinking behind its dimensions will be covered, and the audience can judge themselves if the practical result holds up to the theory through the examples that will be played.


Friday, October 18, 3:15 pm — 5:15 pm (1E21)


Sound Reinforcement: SR07 - Psychoacoustics for Sound Engineers

Peter Mapp, Peter Mapp Associates - Colchester, Essex, UK

The tutorial will discuss and illustrate a number of psychoacoustic phenomena and effects that we use every day when designing and setting up sound systems. Understanding how we hear and discriminate sounds can lead to improved system design, alignment, and optimization. Topics that will be discussed include how we integrate and perceive sounds arriving from different directions and at different times and levels, perception of frequency and frequency balance, the audible effects of latency in systems, IEMs and sound / video synchronization, the Lombard speech and level effect, binaural listening versus monaural measurement.

AES Technical Council This session is presented in association with the AES Technical Committee on Acoustics and Sound Reinforcement


Friday, October 18, 3:30 pm — 4:30 pm (1E08)

Immersive & Spatial Audio: IS07 - Six-Degrees-of-Freedom (6DoF) Sound Capture and Playback Using Multiple Higher Order Ambisonics (HOA) Microphones

Lukasz Januszkiewicz, Zylia Sp. z o.o. - Poznan, Poland
Eduardo Patricio, Zylia Sp. z o.o. - Poznan, Poland
Tomasz Zernicki, Zylia sp. z o.o. - Poznan, Poland

This workshop is addressed to all who wish to learn about capturing immersive audio scenes for the purpose of virtual reality or music production. More specifically, this workshop will focus on a strategy for recording sound and enabling six-degrees-of-freedom playback, making use of multiple simultaneous and synchronized Higher Order Ambisonics (HOA) recordings. Such strategy enables users to navigate a simulated 3D space and listen to the six-degrees-of-freedom recordings from different perspectives. Additionally, during this workshop we will describe challenges related to creating a Unity based navigable 3D audiovisual playback system.


Saturday, October 19, 9:00 am — 12:00 pm (1E13)

Immersive & Spatial Audio: IS08 - Ambisonics Tools for Immersive Audio Capture and Post-Production

Ianina Canalis, National University of Lanús - Buenos Aires, Argentina
Brian Glasscock, Sennheiser
Andres A. Mayo, 360 Music Lab - Buenos Aires, Argentina
Martin Muscatello, 360 Music Lab

Over the past 3 years, immersive audio production tools evolved considerably and allowed producers to combine them in many different ways. In this workshop we will provide an in-depth explanation of how Ambisonics works and how it can be a central piece of an immersive audio production workflow. Attendees will be able to experiment with dedicated hardware and software tools during the entire workshop. Bring your own laptop and headphones for a unique learning session!

Preregistration is required for this event. Tickets are $75 (member) and $125 (non-member) and can be purchased on-line when you register for the convention. Seating is limited. For more information and to register click here.


Saturday, October 19, 9:00 am — 10:30 am (1E08)

Immersive & Spatial Audio: IS09 - Producing High-Quality 360/3D VR Concert Videos with 3D Immersive Audio

Ming-Lun Lee, University of Rochester - Rochester, NY, USA
Steve Philbert, University of Rochester - Rochester, NY, USA

Our 3D Audio Research Laboratory at the University of Rochester has recorded over 40 concerts at the Eastman School of Music since Fall 2017. We have used an Orah 4i 4K 360 VR Camera and a Kandao Obsidian R 8K 3D 360 VR Camera to make 360/3D video recordings, as well as two Neumann KU100 Binaural Microphones, a Sennheiser Ambeo Smart Headset, a 32-element mh acoustics em32 Eigenmike microphone array, a Sennheiser Ambeo VR Microphone, a Zoom H3-VR Handy Recorder, a Core Sound TetraMic, and a Core Sound OctoMic to make 3D immersive audio recordings. With Adobe Premiere, we have been able to edit and render high-quality 8K 360/3D concerts videos mixed with binaural recordings for head-locked binaural audio or Ambisonic recordings for head-tracking binaural audio.

This workshop aims to show our optimized workflows for making high-quality VR concert videos from recording, editing, rendering, and finally publishing on YouTube and Facebook. We plan to demonstrate some essential recording and editing techniques with practical examples for the attendants to hear binaural audio with headphones. Making long concert VR videos is much more challenging than making short VR music videos. We have encountered and investigated so many technical issues, including stitching, video/audio drifting, synchronization, and equalization. Therefore, we also want to share our experiences in resolving some critical A/V issues and improving the audio quality. Our session also welcomes the audience to join discussions and share their experiences.


Saturday, October 19, 10:30 am — 11:30 am (1E06)


Immersive & Spatial Audio: IS10 - Music Production for Dolby Atmos

Lasse Nipkow, Silent Work LLC - Zurich, Switzerland

Multichannel music productions are increasingly using Dolby Atmos, the audio format originally conceived for the cinema. The speaker setup provides ceiling speakers, which are located in very different positions on the ceiling depending on the number of speakers. This requires special strategies for recording and mixing, so that the result is great regardless of the speaker configuration in the listening room. On the one hand, caution is advised when reproducing direct sound vertically from the ceiling to the listener, as musical instruments are nearly never located directly above the listener during live performances. On the other hand, the ceiling speakers should be used, as they make a significant contribution to the immersive effect.

For larger speaker configurations, additional speakers are added to the side. These make it possible to localize stable sound sources on the side. Sideways positioned instruments lead to largely undesirable results for music because the interaural level differences in this case can be very large and thus appear uncomfortable for the listener. It is therefore recommended to use signals for those speakers that do not lead to such large level differences. One possibility to avoid this unwanted effect can be reached by simultaneously reproducing signals from both side speakers that are so short that their localization from the side does not lead to a disbalance of the mix.

During the workshop Lasse Nipkow will explain the most important phenomena that can be used for music in Dolby Atmos and explain how they can best be used for a loudspeaker setup in a large auditorium. During the presentation various sound and video examples will be shown.


Saturday, October 19, 10:30 am — 12:00 pm (South Concourse A)

Poster: P16 - Posters: Spatial Audio

P16-1 Calibration Approaches for Higher Order Ambisonic Microphone ArraysCharles Middlicott, University of Derby - Derby, UK; Sky Labs Brentwood - Essex, UK; Bruce Wiggins, University of Derby - Derby, Derbyshire, UK
Recent years have seen an increase in the capture and production of ambisonic material due to companies such as YouTube and Facebook utilizing ambisonics for spatial audio playback. Consequently, there is now a greater need for affordable high order microphone arrays due to this uptake in technology. This work details the development of a five-channel circular horizontal ambisonic microphone intended as a tool to explore various optimization techniques, focusing on capsule calibration & pre-processing approaches for unmatched capsules.
Convention Paper 10301 (Purchase now)

P16-2 A Qualitative Investigation of Soundbar TheoryJulia Perla, Belmont University - Nashville, TN, USA; Wesley Bulla, Belmont University - Nashville, TN, USA
This study investigated basic acoustic principals and assumptions that form the foundation of soundbar technology. A qualitative listening test compared 12 original soundscape scenes each comprising five stationary and two moving auditory elements. Subjects listened to a 5.1 reference scene and were asked to rate “spectral clarity and richness of sound,” “width and height,” and “immersion and envelopment” of stereophonic, soundbar, and 5.1 versions of each scene. ANOVA revealed a significant effect for all three systems. In all three attribute groups, stereophonic was rated lowest, followed by soundbar, then surround. Results suggest waveguide-based “soundbar technology” might provide a more immersive experience than stereo but will not likely be as immersive as true surround reproduction.
Convention Paper 10302 (Purchase now)

P16-3 The Effect of the Grid Resolution of Binaural Room Acoustic Auralization on Spatial and Timbral FidelityDale Johnson, University of Huddersfield - Huddersfield, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
This paper investigates the effect of the grid resolution of binaural room acoustic auralization on spatial and timbral fidelity. Binaural concert hall stimuli were generated using a virtual acoustics program utilizing image source and ray tracing techniques. Each image source and ray were binaurally synthesized using Lebedev grids of increasing resolution from 6 to 5810 (reference) points. A MUSHRA test was performed where subjects rated the magnitudes of spatial and timbral differences of each stimulus to the reference. Overall, it was found that on the MUSHRA scale, 6 points were perceived to be "Fair," 14 points "Good," and 26 points and above all "Excellent" on the grading scale, for both spatial and timbral fidelity.
Convention Paper 10303 (Purchase now)

P16-4 A Compact Loudspeaker Matrix System to Create 3D Sounds for Personal UsesAya Saito, University of Aizu - Aizuwakamatsu City, Japan; Takahiro Nemoto, University of Aizu - Aizuwakamatsu, Japan; Akira Saji, University of Aizu - Aizuwakamatsu City, Japan; Jie Huang, University of Aizu - Aizuwakamatsu City, Japan
In this paper we propose a new 3D sound system in two-layers as a matrix that has five loudspeakers on each side of the listener. The system is effective for sound localization and compact for personal use. Sound images in this system are created by extended amplitude panning method, with the effect of head-related transfer functions (HRTFs). Performance evaluation of the system for sound localization was made by auditory experiments with listeners. As the result, listeners could distinguish sound image direction localized at any azimuth direction and high elevation direction with small biases.
Convention Paper 10304 (Purchase now)

P16-5 Evaluation of Spatial Audio Quality of the Synthesis of Binaural Room Impulse Responses for New Object PositionsStephan Werner, Technische Universität Ilmenau - Ilmenau, Germany; Florian Klein, Technische Universität Ilmenau - Ilmenau, Germany; Clemens Müller, Technical University of Ilmenau - Ilmenau, Germany
The aim of auditory augmented reality is to create an auditory illusion combining virtual audio objects and scenarios with the perceived real acoustic surrounding. A suitable system like position-dynamic binaural synthesis is needed to minimize perceptual conflicts with the perceived real world. The needed binaural room impulse responses (BRIRs) have to fit the acoustics of the listening room. One approach to minimize the large number of BRIRs for all source-receiver relations is the synthesis of BRIRs using only one measurement in the listening room. The focus of the paper is the evaluation of the spatial audio quality. In most conditions differences in direct-to-reverberant-energy ratio between a reference and the synthesis is below the just noticeable difference. Furthermore, small differences are found for perceived overall difference, distance, and direction perception. Perceived externalization is comparable to the usage of measured BRIRs. Challenges are detected to synthesize more further away sources from a source position that is more close to the listening positions.
Convention Paper 10305 (Purchase now)

P16-6 WithdrawnN/A

P16-7 An Adaptive Crosstalk Cancellation System Using Microphones at the EarsTobias Kabzinski, RWTH Aachen University - Aachen, Germany; Peter Jax, RWTH Aachen University - Aachen, Germany
For the reproduction of binaural signals via loudspeakers, crosstalk cancellation systems are necessary. To compute the crosstalk cancellation filters, the transfer functions between loudspeakers and ears must be given. If the listener moves the filters are usually updated based on a model or previously measured transfer functions. We propose a novel architecture: It is suggested to place microphones close to the listener’s ears to continuously estimate the true transfer functions and use those to adapt the crosstalk cancellation filters. A fast frequency-domain state-space approach is employed for multichannel system tracking. For simulations of slow listener rotations it is demonstrated by objective and subjective means that the proposed system successfully attenuates crosstalk of the direct sound components.
Convention Paper 10307 (Purchase now)

P16-8 Immersive Sound Reproduction in Real Environments Using a Linear Loudspeaker ArrayValeria Bruschi, Univeresità Politecnica delle Marche - Ancona, Italy; Nicola Ortolani, Università Politecnica delle Marche - Ancona (AN), Italy; Stefania Cecchi, Universitá Politecnica della Marche - Ancona, Italy; Francesco Piazza, Universitá Politecnica della Marche - Ancona (AN), Italy
In this paper an immersive sound reproduction system capable of improving the overall listening experience is presented and tested using a loudspeaker linear array. The system aims at providing a channel separation over a broadband spectrum by implementing the RACE (Recursive Ambiophonic Crosstalk Elimination) algorithm and a beamforming algorithm based on a pressure matching approach. A real time implementation of the algorithm has been performed and its performance has been evaluated comparing it with the state of the art. Objective and subjective measurements have con?rmed the effectiveness of the proposed approach.
Convention Paper 10308 (Purchase now)

P16-9 The Influences of Microphone System, Video, and Listening Position on the Perceived Quality of Surround Recording for Sport ContentAimee Moulson, University of Huddersfield - Huddersfield, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
This paper investigates the influences of the recording/reproduction format, video, and listening position on the quality perception of surround ambience recordings for sporting events. Two microphone systems—First Order Ambisonics (FOA) and Equal Segment Microphone Array (ESMA)—were compared in both 4-channel (2D) and 8-channel (3D) loudspeaker reproductions. One subject group tested audio-only conditions while the other group was presented with video as well as audio. Overall, the ESMA was rated significantly higher than the FOA for all quality attributes tested regardless of the presence of video. The 2D and 3D reproductions did not have a significant difference within each microphone system. Video had a significant interaction with the microphone system and listening position depending on the attribute.
Convention Paper 10309 (Purchase now)

P16-10 Sound Design and Reproduction Techniques for Co-Located Narrative VR ExperiencesMarta Gospodarek, New York University - New York, NY, USA; Andrea Genovese, New York University - New York, NY, USA; Dennis Dembeck, New York University - New York, NY, USA; Flavorlab; Corinne Brenner, New York University - New York, NY, USA; Agnieszka Roginska, New York University - New York, NY, USA; Ken Perlin, New York University - New York, NY, USA
Immersive co-located theatre aims to bring the social aspects of traditional cinematic and theatrical experience into Virtual Reality (VR). Within these VR environments, participants can see and hear each other, while their virtual seating location corresponds to their actual position in the physical space. These elements create a realistic sense of presence and communication, which enables an audience to create a cognitive impression of a shared virtual space. This article presents a theoretical framework behind the design principles, challenges and factors involved in the sound production of co-located VR cinematic productions, followed by a case-study discussion examining the implementation of an example system for a 6-minute cinematic experience for 30 simultaneous users. A hybrid reproduction system is proposed for the delivery of an effective sound design for shared cinematic VR. Winner of the 147th AES Convention Best Peer-Reviewed Paper Award
Convention Paper 10287 (Purchase now)


Saturday, October 19, 11:00 am — 12:00 pm (1E17)

Immersive & Spatial Audio: IS17 - Genelec Play Immersive

A wide collection of excellent immersive music, nature and film recordings, conveyed via a point source 7.1.4 reproduction system.


Saturday, October 19, 1:30 pm — 4:30 pm (1E13)

Archiving & Restoration: AR09 - Audio Repair and Restoration for Music and Post: Build Your Skills

David Barber, Juniper Post, Inc. - Burbank, CA, USA
Alexey Lukin, iZotope, Inc. - Cambridge, MA, USA
Jessica Thompson, Jessica Thompson Audio - Berkeley, CA, USA
Jonathan Wyner, M Works Studios/iZotope/Berklee College of Music - Boston, MA, USA; M Works Mastering

Single-ended noise reduction and audio repair tools have evolved during the past 35 years to the point that they have become an integral part of the work and workflows across audio disciplines. During this workshop attendees will be lead through an overview of the various sorts of technologies, techniques, and strategies used to solve audio challenges in music and audio post. Attendees will be guided through exercises that will help them develop their skills in audio repair and restoration

Preregistration is required for this event. Tickets are $75 (member) and $125 (non-member) and can be purchased on-line when you register for the convention. Seating is limited. For more information and to register click here.


Saturday, October 19, 3:00 pm — 4:00 pm (1E08)

Audio for Cinema: AC04 - Ambisonics in Cinema

John Escobar, Berklee College of Music - Boston, MA, USA

Berklee Professor and cinema audio maven John Escobar explores the uses of Ambisonics in cinema post-production to enhance existing audio as well as audio captured by soundfield microphones. Ambisonics can be used to “spatialize” non-soundfield recordings for better localization and potential further use in interactive media. Mr. Escobar will also demonstrate the use of Ambisonics in film score production. Audio examples using the technology will be played.


Saturday, October 19, 3:30 pm — 5:00 pm (1E12)


Sound Reinforcement: SR10 - Seven Steps to a Successful Sound System Design

Josh Loar, Michigan Technological University - Houghton, MI, USA; The Producers Group - Burbank, CA, USA

Sound systems are getting more complex every day. In the era of digital audio, network-enabled devices, and complex interdependent show control, developing a sound system design can feel like a daunting task. Josh Loar (author of The Sound System Design Primer available from Focal Press/Routledge) presents a systematic, seven step process for designing any sound system that demystifies the process, and allows the prospective designer to make logical choices, and maximize the elegance and efficiency of the technical solutions chosen. Loar introduces a question and answer process for the designer that allows them to parse the system needs in any gear category—identifying what specs are most relevant to the work of the designer, and what questions a designer must ask and answer in the process. Additionally, Loar discusses the differing needs of various classes of system, from live theater and concert systems, to studio and production systems, to theme parks, casinos, and other installed systems.


Return to Immersive & Spatial Audio Track Events