AES New York 2019
Paper Session P14
P14 - Spatial Audio, Part 3
Friday, October 18, 1:45 pm — 4:15 pm
Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland
P14-1 Measurement of Oral-Binaural Room Impulse Response by Singing Scales—Munhum Park, King Mongkut's Institute of Technology Ladkrabang - Bangkok, Thailand
Oral-binaural room impulse responses (OBRIRs) are the transfer functions from mouth to ears measured in a room. Modulated by many factors, OBRIRs contain information for the study of stage acoustics from the performer’s perspective and can be used for auralization. Measuring OBRIRs on a human is, however, a cumbersome and time-consuming process. In the current study some issues of the OBRIR measurement on humans were addressed in a series of measurements. With in-ear and mouth microphones volunteers sang scales, and a simple post-processing scheme was used to re?ne the transfer functions. The results suggest that OBRIRs may be measured consistently by using the proposed protocol, where only 4~8 diatonic scales need to be sung depending on the target signal-to-noise ratio.
Convention Paper 10291
P14-2 Effects of Capsule Coincidence in FOA Using MEMS: Objective Experiment—Gabriel Zalles, University of California, San Diego - La Jolla, CA, USA
This paper describes an experiment attempting to determine the effects of capsule coincidence in First Order Ambisonic (FOA) capture. While the spatial audio technique of ambisonics has been widely researched, it continues to grow in interest with the proliferation of AR and VR devices and services. Specifically, this paper attempts to determine whether the increased capsule coincidence afforded by Micro-Electronic Mechanical Systems (MEMS) capsules can help increase the impression of realism in spatial audio recordings via objective and subjective analysis. This is the first of a two-part paper.
Convention Paper 10292
P14-3 Spatial B-Format Equalization—Alexis Favrot, Illusonic GmbH - Uster, Switzerland; Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland
Audio corresponding to the moving picture of a virtual reality (VR) camera can be recorded using a VR microphone. The resulting A or B-format channels are decoded with respect to the look-direction for generating binaural or multichannel audio following the visual scene. Existing post-production tools are limited to only linear matrixing and filtering of the recorded channels when only the signal of a VR microphone is available. A time-frequency adaptive method is presented: providing native B-format manipulations, such as equalization, which can be applied to sound arriving from a specific direction with a high spatial resolution, yielding a backwards compatible modified B-format signal. Both linear and adaptive approaches are compared to the ideal case of truly equalized sources.
Convention Paper 10293
P14-4 Exploratory Research into the Suitability of Various 3D Input Devices for an Immersive Mixing Task—Diego I Quiroz Orozco, McGill University - Montreal, QC, Canada; Denis Martin, McGill University - Montreal, QC, Canada; CIRMMT - Montreal, QC, Canada
This study evaluates the suitability of one 2D (mouse and fader) and three 3D (Leap Motion, Space Mouse, Novint Falcon) input devices for an immersive mixing task. A test, in which subjects were asked to pan a monophonic sound object (probe) to the location of a pink noise burst (target), was conducted in a custom 3D loudspeaker array. The objectives were to determine how quickly the subjects were able to perform the task using each input device, which of the four was most appropriate for the task, and which was most preferred overall. Results show significant differences in response time between 2D and 3D input devices. Furthermore, it was found that localization blur had a significant influence over the subject’s response time, as well as “corner” locations.
Convention Paper 10294
P14-5 The 3DCC Microphone Technique: A Native B-format Approach to Recording Musical Performance—Kathleen "Ying-Ying" Zhang, New York University - New York, NY, USA; Paul Geluso, New York University - New York, NY, USA
In this paper we propose a “native” B-format recording technique that uses dual-capsule microphone technology. The three dual coincident capsule (3DCC) microphone array is a compact sound?eld capturing system. 3DCC’s advantage is that it requires minimal matrix processing during post-production to create either a B-format signal or a multi-pattern, discrete six-channel output with high stereo compatibility. Given its versatility, the system is also capable of producing a number of different primary and secondary signals that are either natively available or derived in post-production. A case study of the system’s matrixing technique has resulted in robust immersive imaging in a multichannel listening environment, leading to the possibility of future development of the system as a single six-channel soundfield microphone.
Convention Paper 10295