144th AES CONVENTION Engineering Brief Details

AES Milan 2018
Engineering Brief Details

EB01 - e-Brief Posters—1


Wednesday, May 23, 11:15 — 12:45 (Arena 2)

EB01-1 Experimental Study on Sound Quality of Various Audio Fade LengthsJedrzej Borowski, Dolby Poland - Wroclaw, Poland; Krzysztof Bulawski, Dolby Poland - Wroclaw, Poland; Krzysztof Goliasz, Dolby Poland - Wroclaw, Poland
The aim of this paper is learning the shortest possible lengths of audio fades and crossfades that are not audible as audio artifacts. The determined lengths can be utilized in adaptive streaming scenarios during pauses and content switches. Subjective and objective tests were performed, utilizing speech and music signals with various fade-out and fade-in lengths. Subjective evaluation was performed by critical listening tests where listeners were asked to grade the quality of the fade-out or fade-in and listen for unwanted audio artifacts. Basing on the subjective test results, optimal ranges of fade-out and fade-in times were selected—50 to 100 ms for fade-in, and 100 to 200 ms for fade-out. Objective tests were conducted using optimal times chosen by the listening tests. The results confirm that the selected ranges of fade-in and fade-out lengths do not introduce significant harmonic distortion and noise into the signal.
Engineering Brief 404 (Download now)

EB01-2 3D Sound Intensity Measurement of 1241 Sound Objects on Fine Panning Grids by Using a Virtual Source VisualizerTakashi Mikami, SONA Co. - Tokyo, Japan; Masataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Kazutaka Someya, beBlue Co., Ltd. - Tokyo, Japan; Akira Omoto, Kyushu University - Fukuoka, Japan; Onfuture Ltd. - Tokyo, Japan
3D sound intensity measurement of 1241 sound objects rendered by Dolby Atmos on fine panning grids was carried out by using a Virtual Source Visualizer (VSV). The obtained sound localizations were visualized as a 3D panning map. To evaluate properties of reproduced sound fields for several kinds of rendering systems in various rooms, the authors have previously carried out VSV measurements of sound objects on some main panning positions. The results roughly illustrated each acoustic feature of rendered sound fields. This measurement was carried out to find out the relationship between 3D panner’s indications and physical sound localizations on finer scale. The visualized sound localizations formed “3D panning position map” that clearly shows the relationship between them.
Engineering Brief 405 (Download now)

EB01-3 SOFA Native Spatializer Plugin for Unity—Exchangeable HRTFs in Virtual RealityClaudia Jenny, University of Vienna - Vienna, Austria; Austrian Academy of Sciences - Vienna, Austria; Piotr Majdak, Austrian Academy of Sciences - Vienna, Austria; Christoph Reuter, University of Vienna - Vienna, Austria
In order to present three-dimensional virtual sound sources via headphones, head-related transfer functions (HRTFs) can be integrated in a spatialization algorithm. However, the spatial perception in binaural virtual acoustics may be limited if the applied HRTFs differ from those of the actual listener. Thus, SOFAlizer, a spatialization engine allowing to use and switch on-the-fly between listener-specific HRTFs stored in the spatially oriented format for acoustics (SOFA) was implemented for the Unity game engine. With that plugin, virtual-reality headsets can benefit from the individual HRTF-based spatial sound reproduction.
Engineering Brief 406 (Download now)

EB01-4 PR-VR: Approaching Sound Field Recording in Multi-Reality EnvironmentsTiernan Cross, University of Sydney - Sydney, Australia
This brief will communicate the author’s recent research that questions what it is to expand the horizon of field recording beyond the physical sense of sonic immediacy into the simultaneous recording of mixed physical, technological and network-based realities. In doing so this work proposes a reconstruction to what constitutes a modern sound recordist’s immediate sonic environment in today’s technologically inundated atmospheres. This brief will discuss the architecture of a PR-VR, software-based audio device capable of recording real-time, multichannel inputs and field recordings from physical, technological, and virtual acoustic spaces concurrently. By blending multi-reality input streams through algorithm this research looks to explore how modern technology can sculpt new variegated sound field recordings and formulate new hybrid, soundscapes.
Engineering Brief 407 (Download now)

EB01-5 A Stand for Measurement and Prediction of Scattering Properties of DiffusersAdam Kurowski, Gdansk University of Technology - Gdansk, Poland; Damian Koszewski, Gdansk University of Technology - Gdansk, Poland; Józef Kotus, Gdansk University of Technology - Gdansk, Poland; Bozena Kostek, Gdansk University of Technology - Gdansk, Poland; Audio Acoustics Lab.
In this paper we present a set of solutions that may be used for prototyping and simulation of acoustic scattering devices. The system proposed is capable of measuring sound field. Also a way to use an open source solution for simulation of scattering phenomena occurring in proximity of acoustic diffusers is shown. The result of our work are measurement procedure and a prototype of the simulation script based on FEniCS - an open source computing platform for the FEM-based solution of differential equations. A visualization and comparison between data obtained from measurement and an example of the simulation scenario are presented and discussed.
Engineering Brief 408 (Download now)

EB01-6 The Immersive Media Laboratory: Installation of a Novel Multichannel Audio Laboratory for Immersive Media ApplicationsRobert Hupke, Leibniz Universität Hannover - Hannover, Germany; Marcel Nophut, Leibniz Universität Hannover - Hannover, Germany; Song Li, Leibniz Universität Hannover - Hannover, Germany; Roman Schlieper, Leibniz Universität Hannover - Hannover, Germany; Stephan Preihs, Leibniz Universität Hannover - Hannover, Germany; Jürgen Peissig, Leibniz Universität Hannover - Hannover, Germany
This engineering brief presents the novel multichannel audio laboratory for immersive media applications of the Institut fu¨r Kommunikationstechnik (IKT) with its varying multichannel loudspeaker arrangements and acoustical transparent projection screens of nearly 270°. We address the construction process and setup of the laboratory called Immersive Media Lab (IML). It was designed in compliance to the strict recommendations of the ITU-R BS.1116-3 in order to conduct research in 3D audio reproduction. Our brief will first address issues of space and room dimensions as well as the acoustical design of the new listening room. Furthermore, the flexible loudspeaker arrangement consisting of 28 active loudspeakers as well as the projection setup consisting of 3 high definition ultra-short throw video projectors is described.
Engineering Brief 409 (Download now)

EB01-7 Can Visual Priming Affect the Perceived Sound Quality of a Voice Signal in Voice over Internet Protocol (VoIP) Applications?Jack Haigh, University of Limerick - Limerick, Limerick, Ireland; Chris Exton, University of Limerick - Limerick, Ireland; Malachy Ronan, Limerick Institute of Technology - Limerick, Ireland
Verbal suggestions of loudness changes have been reported to result in significantly higher loudness ratings than those of a control group [1]. This study seeks to extend these results to VoIP applications by implementing visual priming cues within a VoIP interface and assessing their effect on audio quality ratings. A list of common visual priming cues was compiled and cross-referenced with prevalent design features found in popular mobile VoIP Applications. Fourteen participants were divided into two groups: one received embedded priming cues and one did not. Quality ratings were gathered using a MOS rating scale. The results are presented and their relevance discussed.
Engineering Brief 410 (Download now)

EB01-8 Communication Through Timbral Manipulation: Using Equalization to Communicate Warmth—Part 1Alejandro Aspinwall, McGill University - Montreal, QC, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, Quebec, Canada
With the advent of new technologies that allow for virtually any modern computer to process high quality audio, many musicians and amateur players are presented with a plethora of sound sculpting tools. Some of these display subjective attributes such as warmth punch and shimmer. If engineers are able to manipulate the timbre of a recorded sound using equalization, are they then able to use this ability to convey specific perceptual intentions (to make a sound “crunchy,” “bright,” or “warm” for instance). Using Juslin’s standard paradigm, this study explores the question: How effective are audio engineers in communicating warmth when applying equalization?
Engineering Brief 411 (Download now)

EB01-9 An Open Realtime Binaural Synthesis Toolkit for Audio ResearchAndreas Franck, University of Southampton - Southampton, Hampshire, UK; Giacomo Costantini, University of Southampton - Southampton, UK; Chris Pike, BBC R&D - Salford, UK; University of York - York, UK; Filippo Maria Fazi, University of Southampton - Southampton, Hampshire, UK
Binaural synthesis has gained fundamental importance both as a practical sound reproduction method and as a tool in audio research. Binaural rendering requires significant implementation effort, especially if head movement tracking or dynamic sound scenes are required, thus impeding audio research. For this reason we propose the Binaural Synthesis Toolkit (BST), a portable, open source, and extensible software package for binaural synthesis. In this paper we present the design of the BST and the three rendering approaches currently implemented. In contrast to most other software, the BST can easily be adapted and extended by users. The Binaural Synthesis Toolkit is released as an open software package as a flexible solution for binaural reproduction and to foster reproducible research in this field.
Engineering Brief 412 (Download now)

EB01-10 Measurement of Latency in the Android Audio PathSzymon Zaporowski, Gdansk University of Technology - Gdansk, Poland; Maciej Blaszke, Gdansk University of Technology - Gdansk, Poland; Dawid Weber, Gdansk University of Technology - Gdansk, Poland
This paper provides a description of experimental investigations concerning comparison between the audio path characteristics of various Android versions. First, information about the changes in each system version in the context of latency caused by them is presented. Then, a measurement procedure employing available applications to measure latency is described comparing to results contained in the Internet. Finally, a comparison between tested systems and results of tests are presented along with conclusions on possible audio processing implementations on the Android platform.
Engineering Brief 413 (Download now)

 
 

EB02 - Applications & Audio Education


Thursday, May 24, 14:15 — 16:00 (Scala 2)

Chair:
Nyssim Lefford, Luleå University of Technology - Luleå, Sweden

EB02-1 New Packet Routing for 5G to Replace TCP/IPJohn Grant, Nine Tiles - Cambridge, UK
While most of the attention has been focused on new radio and getting more bits over the wireless interface, operators also need 5G to have new packet routing technology that will make better use of those bits and support new services including low-latency live media. The new technology is being developed in ETSI ISG NGP, which the author chairs, and is expected to be standardized by 2020. This paper outlines the likely main features of the new technology, which is partly developed from AES47 and AES51, and discusses how they will make it more appropriate than IP for audio networking.
Engineering Brief 414 (Download now)

EB02-2 Miniaturized Noise Generation System—A Simulation of a SimulationJan Banas, Intel Technology Poland - Gdansk, Poland; Przemek Maziewski, Intel Technology Poland - Gdansk, Poland; Sebastian Rosenkiewicz, Intel Technology Poland - Gdansk, Poland
In the speech recognition industry, there is an everlasting need for evaluation of products in environments imitating real use cases. A wide-spread solution is to build a setup compliant with ETSI EG 202 396-1 standard, which defines a unified artificial laboratory environment to simulate real use scenarios of products to be tested. For space and cost reduction, a method is being developed to miniaturize the standard setup and simulate its behavior in a soundproof enclosure. In order to achieve high fidelity a number of spectral and temporal qualities of sound are measured in a laboratory and replicated in a box. The performance is evaluated using metrics specific to speech recognition.
Engineering Brief 415 (Download now)

EB02-3 FXive: A Web Platform for Procedural Sound SynthesisParham Bahadoran, Queen Mary University London - London, UK; FXive.com - London, UK; Adan Benito, Queen Mary University London - London, UK; FXive.com - London, UK; Thomas Vassallo, Queen Mary University London - London, UK; Joshua D. Reiss, Queen Mary University of London - London, UK
FXive is a real-time sound effect synthesis framework in the browser. The system is comprised of a library of synthesis models, audio effects, post-processing tools, temporal, and spatial placement functionality for the user to create the scene from scratch. The real-time nature allows the user to manipulate multiple parameters to shape the sound at the point of creation. Semantic descriptors are mapped to low level parameters in order to provide an intuitive means of user manipulation. Post-processing features allow for the auditory, temporal, and spatial manipulation of these individual sound effects.
Engineering Brief 416 (Download now)

EB02-4 Auto-EQ: Can Algorithms Replace a Sound Engineer?Daniil Sinev, ARKAMYS - Paris, France; Le Mans University - Le Mans, France; Guillaume Rossi-Ferrari, ARKAMYS - Paris, France
This brief’s aim is to present a work in progress on an automatic equalization algorithm. The algorithm’s particular design, based on a parametric equalizer rather than inverse filtering, presents certain advantages as well as certain challenges. Being conceived and developed together with sound engineers, it is meant to mimic human decisions in filter choices. This necessitates a careful analysis of a sound engineer’s workflow and a search for algorithmic solutions that correspond to decisions based on listening, experience and personal preference.
Engineering Brief 417 (Download now)

EB02-5 Challenging Changes for Live NGA Immersive Audio ProductionPeter Poers, Junger Audio GmbH - Berlin, Germany
The world of broadcast audio is on the verge of a major revolution. Numerous "3D Immersive" formats are developing and will find their way into the mainstream of broadcast production and distribution in the near future. Along with Immersive Audio another category comes into game – Object Based Audio (OBA). All together it describes the Next Generation Audio formats - NGA. What does this mean and what challenge we need to fight with here? OBA will give the end user the option to personalize their experience by selecting personalized audio mixes. In object based audio, an "object" is essentially an audio stream with accompanying descriptive metadata. One of the major challenges for the production side of the industry will be to start OBA production. This means completely rethinking how we perform the final mix because, with OBA, it will be performed at home by the viewer rather than by a mixer in a post-production facility. What does this all mean for the broadcaster? A complete re-build of existing facilities and a total re-think about the audio processing equipment required for outside broadcast vehicles? Well, if we get it right, there will be some changes to overall workflow and hardware. And working with metadata for live streams and in files will become a major challenge. There will be new technical tools and new standards that will help to reach this new level of requirements. Some ideas and facts will be presented.
Engineering Brief 418 (Download now)

EB02-6 Mutebook.me—Interactive Online Tools for Teaching Music TechnologyThilo Schaller, SUNY Buffalo State - Buffalo, NY, USA; Jan Burle, School of Creative and Performing Arts, University of Calgary - Calgary, AB, Canada
Mutebook is a project that aims to help audio arts students comprehend scientific concepts related to music technology, such as basic mathematics, fundamentals of acoustics, and digital audio theory. By using online interactive course material with visual and aural feedback, students of various audio arts disciplines can intuitively explore and understand relevant scientific concepts. Mutebook was in part funded by the Innovative Instruction Technology Grant (IITG) of the State University of New York (SUNY), and its first phase—i.e., the creation of an initial collection of interactive lecture notes with integrated applets—will be completed in May 2018.
Engineering Brief 419 (Download now)

EB02-7 CAE Support to Woofer Installation in a CarAndrzej Pietrzyk, Volvo Car Corporation - Torslanda, Sweden
CAE support to the development of the audio system from an automotive OEM perspective is discussed, on the example of the development of the details of installation of a door woofer. For this application a loudspeaker model driven with electric voltage has to be integrated in the vibro-acoustic simulation. An example of an implementation of such a model is discussed and the accuracy of simulations is presented for the case of a woofer in free hanging door. It is further discussed in car models with different trim levels, from plane metal body with trimmed closures to fully trimmed body. Finally, the results obtained in a car model, with different variants of installation details of the woofer are presented and discussed.
Engineering Brief 420 (Download now)

 
 

EB03 - Signal Processing/Audio Effects & Instrumentation/Measurements/Forensics


Friday, May 25, 09:30 — 11:15 (Scala 2)

Chair:
Aki Mäkivirta, Genelec Oy - Iisalmi, Finland

EB03-1 SysId—A System Identification Measurement PackageJont Allen, University of Illinois - Urbana, IL, USA
SysID is a computer program that was developed c1980 at Bell Labs, for measurement of linear and nonlinear systems. At that time it was ported to the IBM-PC, and was sold by the Ariel Corp., Highland Park, NJ. Today it is a Matlab/Octave script that works with special hardware manufactured by Mimosa Acoustics of Champaign IL. SysID can measure the complex frequency response and impulse response of any linear system, such as loudspeakers, earphones, and rooms. When coupled with Matlab/Octave, it may be used as an audio-band network analyzer providing a pole-zero analysis of measured complex impedance, either electrical or acoustical. By using synchronous analysis, SysID complements spectrum analyzer functions, and in many cases, can extend the function of a two-channel spectrum analyzer, quickly resulting in highly accurate magnitude and phase results, limited only by the bit-accuracy of the codecs. It can more accurately characterize harmonic distortion and inter-modulation distortion, group delay, phase, impedance, and many other important system features, in near real time, along with a pole-zero analysis. In this presentation I will describe the theory behind SysID, and give a demonstration on the hand-held portable system. There is a long history behind such system. For example a number-theory method called MLS is believed by many to be superior, however this is a topic that needs clarification. MLS uses binary sequences, and therefore cannot directly measure TDH+N, or inter-modulation distortion. These issues are easily overcome by using int-32 sequences having powers of 2 sequence lengths. SysID has been used to measure auditoriums, conference rooms, loudspeaker impulse responses, cochlear potentials, ear canal impedance and reflectance, and many other two-port measurements such as a detailed loudspeaker analysis. The system is used in ECE-403 by students to analyze loud speaker characteristics, and was used to characterize and then model hearing aid receivers [1, 2].
Engineering Brief 421 (Download now)

EB03-2 Capacitor Distortion in High-Order Active FiltersDouglas Self, The Signal Transfer Company - London, UK
Some non-electrolytic capacitor types such as polyester generate distortion when they have a significant signal voltage across them. This can be avoided by using polypropylene or polystyrene types but they are larger and more expensive. I have previously shown that in 2nd-order Sallen & Key filters, both lowpass and highpass, only one of the two capacitors has to be of the expensive sort to obtain the same freedom of distortion achieved with two. The Geffe configuration allows 3rd and 4th-order filters to be realized economically in one stage, and it is shown that here too only one linear capacitor is required in both lowpass and highpass cases, saving a lot of money.
Engineering Brief 422 (Download now)

EB03-3 Graphical Development Design for a Heterogeneous DSP Core ArchitectureMiguel Chavez, Analog Devices - Wilmington, MA, USA
For over 15 years, Analog Devices has continued improving its graphical programming environment to support several, audio specific and general purpose digital signal processors (DSPs). As of 2016, all supported processors have either contained single or dual DSP cores whereas both of them have had the same architecture. With the need to have a heterogeneous DSP architecture the team had the challenge to program both cores within the same environment. This paper describes challenges, trade-offs and design decisions made when programming a new heterogeneous DSP core architecture.
Engineering Brief 423 (Download now)

EB03-4 Room Acoustic Measurements with Logarithmic Sine Sweeps on Android PhonesLorenzo Rizzi, Suono e Vita - Acoustic Engineering - Lecco, Italy; Giulio Scotti, Suonoevita Ingegneria - Lecco, Italy; Gabriele Ghelfi, Suono e Vita - Acoustic Engineering - Lecco, Italy
An application for room acoustics measurements has been developed for Android devices: its novelty represents the use of the logarithmic sine sweep method, which is better than typical direct methods used so far. The article describes the main points of the design phase and stresses the instrument on-site testing results in common use rooms. The testing of this Android instrument gives insights on small-room acoustics and on acoustical parameters measurement quality.
Engineering Brief 424 (Download now)

EB03-5 DirPat—Database and Viewer of 2D/3D Directivity Patterns of Sound Sources and ReceiversManuel Brandner, University of Music and Performing Arts Graz - Graz, Austria; Matthias Frank, University of Music and Performing Arts Graz - Graz, Austria; Daniel Rudrich, Institute of Electronic Music and Acoustics Graz - Graz, Austria
A measurement repository (DirPat) has been set up to archive all 3D and 2D directivity patterns measured at the Institute of Electronic Music and Acoustics, University of Music and Performing Arts in Graz. Directivity measurements have been made of various loudspeakers, microphones, and also of human speakers/singers for specific phonemes. The repository holds time domain impulse responses for each direction of the radiating or incident sound path. The data can be visualized with the provided 2D and 3D visualization scripts programmed in MATLAB. The repository is used for ongoing scientific research in the field of directivity evaluation of sources or receivers regarding localization, auditory perception, and room acoustic modeling.
Engineering Brief 425 (Download now)

EB03-6 The Anatomy, Physiology, and Diagnostics of Smart Audio DevicesXinhui Zhou, Audio Precision - Beaverton, OR, USA; Mark Martin, Audio Precision - Beaverton, OR, USA; Jayant Datta, Audio Precision - Beaverton, OR, USA; Vijay Badawadagi, Audio Precision - Beaverton, OR, USA
Smart audio devices are becoming ubiquitous and their popularity has been skyrocketing. By current standards, a smart audio device is voice-controlled through interaction with an Internet-based intelligent virtual assistant and usually provides access to remote repositories of music or information. This paper focuses on the smart speakers (e.g., Amazon Echo, Google Home, etc.)—the most popular smart audio device. Though they are usually composed of relatively simple audio components, these devices incorporate very sophisticated audio signal processing, a plethora of audio pathways, and functional audio subsystems—posing significant challenge in testing. This paper explores the audio subsystems and pathways found on this type of device and suggests ways to test and validate their functionality and performance.
Engineering Brief 426 (Download now)

EB03-7 Proposed AES Test Standard for Specifying Continuous and Short-Term Maximum Power and SPL for Electronic and Electro-Acoustic Systems: Part 1D. B. (Don) Keele, Jr., DBK Associates and Labs - Bloomington, IN, USA; Steven Hutt, Equity Sound Investments - Bloomington, IN, USA; Marshall Kay, Keysight Technologies - Apex, NC, USA; Hugh Sarvis, Presonus Audio Electronics-Worx Audio Technologies - Baton Rouge, LA, USA
This paper describes a proposed test method that allows both the continuous and short-term maximum peak output and SPL for electronic and electroacoustic systems to be measured. Systems such as power amplifiers, loudspeakers, and sound reinforcement/cinema systems can be measured and specified over the complete audible range. The test is divided into two parts that individually assesses: (1) the system’s broadband continuous maximum output using various steady-state low-crest-factor test signals and (2) the system’s short-term narrow-band maximum output using high-crest-factor test signals. The combination of both tests completely specifies the system’s maximum output on a continuous and short-term narrow-band basis. A future Part 2 paper will go into detail concerning the tests and will illustrate with measured test results.
Engineering Brief 427 (Download now)

 
 

EB04 - Spatial Audio


Friday, May 25, 14:15 — 15:45 (Scala 2)

Chair:
Frank Schultz, University of Music and Performing Arts Graz - Graz, Austria

EB04-1 Ambilibrium—A User-Friendly Ambisonics Encoder/Decoder-Matrix Designer ToolMichael Romanov, University of Music and Performing Arts Graz - Graz, Austria
The name "Ambilibrium" is composed of the terms "Ambi" (prefix)—Latin: around and "aequilibrium"—Latin: balance. This engineering brief discusses the implementation of a tool that creates the spatial balance within any spherical microphone array to any loudspeaker array in the form of custom Ambisonics encoder / decoder matrices packed in a user friendly interface. A technique for automatic loudspeaker position estimation using similar approach to GPS and the optimization of the AllRAD are also discussed in this engineering brief.
Engineering Brief 428 (Download now)

EB04-2 Distant Speech Beamforming Improving Multi-User ASRAdam Kupryjanow, Intel Technology Poland - Gdansk, Poland; Raghavendra Rao R, Intel Technology - Devarabeesanahalli, KA, India; Przemek Maziewski, Intel Technology Poland - Gdansk, Poland; Lukasz Kurylo, Intel Technology Poland - Gdansk, Poland
In this paper an algorithm that improves side speaker attenuation for super directive beamformer like MVDR (Minimum Variance Distortionless Response) is presented. This technique can be utilized in a scenario where there are multiple people in a room intending to interact with an ASR- (automatic speech recognition) enabled device, e.g., smart speaker. The experiments show that the proposed solution gives a reduction of WER (word error rate) up to 23.93% calculated for command uttered by one user when a second user was treated as the side speaker.
Engineering Brief 429 (Download now)

EB04-3 An Efficient Method for Producing Binaural Mixes of Classical Music from a Primary Stereo MixTom Parnell, BBC Research & Development - Salford, UK; Chris Pike, BBC R&D - Salford, UK; University of York - York, UK
Radio audiences in the UK are increasingly listening using headphones, and binaural mixes are likely to offer more natural and immersive classical musical experiences than stereo broadcasts. However, the stereo mix is currently a priority for broadcasters, and producers have limited resources to create an additional, binaural mix. This engineering brief describes the semi-automated workflow used to produce binaural mixes of performances from the BBC Proms. Spatial audio mixes were created by repositioning the individual microphone signals from the stereo broadcast in three dimensions, and adding ambient signals captured using a 3D microphone array. A commercial mixing application was used for spatial panning and binaural rendering, and the resulting binaural audio was streamed live online. Comments on the production workflow were collected from the music balancers, and audience responses were surveyed.
Engineering Brief 430 (Download now)

EB04-4 The Anaglyph Binaural Audio EngineDavid Poirier-Quinot, Sorbonne Université, CNRS - Paris, France; Brian F. G. Katz, Sorbonne Université, CNRS, Institut Jean Le Rond d'Alembert - Paris, France
Anaglyph is part of an ongoing research effort into the perceptual and technical capabilities of binaural rendering. The Anaglyph binaural audio engine is a VST audio plugin for binaural spatialization integrating the results of over a decade of spatial hearing research. Anaglyph has been designed as an audio plugin to both support ongoing research efforts as well as to make accessible the fruits of this research to audio engineers through traditional existing DAW environments. Among its features, Anaglyph includes a personalizable morphological ITD model, near-field ILD and HRTF parallax corrections, a Localization Enhancer, an Externalization Booster, and SOFA HRIR file support. The basic architecture and implementation of each audio-related component is presented here.
Engineering Brief 431 (Download now)

EB04-5 WithdrawnN/A

Engineering Brief 432 (Download now)

EB04-6 HOBA-VR: HRTF On Demand for Binaural Audio in Immersive Virtual Reality EnvironmentsMichele Geronazzo, Aalborg University - Copenhagen, Denmark; Jari Kleimola, Hefio Ltd., - Espoo, Finland; Erik Sikström, Virsabi ApS - Copenhagen, Denmark; Amalia de Götzen, Aalborg University - Copenhagen, Denmark; Stefania Serafin, Aalborg University - Copenhagen, Denmark; Federico Avanzini, University of Milan - Milan, Italy; Dept. of Computer Science
One of the main challenges of spatial audio rendering in headphones is the personalization of the so-called head-related transfer functions (HRTFs). HRTFs capture the listener’s acoustic effects supporting immersive and realistic virtual reality (VR) contexts. This e-brief presents the HOBA-VR framework that provides a full-body VR experience with personalized HRTFs that were individually selected on demand based on anthropometric data (pinnae shapes). The proposed WAVH transfer format allows a flexible management of this customization process. A screening test aiming to evaluate user localization performance with selected HRTFs for a non-visible spatialized audio source is also provided. Accordingly, is might be possible to create a user profile that contains also individual non-acoustic factors such as localizability, satisfaction, and confidence.
Engineering Brief 433 (Download now)

 
 

EB06 - Transducers & Psychoacoustics


Friday, May 25, 16:15 — 17:30 (Scala 2)

Chair:
Hyunkook Lee, University of Huddersfield - Huddersfield, UK

EB06-1 Woofer Performance Variance Due to Components and Assembly ProcessMaria Costanza Bellini, University of Parma - Parma, Italy; Angelo Farina, Università di Parma - Parma, Italy
This paper presents an experimental study of the main causes of scrap during the production of a woofer loudspeaker. After analyzing the most critical components of a transducer, samples with reference and modified components have been built and characterized in terms of frequency-response and linear distortion curves and electrical, mechanical, acoustical parameters. In addition, a second set of samples has been built using reference components but varying the assembly process parameters; these samples also have been characterized as the previous ones. Measurements have been performed both in an anechoic chamber, along a production line, and inside a car. By the analysis of acquired data, the authors have individuated the most influential components and assembly parameters in terms of required performance.
Engineering Brief 446 (Download now)

EB06-2 Design and Measurement of a First-Order, Horizontally Beam-Controlling Loudspeaker CubeNils Meyer-Kahlen, University of Technology Graz - Graz, Austria; University of Music and Performaing Arts Graz; Franz Zotter, IEM, University of Music and Performing Arts - Graz, Austria; Katharina Pollack, TU Graz - Graz, Austria; University of Music and Performing Arts Graz - Graz, Austria
This paper describes a loudspeaker cube with four transducers on its horizontal facets, designed to enable sound radiation with adjustable first-order beam control. Design and equipping of the cubical loudspeaker is presented along with two open data sets containing multiple-input-multiple-output impulse responses (MIMO-IRs) of our measurements. The first one contains 648x 4 MIMO-IRs from input voltages to a grid of microphones at a fixed distance. The second set contains 4x 4 MIMO-IRs from input voltages to loudspeaker cone velocities, and it characterizes the active and passive transducer coupling through the enclosure that we aim to equalize/decouple. Based on these measurements we present a simple FIR filter design required for beam control of which we discuss operation range and limitations.
Engineering Brief 447 (Download now)

EB06-3 Ambisonics Directional Room Impulse Response as a New Convention of the Spatially Oriented Format for AcousticsAndrés Pérez-López, Eurecat - Barcelona, Spain; Pompeu Fabra University - Barcelona, Spain; Julien De Muynke, Fundacio Eurecat - Barcelona, Spain
Room Impulse Response (RIR) measurements are one of the most common ways to capture acoustic characteristics of a given space. When performed with microphone arrays, the RIRs inherently contain directional information. Due to the growing interest in Ambisonics and audio for Virtual Reality, new spherical microphone arrays recently hit the market. Accordingly, several databases of Directional RIRs (DRIRs) measured with such arrays, referred to as Ambisonics DRIRs, have been publicly released. However, there is no format consensus among databases. With the aim of improving interoperability, we propose an exchange format for Ambisonics DRIRs, as a new Spatially Oriented Format for Acoustics (SOFA) convention. As a use-case, some existing databases have been converted and released following our proposal.
Engineering Brief 448 (Download now)

EB06-4 Fidelity of Low Frequency Reproduction in Cars in a Sound Field Control ContextHans Lahti, Harman - Gothenburg, Sweden; Anders Löfgren, Volvo Cars - Torslanda, Sweden; Adrian Bahne, Dirac Research AB - München, Germany
Overall sound quality of factory-delivered automotive sound systems has reached a very high standard. Particularly branded high-end systems comprise great components and are well-tuned. The low frequency reproduction in automotive sound systems is, however, typically flawed. The most prominent flaw consists of resonant bass reproduction and undesirable spectral decay characteristics with strong ringing in wide frequency bands. To overcome this challenge, we adapt an algorithm allowing for simultaneous equalization of multiple channels, assuring full exploitation of the acoustic degrees of freedom inherent to a multichannel system. The sound field can be spatially controlled, yielding a uniform and tight reproduction of low frequencies in all regions of interest throughout the car compartment, with controlled and improved spectral decay characteristics.
Engineering Brief 449 (Download now)

EB06-5 A Distributed Audio System for Automotive ApplicationsJohannes Boehm, Paragon AG - Delbrück, Germany; Dirk Olszewski, paragon AG - Delbrück, Germany; Zafar Baig Mirza, paragon AG - Delbrück, Germany; Philipp Rathmann, paragon AG - Delbrück, Germany; Antonio Prados-Vilchez, paragon AG - Delbrück, Germany; Vitalie Botan, paragon AG - Delbrück, Germany; Juergen Binder, paragon AG - Delbrück, Germany; Klaus Rodemer, paragon AG - Delbrück, Germany
With a trend to higher levels of drivetrain electrification and autonomous driving, the technology to increase audio performance is becoming a more significant factor of request. Instead of centralizing related signal processing in a single powerful hardware platform, distributing it in a more intelligent way can lead to several advantages such as optimized cabling, reduced weight, improved system scalability, performance, and costs. The distributed audio system proposed in this work is connected to an automobile head unit that serves as human machine interface and media source. Portions of the data acquisition, signal processing, and amplification are placed within distributed processing nodes. We present a realization with 34 loudspeakers and 16 microphones featuring seat-individual 3D audio rendition, in-car communication, and further innovative use cases.
Engineering Brief 450 (Download now)

 
 

EB05 - e-Brief Posters—2


Friday, May 25, 16:30 — 18:00 (Arena 2)

EB05-1 Musical Polyphony EstimationSaarish Kareer, University of Rochester - Rochester, NY, USA; Sattwik Basu, University of Rochester - Rochester, NY, USA
Knowing the number of sources present in a mixture is useful for many computer audition problems such as polyphonic music transcription, source separation, and speech enhancement. Most existing algorithms for these applications require the user to provide this number thereby limiting the possibility of complete automatization. In this paper we explore a few probabilistic and machine learning approaches for an autonomous source number estimation. We then propose an implementation of a multi-class classification method using convolutional neural networks for musical polyphony estimation. In addition, we use these results to improve the performance of an instrument classifier based on the same dataset. Our final classification results for both the networks, prove that this method is a promising starting point for further advancements in unsupervised source counting and separation algorithms for music and speech.
Engineering Brief 434 (Download now)

EB05-2 Variability of Speech to Reverberation Modulation Energy RatioPrzemek Maziewski, Intel Technology Poland - Gdansk, Poland; Adam Kupryjanow, Intel Technology Poland - Gdansk, Poland
This paper illustrates variability of the speech to reverberation modulation energy ratio (SRMR). The presented experiments indicate SRMR inconsistencies per user, per utterance, and per microphone position. Additionally, the results show that the normalization available in the reference SRMR implementation does not limit the variability to an acceptable range. Further the paper presents a study of SRMR and reverberation time (RT) correlation. Experiments suggest that a precise relation between SRMR and RT can only be obtained for a specific utterance coming from a known user.
Engineering Brief 435 (Download now)

EB05-3 Introducing a Dataset of Guitar Sounds for Electric Guitars Model RecognitionRenato Profeta, Ilmenau University of Technology - Ilmenau, Germany; Gerald Schuller, Ilmenau University of Technology - IImenau, Germany; Fraunhofer Institute for Digital Media technology (IDMT) - Ilmenau, Germany
This engineering brief introduces a dataset of electric guitar sounds. The main goal of the dataset is to provide a set of electric guitar recordings that can be used for research in identification and/or classification of different electric guitar types. The dataset, at its current stage, consists of recordings from 30 guitars of different manufacturers and types, with around 3500 music events. All audio files are acquired in one-channel, 16-bit waveform audio file format with a sampling rate of 44100 Hz and are accompanied by parameter annotations in xml format. The dataset is planned to include recordings of over 50 guitars and will be released in uncompressed wav file format under Creative Commons License.
Engineering Brief 436 (Download now)

EB05-4 Development of the 4-pi Sampling Reverberator, VSVerb—Preliminary ExperimentsMasataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Akira Omoto, Kyushu University - Fukuoka, Japan; Onfuture Ltd. - Tokyo, Japan; Yasuhiko Nagatomo, Evixar Inc. - Tokyo, Japan
The strategy for capturing and restoring acoustic properties of a 4-pi sound field is one of the important issues for immersive-sound productions. Especially for post-production work, they are required to have a lot of flexibility. While some strategies have been already proposed such as Ambisonics, WFS, and BoSC, they require much effort and have less flexibility. Therefore, the authors developed a 4-pi sampling reverberator, VSVerb, which restores 4-pi reverberation by using captured virtual sound sources on site. Moreover, it enables to adjust various acoustic parameters at the post-production stage. In order to verify feasibility of the proposed method, some preliminary experiments were conducted. As a result, effectiveness of the VSVerb is confirmed, and some future subjects are also found.
Engineering Brief 437 (Download now)

EB05-5 Practical Evaluation of Sweet Spot in Current Noise Reproduction SystemsPiotr Klinke, Intel Technology Poland - Gdansk, Poland; Roksana Kostyk, Intel Technology Poland - Gdansk, Poland; Jan Banas, Intel Technology Poland - Gdansk, Poland; Przemek Maziewski, Intel Technology Poland - Gdansk, Poland; Dominik Stanczak, Intel Technology Poland - Gdansk, Poland
Quality assessment for speech recognition systems is a complex problem involving many factors but it all starts and ends with a proper synthesis of user scenarios in a controlled acoustic environment. In this paper industry leading background noise reproduction methods are presented. Implementations of two ETSI standards for noise reproduction, the ES 202 396-1 and TS 103 224 are measured and compared to determine the vulnerability for sound pressure level and frequency response deviation around the setup center.
Engineering Brief 438 (Download now)

EB05-6 Open Hardware Mobile Wireless Serial Audio Transmission Unit for Acoustical Ecological Momentary Assessment Using Bluetooth RFCOMMSven Franz, Jade University of Applied Sciences - Oldenburg, Germany; Holger Groenewold, Jade University of Applied Sciences - Oldenburg, Germany; Inga Holube, Jade University of Applied Sciences - Oldenburg, Germany; Jörg Bitzer, Jade Hochschule Oldenburg - Oldenburg, Germany
Acoustical Ecological Momentary Assessment (EMA) is a necessary step towards understanding noise exposure in everyday life and can also help identify obstacles in the speech understanding of hearing impaired people. Smartphones would represent a desirable tool for this task, but no simple solution has been proposed yet. Stereo audio transmission between mobile devices via Bluetooth using the Advanced Audio Distribution Profile (A2DP) is a common technique. However, due to software restrictions in Android, this profile is limited to act as a source and is prohibited from receiving a stereo audio stream. In this contribution we present a solution to transmitting uncompressed stereo audio data via Radio Frequency Communication (RFCOMM) enabling an Android smartphone to act as a receiver. Although, in contrast to A2DP, this solution is limited to stereo, 16 kHz and 16 bit, the resulting audio quality is sufficient for speech signals and acoustical measurements.
Engineering Brief 439 (Download now)

EB05-7 Sound Masking on iOS Devices. Masking Everyday Noises and TinnitusLorenzo Rizzi, Suono e Vita - Acoustic Engineering - Lecco, Italy; Nadir Bertolasi, Suonoevita Ingegneria - Lecco, Italy; Gabriele Ghelfi, Suono e Vita - Acoustic Engineering - Lecco, Italy
The research started studying the mobile devices abilities and limits to perform an environmental noise analysis using the internal microphone recording. Perceptual masking algorithms have been implemented to specifically modify natural sounds for the analyzed noise. Then sound pleasantness algorithms have been applied to obtain better masking sounds. This set-up is being extended to the masking of tinnitus: the tinnitus frequency selection is being proposed to the user through sine-sweep listening.
Engineering Brief 440 (Download now)

EB05-8 Wearable Mobile Bluetooth Device for Stereo Audio Transmission to a Modified Android SmartphoneHolger Groenewold, Jade University of Applied Sciences - Oldenburg, Germany; Sven Franz, Jade University of Applied Sciences - Oldenburg, Germany; Inga Holube, Jade University of Applied Sciences - Oldenburg, Germany; Jörg Bitzer, Jade Hochschule Oldenburg - Oldenburg, Germany
For acoustical long-term measurements based on ecological momentary assessment (EMA), we needed a solution for sending A2DP audio data to an Android smartphone. Due to Android’s software limitations, the common smartphones cannot act as the necessary sink for stereo audio. Our innovation is a wearable Bluetooth device containing a wireless transmission unit with two microphones connected to a Nexus 5 smartphone. The Nexus 5 platform was chosen, since Google performed tests featuring an Android-powered car audio system, which enables a stereo Bluetooth sink. The proposed open hardware and software solution can be used for several audio application areas, where non-intrusive long-term observations are necessary, e.g., noise exposure dosimeters.
Engineering Brief 441 (Download now)

EB05-9 Theoretical and Experimental Study of an Acoustic Monopole SourcePierluigi A. Argenta, Politecnico di Bari - Bari, Italy; Francesco Martellotta, Politecnico di Bari - Bari, Italy; Leonardo Soria, Politecnico di Bari - Bari, Italy
As it is well known, the acoustic monopole source plays a fundamental role in the fields of theoretical and experimental acoustics. In this work a sound source that well approximates the dynamic response of a theoretical monopole is designed and the parameters affecting its operational behavior are highlighted and optimized in terms of sound power and range of reproducible frequencies. We develop a hybrid lumped-parameter-finite-element model for describing the source operation. The model is first experimentally tested to validate its predictive effectiveness. Then, we perform a non-dimensional parametric optimization of the source directionality, obtaining relationships with which general design guidelines are identified, to minimize the far field directionality and, thus, achieve a quite omnidirectional behavior.
Engineering Brief 442 (Download now)

EB05-10 Development and Validation of a Full Range Acoustic Impedance TubeRoman Schlieper, Leibniz Universität Hannover - Hannover, Germany; Song Li, Leibniz Universität Hannover - Hannover, Germany; Jürgen Peissig, Leibniz Universität Hannover - Hannover, Germany
The knowledge about the physical properties of materials is of high importance for research and development in acoustics. A standardized method for the determination of acoustic impedances is the impedance measuring tube based on the transfer functions method according to ISO 10534-2. This engineering brief presents the development of an impedance measuring tube with an internal diameter of 8 mm for acoustical impedance measurements in the range of 60 Hz to 20 kHz. The impedance tube was validated by comparison of the measurement results to analytical results of the rigid termination, the open-ended tube, and the empty sample holder.
Engineering Brief 443 (Download now)

EB05-11 Directivity and Electro-Acoustic Measurements of the IKOFrank Schultz, University of Music and Performing Arts Graz - Graz, Austria; Markus Zaunschirm, University of Music and Performing Arts - Graz, Austria; IEM; Franz Zotter, IEM, University of Music and Performing Arts - Graz, Austria
The icosahedral loudspeaker (IKO) as a compact spherical array is capable of 3rd order Ambisonics (TOA) beamforming, and it is used as a musical and technical instrument. To develop and verify beamforming with its 20 loudspeakers flush-mounted into the faces of the regular icosahedron, electroacoustic properties must be measured. We offer a collection of measurement data of IEM’s IKO1, IKO2, and IKO3 along with analysis tools to inspect these properties. Multiple-input-multiple-output (MIMO) data comprises: (i) laser vibrometry measurements of the 20x20 transfer functions from driving voltages to loudspeaker velocities, (ii) 20x16 finite impulse responses (FIR) of the TOA decoding filters, and (iii) 648x20 directional impulse responses from driving voltages to radiated sound pressure. With the open data sets, open source code, and resulting directivity patterns, we intend to support reproducible research about beamforming with spherical loudspeaker arrays.
Engineering Brief 444 (Download now)

 
 


Return to Engineering Briefs