AES Dublin Engineering Brief Details

AES Dublin 2019
Engineering Brief Details

EB01 - Spatial Audio and Acoustics

Wednesday, March 20, 16:15 — 17:45 (Meeting Room 3)

Piotr Majdak, Austrian Academy of Sciences - Vienna, Austria

EB01-1 Extracting Directional Sound for Ambisonics MixPei-Lun Hsieh, Ambidio - Los Angeles, CA, USA; Tsai-Yi Wu, Ambidio
Ambisonics Audio has become the primary format to transmit and reproduce audio in immersive or interactive content including 360 video and virtual reality due to its flexibility to be decoded to various speaker configuration and listener’s orientation. However, one of the drawbacks of encoding a sound field to Ambisonics audio is the loss of its spatial precision. Higher order Ambisonics has been developed to use more channels in exchange of better precision. In this brief we present a method to detect and extract directional sound from an encoded Ambisonics mix. Improved extraction of the directional signal can improve the performance of other systems, for example the spatial precision during reconstruction.
Engineering Brief 491 (Download now)

EB01-2 Height Channel Signal Level in Immersive Audio—How Much Is Enough?Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Brett Leonard, BLP Audio - Richmond Heights, MO, USA; Jack Kelly, McGill University - Montreal, QC, Canada
Is there an appropriate level for the height channels in an immersive/3D presentation of recorded music when those channels are used speci?cally for ambience or spatial information? This paper describes an interactive listening test in which expert listeners were directed to manipulate and set the level for four height channels in the upper ring of a 9.1 channel 3D mix (traditional 5.1 surround sound with the addition of four height channels; front L/R, and rear L/R). Stimuli consisted of three musical excerpts—solo piano, string trio, and orchestra. Results were analyzed for mean level and overall variance as a measure of consistency of level set over multiple trials.
Engineering Brief 492 (Download now)

EB01-3 MPEG Surround Encoder with Steganography Feature for Data Hiding Based on LSB MethodIkhwana Elfitri, Andalas University - Padang, Sumatera Barat, Indonesia; Doni Nursyam, Universitas Andalas - Padang, Indonesia; Ministry of Information and Communication - Banda Aceh Branch Office, Indonesia; Rahmadi Kurnia, Universitas Andalas - Padang, Indonesia; Baharuddin, Universitas Andalas - Padang, Indonesia
Spatial audio coding becomes more important for future audio technology as the Ultra High Definition TV (UHDTV) is ready to enter the market, which is particularly pioneered by engineers in Japan. In this work a method of audio data hiding (steganography) is proposed to be integrated in an MPEG Surround (MPS) encoder, a standard based on the principle of spatial audio coding. A new Reverse One-To-Two (R-OTT) module is introduced with a capability of hiding a short important and secret data or information. This information is embedded in the least signi?cant bit (LSB) of the spatial parameter of the MPS bit stream. The experiments show that the embedding data do not significantly decrease the quality of the transmitted audio signals.
Engineering Brief 493 (Download now)

EB01-4 Consideration on the Design of Multi-Zone Control System in a Vehicle CabinWan-Ho Cho, Korea Research Institute of Standards and Science (KRISS) - Daejeon, Korea, Republic of; Ji-Ho Chang, Korea Research Institute of Standards and Science (KRISS) - Daejeon, Korea
The personal audio system to generate different sound conditions for each seat in a vehicle cabin is the representative application of multi-zone sound field control. Here, the effectiveness validation of source positions and the robustness of estimated solutions are investigated for the design of a multi-zone control system in a vehicle cabin. To quantify the efficiency of source position, the linear independency test of transfer matrix between the candidate positions of sources and listener is conducted, and an efficient position is selected by the quantified value estimated by the effective independence method. The dummy head source system is applied to measure the transfer matrix efficiently. With the properly selected source positions, it is observed that the control performance is prominent and robust.
Engineering Brief 494 (Download now)

EB01-5 Violin Sound Characteristics by its Predominant Formant Frequency ChangesEwa Lukasik, Poznan University of Technology - Poznan, Poland
The goal of this Engineering Brief is to make an insight into the dynamics of violin resonances frequency change while playing the instrument. It was inspired by the experiments of Tai and Chung performed on the individual violin sounds of a scale and associated them with formants of human singing voices. The excerpt of the “Sarabande from Partita d-minor” BWV 1004 by Johann Sebastian Bach has been analyzed from the point of view of its predominant formants within 0–5 kHz band. Violin sounds from AMATI database have been used in experiments.
Engineering Brief 495 (Download now)

EB01-6 Room & Architectural Acoustics—A New Approach to the Design & Delivery of Critical Acoustic FacilitiesJim Dunne, Smart Studio - Dublin, Ireland
There has never been more demand for high quality audio studio facilities than there is today. The growth in music recording and production, sound for picture, and gaming are among the areas that are showing increasing demand for accurate studio acoustics. While equipment has improved in performance terms with significant reductions in cost; the approach to designing and building a professional recording/mixing room is unchanged since the 1970s. However, the demands on facilities to provide fully calibrated and accurate acoustic environments does not allow for the outdated, traditional methods of designing and building critical audio facilities. Time to move on!
Engineering Brief 496 (Download now)


EB02 - E-Brief Poster Session 1

Thursday, March 21, 10:45 — 12:45 (The Liffey B)

EB02-1 Automatic Mixing Level Balancing Enhanced through Source Interference IdentificationDave Moffat, Queen Mary University London - London, UK; Mark Sandler, Queen Mary University of London - London, UK
It has been well established that equal loudness normalization can produce a perceptually appropriate level balance in an automated mix. Previous work assumes that each captured track represents an individual sound source. In the context of a live drum recording this assumption is incorrect. This paper will demonstrate an approach to identify the source interference and adjust the source gains accordingly, to ensure that tracks are all set to equal perceptual loudness. The impact of this interference on the selected gain parameters and resultant mixture is highlighted.
Engineering Brief 497 (Download now)

EB02-2 Binaural Rendering of Phantom Image Elevation Using VHAPHyunkook Lee, University of Huddersfield - Huddersfield, UK; Maksims Mironovs, University of Huddersfield - Huddersfield, West Yorkshire, UK; Dale Johnson, University of Huddersfield - Huddersfield, UK
VHAP (virtual hemispherical amplitude panning) is a method developed to create an elevated phantom source on a virtual upper-hemisphere with only four ear-height loudspeakers. This engineering brief introduces a new VST plug-in for VHAP and evaluates the performance of the binaural rendering of VHAP with a simple but effective distance control method. Listening test results indicate that the binaural mode achieves the externalization of elevated phantom images in various degrees of perceived distance. VHAP is considered to be a cost-efficient and effective method for 3D panning in virtual reality applications as well as in horizontal loudspeaker reproduction. The plugin is available for free download in the Resources section at
Engineering Brief 498 (Download now)

EB02-3 A Web-Based Tool for Microphone Array Design and Phantom Image Prediction Using the Web Audio APINikita Goddard, University of Huddersfield - Huddersfield, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
A web-based interactive tool that facilitates microphone array design and phantom image prediction is presented within this brief. Originally a mobile app, this web version of MARRS (Microphone Array Recording and Reproduction Simulator) provides greater accessibility through most web browsers and further functionality for establishing the optimal microphone array for a desired spatial scene. In addition to its novel psychoacoustic algorithm based on interchannel time-level trade-offs for arbitrary loudspeaker angles, another main feature allows demonstration of the phantom image scene through virtual loudspeaker rendering and room simulation via the Web Audio API. The current version of the MARRS web app is available through the Resources section of the APL Website:
Engineering Brief 499 (Download now)

EB02-4 CityTones: A Repository of Crowdsourced Annotated Soundfield SoundscapesAgnieszka Roginska, New York University - New York, NY, USA; Hyunkook Lee, University of Huddersfield - Huddersfield, UK; Ana Elisa Mendez Mendez, New York University - New York, NY, USA; Scott Murakami, New York University - New York, NY, USA; Andrea Genovese, New York University - New York, NY, USA
Immersive environmental soundscape capture and annotation is a growing area of audio engineering and research with applications in the reproduction of immersive sound experiences in AR and VR, sound classification, and environmental sound archiving. This Engineering Brief introduces CityTones as a crowdsourced repository of soundscapes captured using immersive sound capture methods that the audio community can contribute to. The database will include descriptors containing information about the technical details of the recording, physical information, subjective quality attributes, as well as sound content information.
Engineering Brief 500 (Download now)

EB02-5 Recovering Sound Produced by Wind Turbine Structures Employing Video Motion MagnificationSebastian Cygert, Gdansk University of Technology - Gdansk, Poland; Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland; Marta Stefaniak, Gdansk University of Technology - Gdansk, Poland; Bozena Kostek, Gdansk University of Technology - Gdansk, Poland; Audio Acoustics Lab.
The recordings were made with a fast video camera and with a microphone. Using fast cameras allowed for observation of the micro vibrations of the object structure. Motion-magnified video recordings of wind turbines on a wind farm were made for the purpose of building a damage prediction system. An idea was to use video to recover sound and vibrations in order to obtain a contactless diagnostic method for wind turbines. The recovered signals can be analyzed in a way similar to accelerometer signals, employing spectral analysis. They can be also played back through headphones and compared with sounds recorded by microphones.
Engineering Brief 501 (Download now)

EB02-6 Modelling the Effects of Spectator Distribution and Capacity on Speech Intelligibility in a Typical Soccer StadiumRoss Hammond, University of Derby - Derby, Derbyshire, UK; Peter Mapp Associates - Colchester, UK; Peter Mapp, Peter Mapp Associates - Colchester, Essex, UK; Adam J. Hill, University of Derby - Derby, Derbyshire, UK
Public address system performance is frequently simulated using acoustic computer models to assess coverage and predict potential intelligibility. When the typical 0.5 speech transmission index (STI) criterion cannot be achieved in voice alarm systems under unoccupied conditions, justification must be made to allow contractual obligations to be met. An expected increase in STI with occupancy can be used as an explanation, though the associated increase in noise levels must also be considered. This work demonstrates typical changes in STI for different spectator distribution in a calibrated stadium computer model. The effects of ambient noise are also considered. The results can be used to approximate expected changes in STI caused by different spectator occupation rates.
Engineering Brief 502 (Download now)

EB02-7 Influence of the Delay in Monitor System on the Motor Coordination of Musicians while PerformingSzymon Zaporowski, Gdansk University of Technology - Gdansk, Poland; Maciej Blaszke, Gdansk University of Technology - Gdansk, Poland; Dawid Weber, Gdansk University of Technology - Gdansk, Poland; Marta Stefaniak, Gdansk University of Technology - Gdansk, Poland
This paper provides a description and results of measurements of the maximum acceptable value of delay tolerated by a musician, while playing an instrument, that does not cause de-synchronization and discomfort. First, methodology of measurements comprising audio recording and a fast camera is described. Then, the measurement procedure for acquiring the maximum value of delay conditioning comfortably playing is presented. Results of musician’s response while playing an instrument along with a delayed signal reproduced from the monitor system are shown. Finally, a presentation of the highest values of delays for musicians playing different instruments is given along with a detailed discussion on the methodology used.
Engineering Brief 503 (Download now)


EB03 - Microphones and Circuits

Thursday, March 21, 16:00 — 17:15 (Meeting Room 3)

Joerg Panzer, R&D Team - Salgen, Germany

EB03-1 Analysis of Beam Patterns of Super Directive Acoustic BeamformerAdam Kupryjanow, Intel Technology Poland - Gdansk, Poland
In this brief beam patterns of a super directive acoustic Beamformer were presented. The analysis was done based on the recordings made in diffuse far field environment. State of the art MVDR (Minimum Variance Distortionless Beamformer) was utilized as representative of super directive Beamformer. Two types of uniform microphone arrays were investigated: linear and circular. Experiments were performed for various number of microphones in the arrays, i.e., two, four, six, and eight.
Engineering Brief 504 (Download now)

EB03-2 Configuration for Testing Intermodulation of Ultrasonic Signals in the Microphone PathDominik Stanczak, Intel Technology Poland - Gdansk, Poland; Jan Banas, Intel Technology Poland - Gdansk, Poland; Jedrzej Prysko, Intel - Gdansk, Poland; Pawel Trella, Intel - Gdansk, Poland; Przemek Maziewski, Intel Technology Poland - Gdansk, Poland
The paper presents a comparison of five configurations used to test intermodulation of ultrasonic signals. All five require the use of ultrasonic loudspeakers in an anechoic, low-noise environment. Settings vary in the number and type of digital-to-analog converters and loudspeakers, as well as connection types. Best configuration introducing a small amount of self-intermodulation for the high power of the ultrasonic signals is identified.
Engineering Brief 505 (Download now)

EB03-3 Considerations for the Next Generation of Singing Tutor SystemsBehnam Faghih, Maynooth University - Maynooth, Kildare, Ireland; Joseph Timoney, Maynooth University - Maynooth, Kildare, Ireland
Recently software systems have been proposed to accelerate the progress of singing beginners. The basics of these systems are: the pitch of the sung notes is detected and algorithmic errors removed. Then, an alignment is made with a melodic ground truth, often as a midi representation, using techniques including Dynamic Time Warping and Hidden Markov Models. Although results have been reasonable, significant drawbacks to these alignment schemes include how a “musically acceptable” alignment can be identified, dynamic singer behavior, multiple repeated notes, and dealing with omitted or extra notes. To this end an improved singing analysis system structure is proposed that includes psychoacoustic models and intelligent decision making. Justification is given along with a description of a structured evaluation procedure.
Engineering Brief 506 (Download now)

EB03-4 Control Techniques for Audio Envelope TrackingRobert Bakker, NUI Galway - Galway, Ireland; Maeve Duffy, NUI Galway - Galway, Ireland
One of the main applications for class-D audio amplifiers is portable or battery-operated devices such as Bluetooth speakers, smartphones, car stereos, etc. These devices often use a boost converter to increase the battery voltage to a suitable level to achieve the desired power output. The use of envelope tracking (ET) has been shown to significantly improve the efficiency of a class-D audio amplifier, particularly at lower power levels. However, modulating a boost converter to provide envelope tracking at a high bandwidth is complicated due to the right half-plane zero in its transfer function. This paper discusses the effect of envelope bandwidth on the overall system performance, and how it affects the control of the boost converter. It also discusses the different control methods for boost converters, and variations of this type of DC/DC converter.
Engineering Brief 507 (Download now)

EB03-5 Sound Synthesis Using Programmable System-on-Chip DevicesLarry Fitzgerald, Maynooth University - Maynooth, Kildare, Ireland; Joseph Timoney, Maynooth University - Maynooth, Kildare, Ireland
An approach to building analog synthesizers may be found by exploiting a new mixed-signal technology called the Programmable System-on-Chip (PSoC), which includes a CPU core and mixed-signal arrays of configurable integrated analog and digital peripherals. Another approach is to exploit a System on Chip (SoC) comprising an ARM-based processor and an FPGA. Two synthesizers were built and evaluated for sound quality and difficulty of implementation. Each of the approaches produced a synthesizer of good sound quality. The mixed-signal approach was cheaper in both component costs and development time compared to the FPGA-based approach.
Engineering Brief 508 (Download now)


EB04 - E-Brief Poster Session 2

Friday, March 22, 10:00 — 12:00 (The Liffey B)

EB04-1 A Study in Machine Learning Applications for Sound Source Localization with Regards to DistanceHugh O'Dwyer, Trinity College - Dublin, Ireland; Sebastian Csadi, Trinity College Dublin - Dublin, Ireland; Enda Bates, Trinity College Dublin - Dublin, Ireland; Francis M. Boland, Trinity College Dublin - Dublin, Ireland
This engineering brief outlines how Machine Learning (ML) can be used to estimate objective sound source distance by examining both the temporal and spectral content of binaural signals. A simple ML algorithm is presented that is capable of predicting source distance to within half a meter in a previously unseen environment. This algorithm is trained using a selection of features extracted from synthesized binaural speech. This enables us to determine which of a selection of cues can be best used to predict sound source distance in binaural audio. The research presented can be seen not only as an exercise in ML but also as a means of investigating how binaural hearing works.
Engineering Brief 509 (Download now)

EB04-2 Setup and First Experimentation Over an AES67 Over 802.11 NetworkMickaël Henry, Digigram - Montbonnot, France; University of Grenoble - Grenoble, France; Willy Aubry, Digigram - Montbonnot, France
In this paper the AES67 standard tackles the transport of audio data over IP technology. This standard was originally created for audio transmission over local area ethernet networks. However, other types of IP links exist with behavior different from ethernet. This paper investigates the setup and constraints to put on WiFi links to support AES67 transmission. We will show the limits to employ an AES67 stereo audio stream on a Wireless Local Area Network. We will detail the difficulties encountered when setting up an AES67 wireless network. Then we will analyze disruptions brought by wireless networks on device PTP offset, audio packets, and audio PSNR compared to an AES67 ethernet network. We will show the gains brought by the SMPTE 2022-7 redundancy technique.
Engineering Brief 510 (Download now)

EB04-3 Computational Complexity of a Nonuniform Orthogonal Lapped Filterbank Based on MDCT and Time Domain Aliasing ReductionNils Werner, International Audio Laboratories Erlangen - Erlangen, Germany; Bernd Edler, International Audio Laboratories Erlangen - Erlangen, Germany
In this brief we investigate the computational complexity of a non-uniform lapped orthogonal filterbank with time domain aliasing reduction. The computational complexity of such filterbank is crucial for its usability in real-time systems, as well as in embedded and mobile devices. Due to the signal-adaptive nature of the filterbank, the actual real-world complexity will be situated between two theoretical bounds and has to be estimated experimentally by processing real-world signals using a coder-decoder pipeline. Both the bounds and the real-world complexity were analyzed in this brief, and a median 14–22% increase in complexity over an adaptive uniform MDCT filterbank was found.
Engineering Brief 511 (Download now)

EB04-4 Research on Reference-Processed Pair Differential Signal Objective Parameters in Relationship with Subjective Listening Tests Results in Digital Audio Coding DomainKrzysztof Goliasz, Dolby Poland - Wroclaw, Poland; Wroclaw University of Science and Technology - Wroclaw, Poland; Sylwia Prygon, Dolby Poland - Wroclaw, Poland; Wroclaw University of Science and Technology - Wroclaw, Poland; Mikolaj Zwarycz, Dolby Poland - Wroclaw, Dolnoslaskie, Poland
This engineering brief presents results of the research on reference-processed pair differential signal objective parameters in relationship with subjective listening tests results in digital audio coding domain. Authors encoded set of encoder-stressful test signals using two different digital audio encoders. Various bitrates have been used in order to cover many types of audio coding artifacts. Listening tests were conducted against reference signals. Differential signals were created from reference and processed signals pairs. Set of objective parameters of differential signals have been analyzed in order to find relationship between those signals parameters and listening tests results. Conclusion was made based on found dependencies.
Engineering Brief 512 (Download now)

EB04-5 360° Binaural Room Impulse Response (BRIR) Database for 6DOF Spatial Perception ResearchBogdan Ioan Bacila, University of Huddersfield - Huddersfield, West Yorkshire, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK
This engineering brief presents an open-access database for 360° binaural room impulse responses (BRIR) captured in a reverberant concert hall. Head-rotated BRIRs were acquired with 3.6° angular resolution for each of 13 different receiver positions, using a custom-made head-rotation system that was automated and integrated with the Huddersfield Acoustical Analysis Research Toolbox. The BRIRs are provided in the SOFA format. The library also contains impulse responses captured using a first-order Ambisonic microphone and an omnidirectional microphone. The database can be downloaded through the Resource section of the APL website: . It is expected that the database would be useful for studying the perception of spatial attributes in a six degrees-of-freedom context.
Engineering Brief 513 (Download now)

EB04-6 Vertical Localization of Noise Bands Pairs by Time-Separation and Frequency SeparationTao Zhang, Tokyo University of the Arts - Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Atsushi Marui, Tokyo University of the Arts - Tokyo, Japan
This study investigates how the vertical localization of the low-frequency is released from the domination of the high-frequency sound. Four different noise band pairs were used. They consist of a low-frequency band and a high-frequency band with different cut off frequency, and the low-frequency band was reproduced with a delay time of 25 ms to 200 ms to the high-frequency band. These stimuli were presented from one of five full-range speakers with different elevation set in the median plane. The speakers cover an upper vertical angle of 30 degrees and a lower vertical angle of 30 degrees. The subjects answered both the vertical sound image localization and the sound image width of the low-frequency band and the high-frequency band, respectively. As a result, while the high-frequency band was localized at the actual reproduction position, the low-frequency band showed a tendency of different vertical localization depending on the reproduction position. This tendency of the low-frequency band depending on the reproduction position was shown: in 30°, 15°, and 30°. As the delay time increased, the offset (deviation from the reproducing position) also increased, but the direction of deviation was different.
Engineering Brief 514 (Download now)

EB04-7 Development of a 4-pi Sampling Reverberator, VSVerb—Application to In-Game SoundsMasataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Akira Omoto, Kyushu University - Fukuoka, Japan; Onfuture Ltd. - Tokyo, Japan; Yasuhiko Nagatomo, Evixar Inc. - Tokyo, Japan
The authors develop a 4-pi sampling reverberator named “VSVerb.” The VSVerb restores a 4-pi reverberant field by using information of dominant reflections, i.e., dominant virtual sound sources, which are captured in a target space. The distances, amplitudes, and locations of virtual sound sources are detected from measured x, y, z sound intensities at the site, and they are translated into time responses. The generated reverb-effect by the VSVerb provides high flexibility in sound design of post-production works. In order to verify its practicability, reverb-effects at several positions in a virtual room of a video game are re-generated from a VSVerb data that is sampled at one position in a real room. Re-generated reverbs for in-game sounds are implemented into a Dolby Atmos compliant production flow, and their spatial impressions and sound qualities are listened to be checked.
Engineering Brief 515 (Download now)


EB05 - Loudspeakers and Assistive Technologies

Friday, March 22, 16:00 — 17:00 (Meeting Room 3)

Kirsten Hermes, University of Westminster - London, UK

EB05-1 Comparison of Horn Drivers’ Nonlinear Distortion Measured by Different MethodsAlexander Voishvillo, JBL/Harman Professional Solutions - Northridge, CA, USA; Balazs Kakonyi, Harman Professional Solutions - Northridge, CA, USA; Brian McLaughlin, Harman Professional Solutions - Northridge, CA, USA
Multitone and log-sweep testing signals at progressively increasing levels were applied to a horn driver to obtain a nonlinear response. Musical signals were also applied to the driver. The acoustical signal was received at the throat of the horn and at a 1-meter distance from the horn in a 2-Pi anechoic chamber. The levels of the applied signals were incremented in 3 dB voltage steps. The initial horn driver response was corrected to provide maximum flatness and passed through a high-pass filter. Auralization examples and graphic material are demonstrated. The next stage of the research will involve subjective listening tests with signals obtained from measurements and from nonlinear models of horn driver.
Engineering Brief 516 (Download now)

EB05-2 New Engineering Method for Design and Optimization of Phasing Plug and Dome-Shaped Compression Chamber of Horn DriversAlexander Voishvillo, JBL/Harman Professional Solutions - Northridge, CA, USA
In this work an accurate analytical solution is found for the sound field in a dome-shaped compression chamber. This simplifies the design and optimization of the compression chamber’s annular exits to suppress high-frequency air resonances. In earlier works by other authors, the solution is also found in spherical coordinates. For low-curvature chambers, an approximation in the form of Bessel function summation was used. For high-curvature compression chambers an analytical approximation did not work and FEA had to be used. The new proposed method is based on Mehler-Dirichlet analytical integral presentation of Legendre functions. This approach handles high-curvature dome chambers and does not require using numerical methods. An evaluation of this new method’s applicability to chambers with various different curvatures was implemented.
Engineering Brief 517 (Download now)

EB05-3 Conception of an IP-based Broadcast Infrastructure Considering SMPTE 2110 Using the Example of an Transmedia Teleshopping TV StationNorbert Wilinski, HSE24 - Ismaning, Germany
A future-proof broadcast infrastructure is critical to the success of a live TV channel. In addition to the widespread TV formats, content must also be produced for e-commerce and social media; this is especially true for teleshopping. The production environment has to be designed flexibly in order to adapt it quickly to changing market conditions. The simultaneous production for TV and social media must be just as possible as the successive work for various media, if possible without loss of time due to technical modifications. In addition, it is particularly important in teleshopping to make the course of the programs very variable and to adapt to the interaction of the audience. There is no way around an IP-based solution, but for reasons of transmission security, conventional approaches must continue to be taken into account. SMPTE 2110, Professional Media Over Managed IP Networks, introduces a solution specifically designed for live broadcasting. A concept is presented that further develops a classically designed broadcast infrastructure based on IP.
Engineering Brief 518 (Download now)

EB05-4 Education and Assistive Technology for Blind and Visually Impaired Musicians, Presenters or Authors of Radio Plays and Broadcast ProgramsMarcus C. Diess, StudioGuard - Salzburg, Austria
Many visually impaired people are predestined to work in any field of audio-production, in particular musicians with technical ambitions, authors, and presenters. Blind people often own outstanding analytic hearing to orientate in the spatial environment but require special teaching methods and assistive aids, because they cannot read screens, displays, and level-meters of studio hard-and software. Taking first steps into audio work of any kind are difficult for young blind people due to missing opportunities and professional support. Audio hard- and-software is far from barrier-free, but there are first successful attempts made to solve barriers. Such valuable efforts and an approach for teaching methods are revealed in this paper, reporting existing possibilities, and appealing to increase efforts in education opportunities for visually impaired. There is a quite remarkable number of highly professional blind musicians and sound producers, who prove that a blind person can have a career in the performing arts or music industry and take part in society as everybody else does. Methods for barrier free audio lectures for v.i. and blind people will be presented after an introduction with some general facts about visual impairment that have to be known by the reader for better comprehension of the topic. The intention is to encourage and empower visually impaired and blind people to claim participation in audio production, broadcast or even the movie industry´s new departments for audio description for visually impaired. I question the status quo and offer first solutions, improvements and research based on 30 years of practical audio work and 14 years partly awarded work with visually impaired music and radio enthusiasts. This paper is an excerpt and cannot claim being perfect in the solutions offered, as this field of research only just has begun. My intention is to share personal experience and ideas to encourage colleagues and audio professionals to go further and improve what´s already here in that context. To keep the text fluent and comprehensive the phrase “blind” stands for every kind of visual impairment (v.i.).
Engineering Brief 519 (Download now)


EB06 - Production and Simulation

Saturday, March 23, 11:15 — 12:30 (Meeting Room 3)

Ajin Tom, McGill University - Montreal, Quebec, Canada

EB06-1 The Effect of HRTF Individualization and Head-Tracking on Localization and Source Width Perception in VRHengwei Su, Tokyo University of the Arts - Tokyo, Japan; Atsushi Marui, Tokyo University of the Arts - Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan
In this study the effects of head-tracking and HRTF individualization by subjective selection on localization and width perception of widen-processed sources in VR were investigated. Localization test and the perceived width evaluation were conducted under conditions with or without head-tracking and using individualized or non-individual HRTF. For the perceived width evaluation, monophonic signals were processed by a method proposed in previous studies, which aimed to create spatial extent for sound objects in the binaural synthesis. According to the results, head-tracking not only was effective to improve localization accuracies in localization test, but also could help synthesized source widths to be localized more accurately. No difference in perceived width was found under different conditions.
Engineering Brief 520 (Download now)

EB06-2 Does Spectral Flatness Affect the Difficulty of the Peak Frequency Identification Task in Technical Ear Training? (Part 2)Atsushi Marui, Tokyo University of the Arts - Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan
Technical ear training is a method to improve the ability to focus on a specific sound attribute and to communicate using the vocabularies and units shared in the industry. In designing the successful course in a sound engineers’ educational institution, it is essential to have the gradual increase of the task difficulty. In this e-Brief the authors report the relation between spectral envelope modifications on the music excerpts and the resulting objective score through the training, with variations on the music excerpts and difficulty levels.
Engineering Brief 521 (Download now)

EB06-3 Generalized Image Source Method as a Region-to-Region Transfer FunctionThushara Abhayapala, Australian National University - Canberra, ACT, Australia; Prasanga Samarasinghe, Australian National University - Canberra, Australia
Sound propagation inside reverberant rooms is an important topic of research. This is due to its impact on a plethora of applications especially in spatial audio. Room transfer function (RTF) simulations methods play a major role in testing new algorithms before implementing them. Allen and Berkley’s image source method for simulating room transfer function has recently been generalized [1] to incorporate arbitrary source and receiver directivity patterns. In this paper we further illustrate potential applications of the generalized model for recording, reproduction, and manipulating spatial audio within reverberant rooms. We provide detailed mathematical equations of (i) generalized image source method, (ii) its interpretation as a model for region-to-region transfer functions, and (iii) an outline of potential applications of the model.
Engineering Brief 522 (Download now)

EB06-4 Experimenting with Lapped Transforms in Numerical Computation Libraries Using Polyphase Matrices and Strided Memory ViewsNils Werner, International Audio Laboratories Erlangen - Erlangen, Germany; Bernd Edler, International Audio Laboratories Erlangen - Erlangen, Germany
In this brief we present a framework for experimenting with lapped linear transforms in modern numerical computation libraries such as NumPy and Julia. We make use of the fact that these transforms can be represented as matrices (and oftentimes as sparse factorizations thereof), and that numerical computation libraries often support strided memory views. This strided memory view very elegantly solves the problem of processing several overlapping frames at once, while simultaneously allowing vectorization.
Engineering Brief 523 (Download now)

EB06-5 WithdrawnN/A


Return to Engineering Briefs