AES San Francisco 2012
Product Design Track Event Details

Friday, October 26, 9:00 am — 11:00 am (Room 121)

Paper Session: P1 - Amplifiers and Equipment

Chair:
Jayant Datta, THX - San Francisco, CA, USA; Syracuse University - Syracuse, NY, USA

P1-1 A Low-Voltage Low-Power Output Stage for Class-G Headphone Amplifiers—Alexandre Huffenus, EASii IC - Grenoble, France
This paper proposes a new headphone amplifier circuit architecture, which output stage can be powered with very low supply rails from ±1.8 V to ±0.2 V. When used inside a Class-G amplifier, with the switched mode power supply powering the output stage, the power consumption can be significantly reduced. For a typical listening level of 2x100µW, the increase in power consumption compared to idle is only 0.7mW, instead of 2.5mW to 3mW for existing amplifiers. In battery-powered devices like smartphones or portable music players, this can increase the battery life of more than 15% during audio playback. Theory of operation, electrical performance and a comparison with the actual state of the art will be detailed.
Convention Paper 8684 (Purchase now)

P1-2 Switching/Linear Hybrid Audio Power Amplifiers for Domestic Applications, Part 2: The Class-B+D Amplifier—Harry Dymond, University of Bristol - Bristol, UK; Phil Mellor, University of Bristol - Bristol, UK
The analysis and design of a series switching/linear hybrid audio power amplifier rated at 100 W into 8 O are presented. A high-fidelity linear stage controls the output, while the floating mid-point of the power supply for this linear stage is driven by a switching stage. This keeps the voltage across the linear stage output transistors low, enhancing efficiency. Analysis shows that the frequency responses of the linear and switching stages must be tightly matched to avoid saturation of the linear stage output transistors. The switching stage employs separate DC and AC feedback loops in order to minimize the adverse effects of the floating-supply reservoir capacitors, through which the switching stage output current must flow.
Convention Paper 8685 (Purchase now)

P1-3 Investigating the Benefit of Silicon Carbide for a Class D Power Stage—Verena Grifone Fuchs, University of Siegen - Siegen, Germany; CAMCO GmbH - Wenden, Germany; Carsten Wegner, University of Siegen - Siegen, Germany; CAMCO GmbH - Wenden, Germany; Sebastian Neuser, University of Siegen - Siegen, Germany; Dietmar Ehrhardt, University of Siegen - Siegen, Germany
This paper analyzes in which way silicon carbide transistors improve switching errors and loss associated with the power stage. A silicon carbide power stage and a conventional power stage with super-junction devices are compared in terms of switching behavior. Experimental results of switching transitions, delay times, and harmonic distortion as well as a theoretical evaluation are presented. Emending the imperfection of the power stage, silicon carbide transistors bring out high potential for Class D audio amplification.
Convention Paper 8686 (Purchase now)

P1-4 Efficiency Optimization of Class G Amplifiers: Impact of the Input Signals—Patrice Russo, Lyon Institute of Nanotechnology - Lyon, France; Gael Pillonnet, University of Lyon - Lyon, France; CPE dept; Nacer Abouchi, Lyon Institute of Nanotechnology - Lyon, France; Sophie Taupin, STMicroelectronics, Inc. - Grenoble, France; Frederic Goutti, STMicroelectronics, Inc. - Grenoble, France
Class G amplifiers are an effective solution to increase the audio efficiency for headphone applications, but realistic operating conditions have to be taken into account to predict and optimize power efficiency. In fact, power supply tracking, which is a key factor for high efficiency, is poorly optimized with the classical design method because the stimulus used is very different from a real audio signal. Here, a methodology has been proposed to find class G nominal conditions. By using relevant stimuli and nominal output power, the simulation and test of the class G amplifier are closer to the real conditions. Moreover, a novel simulator is used to quickly evaluate the efficiency with these long duration stimuli, i.e., ten seconds instead of a few milliseconds. This allows longer transient simulation for an accurate efficiency and audio quality evaluation by averaging the class G behavior. Based on this simulator, this paper indicates the limitations of the well-established test setup. Real efficiencies vary up to ±50% from the classical methods. Finally, the study underlines the need to use real audio signals to optimize the supply voltage tracking of class G amplifiers in order to achieve a maximal efficiency in nominal operation.
Convention Paper 8687 (Purchase now)

Friday, October 26, 9:00 am — 11:00 am (Room 123)

Product Design: PD1 - Audio DSP Requirements for Tomorrow's Mobile & Portable Devices

Presenters:
Bob Adams, ADI
Juha Backman, Nokia Corporation - Espoo, Finland
Howard Brown, IDT
Peter Eastty, Oxford Digital Limited - Oxford, UK
Alan Kramer, SRS Labs
Cyril Martin, Research in Motion - Germany

Abstract:
As the convergence of communications, entertainment, and computing races ahead, largely centered on portable and mobile devices where form factors are shrinking and style wins out over practicality of design in some instances, the challenges in delivering the audio DSP to provide good sound and differentiated effects are discussed by a panel of experts representing semiconductor manufacturers, mobile/portable device manufacturers, and DSP IP providers.

Friday, October 26, 9:00 am — 10:30 am (Room 122)

Paper Session: P2 - Networked Audio

Chair:
Ellen Juhlin, Meyer Sound - Berkeley, CA, USA; AVnu Alliance

P2-1 Audio Latency Masking in Music Telepresence Using Artificial Reverberation—Ren Gang, University of Rochester - Rochester, NY, USA; Samarth Shivaswamy, University of Rochester - Rochester, NY, USA; Stephen Roessner, University of Rochester - Rochester, NY, USA; Akshay Rao, University of Rochester - Rochester, NY, USA; Dave Headlam, University of Rochester - Rochester, NY, USA; Mark F. Bocko, University of Rochester - Rochester, NY, USA
Network latency poses significant challenges in music telepresence systems designed to enable multiple musicians at different locations to perform together in real-time. Since each musician hears a delayed version of the performance from the other musicians it is difficult to maintain synchronization and there is a natural tendency for the musicians to slow their tempo while awaiting response from their fellow performers. We asked if the introduction of artificial reverberation can enable musicians to better tolerate latency by conducting experiments with performers where the degree of latency was controllable and for which artificial reverberation could be added or not. Both objective and subjective evaluation of ensemble performances were conducted to evaluate the perceptual responses at different experimental settings.
Convention Paper 8688 (Purchase now)

P2-2 Service Discovery Using Open Sound Control—Andrew Eales, Wellington Institute of Technology - Wellington, New Zealand; Rhodes University - Grahamstown, South Africa; Richard Foss, Rhodes University - Grahamstown, Eastern Cape, South Africa
The Open Sound Control (OSC) control protocol does not have service discovery capabilities. The approach to adding service discovery to OSC proposed in this paper uses the OSC address space to represent services within the context of a logical device model. This model allows services to be represented in a context-sensitive manner by relating parameters representing services to the logical organization of a device. Implementation of service discovery is done using standard OSC messages and requires that the OSC address space be designed to support these messages. This paper illustrates how these enhancements to OSC allow a device to advertise its services. Controller applications can then explore the device’s address space to discover services and retrieve the services required by the application.
Convention Paper 8689 (Purchase now)

P2-3 Flexilink: A Unified Low Latency Network Architecture for Multichannel Live Audio—Yonghao Wang, Birmingham City University - Birmingham, UK; John Grant, Nine Tiles Networks Ltd. - Cambridge, UK; Jeremy Foss, Birmingham City University - Birmingham, UK
The networking of live audio for professional applications typically uses layer 2-based solutions such as AES50 and MADI utilizing fixed time slots similar to Time Division Multiplexing (TDM). However, these solutions are not effective for best effort traffic where data traffic utilizes available bandwidth and is consequently subject to variations in QoS. There are audio networking methods such as AES47, which is based on asynchronous transfer mode (ATM), but ATM equipment is rarely available. Audio can also be sent over Internet Protocol (IP), but the size of the packet headers and the difficulty of keeping latency within acceptable limits make it unsuitable for many applications. In this paper we propose a new unified low latency network architecture that supports both time deterministic and best effort traffic toward full bandwidth utilization with high performance routing/switching. For live audio, this network architecture allows low latency as well as the flexibility to support multiplexing multiple channels with different sampling rates and word lengths.
Convention Paper 8690 (Purchase now)

Friday, October 26, 10:00 am — 11:30 am (Foyer)

Poster: P3 - Audio Effects and Physical Modeling

P3-1 Luciverb: Iterated Convolution for the Impatient—Jonathan S. Abel, Stanford University - Stanford, CA, USA; Michael J. Wilson, Stanford University - Stanford, CA, USA
An analysis of iteratively applied room acoustics used by Alvin Lucier to create his piece "I'm Sitting in a Room" is presented, and a real-time system allowing interactive control over the number of rooms in the processing chain is described. Lucier anticipated that repeated application of a room response would bring out room resonances and smear the input sound over time. What was unexpected was the character of the smearing, turning a transient input into a sequence of crescendos at the room modes, ordered from high-frequency to low-frequency. Here, a room impulse response convolve with itself L times is shown have energy at the room mofes, each with a roughly Gaussian envelope, peaking at the observed L/2 times the frequency-dependent decay time.
Convention Paper 8691 (Purchase now)

P3-2 A Tilt Filter in a Servo Loop—John Lazzaro, University of California, Berkeley - Berkeley, CA, USA; John Wawrzynek, University of California, Berkeley - Berkeley, CA, USA
Tone controls based on the tilt filter first appeared in 1982, in the Quad 34 Hi-Fi preamp. More recently, tilt filters have found a home in specialist audio processors such as the Elysia mpressor. This paper describes a novel dynamic filter design based on a tilt filter. A control system sets the tilt slope of the filter, in order to servo the spectral median of the filter output to a user-specified target. Users also specify a tracking time. Potential applications include single-instrument processing (in the spirit of envelope filters) and mastering (for subtle control of tonal balance). Although we have prototyped the design as an AudioUnit plug-in, the architecture is also a good match for analog circuit implementation.
Convention Paper 8692 (Purchase now)

P3-3 Multitrack Mixing Using a Model of Loudness and Partial Loudness—Dominic Ward, Birmingham City University - Birmingham, UK; Joshua D. Reiss, Queen Mary University of London - London, UK; Cham Athwal, Birmingham City University - Birmingham, UK
A method for generating a mix of multitrack recordings using an auditory model has been developed. The proposed method is based on the concept that a balanced mix is one in which the loudness of all instruments are equal. A sophisticated psychoacoustic loudness model is used to measure the loudness of each track both in quiet and when mixed with any combination of the remaining tracks. Such measures are used to control the track gains in a time-varying manner. Finally we demonstrate how model predictions of partial loudness can be used to counteract energetic masking for any track, allowing the user to achieve better channel intelligibility in complex music mixtures.
Convention Paper 8693 (Purchase now)

P3-4 Predicting the Fluctuation Strength of the Output of a Spatial Chorus Effects Processor—William L. Martens, University of Sydney - Sydney, NSW, Australia; Robert W. Taylor, University of Sydney - Sydney, NSW, Australia; Luis Miranda, University of Sydney - Sydney, NSW, Australia
The experimental study reported in this paper was motivated by an exploration of a set of related audio effects comprising what has been called “spatial chorus.” In contrast to a single-output, delay-modulation-based effects processor that produces a limited range of results, complex spatial imagery is produced when parallel processing channels are subjected to incoherent delay modulation. In order to develop a more adequate user interface for control of such “spatial chorus” effects processing, a systematic investigation of the relationship between algorithmic parameters and perceptual attributes was undertaken. The starting point for this investigation was to perceptually scale the amount of modulation present in a set of characteristic stimuli in terms of the auditory attribute that Fastl and Zwicker called “fluctuation strength.”
Convention Paper 8694 (Purchase now)

P3-5 Computer-Aided Estimation of the Athenian Agora Aulos Scales Based on Physical Modeling—Areti Andreopoulou, New York University - New York, NY, USA; Agnieszka Roginska, New York University - New York, NY, USA
This paper presents an approach to scale estimation for the ancient Greek Aulos with the use of physical modeling. The system is based on manipulation of a parameter set that is known to affect the sound of woodwind instruments, such as the reed type, the active length of the pipe, its inner and outer diameters, and the placement and size of the tone-holes. The method is applied on a single Aulos pipe reconstructed from the Athenian Agora fragments. A discussion follows on the resulting scales and the system’s advantages, and limitations.
Convention Paper 8695 (Purchase now)

P3-6 A Computational Acoustic Model of the Coupled Interior Architecture of Ancient Chavín—Regina E. Collecchia, Stanford University - Stanford, CA, USA; Miriam A. Kolar, Stanford University - Stanford, CA, USA; Jonathan S. Abel, Stanford University - Stanford, CA, USA
We present a physical, modular computational acoustic model of the well-preserved interior architecture at the 3,000-year-old Andean ceremonial center Chavín de Huántar. Our previous model prototype [Kolar et. al. 2010] translated the acoustically coupled topology of Chavín gallery forms to a model based on digital waveguides (bi-directional by definition), representing passageways, connected through reverberant scattering junctions, representing the larger room-like areas. Our new approach treats all architectural units as “reverberant” digital waveguides, with scattering junctions at the discrete planes defining the unit boundaries. In this extensible and efficient lumped-element model, we combine architectural dimensional and material data with sparsely measured impulse responses to simulate multiple and circulating arrival paths between sound sources and listeners.
Convention Paper 8696 (Purchase now)

P3-7 Simulating an Asymmetrically Saturated Nonlinearity Using an LNLNL Cascade—Keun Sup Lee, DTS, Inc. - Los Gatos, CA, USA; Jonathan S. Abel, Stanford University - Stanford, CA, USA
The modeling of a weakly nonlinear system having an asymmetric saturating nonlinearity is considered, and a computationally efficient model is proposed. The nonlinear model is the cascade of linear filters and memoryless nonlinearities, an LNLNL system. The two nonlinearities are upward and downward saturators, limiting, respectively, the amplitude of their input for either positive or negative excursions. In this way, distortion noted in each half an input sinusoid can be separately controlled. This simple model is applied toy simulating the signal chain of the Echoplex EP-4 tape delay, where informal listening tests showed excellent agreement between recorded and simulated program material.
Convention Paper 8697 (Purchase now)

P3-8 Coefficient Interpolation for the Max Mathews Phasor Filter—Dana Massie, Audience, Inc. - Mountain View, CA, USA
Max Mathews described what he named the “phasor filter,” which is a flexible building block for computer music, with many desirable properties. It can be used as an oscillator or a filter, or a hybrid of both. There exist analysis methods to derive synthesis parameters for filter banks based on the phasor filter, for percussive sounds. The phasor filter can be viewed as a complex multiply, or as a rotation and scaling of a 2-element vector, or as a real valued MIMO (multiple-input, multiple-output) 2nd order filter with excellent numeric properties (low noise gain). In addition, it has been proven that the phasor filter is unconditionally stable under time varying parameter modifications, which is not true of many common filter topologies. A disadvantage of the phasor filter is the cost of calculating the coefficients, which requires a sine and cosine in the general case. If pre-calculated coefficients are interpolated using linear interpolation, then the poles follow a trajectory that causes the filter to lose resonance. A method is described to interpolate coefficients using a complex multiplication that preserves the filter resonance.
Convention Paper 8698 (Purchase now)

P3-9 The Dynamic Redistribution of Spectral Energies for Upmixing and Re-Animation of Recorded Audio—Christopher J. Keyes, Hong Kong Baptist University - Kowloon, Hong Kong
This paper details a novel approach to upmixing any n channels of audio to any arbitrary n+ channels of audio using frequency-domain processing to dynamically redistribute spectral energies across however many channels of audio are available. Although primarily an upmixing technique, the process may also help the recorded audio regain the sense of “liveliness” that one encounters in concerts of acoustic music, partially mimicking the effects of sound spectra being redistributed throughout a hall due to the dynamically changing radiation patterns of the instruments and the movements of the instruments themselves, during performance and recording. Preliminary listening tests reveal listeners prefer this technique 3 to 1 over a more standard upmixing technique.
Convention Paper 8699 (Purchase now)

P3-10 Matching Artificial Reverb Settings to Unknown Room Recordings: A Recommendation System for Reverb Plugins—Nils Peters, International Computer Science Institute - Berkeley, CA, USA; University of California Berkeley - Berkeley, CA, USA; Jaeyoung Choi, International Computer Science Institute - Berkeley, CA, USA; Howard Lei, International Computer Science Institute - Berkeley, CA, USA
For creating artificial room impressions, numerous reverb plugins exist and are often controllable by many parameters. To efficiently create a desired room impression, the sound engineer must be familiar with all the available reverb setting possibilities. Although plugins are usually equipped with many factory presets for exploring available reverb options, it is a time-consuming learning process to find the ideal reverb settings to create the desired room impression, especially if various reverberation plugins are available. For creating a desired room impression based on a reference audio sample, we present a method to automatically determine the best matching reverb preset across different reverb plugins. Our method uses a supervised machine-learning approach and can dramatically reduce the time spent on the reverb selection process.
Convention Paper 8700 (Purchase now)

Friday, October 26, 10:45 am — 12:30 pm (Room 132)

Product Design: PD2 - AVB Networking for Product Designers

Chair:
Rob Silfvast, Avid - Mountain View, CA, USA
Panelists:
John Bergen, Marvell
Jeff Koftinoff, Meyer Sound Canada - Vernon, BC, Canada
Morten Lave, TC Applied Technologies - Toronto, ON, Canada
Lee Minich, Lab X Technologies - Rochester, NY, USA
Matthew Mora, Chair IEEE 1722.1 - Pleasanton, CA, USA
Dave Olsen, Harman International
Michael Johas Teener, Broadcom - Santa Cruz, CA, USA

Abstract:
This session will cover the essential technical aspects of Audio Video Bridging technology and how it can be deployed in products to support standards-based networked connectivity. AVB is an open IEEE standard and therefore promises low cost and wide interoperability among products that leverage the technology. Speakers from several different companies will share insights on their experiences deploying AVB in real products. The panelists will also compare and contrast the open-standards approach of AVB with proprietary audio-over-Ethernet technologies.

Friday, October 26, 11:00 am — 12:30 pm (Room 123)

Game Audio: G1 - A Whole World in Your Hands: New Techniques in Generative Audio Bring Entire Game Worlds into the Realms of Mobile Platforms

Presenter:
Stephan Schütze

Abstract:
"We can't have good audio; there is not enough memory on our target platform." This is a comment heard far too often especially considering it's incorrect. Current technology already allows for complex and effective audio environments to be made with limited platform resources when developed correctly, but we are just around the corner from an audio revolution.

The next generation of tools being developed for audio creation and implementation will allow large and complex audio environments to be created using minimal amounts of resources. While the new software apps being developed are obviously an important part of this coming revolution it is the techniques, designs, and overall attitudes to audio production that will be the critical factors in successfully creating the next era of sound environments.

This presentation will break down and discuss this new methodology independent of the technology and demonstrate some simple concepts that can be used to develop a new approach to sound design. All the material presented in this talk will benefit development on current and next gen consoles as much as development for mobile devices.

Friday, October 26, 11:15 am — 12:15 pm (Room 121)

Historical: H1 - Popular Misconceptions about Magnetic Recording History and Theory—Things You May Have Missed over the Past 85 Years

Presenters:
Jay McKnight, Magnetic Reference Lab - Cupertino, CA, USA
Jeffrey McKnight, Creativity, Inc. - San Carlos, CA, USA

Abstract:
• Who really discovered AC Bias? The four groups that re-discovered it around 1940, including one person you've probably never heard of.
• How does AC bias actually work?
• Is a recording properly described by "surface induction"?
• The story of the "effective" gap length of a reproducing head, and a correction to Westmijze’s Gap Loss Theory.
• Does Wallace’s “Thickness Loss” properly describe the wavelength response?
• Does disconnecting the erasing head make a quieter recording?

Friday, October 26, 12:30 pm — 2:00 pm (Room 134)

Special Event: Opening Ceremonies
Awards
Keynote Speech

Presenter:
Bob Moses, AES - Vashon Island, WA, USA
Keynote Speaker:
Steve Lillywhite, CBE, UK

Abstract:
Keynote Speaker

Listen With Your Ears, And Not Your Eyes
Multi-Platinum record producer (and Commander of The British Empire/CBE recipient) Steve Lillywhite, has collaborated with stars ranging from The Rolling Stones, U2, Peter Gabriel, Morrissey, Counting Crows, and The Pogues to The Killers, Dave Matthews Band, and Thirty Seconds to Mars. Over the past thirty years, he has made an indelible impact on contemporary music, and he continues to hone the razor edge with innovative new projects. His AES Keynote address will focus on the importance of “studio culture,” and on inspiring and managing the creative process. He will also stress the importance of embracing new technology while avoiding the trap of becoming enslaved by it. Steve Lillywhite's studio philosophy emphasizes the axiom “Listen with your ears and not your eyes.”

Friday, October 26, 2:00 pm — 4:00 pm (Room 133)

Workshop: W3 - What Every Sound Engineer Should Know about the Voice

Chair:
Eddy B. Brixen, EBB-consult - Smorum, Denmark
Panelists:
Henrik Kjelin, Complete Vocal Institute - Denmark
Cathrine Sadolin, Complete Vocal Institute - Denmark

Abstract:
The purpose of this workshop is to teach sound engineers how to listen to the voice before they even think of microphone picking and knob-turning. The presentation and demonstrations are based on the "Complete Vocal Technique" (CVT) where the fundamental is the classification of all human voice sounds into one of four vocal modes named Neutral, Curbing, Overdrive, and Edge. The classification is used by professional singers within all musical styles and has in a period of 20 years proved easy to grasp in both real life situations and also in auditive and visual tests (sound examples and laryngeal images/ Laryngograph waveforms). These vocal modes are found in the speaking voice as well. Cathrine Sadolin, the developer of CVT, will involve the audience in this workshop, while explaining and demonstrating how to work with the modes in practice to achieve any sound and solve many different voice problems like unintentional vocal breaks, too much or too little volume, hoarseness, and much more. The physical aspects of the voice will be explained and laryngograph waveforms and analyses will be demonstrated by Henrik Kjelin. Eddy Brixen will demonstrate measurements for the detection of the vocal modes and explain essential parameters in the recording chain, especially the microphone, to ensure reliable and natural recordings.

Friday, October 26, 2:00 pm — 6:00 pm (Room 121)

Paper Session: P5 - Measurement and Models

Chair:
Louis Fielder, Dolby - San Francisco, CA, USA

P5-1 Measurement of Harmonic Distortion Audibility Using a Simplified Psychoacoustic Model—Steve Temme, Listen, Inc. - Boston, MA, USA; Pascal Brunet, Listen, Inc. - Boston, MA, USA; Parastoo Qarabaqi, Listen, Inc. - Boston, MA, USA
A perceptual method is proposed for measuring harmonic distortion audibility. This method is similar to the CLEAR (Cepstral Loudness Enhanced Algorithm for Rub & buzz) algorithm previously proposed by the authors as a means of detecting audible Rub & Buzz, which is an extreme type of distortion [1,2]. Both methods are based on the Perceptual Evaluation of Audio Quality (PEAQ) standard [3]. In the present work, in order to estimate the audibility of regular harmonic distortion, additional psychoacoustic variables are added to the CLEAR algorithm. These variables are then combined using an artificial neural network approach to derive a metric that is indicative of the overall audible harmonic distortion. Experimental results on headphones are presented to justify the accuracy of the model.
Convention Paper 8704 (Purchase now)

P5-2 Overview and Comparison of and Guide to Audio Measurement Methods—Gregor Schmidle, NTi Audio AG - Schaan, Liechtenstein; Danilo Zanatta, NTi Audio AG - Schaan, Liechtenstein
Modern audio analyzers offer a large number of measurement functions using various measurement methods. This paper categorizes measurement methods from several perspectives. The underlying signal processing concepts, as well as strengths and weaknesses of the most popular methods are listed and assessed for various aspects. The reader is offered guidance for choosing the optimal measurement method based on the specific requirements and application.
Convention Paper 8705 (Purchase now)

P5-3 Spherical Sound Source for Acoustic Measurements—Plamen Valtchev, Univox - Sofia, Bulgaria; Dimitar Dimitrov, BMS Production; Rumen Artarski, Thrax - Sofia, Bulgaria
A spherical sound source for acoustic measurements is proposed, consisting of a pair of coaxial loudspeakers and a pair of compression drivers radiating into a common radially expanding horn in full 360-degree horizontal plane. This horn’s vertical radiation pattern is defined by the enclosures of the LF arrangement. The LF membranes radiate spherically the 50 to 500 Hz band, whereas their HF components complete the horizontal horn reference ellipsoid-like diagram in both vertical directions to a spherical one. The assembly has axial symmetry, thus perfect horizontal polar pattern. The vertical pattern is well within ISO 3382 specifications, even without any “gliding.” Comparative measurements against a purposely built typical dodecahedron revealed superior directivity, sound power capability, and distortion performance.
Convention Paper 8706 (Purchase now)

P5-4 Low Frequency Noise Reduction by Synchronous Averaging under Asynchronous Measurement System in Real Sound Field—Takuma Suzuki, Etani Electronics Co., Ltd. - Ohta-ku, Tokyo, Japan; Hiroshi Koide, Etani Electronics Co., Ltd. - Ohta-ku, Tokyo, Japan; Akihiko Shoji, Etani Electronics Co., Ltd. - Ohta-ku, Tokyo, Japan; Kouichi Tsuchiya, Etani Electronics Co., Ltd. - Ohta-ku, Tokyo, Japan; Tomohiko Endo, Etani Electronics Co., Ltd. - Ohta-ku, Tokyo, Japan; Shokichiro Hino, Etani Electronics Co. Ltd - Ohta-ku, Tokyo, Japan
An important feature in synchronous averaging is the synchronization of sampling clock between the transmitting and receiving devices (e.g., D/A and A/D converters). However, in the case where the devices are placed apart, synchronization becomes difficult to gain. For such circumstances, an effective method is proposed that enables synchronization for an asynchronous measurement environment. Normally, a swept-sine is employed as a measuring signal but because its power spectrum is flat, the signal-to-noise ratio (SNR) is decreased in a real environment with high levels of low frequency noise. To solve this, the devised method adopts the means of “enhancing the signal source power in low frequencies” and “placing random fluctuations in the repetitive period of signal source.” Subsequently, its practicability was verified.
Convention Paper 8707 (Purchase now)

P5-5 Measurement and Analysis of the Spectral Directivity of an Electric Guitar Amplifier: Vertical Plane—Agnieszka Roginska, New York University - New York, NY, USA; Justin Mathew, New York University - New York, NY, USA; Andrew Madden, New York University - New York, NY, USA; Jim Anderson, New York University - New York, NY, USA; Alex U. Case, fermata audio + acoustics - Portsmouth, NH, USA; University of Massachusetts—Lowell - Lowell, MA, USA
Previous work presented the radiation pattern measurement of an electric guitar amplifier densely sampled spatially on a 3-D grid. Results were presented of the directionally dependent spectral features on-axis with the driver, as a function of left/right position, and distance. This paper examines the directionally dependent features of the amplifier measured at the center of the amplifier, in relationship to the height and distance placement of the microphone. Differences between acoustically measured and estimated frequency responses are used to study the change in the acoustic field. This work results in a better understanding of the spectral directivity of the electric guitar amplifier in all three planes.
Convention Paper 8708 (Purchase now)

P5-6 The Radiation Characteristics of a Horizontally Asymmetrical Waveguide that Utilizes a Continuous Arc Diffraction Slot—Soichiro Hayashi, Bose Corporation - Framingham, MA, USA; Akira Mochimaru, Bose Corporation - Framingham, MA, USA; Paul F. Fidlin, Bose Corporation - Framinham, MA, USA
One of the unique requirements for sound reinforcement speaker systems is the need for flexible coverage control—sometimes this requires an asymmetrical pattern. Vertical control can be achieved by arraying sound sources, but in the horizontal plane, a horizontally asymmetrical waveguide may be the best solution. In this paper the radiation characteristics of horizontally asymmetrical waveguides with continuous arc diffraction slots are discussed. Waveguides with several different angular variations are developed and their radiation characteristics are measured. Symmetrical and asymmetrical waveguides are compared, and the controllable frequency range and limitations are discussed.
Convention Paper 8709 (Purchase now)

P5-7 Analysis on Multiple Scattering between the Rigid-Spherical Microphone Array and Nearby Surface in Sound Field Recording—Guangzheng Yu, South China University of Technology - Guangzhou, Guangdong, China; Bo-sun Xie, South China University of Technology - Guangzhou, China; Yu Liu, South China University of Technology - Guangzhou, China
The sound field recording with a rigid spherical microphone array (RSMA) is a newly developed technique. In room sound field recording, when an RSMA is close to a reflective surface, such as the wall or floor, the multiple scattering between the RSMA and the surface occurs and accordingly causes the error in the recorded signals. Based on the mirror-image principle of acoustics, an equivalent two-sphere model is suggested, and the multipole expansion method is applied to analyze the multiple scattering between the RSMA and reflective surface. Using an RSMA with 50 microphones the relationships among the error in RSMA output signals caused by multiple scattering and frequency, direction of incident plane wave, and distance of RSMA relative to reflective surface are analyzed.
Convention Paper 8710 (Purchase now)

P5-8 Calibration of Soundfield Microphones Using the Diffuse-Field Response—Aaron Heller, SRI International - Menlo Park, CA, USA; Eric M. Benjamin, Surround Research - Pacifica, CA, USA
The soundfield microphone utilizes an array of microphones to derive various components of the sound field to be recorded or measured. Given that at high frequencies the response varies with the angle of incidence, it may be argued that any angle of incidence is as important as another, and thus it is important to achieve a calibration that achieves an optimum perceived response characteristic. Gerzon noted that “Above a limiting frequency F ˜ c/(pi r) [. . .] it is found best to equalise the nominal omni and figure-of-eight outputs for an approximately flat response to homogeneous random sound fields.” In practice, however, soundfield microphones have been calibrated to realize a flat axial response. The present work explores the theoretical ramifications of using a diffuse-field equalization target as opposed to a free-field equalization target and provides two practical examples of diffuse-field equalization of tetrahedral microphone arrays.
Convention Paper 8711 (Purchase now)

Friday, October 26, 3:00 pm — 4:30 pm (Foyer)

Poster: P7 - Amplifiers, Transducers, and Equipment

P7-1 Evaluation of t_rr Distorting Effects Reduction in DCI-NPC Multilevel Power Amplifiers by Using SiC Diodes and MOSFET Technologies—Vicent Sala, UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain; Tomas Resano, Jr., UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain; MCIA Research Center; Jose Luis Romeral, UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain; Jose Manuel Moreno, UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain
In the last decade, the Power Amplifier applications have used multilevel diode-clamped-inverter or neutral-point-clamped (DCI-NPC) topologies to present very low distortion at high power. In these applications a lot of research has been done in order to reduce the sources of distortion in the DCI-NPC topologies. One of the most important sources of distortion, and less studied, is the reverse recovery time (t_rr) of the clamp diodes and MOSFET parasitic diodes. Today, with the emergence of Silicon Carbide (SiC) technologies, these sources of distortion are minimized. This paper presents a comparative study and evaluation of the distortion generated by different combinations of diodes and MOSFETs with Si and SiC technologies in a DCI-NPC multilevel Power Amplifier in order to reduce the distortions generated by the non-idealities of the semiconductor devices.
Convention Paper 8720 (Purchase now)

P7-2 New Strategy to Minimize Dead-Time Distortion in DCI-NPC Power Amplifiers Using COE-Error Injection—Tomas Resano, Jr., UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain; MCIA Research Center; Vicent Sala, UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain; Jose Luis Romeral, UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain; Jose Manuel Moreno, UPC-Universitat Politecnica de Catalunya - Terrassa, Catalunya, Spain
The DCI-NPC topology has become one of the best options to optimize energy efficiency in the world of high power and high quality amplifiers. This can use an analog PWM modulator that is sensitive to generate distortion or error, mainly for two reasons: Carriers Amplitude Error (CAE) and Carriers Offset Error (COE). Other main error and distortion sources in the system is the Dead-Time (td). This is necessary to guarantee the proper operation of the power amplifier stage so that errors and distortions originated by it are unavoidable. This work proposes a negative COE generation to minimize the distortion effects of td. Simulation and experimental results validates this strategy.
Convention Paper 8721 (Purchase now)

P7-3 Further Testing and Newer Methods in Evaluating Amplifiers for Induced Phase and Frequency Modulation via Tones, Amplitude Modulated Signals, and Pulsed Waveforms—Ronald Quan, Ron Quan Designs - Cupertino, CA, USA
This paper will present further investigations from AES Convention Paper 8194 that studied induced FM distortions in audio amplifiers. Amplitude modulated (AM) signals are used for investigating frequency shifts of the AM carrier signal with different modulation frequencies. A square-wave and sine-wave TIM test signal is used to evaluate FM distortions at the fundamental frequency and harmonics of the square-wave. Newer amplifiers are tested for FM distortion with a large level low frequency signal inducing FM distortion on a small level high frequency signal. In particular, amplifiers with low and higher open loop bandwidths are tested for differential phase and FM distortion as the frequency of the large level signal is increased from 1 KHz to 2 KHz.
Convention Paper 8722 (Purchase now)

P7-4 Coupling Lumped and Boundary Element Methods Using Superposition—Joerg Panzer, R&D Team - Salgen, Germany
Both, the Lumped and the Boundary Element Method are powerful tools for simulating electroacoustic systems. Each one can have its preferred domain of application within one system to be modeled. For example the Lumped Element Method is practical for electronics, simple mechanics, and internal acoustics. The Boundary Element Method on the other hand enfolds its strength on acoustic-field calculations, such as diffraction, reflection, and radiation impedance problems. Coupling both methods allows to investigate the total system. This paper describes a method for fully coupling of the rigid body mode of the Lumped to the Boundary Element Method with the help of radiation self- and mutual radiation impedance components using the superposition principle. By this, the coupling approach features the convenient property of a high degree of independence of both domains. For example, one can modify parameters and even, to some extent, change the structure of the lumped-element network without the necessity to resolve the boundary element system. This paper gives the mathematical derivation and a demonstration-example, which compares calculation results versus measurement. In this example electronics and mechanics of the three involved loudspeakers are modeled with the help of the lumped element method. Waveguide, enclosure and radiation is modeled with the boundary element method.
Convention Paper 8723 (Purchase now)

P7-5 Study of the Interaction between Radiating Systems in a Coaxial Loudspeaker—Alejandro Espi, Acústica Beyma - Valencia, Spain; William A. Cárdenas, Sr., University of Alicante - Alicante, Spain; Jose Martinez, Acustica Beyma S.L. - Moncada (Valencia), Spain; Jaime Ramis, University of Alicante - Alicante, Spain; Jesus Carbajo, University of Alicante - Alicante, Spain
In this work the procedure followed to study the interaction between the mid and high frequency radiating systems of a coaxial loudspeaker is explained. For this purpose a numerical Finite Element model was implemented. In order to fit the model, an experimental prototype was built and a set of experimental measurements, electrical impedance, and pressure frequency response in an anechoic plane wave tube among these, were carried out. So as to take into account the displacement dependent nonlinearities, a different input voltage parametric analysis was performed and internal acoustic impedance was computed numerically in the frequency domain for specific phase plug geometries. Through inversely transforming to a time differential equation scheme, a lumped element equivalent circuit to evaluate the mutual acoustic load effect present in this type of acoustic coupled systems was obtained. Additionally, the crossover frequency range was analyzed using the Near Field Acoustic Holography technique.
Convention Paper 8724 (Purchase now)

P7-6 Flexible Acoustic Transducer from Dielectric-Compound Elastomer Film—Takehiro Sugimoto, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Tokyo Institute of Technology - Midori-ku, Yokohama, Japan; Kazuho Ono, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Akio Ando, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Hiroyuki Okubo, NHK Science & Technology Research Laboratories - Setagaya-ku, Tokyo, Japan; Kentaro Nakamura, Tokyo Institute of Technology - Midori-ku, Yokohama, Japan
To increase the sound pressure level of a flexible acoustic transducer from a dielectric elastomer film, this paper proposes compounding various kinds of dielectrics into a polyurethane elastomer, which is the base material of the transducer. The studied dielectric elastomer film utilizes a change in side length derived from the electrostriction for sound generation. The proposed method was conceived from the fact that the amount of dimensional change depends on the relative dielectric constant of the elastomer. Acoustical measurements demonstrated that the proposed method was effective because the sound pressure level increased by 6 dB at the maximum.
Convention Paper 8725 (Purchase now)

P7-7 A Digitally Driven Speaker System Using Direct Digital Spread Spectrum Technology to Reduce EMI Noise—Masayuki Yashiro, Hosei University - Koganei, Tokyo, Japan; Mitsuhiro Iwaide, Hosei University - Koganei, Tokyo, Japan; Akira Yasuda, Hosei University - Koganei, Tokyo, Japan; Michitaka Yoshino, Hosei University - Koganei, Tokyo, Japan; Kazuyki Yokota, Hosei University - Koganei, Tokyo, Japan; Yugo Moriyasu, Hosei University - Koganei, Tokyo, Japan; Kenji Sakuda, Hosei University - Koganei, Tokyo, Japan; Fumiaki Nakashima, Hosei University - Koganei, Tokyo, Japan
In this paper a novel digital direct-driven speaker for reducing electromagnetic interference incorporating a spread spectrum clock generator is proposed. The driving signal of a loudspeaker, which has a large spectrum at specific frequency, interferes with nearby equipment because the driving signal emits electromagnetic waves. The proposed method changes two clock frequencies according to the clock selection signal generated by a pseudo-noise circuit. The noise performance deterioration caused by the clock frequency switching can be reduced by the proposed modified delta-sigma modulator, which changes coefficients of the DSM according to the width of the clock period. The proposed method can reduce out-of-band noise by 10 dB compared to the conventional method.
Convention Paper 8726 (Purchase now)

P7-8 Automatic Speaker Delay Adjustment System Using Wireless Audio Capability of ZigBee Networks—Jaeho Choi, Seoul National University - Seoul, Korea; Myoung woo Nam, Seoul National University - Seoul, Korea; Kyogu Lee, Seoul National University - Seoul, Korea
IEEE 802.15.4 (ZigBee) standard is a low data rate, low power consumption, low cost, flexible network system that uses wireless networking protocol for automation and remote control applications. This paper applied these characteristics on the wireless speaker delay compensation system in a large venue (over 500-seat hall). Traditionally delay adjustment has been manually done by sound engineers, but our suggested system will be able to analyze delayed-sound of front speaker to rear speaker automatically and apply appropriate delay time to rear speakers. This paper investigates the feasibility of adjusting the wireless speaker delay over the above-mentioned ZigBee network. We present an implementation of a ZigBee audio transmision and LBS (Location-Based Service) application that allows to calculation a speaker delay time.
Convention Paper 8727 (Purchase now)

P7-9 A Second-Order Soundfield Microphone with Improved Polar Pattern Shape—Eric M. Benjamin, Surround Research - Pacifica, CA, USA
The soundfield microphone is a compact tetrahedral array of four figure-of-eight microphones yielding four coincident virtual microphones; one omnidirectional and three orthogonal pressure gradient microphones. As described by Gerzon, above a limiting frequency approximated by fc = pc/r, the virtual microphones become progressively contaminated by higher-order spherical harmonics. To improve the high-frequency performance, either the array size must be substantially reduced or a new array geometry must be found. In the present work an array having nominally octahedral geometry is described. It samples the spherical harmonics in a natural way and yields horizontal virtual microphones up to second order having excellent horizontal polar patterns up to 20 kHz.
Convention Paper 8728 (Purchase now)

P7-10 Period Deviation Tolerance Templates: A Novel Approach to Evaluation and Specification of Self-Synchronizing Audio Converters—Francis Legray, Dolphin Integration - Meylan, France; Thierry Heeb, Digimath - Sainte-Croix, Switzerland; SUPSI, ICIMSI - Manno, Switzerland; Sebastien Genevey, Dolphin Integration - Meylan, France; Hugo Kuo, Dolphin Integration - Meylan, France
Self-synchronizing converters represent an elegant and cost effective solution for audio functionality integration into SoC (System-on-Chip) as they integrate both conversion and clock synchronization functionalities. Audio performance of such converters is, however, very dependent on the jitter rejection capabilities of the synchronization system. A methodology based on two period deviation tolerance templates is described for evaluating such synchronization solutions, prior to any silicon measurements. It is also a unique way for specifying expected performance of a synchronization system in the presence of jitter on the audio interface. The proposed methodology is applied to a self-synchronizing audio converter and its advantages are illustrated by both simulation and measurement results.
Convention Paper 8729 (Purchase now)

P7-11 Loudspeaker Localization Based on Audio Watermarking—Florian Kolbeck, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Giovanni Del Galdo, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Iwona Sobieraj, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Tobias Bliem, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Localizing the positions of loudspeakers can be useful in a variety of applications, above all the calibration of a home theater setup. For this aim, several existing approaches employ a microphone array and specifically designed signals to be played back by the loudspeakers, such as sine sweeps or maximum length sequences. While these systems achieve good localization accuracy, they are unsuitable for those applications in which the end-user should not be made aware that the localization is taking place. This contribution proposes a system that fulfills these requirements by employing an inaudible watermark to carry out the localization. The watermark is specifically designed to work in reverberant environments. Results from realistic simulations confirm the practicability of the proposed system.
Convention Paper 8730 (Purchase now)

Friday, October 26, 4:00 pm — 6:00 pm (Room 123)

Network Audio: N1 - Error-Tolerant Audio Coding

Chair:
David Trainor, CSR - Belfast, Northern Ireland, UK
Panelists:
Bernhard Grill, Fraunhofer IIS - Erlangen, Germany
Deepen Sinha, ATC Labs - Newark, NJ, USA
Gary Spittle, Dolby Laboratories - San Francisco, CA, USA

Abstract:
Two important and observable trends are the increasing delivery of real-time audio services over the Internet or cellular network and also the implementation of audio networking throughout a residence, office or studio using wireless technologies. This approach to distributing audio content is convenient, ubiquitous, and can be relatively inexpensive. However the nature of these networks is such that their capacity and reliability for real-time audio streaming can vary considerably with time and environment. Therefore error-tolerant audio coding techniques have an important role to play in maintaining audio quality for relevant applications. This workshop will discuss the capabilities of error-tolerant audio coding algorithms and recent advances in the state of the art.

Friday, October 26, 4:00 pm — 6:00 pm (Room 130)

Product Design: PD3 - Don't Make Your Product a Noise Nightmare

Presenter:
William E. Whitlock, Jensen Transformers, Inc. - Chatsworth, CA, USA; Whitlock Consulting - Oxnard, CA, USA

Abstract:
Audio systems that operate on AC power will experience interference in the form of ground voltage differences, magnetic fields, and electric fields. Although, theoretically, balanced interfaces are immune to such interference, realizing high immunity in real-world, mass-produced equipment is not trivial. Designers who use ordinary balanced input stages fail to recognize the critical importance of very high common-mode impedances, which are the advantage of transformers. Because legacy test methods don’t account for signal sources, most modern audio gear has poor immunity in real-world systems. The new IEC test for CMRR is explained as well as excellent alternatives to ordinary input stages. Other ubiquitous design errors, such as the “pin 1 problem” and the “power-line prima donna” syndrome are described as well as avoidance measures.

Friday, October 26, 4:00 pm — 5:30 pm (Room 120)

Live Sound Seminar: LS3 - Practical Application of Audio Networking for Live Sound

Chair:
Kevin Kimmel, Yamaha Commercial Audio - Fullerton, CA, USA
Panelists:
Steve Seable, Yamaha Commercial Audio - Fullerton, CA, USA
Steve Smoot, Yamaha Commercial Audio - Fullerton, CA, USA
Kieran Walsh, Audinate Pty. Ltd. - Ultimo, NSW, Australia

Abstract:
This panel will focus on the use of several audio networking technologies, including A-Net, Dante, EtherSound, Cobranet, Optocore, Rocknet, and AVnu AVB and their deployment in live sound applications. Network panelists will be industry professionals who have experience working with various network formats.

Saturday, October 27, 9:00 am — 10:00 am (Room 120)

Product Design: PD4 - Audio in HTML 5

Presenters:
Jeff Essex, AudioSyncrasy
Jory K. Prum, studio.jory.org - Fairfax, CA, USA

Abstract:
HTML 5 is coming. Many expect it to supplant Flash as an online rich media player, as Apple has made abundantly clear. But audio support is slow in coming, and there are currently marked differences between browsers. From an audio content standpoint, it's the Nineties all over again. The W3C's Audio Working Group is developing standards, but this is a fast-moving target. This talk will provide an update on what's working, what isn't.

Saturday, October 27, 9:00 am — 12:30 pm (Room 121)

Paper Session: P8 - Emerging Audio Technologies

Chair:
Agnieszka Roginska, New York University - New York, NY, USA

P8-1 A Method for Enhancement of Background Sounds in Forensic Audio Recordings—Robert C. Maher, Montana State University - Bozeman, MT, USA
A method for suppressing speech while retaining background sound is presented in this paper. The procedure is useful for audio forensics investigations in which a strong foreground sound source or conversation obscures subtle background sounds or utterances that may be important to the investigation. The procedure uses a sinusoidal speech model to represent the strong foreground signal and then performs a synchronous subtraction to isolate the background sounds that are not well-modeled as part of the speech signal, thereby enhancing the audibility of the background material.
Convention Paper 8731 (Purchase now)

P8-2 Transient Room Acoustics Using a 2.5 Dimensional Approach—Patrick Macey, Pacsys Ltd. - Nottingham, UK
Cavity modes of a finite acoustic domain with rigid boundaries can be used to compute the transient response for a point source excitation. Previous work, considering steady state analysis, showed that for a room of constant height the 3-D modes can be computed very rapidly by computing the 2-D cross section modes. An alternative to a transient modal approach is suggested, using a trigonometric expansion of the pressure through the height. Both methods are much faster than 3-D FEM but the trigonometric series approach is more easily able to include realistic damping. The accuracy of approximating an “almost constant height” room to be constant height is investigated by example.
Convention Paper 8732 (Purchase now)

P8-3 Multimodal Information Management: Evaluation of Auditory and Haptic Cues for NextGen Communication Displays—Durand Begault, Human Systems Integration Division, NASA Ames Research Center - Moffett Field, CA, USA; Rachel M. Bittner, New York University - New York, NY, USA; Mark R. Anderson, Dell Systems, NASA Ames Research Center - Moffett Field, CA, USA
Auditory communication displays within the NextGen data link system may use multiple synthetic speech messages replacing traditional air traffic control and company communications. The design of an interface for selecting among multiple incoming messages can impact both performance (time to select, audit, and release a message) and preference. Two design factors were evaluated: physical pressure-sensitive switches versus flat panel “virtual switches,” and the presence or absence of auditory feedback from switch contact. Performance with stimuli using physical switches was 1.2 s faster than virtual switches (2.0 s vs. 3.2 s); auditory feedback provided a 0.54 s performance advantage (2.33 s vs. 2.87 s). There was no interaction between these variables. Preference data were highly correlated with performance.
Convention Paper 8733 (Purchase now)

P8-4 Prototype Spatial Auditory Display for Remote Planetary Exploration—Elizabeth M. Wenzel, NASA-Ames Research Center - Moffett Field, CA, USA; Martine Godfroy, NASA Ames Research Center - Moffett Field, CA, USA; San Jose State University Foundation; Joel D. Miller, Dell Systems, NASA Ames Research Center - Moffett Field, CA, USA
During Extra-Vehicular Activities (EVA), astronauts must maintain situational awareness (SA) of a number of spatially distributed "entities" such as other team members (human and robotic), rovers, and a lander/habitat or other safe havens. These entities are often outside the immediate field of view and visual resources are needed for other task demands. Recent work at NASA Ames has focused on experimental evaluation of a spatial audio augmented-reality display for tele-robotic planetary exploration on Mars. Studies compared response time and accuracy performance with different types of displays for aiding orientation during exploration: a spatial auditory orientation aid, a 2-D visual orientation aid, and a combined auditory-visual orientation aid under a number of degraded vs. nondegraded visual conditions. The data support the hypothesis that the presence of spatial auditory cueing enhances performance compared to a 2-D visual aid, particularly under degraded visual conditions.
Convention Paper 8734 (Purchase now)

P8-5 The Influence of 2-D and 3-D Video Playback on the Perceived Quality of Spatial Audio Rendering for Headphones—Amir Iljazovic, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Florian Leschka, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Bernhard Neugebauer, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Jan Plogsties, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Algorithms for processing of spatial audio are becoming more attractive for practical applications as multichannel formats and processing power on playback devices enable more advanced rendering techniques. In this study the influence of the visual context on the perceived audio quality is investigated. Three groups of 15 listeners are presented to audio-only, audio with 2-D video, and audio with 3-D video content. The 5.1 channel audio material is processed for headphones using different commercial spatial rendering techniques. Results indicate that a preference for spatial audio processing over a downmix to conventional stereo can be shown with the effect being larger in the presence of 3-D video content. Also, the influence of video on perceived audio quality is significant for 2-D and 3-D video presentation.
Convention Paper 8735 (Purchase now)

P8-6 An Autonomous System for Multitrack Stereo Pan Positioning—Stuart Mansbridge, Queen Mary University of London - London, UK; Saorise Finn, Queen Mary University of London - London, UK; Birmingham City University - Birmingham, UK; Joshua D. Reiss, Queen Mary University of London - London, UK
A real-time system for automating stereo panning positions for a multitrack mix is presented. Real-time feature extraction of loudness and frequency content, constrained rules, and cross-adaptive processing are used to emulate the decisions of a sound engineer, and pan positions are updated continuously to provide spectral and spatial balance with changes in the active tracks. As such, the system is designed to be highly versatile and suitable for a wide number of applications, including both live sound and post-production. A real-time, multitrack C++ VST plug-in version has been developed. A detailed evaluation of the system is given, where formal listening tests compare the system against professional and amateur mixes from a variety of genres.
Convention Paper 8736 (Purchase now)

P8-7 DReaM: A Novel System for Joint Source Separation and Multitrack Coding—Sylvain Marchand, University of Western Brittany - Brest, France; Roland Badeau, Telecom ParisTech - Paris, France; Cléo Baras, GIPSA-Lab - Grenoble, France; Laurent Daudet, University Paris Diderot - Paris, France; Dominique Fourer, University Bordeaux - Talence, France; Laurent Girin, GIPSA-Lab - Grenoble, France; Stanislaw Gorlow, University of Bordeaux - Talence, France; Antoine Liutkus, Telecom ParisTech - Paris, France; Jonathan Pinel, GIPSA-Lab - Grenoble, France; Gaël Richard, Telecom ParisTech - Paris, France; Nicolas Sturmel, GIPSA-Lab - Grenoble, France; Shuhua Zang, GIPSA-Lab - Grenoble, France
Active listening consists in interacting with the music playing, has numerous applications from pedagogy to gaming, and involves advanced remixing processes such as generalized karaoke or respatialization. To get this new freedom, one might use the individual tracks that compose the mix. While multitrack formats lose backward compatibility with popular stereo formats and increase the file size, classic source separation from the stereo mix is not of sufficient quality. We propose a coder / decoder scheme for informed source separation. The coder determines the information necessary to recover the tracks and embeds it inaudibly in the mix, which is stereo and has a size comparable to the original. The decoder enhances the source separation with this information, enabling active listening.
Convention Paper 8737 (Purchase now)

Saturday, October 27, 10:00 am — 11:30 am (Foyer)

Engineering Brief: EB1 - eBrief Presentations—Posters 1

EB1-1 Accuracy of ITU-R BS.1770 Algorithm in Evaluating Multitrack Material—Pedro Duarte Pestana, CITAR-UCP - Almada, Portugal; CEAUL-FCUL; Alvaro Barbosa, Universidade Lisboa - Lisbon, Portugal
Loudness measurement that is computationally efficient and applicable on digital material disregarding listening level is a very important feature for automatic mixing. Recent work in broadcast specifications of loudness (ITU-R BS.1770) deserved broad acceptance and seems a likely candidate for extension to multitrack material, though the original design did not bear in mind this kind of development. Some empirical observations have suggested that certain types of individual source materials are not evaluated properly by the ITU’s algorithm. In this paper a subjective test is presented that tries to shed some light on the subject.
Engineering Brief 53 (Download now)

EB1-2 An Online Resource for the Subjective Comparison of Vocal Microphones—Bradford Swanson, University of Massachusetts - Lowell - Lowell, MA, USA
Forty-eight microphones were gathered into small groups and tested using four vocalists (two male, two female). The recorded results are collected online so users may subjectively compare a single performance on closely related microphones.
Engineering Brief 54 (Download now)

EB1-3 Perception of Distance and the Effect on Sound Recording Distance Suitability for a 3-D or 2-D Image—Luiz Fernando Kruszielski, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Tokyo, Japan; Atsushi Marui, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan
Possible differences in the perception of the sound caused by 3-D image are still unclear. The aim of this research is to understand a possible difference in the perception of distance caused by interaction of sound and 3-D image compared to a 2-D image and also how this could affect the suitability of the sound recording distance. Using a 3-D setup, a saxophone player was recorded at five different distances. The subjects where asked to judge their subjective distance and also the suitable sound for the presented image. The results show that one group perceived 3-D to be more distant, however it did not change the sound suitability compared 3-D to 2-D.
Engineering Brief 55 (Download now)

EB1-4 Modeling Auditory-Haptic Interface Cues from an Analog Multi-line Telephone—Durand Begault, Human Systems Integration Division, NASA Ames Research Center - Moffett Field, CA, USA; Mark R. Anderson, Dell Systems, NASA Ames Research Center - Moffett Field, CA, USA; Rachel M. Bittner, New York University - New York, NY, USA
The Western Electric company produced influential multi-line telephone designs during the 1940s–1970s using a six-button interface (line selection, hold button, intercom). Its simplicity was an example of a successful human factors design. Unlike touchscreen or membrane switches used in its modern equivalents, the older multi-line telephone used raised surface mechanical buttons that provided robust tactile, haptic, and auditory cues. This multi-line telephone was used as a model for a trade study comparison of two interfaces: a touchscreen interface (iPad) versus a pressure-sensitive strain gauge button interface (Phidget USB interface controllers). This engineering brief describes how the interface logic and the visual and auditory cues of the original telephone were analyzed and then synthesized using MAX-MSP. (The experiment and results are detailed in the authors' AES 133rd convention paper "Multimodal Information Management: Evaluation of Auditory and Haptic Cues for NextGen Communication Displays").
Engineering Brief 56 (Download now)

EB1-5 Tailoring Practice Room Acoustics to Student Needs—Scott R. Burgess, Central Michigan University - Mt. Pleasant, MI, USA
A crucial part of any music education facility is the student practice rooms. While these rooms typically vary in size, the acoustic treatments often take a "one size fits all" approach. This can lead to student dissatisfaction and a lack of rooms that are suitable to some instruments. The School of Music at Central Michigan University surveyed our students and created a variety of acoustic environments based on the results of this survey. This presentation will discuss this process and the results of the follow-up survey, which indicates an improvement in student satisfaction, along with suggestions for further study.
Engineering Brief 57 (Download now)

EB1-6 Acoustic Properties of Small Practice Rooms Where Musicians Can Practice Contentedly: Effect of Reverberation on Practice—Ritsuko Tsuchikura, SONA Corp. - Nakano-ku, Tokyo, Japan; Masataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Takashi Mikami, SONA Co. - Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Tokyo, Japan; Atsushi Marui, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan
This paper describes results of study on practice room acoustics regarding the level of satisfaction players feel about the acoustical conditions. Two different factors are found to be involved for musicians to evaluate the acoustics of practice rooms: "comfort in practice" and "comfort in performance." Further evaluation of the two factors shows that "comfort in practice" and "comfort in performance" have different desired reverberation times. The average absorption coefficients, therefore, are estimated. Though the experiments were carried out on several kinds of instruments, this paper describes the results of experiments involving trumpeters and violinists.
Engineering Brief 58 (Download now)

EB1-7 Bellamy Baffle Array: A Multichannel Recording Technique to Improve Listener Envelopment—Steven Bellamy, Humber College - Toronto, ON, Canada
The paper outlines a 6-microphone technique that makes use of a baffle between front and rear arrays. This addresses three common challenges in multichannel recording for 5.1 channel playback. First, to improve the sense of connectedness between LS/RS, L/LS and R/RS channel pairs. Second, to maintain clarity of the direct sound while allowing for strong levels of room sound in the mix. Third, to provide a flexible system that can work well with a variety of ensembles. The result is a flexible microphone technique that results in recordings of increased clarity and envelopment.
Engineering Brief 59 (Download now)

Saturday, October 27, 11:00 am — 1:00 pm (Room 132)

Tutorial: T5 - Getting the Sound Out of (and Into) Your Head: The Practical Acoustics of Headsets

Presenter:
Christopher Struck, CJS Labs - San Francisco, CA, USA

Abstract:
The electroacoustics of headsets and other head-worn devices are presented. The Insertion Gain concept is reviewed and appropriate free and diffuse field target responses are detailed. The selection and use of appropriate instrumentation, including ear and mouth simulators, and test manikins appropriate for head-worn devices are described. Boom, close-talking, and noise-canceling microphone tests are shown and practical methods are discussed for obtaining consistent data. Relevant standards are reviewed, typical measurement examples are described, and applications of these methods to analogue, USB, and Bluetooth wireless devices are explained.

Saturday, October 27, 12:00 pm — 1:00 pm (Room 120)

Product Design: PD5 - Graphical Audio/DSP Applications Development Environment for Fixed and Floating Point Processors

Presenter:
Miguel Chavez, Analog Devices

Abstract:
Graphical development environments have been used in the audio industry for a number of years. An ideal graphical environment not only allows for algorithm development and prototyping but also facilitates the development of run-time DSP applications by producing production-ready code. This presentation will discuss how a graphic environments’ real-time control and parameter tuning makes audio DSPs easy to evaluate, design, and use resulting in a shortened development time and reduced time-to-market. It will then describe and explain the software architecture decisions and design challenges that were used to develop a new and expanded development environment for audio-centric commercially available fixed and floating-point processors.

Saturday, October 27, 12:30 pm — 2:00 pm (Foyer)

Student / Career: SC3 - Education and Career/Job Fair

Abstract:
The combined AES 133 Education and Career Fair will match job seekers with companies and prospective students with schools.

Companies
Looking for the best and brightest minds in the audio world? No place will have more of them assembled than the 133rd Convention of the Audio Engineering Society. Companies are invited to participate in our Education and Career Fair, free of charge. This is the perfect chance to identify your ideal new hires!
All attendees of the convention, students and professionals alike, are welcome to come visit with representatives from participating companies to find out more about job and internship opportunities in the audio industry. Bring your resume!
Companies sign up by clicking here!

Schools
One of the best reasons to attend AES conventions is the opportunity to make important connections with your fellow educators from around the globe. Academic Institutions offering studies in audio (from short courses to graduate degrees) will be represented in a "table top" session. Information on each school's respective programs will be made available through displays and academic guidance. There is no charge for schools/institutions to participate. Admission is free and open to all convention attendees.
Schools sign up by clicking here!

Saturday, October 27, 2:00 pm — 3:30 pm (Room 123)

Network Audio: N2 - Open IP Protocols for Audio Networking

Presenter:
Kevin Gross, AVA Networks - Boulder, CO, USA

Abstract:
The networking and telecommunication industry has its own set of network protocols for carriage of audio and video over IP networks. These protocols have been widely deployed for telephony and teleconferencing applications, internet streaming, and cable television. This tutorial will acquaint attendees with these protocols and their capabilities and limitations. The relationship to AVB protocols will be discussed.

Specifically, attendees will learn about Internet protocol (IP), voice over IP (VoIP), IP television (IPTV), HTTP streaming, real-time transport protocol (RTP), real-time transport control protocol (RTCP), real-time streaming protocol (RTSP), session initiation protocol (SIP), session description protocol (SDP), Bonjour, session announcement protocol (SAP), differentiated services (DiffServ), and IEEE 1588 precision time protocol (PTP)

An overview of AES standards work, X192, adapting these protocols to high-performance audio applications will be given.

Saturday, October 27, 2:00 pm — 6:00 pm (Room 121)

Paper Session: P10 - Transducers

Chair:
Alex Voishvillo, JBL Professional - Northridge, CA, USA

P10-1 The Relationship between Perception and Measurement of Headphone Sound Quality—Sean Olive, Harman International - Northridge, CA, USA; Todd Welti, Harman International - Northridge, CA, USA
Double-blind listening tests were performed on six popular circumaural headphones to study the relationship between their perceived sound quality and their acoustical performance. In terms of overall sound quality, the most preferred headphones were perceived to have the most neutral spectral balance with the lowest coloration. When measured on an acoustic coupler, the most preferred headphones produced the smoothest and flattest amplitude response, a response that deviates from the current IEC recommended diffuse-field calibration. The results provide further evidence that the IEC 60268-7 headphone calibration is not optimal for achieving the best sound quality.
Convention Paper 8744 (Purchase now)

P10-2 On the Study of Ionic Microphones—Hiroshi Akino, Audio-Technica Co. - Machida-shi, Tokyo, Japan; Kanagawa Institute of Technology - Kanagawa, Japan; Hirofumi Shimokawa, Kanagawa Institute of Technology - Kanagawa, Japan; Tadashi Kikutani, Audio-Technica U.S., Inc. - Stow, OH, USA; Jackie Green, Audio-Technica U.S., Inc. - Stow, OH, USA
Diaphragm-less ionic loudspeakers using both low-temperature and high-temperature plasma methods have already been studied and developed for practical use. This study examined using similar methods to create a diaphragm-less ionic microphone. Although the low-temperature method was not practical due to high noise levels in the discharges, the high-temperature method exhibited a useful shifting of the oscillation frequency. By performing FM detection on this oscillation frequency shift, audio signals were obtained. Accordingly, an ionic microphone was tested in which the frequency response level using high-temperature plasma increased as the sound wave frequency decreased. Maintaining performance proved difficult as discharges in the air led to wear of the needle electrode tip and adhesion of products of the discharge. Study results showed that the stability of the discharge corresponded to the non-uniform electric field that was dependent on the formation shape of the high-temperature plasma, the shape of the discharge electrode, and the use of inert gas that protected the needle electrode. This paper reviews the experimental outcome of the two ionic methods, and considerations given to resolve the tip and discharge product and stability problems.
Convention Paper 8745 (Purchase now)

P10-3 Midrange Resonant Scattering in Loudspeakers—Juha Backman, Nokia Corporation - Espoo, Finland
One of the significant sources of midrange coloration in loudspeakers is the resonant scattering of the exterior sound field from ports, recesses, or horns. This paper discusses the qualitative behavior of the scattered sound and introduces a computationally efficient model for such scattering, based on waveguide models for the acoustical elements (ports, etc.), and mutual radiation impedance model for their coupling to the sound field generated by the drivers. In the simplest case of driver-port interaction in a direct radiating loudspeaker an approximate analytical expression can be written for the scattered sound. These methods can be applied to numerical optimization of loudspeaker layouts.
Convention Paper 8746 (Purchase now)

P10-4 Long Distance Induction Drive Loud Hailer Characterization—Marshall Buck, Psychotechnology, Inc. - Los Angeles, CA, USA; Wisdom Audio; David Graebener, Wisdom Audio Corporation - Carson City, NV, USA; Ron Sauro, NWAA Labs, Inc. - Elma, WA, USA
Further development of the high power, high efficiency induction drive compression driver when mounted on a tight pattern horn results in a high performance loud hailer. The detailed performance is tested in an independent laboratory with unique capabilities, including indoor frequency response at a distance of 4 meters. Additional characteristics tested include maximum burst output level, polar response, and directivity balloons. Outdoor tests were also performed at distances up to 220 meters and included speech transmission index and frequency response. Plane wave tube driver-phase plug tests were performed to assess incoherence, power compression, efficiency, and frequency response.
Convention Paper 8747 (Purchase now)

P10-5 Optimal Configurations for Subwoofers in Rooms Considering Seat to Seat Variation and Low Frequency Efficiency—Todd Welti, Harman International - Northridge, CA, USA
The placement of subwoofers and listeners in small rooms and the size and shape of the room all have profound influences on the resulting low frequency response. In this study, a computer model was used to investigate a large number of room, seating, and subwoofer configurations. For each configuration simulated, metrics for seat to seat consistency and bass efficiency were calculated and combined in a newly proposed metric, which is intended as an overall figure of merit. The data presented has much practical value in small room design for new rooms, or even for modifying existing configurations.
Convention Paper 8748 (Purchase now)

P10-6 Modeling the Large Signal Behavior of Micro-Speakers—Wolfgang Klippel, Klippel GmbH - Dresden, Germany
The mechanical and acoustical losses considered in the lumped parameter modeling of electro-dynamical transducers may become a dominant source of nonlinear distortion in micro-speakers, tweeters, headphones, and some horn compression drivers where the total quality factor Q_ts is not dominated by the electrical damping realized by a high force factor Bl and a low voice resistance R_e. This paper presents a nonlinear model describing the generation of the distortion and a new dynamic measurement technique for identifying the nonlinear resistance R_ms(v) as a function of voice coil velocity v. The theory and the identification technique are verified by comparing distortion and other nonlinear symptoms measured on micro-speakers as used in cellular phones with the corresponding behavior predicted by the nonlinear model.
Convention Paper 8749 (Purchase now)

P10-7 An Indirect Study of Compliance and Damping in Linear Array Transducers—Richard Little, Far North Electroacoustics - Surrey, BC, Canada
A linear array transducer is a dual-motor, dual-coil, multi-cone, tubularly-shaped transducer whose shape defeats many measurement techniques that can be used to examine directly the force-deflection behavior of its diaphragm suspension system. Instead, the impedance curve of the transducer is compared against theoretical linear models to determine best-fit parameter values. The variation in the value of these parameters with increasing input signal levels is also examined.
Convention Paper 8750 (Purchase now)

P10-8 Bandwidth Extension for Microphone Arrays—Benjamin Bernschütz, Cologne University of Applied Sciences - Cologne, Germany; Technical University of Berlin - Berlin, Germany
Microphone arrays are in the focus of interest for spatial audio recording applications or the analysis of sound fields. But one of the major problems of microphone arrays is the limited operational frequency range. Especially at high frequencies spatial aliasing artifacts tend to disturb the output signal. This severely restricts the applicability and acceptance of microphone arrays in practice. A new approach to enhance the bandwidth of microphone arrays is presented, which is based on some restrictive assumptions concerning natural sound fields, the separate acquisition and treatment of spatiotemporal and spectrotemporal sound field properties, and the subsequent synthesis of array signals for critical frequency bands. Additionally, the method can be used for spatial audio data reduction algorithms.
Convention Paper 8751 (Purchase now)

Sunday, October 28, 9:00 am — 10:30 am (Room 120)

Product Design: PD6 - Multimedia Device Audio Architecture

Presenter:
Laurent Le Faucheur, Texas Instruments - Villeneuve-Loubet, France

Abstract:
A hardware audio architecture solves several mobile low power multimedia application processor constraints related to: legacy software reuse, signal processing performance, power optimization, multiple data format interfaces, low latency voice and tones, low system costs. The presented audio architecture is optimized for solving those constraints with several assets: a powerful DSP, a low-power Audio Back-End processor, and a high-performance mixed-signal audio device. The DSP audio framework enables the integration of multiple audio features developed by third-parties.

Sunday, October 28, 9:00 am — 10:30 am (Room 122)

Student / Career: SC6 - Student Design Competition

Moderator:
Scott Dorsey, Williamsburg, VA, USA

Abstract:
The three Graduate level and three undergraduate level finalists of the AES Student Design Competition will present and defend their designs in front of a panel of expert judges. This is an opportunity for aspiring student hardware and software engineers to have their projects reviewed by the best minds in the business. It's also an invaluable career-building event and a great place for companies to identify their next employees.

Students from both audio and non-audio backgrounds are encouraged to submit entries. Few restrictions are placed on the nature of the projects, but designs must be for use with audio. Examples include loudspeaker design, DSP plug-ins, analog hardware, signal analysis tools, mobile applications, and sound synthesis devices. Products should represent new, original ideas implemented in working-model prototypes.

Judges: John La Grou, Dave Collins, Bill Whitlock, W.C. (Hutch) Hutchison

Sunday, October 28, 9:00 am — 11:00 am (Room 132)

Workshop: W8 - The Controversy over Upsampling—Boon or Scam?

Chair:
Vicki R. Melchior, Technical Consultant, Audio DSP - Lynn, MA, USA
Panelists:
Poppy Crum, Dolby Laboratories - San Francisco, CA, USA
Robert Katz, Digital Domain Mastering - Orlando, FL, USA
Bruno Putzeys, Hypex Electronics - Rotselaar, Belgium; Grimm Audio
Rhonda Wilson, Dolby Laboratories - San Francisco, CA, USA

Abstract:
Many “high resolution” releases on Blu-ray, DVD, and also some HD downloads are created by upsampling Redbook or 48-kHz data, a practice that draws frequent and quite vehement outcries of fraud. Yet at the same time upsamplers, both hardware and software, are commonly marketed to consumers and professionals now with the promise of boosting redbook sonics to near-equality with high resolution. What's going on? A panel of experienced mastering engineers, DAC, and DSP designers looks at the long-standing controversies and claims behind upsampling, as well as its uses in audio. The issues relating to down/up conversion go well beyond headroom and relate directly to hardware and software implementation, the design of low pass filters, consequences of signal processing, and mastering considerations related to the production and release of modern high quality discs and music files.

Sunday, October 28, 9:00 am — 11:30 am (Room 121)

Paper Session: P14 - Spatial Audio Over Headphones

Chair:
David McGrath, Dolby Australia - McMahons Point, NSW, Australia

P14-1 Preferred Spatial Post-Processing of Popular Stereophonic Music for Headphone Reproduction—Ella Manor, The University of Sydney - Sydney, NSW, Australia; William L. Martens, University of Sydney - Sydney, NSW, Australia; Densil A. Cabrera, University of Sydney - Sydney, NSW, Australia
The spatial imagery experienced when listening to conventional stereophonic music via headphones is considerably different from that experienced in loudspeaker reproduction. While the difference might be reduced when stereophonic program material is spatially processed in order to simulate loudspeaker crosstalk for headphone reproduction, previous listening tests have shown that such processing typically produces results that are not preferred by listeners in comparisons with the original (unprocessed) version of a music program. In this study a double blind test was conducted in which listeners compared five versions of eight programs from a variety of music genres and gave both preference ratings and ensemble stage width (ESW) ratings. Out of four alternative postprocessing algorithms, the outputs that were most preferred resulted from a nearfield crosstalk simulation mimicking low-frequency interaural level differences typical for close-range sources.
Convention Paper 8779 (Purchase now)

P14-2 Interactive 3-D Audio: Enhancing Awareness of Details in Immersive Soundscapes?—Mikkel Schmidt, Technical University of Denmark - Kgs. Lyngby, Denmark; Stephen Schwartz, SoundTales - Helsingør, Denmark; Jan Larsen, Technical University of Denmark - Kgs. Lyngby, Denmark
Spatial audio and the possibility of interacting with the audio environment is thought to increase listeners' attention to details in a soundscape. This work examines if interactive 3-D audio enhances listeners' ability to recall details in a soundscape. Nine different soundscapes were constructed and presented in either mono, stereo, 3-D, or interactive 3-D, and performance was evaluated by asking factual questions about details in the audio. Results show that spatial cues can increase attention to background sounds while reducing attention to narrated text, indicating that spatial audio can be constructed to guide listeners' attention.
Convention Paper 8780 (Purchase now)

P14-3 Simulating Autophony with Auralized Oral-Binaural Room Impulse Responses—Manuj Yadav, University of Sydney - Sydney, NSW, Australia; Luis Miranda, University of Sydney - Sydney, NSW, Australia; Densil A. Cabrera, University of Sydney - Sydney, NSW, Australia; William L. Martens, University of Sydney - Sydney, NSW, Australia
This paper presents a method for simulating the sound that one hears from one’s own voice in a room acoustic environment. Impulse responses from the mouth to the two ears of the same head are auralized within a computer-modeled room in ODEON; using higher-order ambisonics for modeling the directivity pattern of an anthropomorphic head and torso. These binaural room impulse responses, which can be measured for all possible head movements, are input into a mixed-reality room acoustic simulation system for talking-listeners. With the system, “presence” in a room environment different from the one in which one is physically present is created in real-time for voice related tasks.
Convention Paper 8781 (Purchase now)

P14-4 Head-Tracking Techniques for Virtual Acoustics Applications—Wolfgang Hess, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Synthesis of auditory virtual scenes often requires the use of a head-tracker. Virtual sound fields benefit from continuous adaptation to a listener’s position while presented through headphones or loudspeakers. For this task position- and time-accurate, continuous robust capturing of the position of the listener’s outer ears is necessary. Current head-tracker technologies allow solving this task by cheap and reliable electronic techniques. Environmental conditions have to be considered to find an optimal tracking solution for each surrounding and for each field of application. A categorization of head-tracking systems is presented. Inside-out describes tracking stationary sensors from inside a scene, whereas outside-in is the term for capturing from outside a scene. Marker-based and marker-less approaches are described and evaluated by means of commercially available products, e.g., the MS Kinect, and proprietary developed systems.
Convention Paper 8782 (Purchase now)

P14-5 Scalable Binaural Synthesis on Mobile Devices—Christian Sander, University of Applied Sciences Düsseldorf - Düsseldorf, Germany; Robert Schumann Hochschule Düsseldorf - Düsseldorf, Germany; Frank Wefers, RWTH Aachen University - Aachen, Germany; Dieter Leckschat, University of Applied Science Düsseldorf - Düsseldorf, Germany
The binaural reproduction of sound sources through headphones in mobile applications is becoming a promising opportunity to create an immersive three-dimensional listening experience without the need for extensive equipment. Many ideas for outstanding applications in teleconferencing, multichannel rendering for headphones, gaming, or auditory interfaces implementing binaural audio have been proposed. However, the diversity of applications calls for scalability of quality and performance costs so as to use and share hardware resources economically. For this approach, scalable real-time binaural synthesis on mobile platforms was developed and implemented in a test application in order to evaluate what current mobile devices are capable of in terms of binaural technology, both qualitatively and quantitatively. In addition, the audio part of three application scenarios was simulated.
Convention Paper 8783 (Purchase now)

Sunday, October 28, 10:30 am — 12:00 pm (Foyer)

Student / Career: SC7 - Student Design Exhibition

Abstract:
All accepted student entries to the AES Student Design Competition will have the opportunity to show off their designs at this poster/tabletop exhibition. This audio "science fair" is free and open to all convention attendees and is an opportunity for aspiring student hardware and software engineers to have their projects seen by the AES design community. It's also an invaluable career-building event and a great place for companies to identify their next employees.

Students from both audio and non-audio backgrounds are encouraged to submit entries. Few restrictions are placed on the nature of the projects, but designs must be for use with audio. Examples include loudspeaker design, DSP plug-ins, analog hardware, signal analysis tools, mobile applications, and sound synthesis devices. Products should represent new, original ideas implemented in working-model prototypes.

Sunday, October 28, 10:45 am — 11:45 am (Room 122)

Historical: H3 - Lee de Forest: The Man Who Invented the Amplifier

Presenter:
Mike Adams, San Jose State University - San Jose, CA, USA

Abstract:
After Lee de Forest received his PhD in physics and electricity from Yale University in 1899, he spent the next 30 years turning the 19th Century science he learned into the popular audio media of the 20th century. First he added sound to the wireless telegraph of Marconi and created a radiotelephone system. Next, he invented the triode vacuum tube by adding a control grid to Fleming’s two-element diode tube, creating the three-element vacuum tube used as an audio amplifier and oscillator for radio wave generation. Using his tube and building on the earlier work of Ruhmer and Bell, he created a variable density sound-on-film process, patented it, and began working with fellow inventor Theodore Case. In order to promote and demonstrate his process he made hundreds of short sound films, found theaters for their showing, and issued publicity to gain audiences for his invention. While de Forest did not profit from sound-on-film, it was his earlier invention of the three-element vacuum tube that allowed amplification of audio through loudspeakers for radio and the movies that finally helped create their large public audiences.

Sunday, October 28, 11:00 am — 1:00 pm (Room 132)

Tutorial: T8 - An Overview of Audio System Grounding and Interfacing

Presenter:
William E. Whitlock, Jensen Transformers, Inc. - Chatsworth, CA, USA; Whitlock Consulting - Oxnard, CA, USA

Abstract:
Equipment makers like to pretend the problems don’t exist, but this tutorial replaces hype and myth with insight and knowledge, revealing the true causes of system noise and ground loops. Unbalanced interfaces are exquisitely vulnerable to noise due to an intrinsic problem. Although balanced interfaces are theoretically noise-free, they’re widely misunderstood by equipment designers, which often results in inadequate noise rejection in real-world systems. Because of a widespread design error, some equipment has a built-in noise problem. Simple, no-test-equipment, troubleshooting methods can pinpoint the location and cause of system noise. Ground isolators in the signal path solve the fundamental noise coupling problems. Also discussed are unbalanced to balanced connections, RF interference, and power line treatments. Some widely used “cures” are both illegal and deadly.

Sunday, October 28, 11:45 am — 12:45 pm (Room 121)

Product Design: PD7 - A Next Generation Audio Processing Suite for the Enhancement of Acoustically Challenged Devices

Presenter:
Alan Seefeldt, Dolby Laboratories - San Francisco, CA, USA

Abstract:
This tutorial will describe the design principles and algorithms behind a recently released commercial audio processing suite intended to enhance the sound of acoustically challenged devices such as laptops, tablets, and mobile phones. The suite consists of numerous algorithms, all operating within a common frequency domain framework, with several of these algorithms tuned specifically for the acoustics of the device on which it operates.

Sunday, October 28, 12:00 pm — 1:00 pm (Room 122)

Workshop: W9 - Acoustic and Audio iPhone Apps

Chair:
Peter Mapp, Peter Mapp Associates - Colchester, Essex, UK

Abstract:
A range of audio and acoustic apps are available for the iPhone, iPad, and other smartphones. Both Measurement and Calculation apps are available. The workshop will review current apps and discuss some of their uses and limitations.

Sunday, October 28, 1:30 pm — 4:00 pm (Off-Site 2)

Historical: H4 - The Egg Show: A Demonstration of the Artistic Uses of Sound on Film

Presenter:
Ioan Allen, Dolby Laboratories Inc. - San Francisco, CA, USA

Abstract:
The Egg Show is a two-hour entertaining and evocative look at the influences of sound mixing on motion picture presentation. This educational and exquisitely entertaining survey of sound design and acoustical properties in film has been presented at many industry events, film festivals, technology seminars, and film schools with great success. With pictures of eggs ingeniously illustrating his points, and an array of 35mm clips, Mr. Allen explains mixing and editing techniques that sound designers and filmmakers have developed over the years. Many film examples are used, each of which has been chosen to demonstrate a different mixing issue. You will see how film sound mixers highlight a conversation in a crowd scene, inter-cut music to film and create artificial sound effects that sound more real on film than on a live recording. Numerous slides enhance the lecture, most of which use photographs of eggs to illustrate the artistic uses of sound on film.

Location
This event will be held at Dolby Laboratories’ Theater, a multi-format projection room/recording studio complex within Dolby’s South-of-Market Dolby headquarters. The facility was designed as an ideal listening and viewing environment for film, recorded sound, and live presentations.

A limited number of $10 tickets will be available exclusively to registered convention attendees at the tours counter in the main lobby at Moscone. The marked bus will depart at 1:30 for the short ride to Dolby.

About the Speaker
In addition to his 1987 Oscar, Ioan Allen has earned Scientific and Engineering Awards from the Academy of Motion Picture Arts and Sciences. An AES Fellow, and recipient of the AES Silver Medal, Mr. Allen spearheaded the introduction of breakthrough film audio formats which have revolutionized the film sound experience. He is a Fellow of SMPTE and BKSTS and active in U.S. and world standards organizations. In 1985 he received the Samuel L. Warner Award for contributions to motion picture sound.

Sunday, October 28, 1:45 pm — 2:45 pm (Room 123)

Product Design: PD8 - Implementing Application Processor Agnostic Audio Systems for Portable Consumer Devices

Chair:
Jess Brown, Wolfson Micro Ltd.

Abstract:
This tutorial will outline the audio future trends of the portable consumer device and demonstrate how this is achieved with advanced audio solutions. Covering areas, such as HD audio voice, capture, playback, and share, this tutorial will outline the total "mouth to ear" audio solution, from a technology and device standpoint.

Sunday, October 28, 2:00 pm — 5:30 pm (Room 121)

Paper Session: P15 - Signal Processing Fundamentals

Chair:
Lars Villemoes, Dolby Sweden - Stockholm, Sweden

P15-1 Frequency-Domain Implementation of Time-Varying FIR Filters—Earl Vickers, STMicroelectronics, Inc. - Santa Clara, CA, USA; The Sound Guy, Inc. - San Jose, CA, USA
Finite impulse response filters can be implemented efficiently by means of fast convolution in the frequency domain. However, in applications such as speech enhancement or channel upmix, where the filter is a time-varying function of the input signal, standard approaches can suffer from artifacts and distortion due to circular convolution and the resulting time-domain aliasing. Existing solutions can be computationally prohibitive. This paper compares a number of previous algorithms and presents an alternate method based on the equivalence between frequency-domain convolution and time domain windowing. Additional computational efficiency can be attained by careful choice of the analysis window.
Convention Paper 8784 (Purchase now)

P15-2 Estimating a Signal from a Magnitude Spectrogram via Convex Optimization—Dennis L. Sun, Stanford University - Stanford, CA, USA; Julius O. Smith, III, Stanford University - Stanford, CA, USA
The problem of recovering a signal from the magnitude of its short-time Fourier transform (STFT) is a longstanding one in audio signal processing. Existing approaches rely on heuristics that often perform poorly because of the nonconvexity of the problem. We introduce a formulation of the problem that lends itself to a tractable convex program. We observe that our method yields better reconstructions than the standard Griffin-Lim algorithm. We provide an algorithm and discuss practical implementation details, including how the method can be scaled up to larger examples.
Convention Paper 8785 (Purchase now)

P15-3 Distance-Based Automatic Gain Control with Continuous Proximity-Effect Compensation—Walter Etter, Bell Labs, Alcatel-Lucent - Murray Hill, NJ, USA
This paper presents a method of Automatic Gain Control (AGC) that derives the gain from the sound source to microphone distance, utilizing a distance sensor. The concept makes use of the fact that microphone output levels vary inversely with the distance to a spherical sound source. It is applicable to frequently arising situations in which a speaker does not maintain a constant microphone distance. In addition, we address undesired bass response variations caused by the proximity effect. Knowledge of the sound-source to microphone distance permits accurate compensation for both frequency response changes and distance-related signal level changes. In particular, a distance-based AGC can normalize these signal level changes without deteriorating signal quality, as opposed to conventional AGCs, which introduce distortion, pumping, and breathing. Provided an accurate distance sensor, gain changes can take effect instantaneously and do not need to be gated by attack and release time. Likewise, frequency response changes due to undesired proximity-effect variations can be corrected adaptively using precise inverse filtering derived from continuous distance measurements, sound arrival angles, and microphone directivity no longer requiring inadequate static settings on the microphone for proximity-effect compensation.
Convention Paper 8786 (Purchase now)

P15-4 Subband Comfort Noise Insertion for an Acoustic Echo Suppressor—Guangji Shi, DTS, Inc. - Los Gatos, CA, USA; Changxue Ma, DTS, Inc. - Los Gatos, CA, USA
This paper presents an efficient approach for comfort noise insertion for an acoustic echo suppressor. Acoustic echo suppression causes frequent noise level change in noisy environments. The proposed algorithm estimates the noise level for each frequency band using a minimum variance based noise estimator, and generates comfort noise based on the estimated noise level and a random phase generator. Tests show that the proposed comfort noise insertion algorithm is able to insert an appropriate level of comfort noise that matches the background noise characteristics in an efficient manner.
Convention Paper 8787 (Purchase now)

P15-5 Potential of Non-uniformly Partitioned Convolution with Freely Adaptable FFT Sizes—Frank Wefers, RWTH Aachen University - Aachen, Germany; Michael Vorländer, RWTH Aachen University - Aachen, Germany
The standard class of algorithms used for FIR filtering with long impulse responses and short input-to-output latencies are non-uniformly partitioned fast convolution methods. Here a filter impulse response is split into several smaller sub filters of different sizes. Small sub filters are needed for a low latency, whereas long filter parts allow for more computational efficiency. Finding an optimal filter partition that minimizes the computational cost is not trivial, however optimization algorithms are known. Mostly the Fast Fourier Transform (FFT) is used for implementing the fast convolution of the sub filters. Usually the FFT transform sizes are chosen to be powers of two, which has a direct effect on the partitioning of filters. Recent studies reveal that the use of FFT transform sizes that are not powers two has a strong potential to lower the computational costs of the convolution even more. This paper presents a new real-time low-latency convolution algorithm, which performs non-uniformly partitioned convolution with freely adaptable FFT sizes. Alongside, an optimization technique is presented that allows adjusting the FFT sizes in order to minimize the computational complexity for this new framework of non-uniform filter partitions. Finally the performance of the algorithm is compared to conventional methods.
Convention Paper 8788 (Purchase now)

P15-6 Comparison of Filter Bank Design Algorithms for Use in Low Delay Audio Coding—Stephan Preihs, Leibniz Universität Hannover - Hannover, Germany; Thomas Krause, Leibniz Universität Hannover - Hannover, Germany; Jörn Ostermann, Leibniz Universität Hannover - Hannover, Germany
This paper is concerned with the comparison of filter bank design algorithms for use in audio coding applications with a very low coding delay of less than 1ms. Different methods for numerical optimization of low delay filter banks are analyzed and compared. In addition, the use of the designed filter banks in combination with a delay-free ADPCM coding scheme is evaluated. Design properties and results of PEAQ (Perceptual Evaluation of Audio Quality) based objective audio-quality evaluation as well as a listening test are given. The results show that in our coding scheme a significant improvement of audio-quality, especially for critical signals, can be achieved by the use of filter banks designed with alternative filter bank design algorithms.
Convention Paper 8789 (Purchase now)

P15-7 Balanced Phase Equalization; IIR Filters with Independent Frequency Response and Identical Phase Response—Peter Eastty, Oxford Digital Limited - Oxford, UK
It has long been assumed that in order to provide sets of filters with arbitrary frequency responses but matching phase responses, symmetrical, finite impulse response filters must be used. A method is given for the construction of sets of infinite impulse response (recursive) filters that can achieve this aim with lower complexity, power, and delay. The zeros of each filter in a set are rearranged to provide linear phase while the phase shift due to the poles of each filter is counteracted by all-pass compensation filters added to other members of the set.
Convention Paper 8790 (Purchase now)

Sunday, October 28, 2:15 pm — 3:45 pm (Room 122)

Game Audio: G10 - Game Audio in a Web Browser

Presenters:
Owen Grace, Electronic Arts
Roger Powell, Electronic Arts
Chris Rogers, Google Inc.
Guy Whitmore, PopCap Games

Abstract:
Web browser-based computer games are popular because they do not require client application installation, can be played by single or multiple players over the internet, and are generally capable of being played across different browsers and on multiple devices. Audio tools support for developers is varied, with sound engine software typically employing the Adobe Flash plug-in for rendering audio, or the simplistic HTML5 <audio> element tag. This session will focus on a research project to create a game sound engine in Javascript based on the W3C WebAudio API draft proposal. The sound engine was used to generate 3-D spatialized rich audio content within a WebGL-based graphics game framework. The result, a networked multi-player arena combat-style game, rivals the experience of playing on a dedicated console gaming device.

Sunday, October 28, 2:30 pm — 4:30 pm (Room 120)

Live Sound Seminar: LS8 - Tuning a Loudspeaker Installation

Chair:
Jamie Anderson, Rational Acoustics - Putnam, CT, USA
Panelists:
David Gunness, Fulcrum Acoustic - Sutton, MA, USA
Deward Timothy, Poll Sound - Salt Lake Cty, UT, USA

Abstract:
Loudspeaker systems are installed to achieve functional and aesthetic goals. Therefore, the act of tuning (aligning) those systems are the process of / attempt at achieving those aims. While often equated with simply the adjustment of a system’s drive EQ / DSP, loudspeaker system alignment truly encompasses the sum total of the series of decisions (or non-decisions) made throughout the design, installation, drive adjustment, and usage processes. This session gathers a panel of audio professionals with extensive experience in sound system alignment over a diverse variety of system types and applications to discuss their processes, priorities, and the critical elements that make their alignment goals achievable (or not). Given finite, and often extremely limited, resources (equipment, time, money, labor, space, access, authority) this session asks its panelists what is necessary to successfully tune a loudspeaker installation.

Sunday, October 28, 3:00 pm — 4:30 pm (Room 123)

Network Audio: N7 - Audio Network Device Connection and Control

Chair:
Richard Foss, Rhodes University - Grahamstown, Eastern Cape, South Africa
Panelists:
Jeffrey Alan Berryman, Bosch Communications - Flesherton, ON, Canada
Andreas Hildebrand, ALC NetworX - Munich, Germany
Jeff Koftinoff, Meyer Sound Canada - Vernon, BC, Canada
Kieran Walsh, Audinate Pty. Ltd. - Ultimo, NSW, Australia

Abstract:
In this session a number of industry experts will describe and demonstrate how they have enabled the discovery of audio devices on local area networks, their subsequent connection management, and also control over their various parameters. The workshop will start with a panel discussion that introduces issues related to streaming audio, such as bandwidth management and synchronization, as well as protocols that enable connection management and control. The panelists will have demonstrations of their particular audio network solutions. They will describe these solutions as part of the panel discussion, and will provide closer demonstrations following the panel discussion.

Sunday, October 28, 3:30 pm — 5:00 pm (Room 131)

Broadcast and Streaming Media: B14 - Understanding and Working with Codecs

Chair:
Kimberly Sacks, Optimod Refurbishing - Hollywood, MD, USA
Panelists:
Kirk Harnack, Telos Alliance - Nashville, TN, USA; South Seas Broadcasting Corp. - Pago Pago, American Samoa
James Johnston, Retired - Redmond, WA, USA
Jeffrey Riedmiller, Dolby Laboratories - San Francisco, CA, USA
Chris Tobin, Musicam USA - Holmdel, NJ, USA

Abstract:
In the age of smart phones and internet ready devices, audio transport and distribution has evolved from sharing low quality MP3 files to providing high quality mobile device audio streams, click to play content, over the air broadcasting, audio distribution in large facilities, and more. Each medium has several methods of compressing content by means of a codec. This session will explain which codecs are appropriate for which purposes, common misuse of audio codecs, and how to maintain audio quality by implementing codecs professionally.

Sunday, October 28, 4:00 pm — 5:45 pm (Room 122)

Engineering Brief: EB2 - eBrief Presentations—Lectures 1

Chair:
Lance Reichert, Sennheiser Electronic Corporation - San Francisco, CA, USA

EB2-1 A Comparison of Highly Configurable CPU- and GPU-Based Convolution Engines—Michael Schoeffler, International Audio Laboratories Erlangen - Erlangen, Germany; Wolfgang Hess, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
In this work the performance of real-time audio signal processing convolution engines is evaluated. A CPU-based implementation using the Integrated Performance Primitives Library and two GPU-based implementations using CUDA and OpenCL are compared. The purpose of these convolution engines is auralization, e.g., the binaural rendering of virtual multichannel configurations. Any multichannel input and output configuration is supported, e.g., 22.2 to 5.1, 7.1 to 2.0, vice versa, etc. This ability results in a trade-off between configurability and performance. Using a 5.1-to-binaural setup with continuous filter changes due to simulated head-tracking, GPU processing is more efficient when 24 filters of more than 1.92 seconds duration each @ 48 kHz sampling rate are convolved. The GPU is capable of convolving longer filters in real-time than a CPU-based processing. By comparing both GPU-based implementations, negligible performance differences between OpenCL and CUDA were measured.
Engineering Brief 60 (Download now)

EB2-2 Multichannel Audio Processor which Adapts to 2-D and 3-D Loudspeaker Setups—Christof Faller, Illusonic - Uster, Switzerland
A general audio format conversion concept is described for reproducing stereo and surround audio content on loudspeaker setups with any number of channels. The goal is to improve localization and to generate a recording-related spatial impression of depth and immersion. It is explained how with these goals signals are processed using a strategy that is independent of a specific loudspeaker setup. The implementation of this general audio format conversion concept, in the Illusonic Immersive Audio Processor, is described.
Engineering Brief 61 (Download now)

EB2-3 A Comparison of Recording, Rendering, and Reproduction Techniques for Multichannel Spatial Audio—David Romblom, McGill University - Montreal, Quebec, Canada; Catherine Guastavino, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
The objective of this project is to compare the relative merits of two different spatial audio recording and rendering techniques within the context of two different multichannel reproduction systems. The two recordings and rendering techniques are "natural," using main microphone arrays, and "virtual," using spot microphones, panning, and simulated acoustic delay. The two reproduction systems are the 3/2 system (5.1 surround), and a 12/2 system, where the frontal L/C/R triplet is replaced by a 12 loudspeaker linear array. Additionally, the project seeks to know if standard surround techniques can be used in combination with wavefront reconstruction techniques such as Wave Field Synthesis. The Hamasaki Square was used for the room effect in all cases, exhibiting the startling quality of increasing the depth of the frontal image.
Engineering Brief 62 (Download now)

EB2-4 The Reactive Source: A Reproduction Format Agnostic and Adaptive Spatial Audio Effect—Frank Melchior, BBC R&D - Salford, UK
Spatial audio has become a more and more active field of research and various systems are currently under investigation on different scales of effort and complexity. Given the potential of 3-D audio systems, spatial effects beyond source positioning and room simulation are desirable to enhance the creative flexibility. This paper describes a new adaptive spatial audio effect called reactive source. The reactive source uses low-level features of the incoming audio signal to dynamically adapt the spatial behavior of a sound source. Furthermore, the concept is format agnostic so that the effect could easily be applied to different 3-D audio reproduction methods using the same interaction method. To verify the basic concept, a prototype system for multichannel reproduction has been developed.
Engineering Brief 63 (Download now)

EB2-5 Teaching Critical Thinking in an Audio Production Curriculum—Jason Corey, University of Michigan - Ann Arbor, MI, USA
The practice of sound recording and production can be characterized as a series of decisions based primarily on subjective impressions of sound. These subjective impressions lead to equipment choices and use, not only for artistic effect but also to accomplish technical objectives. Nonetheless, the ability to think critically about recording techniques, equipment specifications, and sound quality is vitally important to equipment choice and use. The goal of this paper is to propose methods to encourage critical thinking among students in an audio production curriculum and to consider topics that might be included in coursework to help aspiring audio engineers evaluate audio equipment and processing.
Engineering Brief 64 (Download now)

EB2-6 Sync-AV – Workflow Tool for File-Based Video Shootings—Andreas Fitza, University of Applied Science Mainz - Mainz, Germany
The Sync-AV workflow eases the sorting and synchronization of video and audio footage without the needs of expensive special hardware. It supports the preproduction and the shooting as well as the post-production. It consists of three elements: A script-information- and metadata-gathering iOS app that is synchronized with a server-back-end. It can be used on different devices at once to exchange information onset. A server database with a web-front-end that can sort files by their metadata and show dailies as well. It can also be used to distribute and manage information during the preproduction. A local client that can synchronize and rename the files and that implements the metadata.
Engineering Brief 65 (Download now)

EB2-7 Audio over IP —Kieran Walsh, Audinate Pty. Ltd. - Ultimo, NSW, Australia
Developments in both IP networking and the attitude of professional audio to emerging technologies have presented the opportunity to consider a more abstract and all-encompassing approach to the ways that we manage data. We will examine this paradigm shift and discuss the benefits presented both in practical terms and creatively.
Engineering Brief 66 (Download now)

Sunday, October 28, 5:00 pm — 6:00 pm (Room 123)

Product Design: PD9 - Audio for iPad Publishers

Chair:
Jeff Essex, AudioSyncrasy

Abstract:
Book publishers are running to the iPad, and not just for iBooks, or one-off apps. They're building storefronts and creating subscription models, and the children's book publishers are leading the way. Through two case studies, this talk will explore how to build the audio creation and content management systems needed to produce multiple apps in high-volume environments, including VO production, concatenation schemes, file-naming conventions, audio file types for iOS, and perhaps most important, helping book publishers make the leap from the printed page to interactive publishing.

Sunday, October 28, 5:00 pm — 6:30 pm (Room 131)

Broadcast and Streaming Media: B15 - Audio Processing Basics

Chair:
Richard Burden, Richard W. Burden Associates - Canoga Park, CA, USA
Panelists:
Tim Carroll, Linear Acoustic Inc. - Lancaster, PA, USA
Frank Foti, Omnia - New York, NY, USA
James Johnston, Retired - Redmond, WA, USA
Robert Orban, Orban - San Leandro, CA, USA

Abstract:
Limiting peak excursions to prevent over modulation and increasing the average level through compression to improve signal to noise are worthwhile objectives. Just as we can all agree that a little salt in pepper in the stew enhances the flavor, the argument is how much salt and pepper becomes too much.

It is a given that the louder signal is interpreted by the listener as sounding better. However, there are misuses of the available tools and display a lack of leadership at the point of origin. The variation in energy levels within program and commercial content, as well as, the excessive use of compression on many news interviews are annoying to the listener.

The presentation will cover the fundamentals, the history, and the philosophy of audio processing. An open discussion, with audience participation, on the subject and its practices follow.

Monday, October 29, 9:00 am — 10:00 am (Room 130)

Product Design: PD10 - Ethernet Standard Audio

Presenter:
Stephen Lampen, Belden - San Francisco, CA, USA

Abstract:
Ethernet has been around since 1973, and with the rise of twisted-pair-based Ethernet there have been many companies who played around to get Ethernet to work for multichannel audio. The problem is that all of their solutions were proprietary and not always compatible between manufacturers. This was the impetus behind IEEE 802.1BA AVB, a re-write of the Ethernet standard to include many bells and whistles for audio and video applications. This presentation will show AVB switches, how they are different, and what is in this new standard.

Monday, October 29, 9:30 am — 11:00 am (Foyer)

Engineering Brief: EB3 - eBrief Presentations—Posters 2

EB3-1 Implementation of an Interactive 3-D Reverberator for Video Games Using Statistical Acoustics—Masataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Tomoya Kishi, CAPCOM Co., Ltd. - Okaka-shi, Oasaka-fu, Japan; Kenji Kojima, CAPCOM Co. Ltd.; Toshiki Hanyu, Nihon University - Funabashi, Chiba, Japan; Kazuma Hoshi, Nihon University - Chiba-ken, Japan
An interactive reverberator, which applies realistic computed acoustic responses interactively to video game scenes, is a very important technology for the processing of in-game sounds. The mainframe of an interactive reverberator, which the authors developed, is designed based on statistical acoustics theory, so that it is possible to compute fast enough to realize real-time processing in fast-changing game scenes. Though statistical reverbs generally do not provide a high level of reality, the authors have achieved a quantum leap of sound quality by applying Hanyu's algorithm to conventional theories. The reverberator features: (1) No pre-computing jobs including room modeling are required. (2) Three-dimensional responses are generated automatically. (3) Complex factor of a room's shape, open-air areas, and effects of neighboring reverberations are expressed. The authors implemented the reverberator into a Capcom’s middleware experimentally and have verified it can run effectively. In this paper the algorithm, background theories, and implementation techniques are introduced.
Engineering Brief 67 (Download now)

EB3-2 Printable Loudspeaker Arrays for Flexible Substrates and Interactive Surfaces—Jess Rowland, University of California, Berkeley - Berkeley, CA, USA; Adrian Freed, University of California, Berkeley - Berkeley, CA, USA
Although planar loudspeaker drivers have been well explored for many years, a flat speaker array system that may flex or fold freely remains a current challenge to engineer. We will demonstrate a viable technique for building large loudspeaker arrays that allow for diffused fields of sound transduction on flexible membranes. Planar voice coils are made from machine-cut copper sheets, or by inkjet printing and electroless copper plating, on paper, thin plastic, or similar lightweight material. We will present various ways of attaching thin magnets to these membranes, including a novel alternative strategy of mounting magnets in gloves worn by the listener. This creates an engaging experience for listeners in which gestures can control sounds from the speaker array interactively.
Engineering Brief 68 (Download now)

EB3-3 Nonlinear Distortion Measurement in Audio Amplifiers: The Perceptual Nonlinear Distortion Response—Phillip Minnick, University of Miami - Coral Gables, FL, USA
A new metric for measuring nonlinear distortion is introduced called the Perceptual Nonlinear Distortion Response (PNDR) to measure nonlinear distortion in audio amplifiers. This metric accounts for the auditory system's masking effects. Salient features of previously developed nonlinear distortion measurements are considered in the development of the PNDR. A small group of solid-state and valve audio amplifiers were subjected to various benchmark tests. A listening test was created to test perceptibility of nonlinear distortions generated in the amplifiers. These test results were analyzed and the Perceptual Nonlinear Distortion Response was more successful than traditionally used distortion metrics. This cognitive tool could provide the audio industry with more accurate test methods, facilitating product research and development.
Engineering Brief 69 (Download now)

EB3-4 EspGrid: A Protocol for Participatory Electronic Ensemble Performance—David Ogborn, McMaster University - Hamilton, ON, Canada
EspGrid is a protocol developed to streamline the sharing of timing, code, audio, and video in participatory electronic ensembles, such as laptop orchestras. An application implementing the protocol runs on every machine in the ensemble, and a series of “thin” helper objects connect the shared data to the diverse languages that live electronic musicians use during performance (Max, ChucK, SuperCollider, PD, etc.). The protocol/application has been developed and tested in the busy rehearsal and performance environment of McMaster University’s Cybernetic Orchestra, during the project “Scalable, Collective Traditions of Electronic Sound Performance” supported by Canada’s Social Sciences and Humanities Research Council (SSHRC), and the Arts Research Board of McMaster University.
Engineering Brief 70 (Download now)

EB3-5 A Microphone Technique for Improved Stereo Image, Spatial Realism, and Mixing Flexibility: STAGG (Stereo Technique for Augmented Ambience Gradient)—Jamie Tagg, McGill University - Montreal, Quebec, Canada
While working on location, recording engineers are often challenged by insufficient monitoring. Poor (temporary control room) acoustics or headphone monitoring can make judgments regarding microphone choice and placement difficult. These choices often lead to timbral, phase, and stereo image problems. We are often forced to choose between the improved spatial imaging of near-coincident techniques and the acoustic envelopment from spaced omni-directional mics. This poster proposes a new technique: STAAG (Stereo Technique for Augmented Ambience Gradient), which aims to improve stereo image, acoustic realism, and flexibility in the mix. The STAAG technique allows for adjustment of the acoustic envelopment once in a proper monitoring environment.
Engineering Brief 71 (Download now)

Monday, October 29, 10:15 am — 12:15 pm (Room 130)

Product Design: PD11 - Rub & Buzz and Other Irregular Loudspeaker Distortion

Presenter:
Wolfgang Klippel, Klippel GmbH - Dresden, Germany

Abstract:
Loudspeaker defects caused by manufacturing, aging, overload, or climate impact generate a special kind of irregular distortion commonly known as rub & buzz that are highly audible and intolerable for the human ear. Contrary to regular loudspeaker distortions defined in the design process the irregular distortions are hardly predictable and are generated by an independent process triggered by the input signal. Traditional distortion measurements such as THD fail in the reliable detection of those defects. The Tutorial discusses the most important defect classes, new measurement techniques, audibility, and the impact on perceived sound quality.

Monday, October 29, 10:45 am — 12:15 pm (Room 131)

Tutorial: T12 - Sound System Intelligibility

Presenters:
Ben Kok, BEN KOK acoustic consulting - Uden, Netherlands
Peter Mapp, Peter Mapp Associates - Colchester, Essex, UK

Abstract:
The tutorial will discuss the background to speech intelligibility and its measurement, how room acoustics can potentially affect intelligibility, and what measures can be taken to optimize the intelligibility of a sound system. Practical examples of real world problems and solutions will be given based the wide experience of the two presenters.

Monday, October 29, 11:00 am — 12:30 pm (Room 120)

Live Sound Seminar: LS11 - Audio DSP in Unreal-Time, Real-Time, and Live Settings

Chair:
Robert Bristow-Johnson, audioImagination - Burlington, VT, USA
Panelist:
Kevin Gross, AVA Networks - Boulder, CO, USA

Abstract:
In audio DSP we generally worry about two problem areas: (1) the Algorithm: what we're trying to accomplish with the sound and the mathematics for doing it; and (2) Housekeeping: the "guzzintas" and the "guzzoutas," and other overhead. On the other hand is the audio processing (or synthesis) setting which might be divided into three classes: (1) Non-real-time processing of sound files; (2) Real-time processing of a stream of samples; (3) Live processing of audio. The latter is more restrictive than the former. We'll get a handle on defining what is real-time and what is not, what is live and what is not. What are the essential differences? We'll discuss how the setting affects how the algorithm and housekeeping might be done. And we'll look into some common techniques and less common tricks that might assist in getting non-real-time algorithms to act in a real-time context and to get *parts* of a non-live real-time algorithm to work in a live setting.

Monday, October 29, 2:00 pm — 3:15 pm (Room 121)

Engineering Brief: EB4 - eBrief Presentations—Lectures 2

Chair:
Tad Rollow

EB4-1 Parametric Horn Design—Ambrose Thompson, Martin Audio - High Wycombe, UK
The principle barrier to more widespread use of numerical techniques for horn design is not a shortage of advanced computational libraries, rather the difficultly in defining the problem to solve in a suitable format. The traditional approach of creating horn geometry in large commercial CAD programs then exporting, meshing, assigning conditions, solving, and inspecting results denies the engineer the ability to easily iterate the design. We introduce an object-orientated parametric description of a horn that enables the engineer to modify meaningful parameters to generate geometry of appropriate complexity. The entire process is performed in one environment and an early implementation has been shown to allow nearly 100 iterations of one horn in one week. Results for this and another multi-driver HF horn are given.
Engineering Brief 72 (Download now)

EB4-2 Integration of Touch Pressure and Position Sensing with Speaker Diaphragms—Adrian Freed, University of California, Berkeley - Berkeley, CA, USA
Speaker cones and other driver diaphragms are usually too fragile to be good sites for touch interaction. This can be solved by employing new, lightweight piezoresistive e-textiles with flat, rectangular, stiff surfaces used in full-range drivers from HiWave. Good low-frequency performance of piezoresistive fabric has an advantage over piezoelectric sensing for this situation. Applications of these integrated sensor/actuators include haptic feedback user interfaces and responsive electronic percussion instruments.
Engineering Brief 73 (Download now)

EB4-3 Power Entry—Where High Performance Design Begins—Christopher Peters, Schurter - Santa Rosa, CA, USA; Diane Cupples, Schurter, Inc. - Santa Rosa, CA, USA
What is EMC/EMI? Where does it come from? What steps do I need to take to insure compatibility in today's world? Power quality as it relates to electro-magnetic compatibility is a growing topic among audio equipment manufacturers in the advent of the digital age. This abstract looks at equipment emissions and susceptibility and how to remedy noise problems with effective EMC design in the early stages. The paper and presentation will also offer ideas for integrating voltage selection, overcurrent protection and on/off switching into the overall aspects of compact, high performance design. Cord connection and retaining systems will also be covered.
Engineering Brief 74 (Download now)

EB4-4 Design and Construction of the Tri-Baffle: A Modular Acoustic Modification System for Task-Based Mixing Experiments—Scott Levine, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Brett Leonard, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
The Tri-Baffle is a modular system capable of providing multiple acoustic conditions within a space through the use of absorptive, reflective, and diffusive materials. Each baffle is constructed in a triangular frame, and capable of rotation via a ground-positioned motor. The system was designed and constructed to fit multiple experimental requirements such as acoustic characteristics, time requirements, installation concerns, and portability. As constructed, the Tri-Baffle is fully portable and is capable of installation in any space where task-based experimentation is desired.
Engineering Brief 75 (Download now)

EB4-5 Another View of Distortion Perception—John Vanderkooy, University of Waterloo - Waterloo, ON, Canada; Kevin B. Krauel, University of Waterloo - Waterloo, ON, Canada
Perception of distortion is difficult to determine since it relies critically on signal level. We study a distortion characteristic possessing a relative distortion independent of signal level—a simple change in slope between positive and negative signal excursion. A mathematical analysis is presented, showing the resulting distortion to be mainly even harmonic but with some rectification effects, which need consideration. Various signals are evaluated by informal A/B listening tests, including pure tones and music. Judiciously-chosen signals have distortions that are detectable only if they are above 1%, in keeping with psychoacoustic masking data, while real music signals are considerably more tolerant of distortion up to levels of 5% or more! This implies that, except for crossover distortion, present-day electronic systems are all sufficiently linear.
Engineering Brief 76 (Download now)

EB4-6 How Can Sample Rates Be Properly Compared in Terms of Audio Quality?—Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Daniel Levitin, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Brett Leonard, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
A listening test was designed to compare audio quality using varying sample rates. A Yamaha Disklavier player piano was used as a repeatable acoustic source, and the signal from the microphone preamps was sent to three identical analog to digital converters of the same make and model. The digitized signals were then re-converted to analog and compared to the original "live" signal through the use of a four-way switcher. Sample rates were 44.1, 96, and 192 kHz. Technical setup and the "somewhat inconclusive" results are discussed, along with some possible options for future testing.
Engineering Brief 77 (Download now)

Return to Product Design Track Events

EXHIBITION HOURS October 27th 10am �� 6pm October 28th 10am �� 6pm October 29th 10am �� 4pm

REGISTRATION DESK October 25th 3pm �� 7pm October 26th 8am �� 6pm October 27th 8am �� 6pm October 28th 8am �� 6pm October 29th 8am �� 4pm

TECHNICAL PROGRAM October 26th 9am �� 7pm October 27th 9am �� 7pm October 28th 9am �� 7pm October 29th 9am �� 5pm

Audio Engineering Society

AES San Francisco 2012Product Design Track Event Details