AES San Francisco 2010
Recording Industry Event Details


Thursday, November 4, 9:30 am — 11:30 am

Technical Tour: TT2 - Fantasy Studios

Fantasy Studios at the Zaentz Media Center ( is one of the Bay Area’s largest recording and post-production facilities, with a rich history in their nearly 50 years at this location. Noted artists ranging from Creedence Clearwater Revival to Will.i.Am have recorded acclaimed albums at Fantasy Studios: sound and scores for Oscar-winning films like Ratatouille and The English Patient have been produced in this landmark building. The studio manager will lead this tour, featuring all three recording studios and one of the building’s theaters.

Technical Tours are made available on a first come, first served basis. Tickets can be purchased during normal registration hours at the convention center.

Price: $40 Members / $50 Nonmembers

Thursday, November 4, 11:30 am — 1:00 pm (Room 130)

Tutorial: T2 - Equalization—Are You Getting the Most Out of this Humble Effect?

Alex U. Case, University of Massachusetts Lowell - Lowell, MA, USA

Track by track, mix by mix, we reach for equalization constantly. Easy at first, EQ becomes more intuitive when you have a deep understanding of the parameters, types, and technologies used—plus deep knowledge of the spectral content of the most common pop and rock instruments. Alex Case offers a routine for applying EQ and strategies for its use: fixing problems, enhancing features, fitting the spectral pieces together, and more.

Thursday, November 4, 4:30 pm — 6:30 pm (Room 131)

Tutorial: T6 - Managing Tinnitus as a Working Audio Professional

Neil Cherian, Cleveland Clinic - Cleveland, OH, USA
Michael Santucci, Sensaphonics Hearing Conservation - Chicago, IL, USA

Tinnitus is a common yet poorly understood disorder where sound is perceived in the absence of an external source. Significant sound exposure, with or without hearing loss, is the most common risk factor. Tinnitus can be debilitating and can impair quality of life. Anxiety, depression, and sleep disorders are potential consequences. Most importantly for those in the audio industry, it can significantly impair auditory perception.

This tutorial will focus on methods in managing tinnitus in the life of an audio professional. Background information will be provided regarding the basic concept of tinnitus, pertinent anatomy and physiology, audiologic parameters of tinnitus, and an overview of current research. Suggestions for identifying and mitigating high risk behaviors will be covered. Elements of medical and audiologic evaluations of tinnitus will also reviewed.

Thursday, November 4, 5:00 pm — 6:30 pm (Room 206)

Game Audio: G3 - The Wide Wonderful World of 5.1 Orchestral Recordings

Richard Dekkard, Director, Orphic Media LLC
Tim Gedemer, Owner/Supervising Sound Editor, Source Sound Inc.

When recording an orchestra was a simple affair using two or three microphones, the performance, the choice and placement of said microphones, and the quality of the recording medium were all that factored into the result. These days, orchestral recording takes almost as many forms as pop recording. Spot mics, multichannel arrays, postproduction, and editing are all used in the production process. In this panel, experts in both game and film audio will be discussing the means by which producers and engineers arrive at their final goals for different formats and deal with the challenges of 5.1 orchestral recording for their mediums. Topics will include the different footprint limits in the 5.1 format used for games versus the film format—as well as the those involved in streaming bandwidth for both games and movies. Panelists will go over editing in 5.1 for games to accommodate player-driven music as compared to the linear progression standard in film editing, and will also discuss the lack of standards for 5.1 in games versus the established process and standards for film.

Friday, November 5, 9:00 am — 10:45 am (Room 133)

Workshop: W5 - How Does It Sound Now? The Evolution of Audio

Gary Gottlieb, Webster University
Ed Cherney
Mark Rubel
Elliot Scheiner
Al Schmitt

With 27 Grammy awards between them, panelists Al Schmitt, Elliot Scheiner, Ed Cherney, and Mark Rubel are uniquely qualified to address the issues surrounding quality in audio, the one constant through decades of transitions in our business. Moderator Gary Gottlieb (engineer, author and educator) draws from the old Chet Atkins story with the punch line, "How does it sound now?" as these audio all-stars discuss the methodology employed when confronted with new and evolving technology and how we retain quality and continue to create a product that conforms to our own high standards. This may lead to other conversations about the musicians we work with, the consumers we serve, and the differences and similarities between their standards and our own. How high should your standards be? How should it sound now? How should it sound tomorrow?

Friday, November 5, 9:00 am — 10:30 am (Room 220)

Paper Session: P6 - Microphone Processing

Jon Boley

P6-1 Digitally Enhanced Shotgun Microphone with Increased DirectivityHelmut Wittek, SCHOEPS Mikrofone GmbH - Karlsruhe, Germany; Christof Faller, Illusonic LLC - Lausanne, Switzerland; Christian Langen, SCHOEPS Mikrofone GmbH - Karlsruhe, Germany; Alexis Favrot, Christophe Tournery, Illusonic LLC - Lausanne, Switzerland
Shotgun microphones are still state-of-the-art when the goal is to achieve the highest possible directivity and signal-to-noise ratio with high signal fidelity. As opposed to beamformers, properly designed shotgun microphones do not suffer greatly from inconsistencies and sound color artifacts. A digitally enhanced shotgun microphone is proposed, using a second backward-oriented microphone capsule and digital signal processing with the goal of improving directivity and reducing diffuse gain at low and medium frequencies significantly, while leaving the sound color essentially unchanged. Furthermore, the shotgun microphone’s rear lobe is attenuated.
Convention Paper 8187 (Purchase now)

P6-2 Conversion of Two Closely Spaced Omnidirectional Microphone Signals to an XY Stereo SignalChristof Faller, Illusonic LLC - St-Sulpice, Switzerland
For cost and form factor reasons it is often advantageous to use omni-directional microphones in consumer devices. If the signals of a pair of such microphones are used directly, time-delay stereo with possibly some weak level-difference cues (device body shadowing) is obtained. The result is weak localization and little channel separation. If the microphones are relatively closely spaced, time-delay cues can be converted to intensity-difference cues by applying delay-and-subtract processing to obtain two cardioids. The delay-and-subtract processing is generalized to also be applicable when there is a device body between the microphones. The two cardioids could be directly used as stereo signal, but to prevent low frequency noise the output signals are derived using a time-variant filter applied to the input microphone signals.
Convention Paper 8188 (Purchase now)

P6-3 Determined Source Separation for Microphone Recordings Using IIR FiltersChristian Uhle, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Josh Reiss, Queen Mary University of London - London, UK
A method for determined blind source separation for microphone recordings is presented that attenuates the direct path cross-talk using IIR filters. The unmixing filters are derived by approximating the transmission paths between the sources and the microphones by a delay and a gain factor. For the evaluation, the proposed method is compared to three other approaches. Degradation of the separation performance is caused by fractional delays and the directivity of microphones and sources, which are discussed here. The advantages of the proposed method are low latency, low computational complexity, and high sound quality.
Convention Paper 8189 (Purchase now)

Friday, November 5, 9:15 am — 12:00 pm

Technical Tour: TT7 - Polarity Post Studios and Crescendo! Studios

Polarity Post Studios

In continuous operation since 1985, Barbary Coast resident Polarity Post
Production ( is a five studio audio postproduction house focusing on stereo and surround mixing for TV, radio, video games, film, and the Web, as well as ISDN patches and full service localization, and has a client list including Apple, CBS, Electronic Arts, Disney, and Chevron. The tour will include a demonstration of the Penteo Stereo-to-5.1 conversion process that has been used in Quentin Tarantino's Inglourious Basterds and the major motion picture Watchmen, as well as digital TV programming and SACD. Developed by John Wheeler at Penteo Surround ( the process separates a stereo source into panorama-based slices for 5.1 surround masters which can also downmix back to stereo with no sonic artifacts.

Crescendo! Studios

Crescendo! Studios (, one of San Francisco’s premier audio post facilities, is located in the historic Barbary Coast waterfront district and was designed by WSDG in an Italian motif. The twin Roma and Firenze studios, along with the Dolby 5.1 surround equipped Venezia studio, marry state-of-the-art technology with old world aesthetics. All studios are built around Pro Tools HD Accel systems with HD video playback available facility wide. While touring Crescendo! Studios, AES members will experience the blend of surroundings, service, and staff that have made this studio a resource for media professionals for almost 15 years.

Technical Tours are made available on a first come, first served basis. Tickets can be purchased during normal registration hours at the convention center.

Price: $35 Members / $45 Nonmembers

Friday, November 5, 9:30 am — 11:00 am (Room 226)

Poster: P8 - Audio Processing—1

P8-1 Near and Far-Field Control of Focused Sound Radiation Using a Loudspeaker ArraySangchul Ko, Youngtae Kim, Jung-Woo Choi, SAIT, Samsung Electronics Co. Ltd. - Gyeonggi-do, Korea
In this paper a sound manipulation technique is proposed to prevent unwanted eavesdropping or disturbing others in the vicinity if a multimedia device is being used in a public place. This is capable of realizing the creation of a spatial region having highly acoustic potential energy at the listener’s position. For doing so, the paper discusses the design of multichannel filters with a spatial directivity pattern for a given arbitrary loudspeaker array configuration. First some limitations in using conventional beamforming techniques are presented, and then a novel control strategy is suggested for reproducing a desired acoustic property in a spatial area of interest close to the loudspeaker array. This technique also allows us to control an acoustic property in an area relatively far from the array with a single objective function. In order to precisely produce a desired shape of energy distribution in both areas, spatial weighting technique is introduced. The results are compared with those from controlling each area separately.
Convention Paper 8198 (Purchase now)

P8-2 A Real-Time implementation of a Novel Psychoacoustic Approach for Stereo Acoustic Echo CancellationStefania Cecchi, Laura Romoli, Paolo Peretti, Francesco Piazza, Università Politecnica delle Marche - Ancona (AN), Italy
Stereo acoustic echo cancellers (SAECs) are used in teleconferencing systems to reduce undesired echoes originating from coupling between loudspeakers and microphones. The main problem of this approach is related to the issue of uniquely identifying each pair of room acoustic paths, due to high interchannel coherence. In this paper a real-time implementation of a novel approach for SAEC based on the psychoacoustic effect of missing fundamental is proposed. An adaptive algorithm is employed to track and remove the fundamental frequency of one of the two channels, ensuring a continuous decorrelation without affecting the stereo quality. Several tests are presented taking into account a real-time implementation on a DSP framework in order to confirm its effectiveness.
Convention Paper 8199 (Purchase now)

P8-3 Solo Plucked String Sound Detection by the Energy-to-Spectral Flux Ratio (ESFR)Byung Suk Lee, LG Electronics Inc. - Seocho-Gu, Seoul, Korea, Columbia University, New York, NY, USA; Chang-Heon Lee, Yonsei University - Seoul, Korea; Gyuhyeok Jeong, In Gyu Kang, LG Electronics Inc. - Seocho-Gu, Seoul, Korea
We address the problem of distinguishing solo plucked string sound from speech. Due to the harmonic components present in both types of signals, a low complexity music/speech classifier often misclassifies these signals. To capture the sustained harmonic structures observed in solo plucked string sound, we propose a new feature, the Energy-to-Spectral Flux Ratio (ESFR). The values and the statistics of the ESFR for solo plucked string sound were distinct from those for speech when calculated over windows of 20 to 50 ms. By building a low complexity detector with the ESFR, we demonstrate the discriminating performance of the ESFR feature for the considered problem.
Convention Paper 8200 (Purchase now)

P8-4 Separation of Repeating and Varying Components in Audio MixturesSean Coffin, Stanford University - Stanford, CA, USA
A large amount of modern pop music contains digital “loops” or “samples” (short audio clips) that appear multiple times during a song. In this paper a novel approach to separating these exactly repeating component waveforms from the rest of an audio mixture is presented. By examining time-frequency representations of the mixture during several instances of a single repeating component and taking the complex value for each time-frequency bin with the smallest magnitude across all instances we can effectively extract the content that is perceived to be repeating given that the rest of the mixture varies sufficiently. Results are presented demonstrating successful application to commercially available recordings as well as to constructed audio mixtures achieving signal to interference ratios up to 42.8 dB.
Convention Paper 8201 (Purchase now)

P8-5 High Quality Time-Domain Pitch Shifting Using PSOLA and Transient PreservationAdrian von dem Knesebeck, Pooya Ziraksaz, Udo Zölzer, Helmut-Schmidt-University - Hamburg, Germany
An enhanced pitch shifting system is presented that uses the Pitch Synchronous Overlap Add (PSOLA) technique and a transient detection for processing of monophonic speech or instrument signals. The PSOLA algorithm requires the pitch information and the pitch marks for the signal segmentation in the analysis stage. The pitch is acquired using a well established pitch detector. A new robust pitch mark positioning algorithm is presented that achieves high quality results and allows the positioning of the pitch marks in a frame-based manner to enable real-time application. The quality of the pitch shifter is furthermore enhanced by extracting the transient components before the PSOLA and reapplying them at the synthesis stage to eliminate repetitions of the transients.
Convention Paper 8202 (Purchase now)

Friday, November 5, 10:30 am — 12:45 pm (Room 206)

Workshop: W6 - Single Unit Surround Microphones

Eddy B. Brixen, EBB-Consult
Gary Elko, mh acoustics LLC
David Josephson, Josephson Engineering
Jim Pace, Sanken Microphones / Plus 24
Pieter Schillebeeckx, SoundField
Morten Stove, DPA Microphones
Mattias Strömberg, Milab
Helmut Wittek, SCHOEPS Mikrofone GmbH

The workshop will present available single-unit surround sound microphones in a kind of "shoot out." There are a number of these microphones available and more units are on their way. These microphones are based on different principles. However, due to their compact sizes there may/may not be restrictions to the performance. Basically this workshop will present the different products and the ideas and theories behind them.

Friday, November 5, 11:15 am — 12:45 pm (Room 130)

Workshop: W8 - Mastering: Art, Perception, Technologies–1

Michael Romanowski, Michael Romanowski Mastering
Gavin Lurssen, Gavin Lurssen Mastering
Andrew Mendleson, Georgetown Masters
Joe Palmacio, The Place for Mastering
Mike Wells, Mike Wells Mastering

This is a continuation of the Mastering panel from AES 2009 in New York. We will discuss the state of Mastering in 2010. Mastering engineers use technology to achieve the desired results. But what gets little or no discussion is the perceptions and approaches that cause the engineer to make those choices. In this two part series, we want to talk about the art of perception and technology as it pertains to the Mastering industry in 2010 and the future. This particular discussion will focus on mastering technologies and the state of mastering today and looking forward.

Friday, November 5, 11:30 am — 1:00 pm (Room 226)

Poster: P10 - Audio Processing—2

P10-1 MPEG-A Professional Archival Application Format and its Application for Audio Data ArchivingNoboru Harada, Yutaka Kamamoto, Takehiro Moriya, NTT Communication Science Labs. - Atsugi, Kanagawa, Japan; Masato Otsuka, Memory-Tech Corporation - Tokyo, Japan
ISO/IEC 23000-6 (MPEG-A) Professional Archival Application Format (PA-AF) has just been standardized. This paper proposes an optimized and standard compliant implementation of a PA-AF archiving tool for audio archiving applications. The implementation made use of an optimized MPEG-4 Audio Lossless Coding (ALS) codec library for audio data compression and Gzip for other files. The PA-AF specification was extended to support platform specific attributes of Mac OSs while keeping interoperability among other OSs. Performance test results for actual audio data, such as ProTools HD projects, show that the processing time of a devised PA-AF archiving tool is twice as fast as that of MacDMG and WinZip while the compressed data size is much smaller than that of MacDMG and WinZip.
Convention Paper 8207 (Purchase now)

P10-2 Switched Convolution Reverberator with Two-Stage Decay and Onset Time ControlKeun-Sup Lee, Jonathan S. Abel, Stanford University - Stanford, CA, USA
An efficient artificial reverberator having two-stage decay and onset time controls is presented. A second-order comb filter controlling the reverberator frequency-dependent decay rates and onset times drives a switched convolution with short noise sequences. In this way, a non-exponential reverberation envelope is produced by the comb filter, while the switched convolution structure produces a high echo density. Several schemes for generating two-stage decays and onset time controls with different onset characteristics in different frequency-band are described.
Convention Paper 8208 (Purchase now)

P10-3 Guitar-to-MIDI Interface: Guitar Tones to MIDI Notes Conversion Requiring No Additional PickupsMamoru Ishikawa, Takeshi Matsuda, Michael Cohen, Univeristy of Aizu - Aizu-Wakamatsu, Fukushima-ken, Japan
Many musicians, especially guitarists (both professional and amateur), use effects processors. In recent years, a large variety of digital processing effects have been made available to consumers. Further, desktop music, the “lingua franca” of which is MIDI, has become widespread through advances in computer technology and DSP. Therefore, we are developing a “Guitar to MIDI” interface device that analyzes the analog guitar audio signal and emits a standard MIDI stream. Similar products are already on the market (such as the Roland GI-20 GK-MIDI Interface), but almost all of them need additional pickups or guitar modification. The interface we are developing requires no special guitar accessories. We describe a prototype platformed on a PC that anticipates a self-contained embedded system.
Convention Paper 8209 (Purchase now)

P10-4 A Mixed Mechanical/Digital Approach for Sound Beam Pointing with Loudspeakers Line ArrayPaolo Peretti, Stefania Cecchi, Francesco Piazza, Università Politecnica delle Marche - Ancona (AN), Italy; Marco Secondini, Andrea Fusco, FBT Elettronica S.p.a. - Recanati (MC), Italy
Digital steering is often used in line array sound systems in order to tilt the reproduced sound beam in a desired direction. Unfortunately, the working frequency range is limited to low and medium frequencies, thus, sound beams referred to high frequencies can be tilted only by using a mechanical steering involving both an expensive manufacture and a higher environmental impact. The proposed solution is a mixed approach to sound beam steering by considering an on-axis mechanical rotation of each loudspeaker together with the classical digital control applied to the entire system. In this manner the sound beam can be tilted also at high frequency maintaining linear array geometry. Simulations, considering real loudspeaker directivity, will be shown in order to demonstrate the effectiveness of the proposed approach.
Convention Paper 8210 (Purchase now)

P10-5 The Non-Flat and Continually Changing Frequency Response of Multiband CompressorsEarl Vickers, STMicroelectronics, Inc. - Santa Clara, CA, USA
Multiband dynamic range compressors are powerful, versatile tools for audio mastering, broadcast, and playback. However, they are subject to certain problems relating to frequency response. First, when excited by a time-varying narrow-band input such as a swept sinusoid, they create unwanted magnitude peaks at the band boundaries. Second, and more importantly, the frequency response continually changes, which may have unwanted effects on the long-term average spectral balance. This paper proposes a frequency-domain solution for the unwanted magnitude peaks, whereby slight adjustments to the band boundaries prevent sinusoidal peaks from being midway between two bands. For the second problem, real-time spectral balance compensation may be implemented in either the time or frequency domain.
Convention Paper 8211 (Purchase now)

P10-6 Volterra Series-Based Distortion EffectFinn T. Agerkvist, Technical University of Denmark - Kgs. Lyngby, Denmark
A large part of the characteristic sound of the electric guitar comes from nonlinearities in the signal path. Such nonlinearities may come from the input- or output-stage of the amplifier, which is often equipped with vacuum tubes or a dedicated distortion pedal. In this paper the Volterra series expansion for non linear systems is investigated with respect to generating good distortion. The Volterra series allows for unlimited adjustment of the level and frequency dependency of each distortion component. Subjectively relevant ways of linking the different orders are discussed.
Convention Paper 8212 (Purchase now)

Friday, November 5, 2:00 pm — 4:00 pm (Room 134)

Special Event: Platinum Mastering

Bob Ludwig
Michael Fremer
Doug Sax

Mastering legend Bob Ludwig will moderate a panel that explores vinyl mastering and disc cutting. Besides Bob, the other panelists will be renowned engineer and disc-cutting expert Doug Sax and vinyl guru Michael Fremer, producer of the DVD 21st Century Vinyl: Michael Fremer's Practical Guide to Turntable Set-Up. Panelists will discuss how they approach vinyl mastering, disc cutting, and turntable setup. Ludwig will show archival videos of how lacquers are made as well as a clip of him cutting the vinyl master to the hit Genesis album "Invisible Touch."

Friday, November 5, 2:30 pm — 6:30 pm (Room 220)

Paper Session: P11 - Acoustical and Physical Modeling

Julius O. Smith

P11-1 Virtual Acoustic Prototyping—Practical Applications for Loudspeaker DevelopmentAlex Salvatti, JBL Professional - Northridge, CA, USA
Acoustic simulations using finite elements have been used in loudspeaker development for over 20 years, with complexity and accuracy accelerating in tandem with the increases in computing power generally available on the engineering desktop. Using user-friendly, modern FEA software, the author presents an overview of methods to build virtual prototypes of both horns and loudspeaker drivers that allows a significant reduction in the number of physical prototypes, as well as reduced development time. A comparison of simulated vs. measured data proves the validity of the methods.
Convention Paper 8213 (Purchase now)

P11-2 Simulation of Horn Driver Response by Combination of Matrix Analysis and FEAAlex Voishvillo, JBL Professional - CA, USA
To access performance of a horn driver (compression driver loaded by a horn), measurement of frequency response on-axis and off-axis must be carried out. The measurement process is time-consuming especially if the entire 3-dimensional “balloon” of responses is to be measured. Prediction of directional responses of the horn only (without compression driver) can be performed by the FEA (Finite Elements Analysis) or BEA (Boundary Elements Analysis). However, FEA or BEA of horn only provides relative directional properties of the horn. The SPL responses of horn driver at different angles remain unknown because these responses depend on interaction of electrical, mechanical, and acoustical parameters of the compression driver and the acoustical parameters of the horn. New methods based on a combination of FEA and matrix analysis makes it possible to predict the response of a combination of various compression drivers and horns without actually measuring each combination and even without physically building horns. This method was verified during the development of a new AM series of JBL professional loudspeaker systems and showed high accuracy.
Convention Paper 8214 (Purchase now)

P11-3 Dynamic Motion of the Corrugated Ribbon In a Ribbon MicrophoneDaniel Moses Schlessinger, Sennheiser DSP Research Laboratory - Palo Alto, CA, USA; Jonathan S. Abel, Stanford University - Stanford, CA, USA
Ribbon microphones are known for their warm sonics, owing in part to the unique ribbon motion induced by the sound field. Here the motion of the corrugated ribbon element in a sound field is considered, and a physical model of the ribbon motion is presented. The model separately computes propagating torsional disturbances and coupled transverse and longitudinal disturbances. Each propagation mode is implemented as a mass-spring model where a mass is identified with a ribbon corrugation fold. The model is parametrized using ribbon material and geometric properties. Laser vibrometer measurements are presented, revealing stiffness in the transverse and longitudinal propagation and showing close agreement between measured and modeled ribbon motion.
Convention Paper 8215 (Purchase now)

P11-4 Modeling of Leaky Acoustic Tube for Narrow-Angle Directional MicrophoneKazuho Ono, Takehiro Sugimoto, Akio Ando, Kimio Hamasaki, NHK Science and Technology Research Laboratories - Kinuta Setagaya-ku, Tokyo, Japan; Takeshi Ishii, Yutaka Chiba, Keishi Imanaga, Sanken Microphone Co. Ltd. - Suginami-ku, Tokyo, Japan
Line microphones have been popular as narrow directional microphones for a long time. Their structure adopts a leaky acoustical tube with many slits to suppress off-axis sensitivity, together with a directional capsule attached to this tube. Although many microphones of this type are on the market, we have no quantitative theory to explain its behavior, which is very important for effectively designing directivity. We thus modeled the leaky acoustical tube using a distributed equivalent circuit and combined it with the directional capsule’s equivalent circuit model. The analysis showed that the model agreed well with the measurement results, particularly at the directional characteristics, while an ordinary model of acoustical tube using delay and sum modeling did not.
Convention Paper 8216 (Purchase now)

P11-5 Modeling Viscoelasticity of Loudspeaker Suspensions Using Retardation SpectraTobias Ritter, Finn Agerkvist, Technical University of Denmark - Kgs. Lyngby, Denmark
It is well known that, due to viscoelastic effects in the suspension, the displacement of the loudspeaker increases with decreasing frequency below the resonance. Present creep models are either not precise enough or purely empirical and not derived from the basis of physics. In this investigation, the viscoelastic retardation spectrum, which provides a more fundamental description of the suspension viscoelasticity, is first used to explain the accuracy of the empirical LOG creep model (Knudsen et al.). Then, two extensions to the LOG model are proposed that include the low and high frequency limit of the compliance, not accounted for in the original LOG model. The new creep models are verified by measurements on two 5.5 loudspeakers with different surrounds.
Convention Paper 8217 (Purchase now)

P11-6 Physical Modeling and Synthesis of Motor Noise for Replication of a Sound Effects LibrarySimon Hendry, Josh Reiss, Queen Mary University of London - London, UK
This paper presents the results of objective tests exploring the concept of using a small number of physical models to create and replicate a large number of samples from a traditional sound effects library. The design of a DC motor model is presented and this model is used to create both a household drill and a small boat engine. The harmonic characteristics, as well as the spectral centroid were compared with the original samples, and all the features agree to within 6.1%. The results of the tests are discussed with a heavy emphasis on realism and perceived accuracy, and the parameters that have to be improved in order to humanize a model are explored.
Convention Paper 8218 (Purchase now)

P11-7 Measures and Parameter Estimation of Triodes for the Real-Time Simulation of a Multi-Stage Guitar PreamplifierIvan Cohen, Ircam - Paris, France, Orosys R&D, Montpellier, France; Thomas Hélie, Ircam - Paris, France
This paper deals with the real-time simulation of a multi-stage guitar preamplifier. Dynamic triode models based on Norman Koren’s model, and "secondary phenomena" as grid rectification effect and parasitic capacitances are considered. Then, the circuit is modeled by a nonlinear differential algebraic system, with extended state-space representations. Standard numerical schemes yield efficient stable simulations of the circuit and are implemented as VST plug-ins. Measures of real triodes have been realized, to develop new triode models, and to characterize the capabilities of aged and new triodes. The results are compared for all the models, using lookup tables generated with the measures and Norman Koren’s model with its parameters estimated from the measures.
Convention Paper 8219 (Purchase now)

P11-8 ZFIT: A MATLAB Tool for Thiele-Small Parameter Fitting and OptimizationChristopher Struck, CJS Labs - San Francisco, CA, USA
Over the years, many approaches to the calculation of the Thiele-Small parameters have been presented. Most current methods rely upon curve-fitting to the impedance magnitude data for a specific lumped parameter model. A flexible Matlab least-mean-squares optimization tool for complex loudspeaker impedance data is described. Magnitude and phase data are fit to a user-selected lumped parameter model of variable complexity. Appropriate constraints on the optimization help identify if the selected model is of sufficient order or overly complex for the given data. Examples are shown for impedance data from several different loudspeaker drivers.
Convention Paper 8220 (Purchase now)

Friday, November 5, 4:00 pm — 7:00 pm (Room 206)

Student / Career: Recording Competition Surround

The Student Recording Competition is a highlight at each convention. A distinguished panel of judges participates in critiquing finalists of each category in an interactive presentation and discussion. This event presents stereo and surround recordings in these categories:

• Surround Sound for Picture 4:00 pm to 5:00 pm
• Surround Classical 5:00 pm to 6:00 pm
• Surround Non-Classical 6:00 pm to 7:00 pm

The top three finalists in each category, as identified by our judges, present a short summary of their production intentions and the key recording and mix techniques used to realize their goals. They then play their projects for all who attend. Meritorious awards are determined here and will be presented at the closing Student Delegate Assembly Meeting (SDA-2) on Sunday afternoon.

The competition is a great chance to hear the work of your fellow students at other educational institutions. Everyone learns from the judges’ comments even if your project isn’t one of the finalists, and it's a great chance to meet other students and faculty.

Friday, November 5, 4:30 pm — 6:00 pm (Room 132)

Workshop: W11 - AES42 and Digital Microphones

Helmut Wittek, SCHOEPS Mikrofone GmbH
Stephan Flock, DirectOut GmbH
Tom Frey, Sennheiser
Stephan Peus, Georg Neumann GmbH

The AES42 interface for digital microphones is not yet widely used. This can be due to the relatively young appearance of digital microphone technology but also a lack of knowledge and practice with digital microphones and the corresponding interface exists. The advantages and disadvantages have to be communicated in an open and neutral way regardless of commercial interests but on the basis of the actual need of the engineers. Along with an available “White paper” about AES42 and digital microphones, which is aimed a neutral in-depth information and which was compiled from different authors, the proposed workshop intents to enlighten facts and prejudices on this topic.

Friday, November 5, 4:30 pm — 6:30 pm (Room 131)

Workshop: W12 - Keep Turning it Down! Developing an Exit Strategy for the Loudness Wars

Martin Walsh, DTS Inc.
Bob Ludwig, Gateway Mastering
Thomas Lund, TC Electronic
Susan Rogers, Berklee College of Music

Following on from the popular workshop presented at the 127th AES Convention that delved into topics relating to the nature and the consequences of the loudness wars, our panel of loudness experts and "master" mastering engineers will provide an update on the progress toward ending the war and returning peace, harmony, and dynamic range to the people.

The workshop will focus on alternatives to the practice of overly aggressive dynamic range compression using weapons such as seasoned mastering techniques and gain normalization algorithms and standards. Audience participation is encouraged and all are welcome to voice their own opinion and comments in relation to the issues discussed.

Friday, November 5, 4:30 pm — 6:00 pm (Room 226)

Poster: P14 - Loudspeakers and Microphones

P14-1 Coaxial Flat Panel Loudspeaker System with Dynamic Push-Pull DriveDrazenko Sukalo, DSLab - Device Solution Laboratory - Munich, Germany
After the successful introduction of the flat television, acousticians are concerned with the design of a “full-range” flat panel loudspeaker. A new design with low manufactured depth, consisting of an array of two conventional cone drivers and a transmission line and the method for driving of them is presented. The main aim was to build a small-sized flat panel box but with extended low frequency response and low distortion output because of the extended liner diaphragm excursion. The PSpice-OrCAD® simulator was used to represent a distributed model of a transmission line. The results of the simulation show the influence of the parameters of the transmission-line enclosure on the impedance curve and resonant frequency of the woofer driver. Among others, this paper is concerned with an active filter design for driving loudspeaker drive units in an appropriate phase relationship in the low frequency region, by means of implementing of DPP drive. A prototype of the flat panel loudspeaker is built according to the described design concept and the results of sound pressure level measurement are presented. The design result from work performed for DSLab and is subject to the referenced patent.
Convention Paper 8235 (Purchase now)

P14-2 A Novel Universal-Serial-Bus-Powered Digitally Driven Loudspeaker System with Low Power Dissipation and High FidelityHajime Ohtani, Akira Yasuda, Kenzo Tsuihiji, Ryota Suzuki, Daigo Kuniyoshi, Hosei University - Koganei, Tokyo, Japan; Junichi Okamura, Trigence Semiconductor - Chiyoda, Tokyo, Japan
We propose a novel digitally driven loudspeaker system in which a newly devised mismatch shaper method, multilevel noise shaping dynamic element matching, is used to realize high fidelity, high sound power level, and low power dissipation. The unit used for the mismatch shaper method can easily increase the number of sound pressure levels with the aid of an H-bridge circuit, even when the number of sub-speakers is fixed. Further, it reduces the noise caused by quantization and loudspeaker mismatches and decreases the switching loss. The output sound power level equipped with six voice coils is 94 dB/m when a 3.3-V universal-serial-bus power supply is used exclusively. The power efficiency is 95% at 0 dBFS and 75% at –10 dBFS.
Convention Paper 8236 (Purchase now)

P14-3 Loudspeaker Rub Fault Detection by Means of a New Nonstationary Procedure TestGerman Ruiz, Vicent Sala, Miguel Delgado, Juan Antonio Ortega, UPC-Universitat Politecnica de Catalunya - Terrassa, Spain
This paper addresses rub defect loudspeaker detection. The study includes a simulation with a rub model based on classical static coulomb friction added to the loudspeaker nonlinearities parametric model to demonstrate the current signal viability to rub failure detection. The electric current signal is analyzed by means of Zhao-Atlas-Marks distribution (ZAMD). A failure extractor based on relevant harmonic ZAMD frequency regions segmentation and Mahalanobis distance is presented. The simulation and experimental results show the goodness and reliability of rub detection method presented.
Convention Paper 8237 (Purchase now)

P14-4 Contributions to the Improvement of the Response of a Pleated LoudspeakerJaime Ramis, Rita Martinez, Acustica Beyma S.L. - Moncada, Valencia, Spain; E. Segovia, Obras Públicas e Infraestructura Urbana - Spain; Jesus Carbajo, Jaime Ramis, Universidad de Alicante - Alicante, Spain
In this paper we describe some results that have led to the improvement of the response of an Air Motion Transformer loudspeaker. First, it is noteworthy that it has been found an approximate analytical solution to the differential equations system that governs the behavior of the moving assembly of this type of transducer, being this valid when the length of the pleat is much greater than the radius of the cylindrical part. This solution is valid for any type of analysis (static, modal, and harmonic), and the modes are significantly simplified assuming the hypothesis above mentioned. In addition, we have analyzed the influence of the thickness and the shape of perforation of the pole piece in the frequency response of the loudspeaker.
Convention Paper 8238 (Purchase now)

P14-5 Exploring the Ultra-directional Acoustic Responses of an Electret Cell Array LoudspeakerYu-Chi Chen, Wen-Ching Ko, National Taiwan University - Taipei, Taiwan; Chang-Ho Liou, National Taiwan University - Taipei, Taiwan, Industrial Technology Research Institute, Hsinchu Taiwan; Wen-Hsin Hsiao, Chih-Chiang Cheng, Wen-Jong Wu, Pei-Zen Chang, National Taiwan University - Taipei, Taiwan; Chih-Kung Lee, National Taiwan University - Taipei, Taiwan, Institute for Information Industry, Taipei, Taiwan
In recent years, novel thin-plate loudspeakers have triggered much interest. Applications in areas such as 3C peripherals, automobile audio systems, and home theater have been actively discussed. However, the acoustic directivity of a thin-plate loudspeaker depends on the frequency response. At this time, thin-plate loudspeakers have poor directivity. However, if this limitation can be overcome, thin-plate loudspeakers can find useful applications such as in museums, supermarkets, or exhibition areas that require a channeling the sound to a particular area or location without affecting nearby areas or unintended audiences. From previous studies, electret cell arrays have been confirmed to be an excellent flexible flat loudspeaker since it can create high performance sounds in a mid to high frequency range. An electret loudspeaker can generate ultra-directional audible sound by adjusting the array size, amplitude modulation, and layout structure.
Convention Paper 8239 (Purchase now)

P14-6 A Soundfield Microphone Using Tangential CapsulesEric Benjamin, Suround Research - Pacifica, CA, USA
The traditional soundfield microphone is a tetrahedral array of pressure gradient microphones, the outputs of which are linearly combined in order to realize signals that are proportional to co-located microphones, one with omnidirectional sensitivity and three orthogonal microphones with figure-of-eight sensitivity. This configuration works well and has been the basis of commercial products for a number of years. Recently, an alternative array type has been disclosed [2,3] by Craven, Law, and Travis, comprised of pressure gradient sensors arranged with their principle axes oriented tangentially with respect to the center. Additional analysis has been performed and several prototypes were constructed and evaluated.
Convention Paper 8240 (Purchase now)

P14-7 A 2-Way Loudspeaker Array System with Pseudorandom Spacing for Music ConcertsYuki Ayabe, Saburo Nakano, Tokyo City University - Setagaya-ku, Tokyo, Japan; Kaoru Ashihara, Advanced Industrial Science and Technology - Tsukuba, Japan; Shogo Kiryu, Tokyo City University - Setagaya-ku, Tokyo, Japan
A 96-channel loudspeaker array system that allows real-time control of sound field has been developed for live musical concerts. Multiple sound focused at different points can been generated and controlled independently using the system. The variable delay circuits, the controller of the power amplifier, and the communication circuit between the hardware and the computer are implemented in FPGAs. In order to extend the frequency range and reduce the spatial aliasing, the loudspeaker array is assembled by two-way loudspeakers with pseudorandom spacing.
Convention Paper 8241 (Purchase now)

Friday, November 5, 5:45 pm — 7:00 pm (Room 133)

Broadcast and Media Streaming: B9 - Audio Processing for Streaming

Bill Sacks, Optimod.FM
Ray Archie, CBS
Frank Foti, Telos
Greg Ogonowski, Orban
Skip Pizzi, Consultant/Radio Ink

Finding the silver lining in the internet cloud

Traditional media people from broadcast and recording have a far different perspective from IT people. Compression now means two far different things to a seasoned audio engineer depending if the conversation is about dynamic range reduction or a data stream's efficiency. We must learn to communicate with one another. We must learn the TCP/IP language and protocols well enough to relate the needs to our IT counterparts and we need to be able to teach them, in their language, what we need them to do with us in order to accomplish our mission.

This panel will discuss the evolving relationship of audio and IT and how to improve not just the technical interfaces to learn, but the mutual understanding of our needs.

Saturday, November 6, 9:00 am — 10:45 am (Room 132)

Product Design: PD4 - Grounding & Shielding - Circuits and Interference (Part 1)

Ralph Morrison

This first session will discuss the way signals and power are transported. We will discuss the basic meanings of words such as voltage, current, capacitance, and inductance; the role conductor geometries have in controlling where signals and power can travel; the problems of utility and facility design together with the meaning of ground and earth; the interference problems created by transformers and facility wiring; a discussion of shielding as applied to analog circuits and radiating structures. Terms such as differential, balanced, common-mode, normal-mode, and single ended will be explained.

Saturday, November 6, 9:00 am — 1:00 pm (Room 220)

Paper Session: P15 - Multichannel Audio Playback

Alex Voishvillo

P15-1 Why Ambisonics Does WorkEric Benjamin, Suround Research - Pacifica, CA, USA; Richard Lee, Pandit Littoral - Cookstown, Australia; Aaron Heller, SRI International - Menlo Park, CA, USA
Several techniques exist for surround sound, including Ambisonics, VBAP, WFS, and pair-wise panning. Each of the systems have strengths and weaknesses but Ambisonics has long been favored for its extensibility and for being a complete solution, including both recording and playback. But Ambisonics has not met with great critical or commercial success despite having been available in one form or another for many years. Some observers have gone so far as to suggest that Ambisonics can’t work. The present paper is intended to provide an analysis of the performance of Ambisonics according to various psychoacoustic mechanisms in spatial hearing, such as localization and envelopment.
Convention Paper 8242 (Purchase now)

P15-2 Design of Ambisonic Decoders for Irregular Arrays of Loudspeakers by Non-Linear OptimizationAaron J. Heller, SRI International - Menlo Park, CA, USA; Eric Benjamin, Surround Research - Pacifica, CA, USA; Richard Lee, Pandit Littoral - Cooktown, Queensland, Australia
In previous papers, the present authors described techniques for design, implementation, and evaluation of Ambisonic decoders for regular loudspeaker arrays. However, to accommodate domestic listening rooms, irregular arrays are often required. Because the figures of merit used to predict decoder performance are non-linear functions of loudspeaker positions, non-linear optimization techniques are needed. In this paper we discuss the implementation of an open-source application based on the NLopt non-linear optimization software library that derives decoders for arbitrary arrays of loudspeakers, as well as providing a prediction of their performance using psychoacoustic criteria, such as Gerzon’s velocity and energy localization vectors. We describe the implementation and optimization criteria and report on listening tests comparing the decoders produced.
Convention Paper 8243 (Purchase now)

P15-3 Discrete Driving Functions for Wave Field Synthesis and Higher Order AmbisonicsCésar D. Salvador, Universidad de San Martín de Porres - Lima, Peru
Practical implementations of physics-based spatial sound reproduction techniques, such as Wave Field Synthesis (WFS) and Higher Order Ambisonics (HOA), require real-time filtering, scaling, and delaying operations on the audio signal to be spatialized. These operations form the so-called loudspeaker’s driving function. This paper describes a discretization method to obtain a rational representation in the z-plane from the continuous WFS and HOA driving functions. Visual and numerical comparisons between the continuous and discrete driving functions, and between the continuous and discrete sound pressure fields, synthesized with circular loudspeaker arrays, are shown. The percentage discretization errors, in the reproducible frequency range and in the whole listening area, are in the order of 1%. A methodology for the reconstruction of immersive soundscapes composed with nature sounds is also reported as a practical application.
Convention Paper 8244 (Purchase now)

P15-4 Reducing Artifacts of Focused Sources in Wave Field SynthesisHagen Wierstorf, Matthias Geier, Sascha Spors, Technische Universität Berlin - Berlin, Germany
Wave Field Synthesis provides the possibility to synthesize virtual sound sources located between the loudspeaker array and the listener. Such sources are known as focused sources. Previous studies have shown that the reproduction of focused sources is subject to audible artifacts. The strength of those artifacts heavily depends on the size of the loudspeaker array. This paper proposes a method to reduce artifacts in the reproduction of focused sources by using only a subset of loudspeakers of the array. A listening test verifies the method and compares it to previous results.
Convention Paper 8245 (Purchase now)

P15-5 On the Anti-Aliasing Loudspeaker for Sound Field Synthesis Employing Linear and Circular Distributions of Secondary SourcesJens Ahrens, Sascha Spors, Deutsche Telekom AG Laboratories - Berlin, Germany
The theory of analytical approaches for sound field synthesis like wave field synthesis, nearfield compensated higher order Ambisonics, and the spectral division method requires continuous distributions of secondary sources. In practice, discrete loudspeakers are employed and the synthesized sound field is corrupted by a number of artifacts that are commonly referred to as spatial aliasing. This paper presents a theoretical investigation of the properties of the loudspeakers that are required in order to suppress such spatial aliasing artifacts. It is shown that the employment of such loudspeakers is not desired since the suppression of spatial aliasing comes by the cost of an essential restriction of the reproducible spatial information when practical loudspeaker spacings are assumed.
Convention Paper 8246 (Purchase now)

P15-6 The Relationship between Sound Field Reproduction and Near-Field Acoustical HolographyFilippo M. Fazi, Philip Nelson, University of Southampton - Southampton, UK
The problem of reproducing a desired sound field with an array of loudspeakers and the technique known as Near-Field Acoustical Holography share some fundamental theoretical aspects. It is shown that both problems can be formulated as an integral equation that usually defines an ill-posed problem. The example of spherical geometry and planar geometry is discussed in detail. It is shown that for both the reproduction and the acoustical holography cases, the ill-conditioning of the problem is greatly affected by the distance between the source layer and the measurement/control surface.
Convention Paper 8247 (Purchase now)

P15-7 Surround Sound with Height in Games Using Dolby Prologic IIzNicolas Tsingos, Christophe Chabanne, Charles Robinson, Dolby Laboratories - San Francisco, CA, USA; Matt McCallus, RedStorm Entertainment - Cary, NC, USA
Dolby Pro Logic IIz is a new matrix encoding/decoding system that enables the transmission of a pair of height channels within a conventional surround sound stream (e.g. 5.1). In this paper we provide guidelines for the use of Pro logic IIz for interactive gaming applications including recommended speaker placement, creation of elevation information, and details on how to embed the height channels within a 5- or 7-channel stream. Surround sound with height is already widely available in home-theater receivers. It offers increased immersion to the user and is a perfect fit for 2-D or stereoscopic 3-D video games.
Convention Paper 8248 (Purchase now)

P15-8 Optimal Location and Orientation for Midrange and High Frequency Loudspeakers in the Instrument Panel of an Automotive InteriorRoger Shively, Harman International - Novi, MI, USA; Jérôme Halley, Harman International - Karlsbad, Germany; François Malbos, Harman International - Chateau du Loir, France; Gabriel Ruiz, Harman International - Bridgend, Wales, UK
In a follow-up to a previous paper (AES Convention Paper # 8023, May 2010) using the modeling process described there for modeling loudspeakers in an automotive interior, the optimization of midrange and of high frequency tweeter loudspeakers’ positions for best acoustic performance in the driver's side (left) and passenger's side (right) of automotive instrument panel is reported on.
Convention Paper 8249 (Purchase now)

Saturday, November 6, 10:30 am — 1:00 pm

Technical Tour: TT10 - Studio Trilogy

Designed by John Storyk of WSDG, Studio Trilogy ( includes three state-of-the-art control rooms, four integrated isolation booths, and one of San Francisco’s most versatile large tracking rooms. The facility, located in the creative heart of San Francisco’s SOMA district, features the only commissioned SSL 9000K console in Northern California, Pro Tools|HD 3 systems, classic Studer analog 2" tape, and a full complement of vintage and modern instruments and gear. The newly re-opened studio is owned and staffed by veterans of the Bay Area studio scene, having held positions at Different Fur, SF Soundworks, Crescendo! Studios, Russian Hill Recording, Nu-tone, Studio 880, and Talking House.

Technical Tours are made available on a first come, first served basis. Tickets can be purchased during normal registration hours at the convention center.

Price: $35 Members / $45 Nonmembers

Saturday, November 6, 11:30 am — 1:00 pm (Room 134)

Special Event: Platinum Artists

David Goggin
Bruce Botnick
Corey Cunningham
Ray Manzarek
Veronica Romeo
CJ Vanston

World-renowned recording artists, producers, and engineers share their insights into the recording craft. What do artists look for in producers, engineers, and studios? How has recording changed since multitracking burst on the scene? What were some of the magical moments they experience in the studio? The panel will consist of the following:

• Legendary Doors keyboardist Ray Manzarek and band’s producer, Bruce Botnick
• Spanish pop star Veronica Romeo and her producer, CJ Vanston
• Corey Cunningham of up-and-coming San Francisco-based rock band Magic Bullets and the band’s producer, KamranV
• Moderator David Goggin, noted recording industry author and photographer

Saturday, November 6, 2:15 pm — 5:15 pm

Technical Tour: TT11 - Studio 880

Newly relocated to a completely restored 1886-built Victorian mansion on the small bay island of Alameda, Studio 880 ( includes an art-deco themed 7.1 mix theater and victorian-gothic tracking room. During the tour, legendary producer/mixer John Lucasey (Ne-Yo, Keith Urban, Norah Jones) will discuss the aesthetics of studio design and ambiance, in this rare glimpse inside one of the most unique recording studios in the Bay Area. Lucasey previously designed and built the original award-winning Studio 880 in Jingletown, Oakland, which is referred to in Green Day’s Grammy award winning album American Idiot, and the setting of their Tony award winning Broadway musical.

Technical Tours are made available on a first come, first served basis. Tickets can be purchased during normal registration hours at the convention center.

Price: $40 Members / $50 Nonmembers

Saturday, November 6, 2:30 pm — 4:00 pm (Room 134)

Special Event: Grammy SoundTable
Sonic Imprints: Songs That Changed My Life

Sylvia Massy
Joe Barresi
Bob Clearmountain
DJ Khalil
Jimmy Douglass
Nathaniel Kunkel

Some songs are hits, some we just love, and some have changed our lives. Our panelists break down the DNA of their favorite tracks and explain what moved them, what grabbed them, and why these songs left a life long impression.

Saturday, November 6, 2:30 pm — 4:30 pm (Room 132)

Product Design: PD5 - Grounding & Shielding - Circuits and Interference (Part 2)

Ralph Morrison

This second session of Grounding and Shielding Circuits takes off where the first session ended. Under discussion will be : the role of digital logic and processors in the audio world; A/D converters; transmission line basics; impedance control and impedance matching; the relation between rise and fall times and frequency spectrum; the need for ground and power planes; decoupling and filtering on logic structures; the interface between analog and digital circuits; digital analog filters; aliasing errors; balanced digital lines and common-mode rejection; multilayer boards; interference problems such as cross talk, ground bounce, via locations.

Saturday, November 6, 2:30 pm — 6:00 pm

Technical Tour: TT12 - Studio D Recording and Loudville

Studio D Recording

The 5.1 surround sound enabled Pro Tools HD|3 Accel system and collection of classic preamps, compressors and eq’s makes Studio D ( a perfect balance of cutting edge and vintage, attracting artists like Carlos Santana, Train and Kenny Wayne Shephard. During the tour, studio owner and head engineer Joel Jaffe will conduct an on-site demonstration of surround mixing and different ways to mix a track, examining placement, mics used and overall mix results.


Loudville ( is a step in the future, with combined professional audio and video for live-streaming or archival and editing. With an oversized audio control room which can double as a live performance sound stage and robotic HD video cameras from every angle, this world class recording studio is able to stream live at any time. The Avid D-Command console is wired to talk to the video control room for smooth sound-to-picture integration. Loudville plans to stream their portion of this technical tour live through their website.

Studio D Recording and Loudville are in the same complex, combined as one tour.

Technical Tours are made available on a first come, first served basis. Tickets can be purchased during normal registration hours at the convention center.

Price: $40 Members / $50 Nonmembers

Saturday, November 6, 2:30 pm — 6:30 pm (Room 220)

Paper Session: P17 - Real-Time Audio Processing

Jayant Datta

P17-1 A Time Distributed FFT for Efficient Low Latency ConvolutionJeffrey Hurchalla, Garritan Corp. - Orcas, WA, USA
To enable efficient low latency convolution, a Fast Fourier Transform (FFT) is presented that balances processor and memory load across incoming blocks of input. The proposed FFT transforms a large block of input data in steps spread across the arrival of smaller blocks of input and can be used to transform large partitions of an impulse response and input data for efficiency, while facilitating convolution at very low latency. Its primary advantage over a standard FFT as used for a non-uniform partition convolution method is that it can be performed in the same processing thread as the rest of the convolution, thereby avoiding problems associated with the combination of multithreading and near real-time calculations on general purpose computing architectures.
Convention Paper 8257 (Purchase now)

P17-2 An Infinite Impulse Response (IIR) Hilbert Transformer Filter Design Guide for AudioDaniel Harris, Sennheiser Research Laboratory - Palo Alto, CA, USA; Edgar Berdahl, Stanford University - Stanford, CA, USA
Hilbert Transformers have found many applications in the signal processing community, from single-sideband communication systems to audio effects. IIR implementations are attractive for computationally sensitive systems due to their lower number of coefficients. However, as in any advanced filter design problem, their tuning and implementation present a number of design challenges and tradeoffs. Furthermore, while literature addressing these problems exists, designers must draw from several sources to find answers. In this paper we present a complete start-to-finish explanation of how to implement an efficient infinite impulse response (IIR) Hilbert transformer filter. We start from a half-band filter design and show how the poles move as the half-band filter is transformed into summed all-pass filters and then from there into a Hilbert transformer filter. The design technique is based largely on pole locations and creates a filter in the cascaded 1st order allpass form, which is numerically robust.
Convention Paper 8258 (Purchase now)

P17-3 Automatic Parallelism from Dataflow GraphsRamy Sadek, University of Southern California - Playa Vista, CA, USA
This paper presents a novel algorithm to automate high-level parallelization from graph-based data structures representing data flow. Algorithm correctness is shown via a formal proof by construction. This automatic optimization yields large performance improvements for multi-core machines running host-based applications. Results of these advances are shown through their incorporation into the audio processing engine Application Rendering Immersive Audio (ARIA) presented at AES 117. Although the ARIA system is the target framework, the contributions presented in this paper are generic and therefore applicable in a variety of software such as Pure Data and Max/MSP, game audio engines, non-linear editors and related systems. Additionally, the parallel execution paths extracted are shown to give effectively optimal cache performance, yielding significant speedup for such host-based applications.
Convention Paper 8259 (Purchase now)

P17-4 The Design of Low-Complexity Wavelet-Based Audio Filter Banks Suitable for Embedded PlatformsNeil Smyth, CSR - Cambridge Silicon Radio - Belfast, N. Ireland, UK
Many audio applications require the use of low complexity, low power, and low latency filter banks (e.g., real-time audio streaming to mobile devices). The underlying mathematics of wavelet transforms provides these attractive characteristics for embedded platforms. However, commonly used wavelets (Haar, Daubechies) possess coefficients containing irrational numbers that lead to distortion in fixed-point implementations. This paper discusses the development and provides practical performance comparisons of filter banks using wavelet transforms as an alternative to more commonly used sub-banding filter banks in PCM audio coding algorithms. The advantages and disadvantages of wavelets used in such audio compression applications are also discussed.
Convention Paper 8260 (Purchase now)

P17-5 Application of Optimized Inverse Filtering to Improve Time Response and Phase Linearization in Multiway Loudspeaker SystemsMario Di Cola, Audio Labs Systems - Casoli (CH), Italy; Daniele Ponteggia, Studio Ing. Ponteggia - Terni (TR), Italy
Digital processing has been widely demonstrated to be a very useful technique in improving loudspeaker systems’ performances. Particularly interesting is Inverse Filtering applied to loudspeaker systems because it can improve performances and sound quality in terms of transient response and reduced overall phase shift. Inverse Filtering is a processing technique that can be realized with FIR filtering techniques with a specific sequence of taps that need to be synthesized “ad hoc” for a specific transducer and/or for a specific loudspeaker system configuration. Most of the studies on this matter so far, with very few exceptions, have been focused on the “DSP processing” point of view, being generally related to the involved mathematics and relative numerical problems. This paper represents a discussion on the philosophy that should drive the application of this technique to process a loudspeaker system in order to really improve it, and consequently it’s been focused on the analysis of the loudspeaker system nature and the understanding of what can really be processed with a 1-dimensional “action.” We will discuss what can be synthesized as a “2-port” model of the loudspeaker and then what can be effectively obtained by processing the input signal of a loudspeaker system.
Convention Paper 8261 (Purchase now)

P17-6 Filter Design for a Double Dipole Flat Panel Loudspeaker System Using Time Domain Toeplitz EquationsTobias Corbach, Martin Holters, Udo Zölzer, Helmut-Schmidt-University/University of the Federal Armed Forces - Hamburg, Germany
Today flat panel loudspeakers are used in multiple applications. Due to their high directivity and their good structural integration properties, flat panel loudspeakers are commonly used for directed acoustic information. A previously proposed system of 2 parallel flat panel dipole loudspeakers with adapted input filtering ensures a high suppression of the backward radiation and only minor influences to the forward radiation side. This paper presents a new approach to the filter computation for this application. It makes use of the time domain convolution, realized by Toeplitz matrices and builds the desired filter impulse responses by a least squares approach. The different filter computations as well as the numerical and measured results are shown.
Convention Paper 8262 (Purchase now)

P17-7 A Low Complexity Approach for Loudness CompensationPradeep D. Prasad, Ittiam Systems Pvt. Ltd. - Bangalore, Karnataka, India
The essence of loudness compensation is to maintain the perceived spectral balance of audio content irrespective of the playback volume level. The need for this compensation arises due to the inherent non-linearity in human aural perception manifesting as change in spectral balance. The compensation varies with critical band, original, and playback specific loudness. This results in a computationally intensive approach of estimating original and target specific loudness and calculating required compensation for every frame. A low complexity algorithm is proposed to enable resource constrained devices to efficiently perform loudness compensation. A closed form expression is derived for the proposed compensation followed by an analysis of the quality versus complexity tradeoff.
Convention Paper 8263 (Purchase now)

P17-8 MPEG Spatial Audio Object Coding—The ISO/MPEG Standard for Efficient Coding of Interactive Audio ScenesOliver Hellmuth, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Heiko Purnhagen, Dolby Sweden AB - Stockholm, Sweden; Jeroen Koppens, Philips Applied Technologies - Eindhoven, The Netherlands; Jürgen Herr, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Jonas Engdegård, Dolby Sweden AB - Stockholm, Sweden; Johannes Hilpert, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Lars Villemoes, Dolby Sweden AB - Stockholm, Sweden; Leonid Terentiev, Cornelia Falch, Andreas Hölzer, María Luis Valero, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Barbara Resch, Dolby Sweden AB - Stockholm, Sweden; Harald Mundt, Dolby Germany GmbH, Nürnberg, Germany; Hyen-O Oh, Digital TV Lab., LG Electronics, Seoul, Korea
In 2007, the ISO/MPEG Audio standardization group started a new work item on efficient coding of sound scenes comprising several audio objects by parametric coding techniques. Finalized in the summer of 2010, the resulting MPEG “Spatial Audio Object Coding” (SAOC) specification allows the representation of such scenes at bit rates commonly used for coding of mono or stereo sound. At the decoder side, each object can be interactively rendered, supporting applications like user-controlled music remixing and spatial teleconferencing. This paper summarizes the results of the standardization process, provides an overview of MPEG SAOC technology, and illustrates its performance by the results of the recent verification tests. The test includes operation modes for several typical application scenarios that take advantage of object-based processing.
Convention Paper 8264 (Purchase now)

Saturday, November 6, 2:30 pm — 4:00 pm (Room 226)

Poster: P19 - Spatial Sound Processing—1

P19-1 Estimation of the Probability Density Function of the Interaural Level Differences for Binaural Speech SeparationDavid Ayllon, Roberto Gil-Pita, Manuel Rosa-Zurera, University of Alcalá - Alcalá de Henares (Madrid), Spain
Source separation techniques are applied to audio signals to separate several sources from one mixture. One important challenge of speech processing is noise suppression and several methods have been proposed. However, in some applications like hearing aids, we are not interested just in removing noise from speech but amplifying speech and attenuating noise. A novel method based on the estimation of the Power Density Function of the Interaural Level Differences in conjunction with time-frequency decomposition and binary masking is applied to speech-noise mixtures in order to obtain both signals separately. Results show how both signals are clearly separated and the method entails low computational cost so it could be implemented in a real-time environment, such as a hearing aid device.
Convention Paper 8273 (Purchase now)

P19-2 The Learning Effect of HRTF-Based 3-D Sound Perception with a Horizontally Arranged 8-Loudspeaker SystemAkira Saji, Keita Tanno, Li Huakang, Tetsuya Watanabe, Jie Huang, The University of Aizu - Aizuwakamatsu City, Fukushima, Japan
This paper argues about the learning effects on the localization of HRTF-based 3-D sound using an 8-channel loudspeaker system, which creates virtual sound images. This system can realize sound with elevation by 8 channel loudspeakers arranged on the horizontal plane and convolving HRTF, not using high or low mounted loudspeakers. The position of the sound image that the system creates is difficult to perceive because such HRTF-based sounds are unfamiliar. However, after repetition of the learning process, almost all listeners can perceive the position of the sound images better. This paper shows this learning effect for an HRTF-based 3-D sound system.
Convention Paper 8274 (Purchase now)

P19-3 Spatial Audio Attention Model Based Surveillance Event DetectionBo Hang, Ruimin Hu, Xiaochen Wang, Weiping Tu, Wuhan University - Wuhan, China
In this paper we propose a bottom-up audio attention model based on spatial audio cues and subband energy change for unsupervised event detection in stereo audio surveillance. First, the spatial audio parameter Interaural Level Difference (ILD) is extracted to calculate and represent the attention events, which are caused by rapid moving sound source. Then the subband energy change is computed to present the salient energy distribution change in frequency domain. At last, an environment adaptive normalization is used to assess the normalized attention level. Experimental results demonstrate that the proposed audio attention model is effective for audio surveillance event detection.
Convention Paper 8275 (Purchase now)

P19-4 Investigating Perceptual Effects Associated with Vertically Extended Sound Fields Using Virtual Ceiling SpeakerYusuke Ono, Sungyoung Kim, Masahiro Ikeda, Yamaha Corporation - Iwata, Shizuoka, Japan
Virtual Ceiling Speaker (VCS) is a signal processing method that creates an elevated auditory image using an optimized cross-talk compensation for a 5-channel reproduction system. In order to understand latent perceptual effects caused by virtually elevated sound imageries, we experimentally compared the perceptual differences between physically and virtually elevated sound sources in terms of ASW, LEV, Powerfulness, and Clarity. The results showed that listeners perceived higher LEV or Clarity by adding physically or virtually elevated early reflections than 5-channel content in either case. It might implicate that attributes related to spatial dimensions were relatively well expressed due to virtually elevated signals using VCS.
Convention Paper 8276 (Purchase now)

P19-5 Enhancing 3-D Audio Using Blind Bandwidth ExtensionTim Habigt, Marko Durkovic, Martin Rothbucher, Klaus Diepold, Technische Universität München - München, Germany
Blind bandwidth extension techniques are used to recreate high frequency bands of a narrowband audio signal. These methods allow increasing the perceived quality of signals that are transmitted via a narrow frequency band as in telephone or radio communication systems. We evaluate the possibility to use blind bandwidth extension methods in 3-D audio applications, where high frequency components are necessary to create an impression of elevated sound sources.
Convention Paper 8277 (Purchase now)

Saturday, November 6, 4:30 pm — 6:00 pm (Room 226)

Poster: P20 - Spatial Sound Processing—2

P20-1 Inherent Doppler Properties of Spatial AudioMartin Morrell, Joshua D. Reiss, Queen Mary University of London - London, UK
The Doppler shift is a naturally occurring phenomenon that shifts the pitch of sound if the emitting object’s distance to the listener is not a constant. These pitch deviations, alongside amplitude change help humans to localize a source’s position, velocity, and movement direction. In this paper we investigate spatial audio reproduction methods to determine if Doppler shift is present for a moving sound source. We expand spatialization techniques to include time-variance in order to produce the Doppler shift. Recordings of several different loudspeaker layouts demonstrate the presence of Doppler with and without time-variance, comparing this to the pre-calculated theoretical values.
Convention Paper 8278 (Purchase now)

P20-2 A Binaural Model with Head Motion that Resolves Front-Back Confusions for Analysis of Room Impulse ResponsesJohn T. Strong, Jonas Braasch, Ning Xiang, Rensselaer Polytechnic Institute - Troy, NY, USA
Front-back confusions occur in both psychoacoustic localization tests and interaural cross-correlation-based binaural models. Head motion has been hypothesized and tested successfully as a method of resolving such confusions. This ICC-based model set forth here simulates head motion by filtering test signals with a trajectory of HRTFs and shifting an azimuth remapping function to follow the same trajectory. By averaging estimated azimuths over time, the correct source location prevails while the front-back reversed location washes out. This model algorithm is then extended to room impulse response analysis. The processing is performed on simulated binaural impulse responses at the same position but different head angles. The averaging allows the model to discriminate reflections coming from the front from those arriving from the rear.
Convention Paper 8279 (Purchase now)

P20-3 A Set of Microphone Array Beamformers Designed to Implement a Constant-Amplitude Panning LawYoomi Hur, Stanford University - Stanford, CA, USA, Yonsei University, Seoul, Korea; Jonathan S. Abel, Stanford University - Stanford, CA, USA; Young-cheol Park, Yonsei University - Wonju, Korea; Dae-Hee Youn, Yonsei University - Seoul, Korea
This paper describes a technique for designing a collection of beamformers, a "beamformerbank," that approximately produces a constant-amplitude panning law. Useful in multichannel recording scenarios, a point source will appear with energy above a specified sidelobe level in at most two adjacent beams, and the beam sum will approximate the source signal. A non-parametric design method is described in which a specified sidelobe level determines beam width as a function of arrival direction and frequency, leading directly to the number and placement of beams at every frequency. Simulation results using several microphone array configurations are reported to verify the performance of proposed technique.
Convention Paper 8280 (Purchase now)

P20-4 A 3-D Sound Creation System Using Horizontally Arranged LoudspeakersKeita Tanno, Akira Saji, Huakang Li, Jie Huang, The University of Aizu - Fukushima, Japan
In this research we have studied a 3-D sound creation system using 5- and 8-channel loudspeaker arrangements. This system has a great advantage in that it does not require the users to purchase a new audio system or to reallocate loudspeakers. The only change for creators of television stations, video game makers, and so on is to install the new proposed method for creation of the 3-D sound sources. Head-related transfer functions are used to create the signals of left and right loudspeaker groups. An extended amplitude panning method is proposed to decide the amplitude ratios between and within loudspeaker groups. Listening experiments show that the subjects could perceive the elevation of sound images created by the system as well.
Convention Paper 8281 (Purchase now)

P20-5 Locating Sounds Around the ScreenDavid Black, Hochschule Bremen, University of Applied Sciences - Bremen, Germany; Jörn Loviscach, Hochschule Bielefeld, University of Applied Sciences - Bielefeld, Germany
Today's large-size computer screens can display a wealth of information easily enough to overload the user's visual perceptual channel. Looking for a remedy for this effect, we researched into providing additional acoustic cues through surround sound loudspeakers mounted around the screen. In this paper we demonstrate the results of user evaluations of interaction with screen elements using the surround-screen setup. Results of these tests have shown that applying surround-screen sound can enhance response times in a simple task, and that users can localize the approximate origin of a sound when played back with this technique.
Convention Paper 8282 (Purchase now)

Saturday, November 6, 4:45 pm — 6:15 pm (Room 131)

Workshop: W16 - Mastering: Art, Perception, and Technologies–2

Michael Romanowski, Michael Romanowski Mastering
Gavin Lurssen, Gavin Lurssen Mastering
Andrew Mendleson, Georgetown Masters
Joe Palmacio, The Place for Mastering
Paul Stubblebine, Paul Stubblebine Mastering
Mike Wells, Mike Wells Mastering

This is a continuation of the Mastering panel from AES 2009 in New York. We will discuss the state of Mastering in 2010 by Mastering Engineers. Mastering engineers use technology to achieve the desired results. But what gets little or no discussion is the perceptions and approaches that cause the engineer to make those choices. In this two part series, we want to talk about the art of perception and technology as it pertains to the Mastering industry in 2010, and the future. In this session, we will talk about perception and the art form of mastering, and how decisions are made based on our approaches and perception in the mastering environment.

Sunday, November 7, 9:00 am — 1:00 pm (Room 206)

Student / Career: Recording Competition Stereo

The Student Recording Competition is a highlight at each convention. A distinguished panel of judges participates in critiquing finalists of each category in an interactive presentation and discussion. This event presents stereo recordings in these categories:

Classical 9:00 am to 10:00 am
Jazz/Blues 10:00 am to 11:00 am
World/Folk 11:00 AM to 12:00 noon
Pop/Rock 12:00 noon to 1:00 pm

The top three finalists in each category, as identified by our judges, present a short summary of their production intentions and the key recording and mix techniques used to realize their goals. They then play their projects for all who attend. Meritorious awards are determined here and will be presented at the closing Student Delegate Assembly Meeting (SDA-2) on Sunday afternoon.

The competition is a great chance to hear the work of your fellow students at other educational institutions. Everyone learns from the judges’ comments even if your project isn’t one of the finalists, and it's a great chance to meet other students and faculty.

Sunday, November 7, 9:00 am — 12:00 pm

Technical Tour: TT13 - Tiny Telephone Studio and Women's Audio Mission Studio

Tiny Telephone
Tiny Telephone ( was opened by John Vanderslice in 1997 to provide affordable recording to SF's independent music community. Situated in the Mission District in a gated, private compound, the 1700 sq. ft. studio offers a Bob Hodas-tuned control room, a discrete Neve 5316, a Studer 827, and the city's most comfortable couch. Tiny Telephone engineers have worked with a variety of artists ranging from Modest Mouse to Death Cab For Cutie and Elvis Costello.

Women's Audio Mission
Women's Audio Mission ( is a non-profit organization dedicated to the advancement of women in music production and the recording arts. In a field where women are chronically under-represented (less than 5%), WAM seeks to "change the face of sound" by providing women with hands-on media technology training, career counseling, and job placement. The tour will be highlighted by a visit to WAM’s recently redesigned studio.

Technical Tours are made available on a first come, first served basis. Tickets can be purchased during normal registration hours at the convention center.

Price: $35 Members / $45 Nonmembers

Sunday, November 7, 9:00 am — 10:00 am (Room 236)

Paper Session: P22 - Enhancement of Audio Reproduction

Richard Foss, Rhodes University - Grahamstown, South Africa

P22-1 Enhancing Stereo Audio with Remix CapabilityHyen-O Oh, LG Electronics Inc. - Seoul, Korea, Yonsei University, Seoul, Korea; Yang-Won Jung, LG Electronics Inc. - Seoul, Korea; Alexis Favrot, Illusonic LLC - Lausanne, Switzerland; Christof Faller, Illusonic LLC - Lausanne, Switzerland, EPFL, Lausanne, Switzerland
Many audio appliances feature capabilities for modifying audio signals, such as equalization, acoustic room effects, etc. However, these modification capabilities are always limited in the sense that they apply to the audio signal as a whole and not to a specific "audio object." We are proposing a scheme that enables modification of stereo panning and gain of specific objects inherent in a stereo signal. This capability is enabled (possibly stereo backwards compatibly) by adding a few kilobits of side information to the stereo signal. For generating the side information, the signals of the objects to be modified in the stereo signal are needed.
Convention Paper 8290 (Purchase now)

P22-2 Automatically Optimizing Situation Awareness and Sound Quality for a Sound Isolating EarphoneJohn Usher, Hearium Labs - San Francisco, CA, USA
Sound isolating (SI) earphones are increasingly used by the general public with portable media players in noisy urban and transport environments. The dangers of these SI earphones are becoming increasingly apparent, and an urgent review of their usage is being recommended by legislators. The problem is that user is removed from their local ambient scene: a reduction in their “situation awareness” that often leads to accidents involving unheard oncoming vehicles. This paper introduces a new automatic gain control system to automatically mix the ambient sound field with reproduced audio material. A discussion of the audio system architecture is given and an analysis of 20 different warning sounds is used to suggest suitable parameters.
Convention Paper 8291 (Purchase now)

Sunday, November 7, 9:30 am — 11:00 am (Room 226)

Poster: P23 - Perception and Subjective Evaluation of Audio

P23-1 Toward an Algorithm to Simulate Ensemble Rhythmic Interaction Based on Quantifiable Strategy FunctionsNima Darabi, U. Peter Svensson, Norwegian University of Science and Telecommunications - Trondheim, Norway; Chris Chafe, Stanford University - Stanford, CA, USA
This paper studies the strategy taken by a pair of ensemble performers under the influence of delay. A general quantifiable measure of strategy taken by performers in an interactive rhythmic performance is represented in a form of a single-parameter strategy function. This is done by imposing an assumption about a decision-making process for “onset generation” by a participant, with one degree of freedom, to the observed data. We present specific examples of such strategy functions, suitable for different scenarios of rhythmic collaboration. By perpendicular projection of strategy functions of an ensemble performing trail onto Cartesian axis a nominal trial was transformed to a “strategy path” to show how the performers change their strategies during the course of a trial. By mathematical induction it was proven that this transformation from the time domain to a “strategy domain” is conditionally reversible, i.e., time vectors of an ensemble trial can be reconstructed by a domino effect having its time-free strategy path and given an initial state. This algorithm is considered to be a means to simulate the ensemble trials based on the overall strategies leading them.
Convention Paper 8292 (Purchase now)

P23-2 Hearing Threshold of Pure Tones and a Fire Alarm Sound for People Listening to Music with HeadphonesKaori Sato, Shogo Kiryu, Tokyo City University - Setagaya-ku, Tokyo, Japan; Kaoru Ashihara, Advanced Industrial Science and Technology - Tsukuba, Japan
When listening to music through headphones, the listeners may be less sensitive to environmental sounds. The sound pressure level of the fire alarm bell sound was measured in an actual internet cafe. The hearing thresholds of pure tones and the fire alarm bell sound were measured for the subjects with headphones. The minimum sound pressure level of the fire alarm bell sound recorded in the cafe was about 40 dB under the worst condition. When the subjects listened to pseudo-music signals through headphones, the hearing threshold of the fire alarm sound increased to about 80 dB.
Convention Paper 8293 (Purchase now)

P23-3 Psychoacoustic Measurement and Auditory Brainstem Response in the Frequency Range between 10 kHz and 30 kHzMotoi Koubori, Tokyo City University - Setagaya-ku, Tokyo, Japan; Kaoru Ashihara, Advanced Industrial Science and Technology - Tsukuba, Japan; Mizuki Omata, Masaki Kyoso, Shogo Kiryu, Tokyo City University - Setagaya-ku, Tokyo, Japan
High-frequency components above 20 kHz can be recorded in recent high-resolution audio media. However, it is argued whether such components can be perceived or not. In this paper a psychoacoustic measurement and auditory brainstem response in the high-frequency range are reported. In the psycho-acoustic measurement, some subjects could perceive the high-frequency sounds above 20 kHz and the auditory brainstem response could be measured for one subject at the frequency of 22 kHz. However, the sound pressure levels of the thresholds were beyond 80 dB in the both measurements. The results were unremarkable. Because auditory brainstem response is a direct signal from the auditory nerve, the nerve seems not to be stimulated by weak high-frequency sounds.
Convention Paper 8294 (Purchase now)

P23-4 Acoustical Design of Control Room for Stereo and Multichannel Production and Reproduction—A Novel ApproachBogic Petrovic, Zorica Davidovic, BoZo Electronics, MyRoom Acoustics - Beograd, Serbia
This paper describes a new method of acoustic adaptation of control rooms with a goal to satisfy the necessary conditions for a quality control room, able to provide a better mix translation to other systems, with less need for the engineer to adapt, which is compatible for stereo as well as for surround monitoring. Two practical examples of control rooms will be described, which are realized by using the new principles, along with the descriptions and experiences of sound engineers who have worked in them.
Convention Paper 8295 (Purchase now)

P23-5 New 10.2-Channel Vertical Surround System (10.2-VSS); Comparison Study of Perceived Audio Quality in Various Multichannel Sound Systems with Height LoudspeakersSunmin Kim, Young Woo Lee, Samsung Electronics - Suwon, Korea; Ville Pulkki, Aalto University School of Science and Technology - Aalto, Finland
This paper presents the listening test results of perceived audio quality with several loudspeaker arrangements in order to find the optimal configuration of loudspeakers for a next-generation multichannel sound system. We compare new reproduction formats with NHK 22.2-channel and 7.1-channel setup of Recommendation ITU-R BS.775-2. The subjective evaluations focused on the loudspeaker configurations at the top layer were carried out with test materials generated with different methods, by mixing, and by reproducing B-format recordings. The results show that the perceptual difference in the overall quality achieved with the new 10.2-channel vertical surround system with 3 top loudspeakers and the NHK 22.2-channel system was imperceptible in a grading scale used in the experiment.
Convention Paper 8296 (Purchase now)

P23-6 Perceptually Motivated Scoring of Musical Meter Classification AlgorithmsMatthias Varewyck, Jean-Pierre Martens, Ghent University - Ghent, Belgium
In this paper perceived confusions between the four most popular meters 2/4, 3/4, 4/4, and 6/8 in Western music are examined. A theoretical framework for modeling these confusions is proposed and translated into a perceptually motivated objective score that can be used for the evaluation of meter classification algorithms with respect to meter labels that were elicited from a single annotator. Experiments with three artificial and two real algorithms showed that the new score is preferable over the traditional accuracy since the score rewards algorithms that make reasonable errors and seems to be more robust against different annotators.
Convention Paper 8297 (Purchase now)

P23-7 Classification of Audiovisual Media ContentUlrich Reiter, Norwegian University of Science and Technology - Trondheim, Norway
This paper describes a qualitative experiment designed to ultimately derive a set of meaningful attributes for the classification of audiovisual media content. Whereas such attributes are available for the classification of video only content, they are missing for audiovisual content. Based on the suggestions made by Woszczyk et al. in their 1995 AES Convention paper [Preprint 4133], we have taken a closer look in a combined set of experiments, one consisting in a quality trade-off decision, and one consisting in a relevance sorting task with respect to these attributes.
Convention Paper 8298 (Purchase now)

P23-8 The Influence of Texture and Spatial Quality on the Perceived Quality of Blindly Separated Audio Source SignalsThorsten Kastner, University of Erlangen - Erlangen, Germany, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
Blind Audio Source Separation (BASS) algorithms are often employed in applications where the aim is the acoustic reproduction of the separated source signals. The perceived quality of the reproduced signals is therefore a crucial criterion. Two different factors can be roughly distinguished that have influence on the perceived quality of blindly separated source signals. First, the quality of the separation of a desired target source from a signal mixture. Second, the preservation of the spatial image of the source, the spatial position of the target source in the signal mixture as it is perceived by the listener. Based on extensive MUSHRA-style listening tests, results are presented reflecting the influence of both factors on the overall basic audio quality of BASS signals. Further, a nonlinear regression model is set up to parametrize the influence of both factors on the subjective audio quality. A correlation of 0.98 between predicted and measured subjective quality and a root mean square prediction error of 2.7 on a [0,100] MUSHRA-scale was achieved for predicting the basic audio quality from an unknown listening test.
Convention Paper 8299 (Purchase now)

P23-9 Perceptual Evaluation of Spatial Audio QualityHwan Shim, Eunmi Oh, Sangchul Ko, Samsung Electronics - Gyeonggi-do, Korea; Sang Ha Park, Seoul National University - Seoul, Korea
With rapid development in multimedia devices, realistic spatial audio is of interest. In this paper we discuss how to evaluate realistic audio experience and then determine major perceptual attributes to deliver realistic audio experience to listeners. We propose eight attributes in the three categories such as “timbre,” “localization,” and “spaciousness.” Each perceptual attribute is evaluated by subjective listening tests using different surround reproduction systems including 10.2 and 22.2 channel systems. The experimental results show which spatial audio attribute is influential for realistic audio experience and which attribute is difficult to reproduce by using current reproduction systems.
Convention Paper 8300 (Purchase now)

Sunday, November 7, 11:30 am — 1:00 pm (Room 134)

Special Event: Platinum Producers & Engineers

Paul Verna
Niko Bolas
Joe Chiccarelli
Ross Hogarth

The recording industry and the technology that empowers it have undergone seismic shifts over the past decade. Despite these upheavals, the roles of the producer and engineer have remained vital to the recording process. How do today’s top studio professionals stay focused in a fragmented, rapidly changing landscape? What challenges and opportunities do the struggles of the broader recording industry present? How do producers and engineers promote quality to an audience that seems more interested in convenience? These are just some of the questions that panelists Joe Chiccarelli (Frank Zappa, My Morning Jacket, Counting Crows), Niko Bolas (Neil Young, Warren Zevon, Spinal Tap), and Ross Hogarth (Lyle Lovett, John Mellencamp, Jewel) will entertain.

Sunday, November 7, 2:30 pm — 4:30 pm (Room 206)

Workshop: W19 - The Challenge of Producing Blu-ray

Stefan Bock, msm-studios - Munich, Germany
Markus Hinz, Minnetonka Audio Software - Minnetonka, MN, USA
John McDaniel, dts - Los Angeles, CA, USA
Joe Rice, MX Production - San Francisco, CA, USA
Mark Waldrep, AIX Media Group - Los Angeles, CA, USA

Blu-ray is catching on at an interesting rate nowadays, either as a storage medium, a platform for high definition video and audio, or even a super high quality format for audio-only titles, such as Pure Audio Blu-ray. Do mixing and mastering engineers need to change their workflow to incorporate such formats? What is the challenge of working for Blu-ray compared to other surround formats? How can lossless codecs be implemented on Blu-ray? How can Blu-ray discs be authored according to the AES X-188 draft? The panel will present the audience different authoring concepts as it can be found in current commercial products.

Sunday, November 7, 2:30 pm — 5:00 pm (Room 236)

Paper Session: P27 - Room Acoustics

Søren Bech

P27-1 First Results from a Large-Scale Measurement Program for Home TheatersTomlinson Holman, Ryan Green, University of Southern California - Los Angeles, CA, USA, Audyssey Laboratories, Los Angeles, CA, USA
The introduction of one auto-equalization system to the home theater market with an accompanying reporting infrastructure provides methods of data collection that allows research into many practical system installations. Among the results delivered are histograms of room volume, reverberation time vs. volume and frequency, early arrival sound frequency response both equalized and unequalized, and steady-state frequency response both equalized and unequalized. The variation in response over the listening area is studied as well and sheds light on contemporary use of the Schroeder frequency.
Convention Paper 8310 (Purchase now)

P27-2 Improving the Assessment of Low Frequency Room Acoustics Using Descriptive AnalysisMatthew Wankling, Bruno Fazenda, William J. Davies, University of Salford - Salford, Greater Manchester, UK
Several factors contribute to the perceived quality of reproduced low-frequency audio in small rooms. Listeners often use descriptive terms such as “boomy” or “resonant.” However a robust terminology for rating samples during listening tests does not currently exist. This paper reports on an procedure to develop such a set of subjective descriptors for low frequency reproduced sound, using descriptive analysis. The descriptors that resulted are Articulation, Resonance, and Bass Content. These terms have been used in listening tests to measure the subjective effect of changing three objective room parameters: modal decay time, room volume, and source/receiver position. Reducing decay time increased Articulation while increased preference is associated with increased Articulation and decreased Resonance.
Convention Paper 8311 (Purchase now)

P27-3 Subjective Preference of Modal Control Methods in Listening RoomsBruno M. Fazenda, Lucy A. Elmer, Matthew Wankling, J. A. Hargreaves, J. M. Hirst, University of Salford - Greater Manchester, UK
Room modes are well known to cause unwanted effects in the correct reproduction of low frequencies in critical listening rooms. Methods to control these problems range from simple loudspeaker/listener positioning to quite complex digital signal processing. Nonetheless, the subjective importance and impact of these methods has rarely been quantified subjectively. A number of simple control methods have been implemented in an IEC standard listening environment. Eight different configurations were setup in the room simultaneously and could therefore be tested in direct comparison to each other. A panel of 20 listeners were asked to state their preferred configuration using the method of paired comparison. Results show clear winners and losers, indicating an informed strategy for efficient control.
Convention Paper 8312 (Purchase now)

P27-4 Wide-Area Psychoacoustic Correction for Problematic Room Modes Using Non-Linear Bass SynthesisAdam J. Hill, Malcolm O. J. Hawksford, University of Essex - Colchester, UK
Small room acoustics are characterized by a limited number of dominant low-frequency room modes that result in wide spatio-pressure variations that traditional room correction systems find elusive to correct over a broad listening area. A psychoacoustic-based methodology is proposed whereby signal components coincident only with problematic modes are filtered and substituted by virtual bass components to forge an illusion of the suppressed frequencies. A scalable and hierarchical approach is studied using the Chameleon Subwoofer Array (CSA), and subjective evaluation confirms a uniform large-area performance. Bass synthesis exploits parallel nonlinear and phase vocoder generators with outputs blended as a function of transient and steady-state signal content.
Convention Paper 8313 (Purchase now)

P27-5 Beyond Coding: Reproduction of Direct and Diffuse Sounds in Multiple EnvironmentsJames D. Johnston, DTS Inc. - Kirkland, WA, USA; Jean-Marc Jot, DTS Inc. - Scotts Valley, CA, USA; Zoran Fejzo,, DTS Inc. - Calabasas, CA, USA; Steve R. Hastings, DTS Inc. - Scotts Valley, CA, USA
For many years, the difference in perception between perceptually direct sounds (i.e., sounds with a specific direction) and perceptually diffuse sounds (i.e., sounds that "surround" or "envelop" the listener) have been recognized, leading to a variety of approaches for simulating or capturing these perceptual effects. Here, we discuss a system using separation of direct and diffuse signals, or for synthetic signals (e.g., those made by modern production methods) synthesis of the diffuse signal in one of several ways, in order to enable the reproduction system, after measuring the characteristics of the playback system, to provide the best possible sensation from that particular set of playback equipment.
Convention Paper 8314 (Purchase now)

Sunday, November 7, 3:30 pm — 4:30 pm (Room 133)

Master Class: M4 - Hybrid Mixing: A Step by Step Class on Mixing The All-American Rejects Hit Single "Gives You Hell"

Eric Valentine

Eric Valentine will walk through the process of mixing "Gives You Hell." He will discuss all of the techniques, plug-ins, outboard gear, and external summing used in the process. Valentine will start with the unmixed material and go through the process of transforming it into the final mixed version that many folks may be familiar with. The audience will be able to participate by asking questions through out the process and will be invited to improve on the finished version when it is done.

Sunday, November 7, 4:30 pm — 6:00 pm (Room 130)

Workshop: W20 - Return to Quality in Audio Production

Andres Mayo, Andres Mayo Mastering - Buenos Aires, Argentina
Ronald Prent, Galaxy Studios - Mol, Belgium
Francisco Miranda, Engineer/Studio Owner - Mexico City, Mexico
Dave Reitzas, Mixer/Producer - Los Angeles, CA, USA
Jeff Wolpert, Producer/Educator - Toronto, Ontario, Canada

We are witnessing a return to the search for better quality in current audio productions, with engineers and producers more concerned about long lasting recordings instead of just thinking about MP3 and Internet delivery. In Los Angeles, London, and Mexico City (just to name a few) great sounding studios have recently opened, and established ones are regaining clientele thanks to new and improved recording and mastering systems. This panel will discuss the paradigm shift that is affecting industry professionals positively throughout the globe.

Return to Recording Industry Events