AES San Francisco 2008
Audio Product Design Event Details
Thursday, October 2, 9:00 am — 12:30 pm
P1 - Audio Coding
: Marina Bosi
, Stanford University - Stanford, CA, USAP1-1 A Parametric Instrument Codec for Very Low Bit Rates
—Mirko Arnold, Gerald Schuller
, Fraunhofer Institute for Digital Media Technology - Ilmenau, Germany
A technique for the compression of guitar signals is presented that utilizes a simple model of the guitar. The goal for the codec is to obtain acceptable quality at significantly lower bit rates compared to universal audio codecs. This instrument codec achieves its data compression by transmitting an excitation function and model parameters to the receiver instead of the waveform. The parameters are extracted from the signal using weighted least squares approximation in the frequency domain. For evaluation a listening test has been conducted and the results are presented. They show that this compression technique provides a quality level comparable to recent universal audio codecs. The application however is, at this stage, limited to very simple guitar melody lines.
[This paper is being presented by Gerald Schuller.]
Convention Paper 7501 (Purchase now)P1-2 Stereo ACC Real-Time Audio Communication
, University of Porto - Porto, Portugal, ATC Labs, Chatham, NJ, USA; Filipe Abreu
, SEEGNAL Research - Portugal; Deepen Sinha
, ATC Labs - Chatham, NJ, USA
Audio Communication Coder (ACC) is a codec that has been optimized for monophonic encoding of mixed speech/audio material while minimizing codec delay and improving intrinsic error robustness. In this paper we describe two major recent algorithmic improvements to ACC: on-the-fly bit rate switching and coding of stereo. A combination of source, parametric, and perceptual coding techniques allows a very graceful switching between different bit rates with minimal impact on the subjective quality. A real-time GUI demonstration platform is available that illustrates the ACC operation from 16 kbit/s mono till 256 kbit/s stereo. A real-time two-way stereo communication platform over Bluetooth has been implemented that illustrates the ACC operational flexibility and robustness in error-prone environments.
Convention Paper 7502 (Purchase now)P1-3 MPEG-4 Enhanced Low Delay AAC—A New Standard for High Quality Communication
—Markus Schnell, Markus Schmidt, Manuel Jander, Tobias Albert, Ralf Geiger
, Fraunhofer IIS - Erlangen, Germany; Vesa Ruoppila, Per Ekstrand
, Dolby Stockholm/Sweden, Nuremberg/Germany; Bernhard Grill
, Fraunhofer IIS - Erlangen, Germany
The MPEG Audio standardization group has recently concluded the standardization process for the MPEG-4 ER Enhanced Low Delay AAC (AAC-ELD) codec. This codec is a new member of the MPEG Advanced Audio Coding family. It represents the efficient combination of the AAC Low Delay codec and the Spectral Band Replication (SBR) technique known from HE-AAC. This paper provides a complete overview of the underlying technology, presents points of operation as well as applications, and discusses MPEG verification test results.
Convention Paper 7503 (Purchase now)P1-4 Efficient Detection of Exact Redundancies in Audio Signals
—José R. Zapata G.
, Universidad Pontificia Bolivariana - Medellín, Antioquia, Colombia; Ricardo A. Garcia
, Kurzweil Music Systems - Waltham, MA, USA
An efficient method to identify bitwise identical long-time redundant segments in audio signals is presented. It uses audio segmentation with simple time domain features to identify long term candidates for similar segments, and low level sample accurate metrics for the final matching. Applications in compression (lossy and lossless) of music signals (monophonic and multichannel) are discussed.
Convention Paper 7504 (Purchase now)P1-5 An Improved Distortion Measure for Audio Coding and a Corresponding Two-Layered Trellis Approach for its Optimization
—Vinay Melkote, Kenneth Rose
, University of California - Santa Barbara, CA, USA
The efficacy of rate-distortion optimization in audio coding is constrained by the quality of the distortion measure. The proposed approach is motivated by the observation that the Noise-to-Mask Ratio (NMR) measure, as it is widely used, is only well adapted to evaluate relative distortion of audio bands of equal width on the Bark scale. We propose a modification of the distortion measure to explicitly account for Bark bandwidth differences across audio coding bands. Substantial subjective gains are observed when this new measure is utilized instead of NMR in the Two Loop Search, for quantization and coding parameters of scalefactor bands in an AAC encoder. Comprehensive optimization of the new measure, over the entire audio file, is then performed using a two-layered trellis approach, and yields nearly artifact-free audio even at low bit-rates.
Convention Paper 7505 (Purchase now)P1-6 Spatial Audio Scene Coding
—Michael M. Goodwin, Jean-Marc Jot
, Creative Advanced Technology Center - Scotts Valley, CA, USA
This paper provides an overview of a framework for generalized multichannel audio processing. In this Spatial Audio Scene Coding (SASC) framework, the central idea is to represent an input audio scene in a way that is independent of any assumed or intended reproduction format. This format-agnostic parameterization enables optimal reproduction over any given playback system as well as flexible scene modification. The signal analysis and synthesis tools needed for SASC are described, including a presentation of new approaches for multichannel primary-ambient decomposition. Applications of SASC to spatial audio coding, upmix, phase-amplitude matrix decoding, multichannel format conversion, and binaural reproduction are discussed.
Convention Paper 7507 (Purchase now)P1-7 Microphone Front-Ends for Spatial Audio Coders
, Illusonic LLC - Lausanne, Switzerland
Spatial audio coders, such as MPEG Surround, have enabled low bit-rate and stereo backwards compatible coding of multichannel surround audio. Directional audio coding (DirAC) can be viewed as spatial audio coding designed around specific microphone front-ends. DirAC is based on B-format spatial sound analysis and has no direct stereo backwards compatibility. We are presenting a number of two capsule-based stereo compatible microphone front-ends and corresponding spatial audio encoder modifications that enable the use of spatial audio coders to directly capture and code surround sound.
Convention Paper 7508 (Purchase now)
Thursday, October 2, 9:00 am — 10:45 am
T1 - Electroacoustic Measurements
:Christopher J. Struck
, CJS Labs - San Francisco, CA, USA
This tutorial focuses on applications of electroacoustic measurement methods, instrumentation, and data interpretation as well as practical information on how to perform appropriate tests. Linear system analysis and alternative measurement methods are examined. The topic of simulated free field measurements is treated in detail. Nonlinearity and distortion measurements and causes are described. Last, a number of advanced tests are introduced.
This tutorial is intended to enable the participants to perform accurate audio and electroacoustic tests and provide them with the necessary tools to understand and correctly interpret the results.
Thursday, October 2, 11:00 am — 1:00 pm
T2 - Standards-Based Audio Networks Using IEEE 802.1 AVB
, Harman International - CA, USAMatthew Xavier Mora
, Apple - Cupertino, CA, USAMichael Johas Teener
, Broadcom Corp. - Irvine CA, USA
Recent work by IEEE 802 working groups will allow vendors to build a standards-based network with the appropriate quality of service for high quality audio performance and production. This new set of standards, developed by the IEEE 802.1 Audio Video Bridging Task Group, provides three major enhancements for 802-based networks:
1. Precise timing to support low-jitter media clocks and accurate synchronization of multiple streams,
2. A simple reservation protocol that allows an endpoint device to notify the various network elements in a path so that they can reserve the resources necessary to support a particular stream, and
3. Queuing and forwarding rules that ensure that such a stream will pass through the network within the delay specified by the reservation.
These enhancements require no changes to the Ethernet lower layers and are compatible with all the other functions of a standard Ethernet switch (a device that follows the IEEE 802.1Q bridge specification). As a result, all of the rest of the Ethernet ecosystem is available to developers—in particular, the various high speed physical layers (up to 10 gigabit/sec in current standards, even higher speeds are in development), security features (encryption and authorization), and advanced management (remote testing and configuration) features can be used. This tutorial will outline the basic protocols and capabilities of AVB networks, describe how such a network can be used, and provide some simple demonstrations of network operation (including a live comparison with a legacy Ethernet network).
Thursday, October 2, 2:30 pm — 4:30 pm
T3 - Broadband Noise Reduction: Theory and Applications
, iZotope, Inc. - Boston, MA, USAJeremy Todd
, iZotope, Inc. - Boston, MA, USA
Broadband noise reduction (BNR) is a common technique for attenuating background noise in audio recordings. Implementations of BNR have steadily improved over the past several decades, but the majority of them share the same basic principles. This tutorial discusses various techniques used in the signal processing theory behind BNR. This will include earlier methods of implementation such as broadband and multiband gates and compander-based systems for tape recording. In addition to explanation of the early methods used in the initial implementation of BNR, greater emphasis and discussion will be focused toward recent advances in more modern techniques such as spectral subtraction. These include multi-resolution processing, psychoacoustic models, and the separation of noise into tonal and broadband parts. We will compare examples of each technique for their effectiveness on several types of audio recordings.
Thursday, October 2, 2:30 pm — 4:30 pm
P4 - Acoustic Modeling and Simulation
: Scott Norcross
, Communications Research Centre - Ottawa, Ontario, CanadaP4-1 Application of Multichannel Impulse Response Measurement to Automotive Audio
, Fraunhofer Institute for Digital Media Technology - Ilmenau, Germany, and Technical University of Delft, Delft, The Netherlands; Diemer de Vries
, Technical University of Delft - Delft, The Netherlands
Audio reproduction in small enclosures holds a couple of differences in comparison to conventional room acoustics. Today’s car audio systems meet sophisticated expectations but still the automotive listening environment delivers critical acoustic properties. During the design of such an audio system it is helpful to gain insight into the temporal and spatial distribution of the acoustic field's properties. Because room acoustic modeling software reaches its limits the use of acoustic imaging methods can be seen as a promising approach. This paper describes the application of wave field analysis based on a multichannel impulse response measurement in an automotive use case. Besides a suitable preparation of the theoretical aspects, the analysis method is used to investigate the acoustic wave field inside a car cabin.
Convention Paper 7521 (Purchase now)P4-2 Multichannel Low Frequency Room Simulation with Properly Modeled Source Terms—Multiple Equalization Comparison
—Ryan J. Matheson
, University of Waterloo - Waterloo, Ontario, Canada
At low frequencies unwanted room resonances in regular-sized rectangular listening rooms cause problems. Various methods for reducing these resonances are available including some multichannel methods. Thus with introduction of setups like 5.1 surround into home theater systems there are now more options available to perform active resonance control using the existing loudspeaker array. We focus primarily on comparing, separately, each step of loudspeaker placement and its effects on the response in the room as well as the effect of adding additional symmetrically placed loudspeakers in the rear to cancel out any additional room resonances. The comparison is done by use of a Finite Difference Time Domain (FDTD) simulator with focus on properly modeling a source in the simulation. A discussion about the ability of a standard 5.1 setup to utilize a multichannel equalization technique (without adding additional loudspeakers to the setup) and a modal equalization technique is later discussed.
Convention Paper 7522 (Purchase now)P4-3 A Super-Wide-Range Microphone with Cardioid Directivity
—Kazuho Ono, Takehiro Sugimoto, Akio Ando
, NHK Science and Technical Research Laboratories - Tokyo, Japan; Tomohiro Nomura, Yutaka Chiba, Keishi Imanaga
, Sanken Microphone Co. Ltd. - Japan
This paper describes a super-wide-range microphone with cardioid directivity, which covers the frequency range up to 100 kHz. The authors have successfully developed the omni-directional microphone capable of picking up sounds of up to 100 kHz with low noise. The proposed microphone uses an omni-directional capsule adopted in the omni-directional super-wide-range microphone and a bi-directional capsule that is newly designed to fit the characteristics of the omni-directional one. The output signals of both capsules are synthesized as the output signals to achieve cardioid directivity. The measurement results show that the proposed microphone achieves wide frequency range up to 100 kHz, as well as low noise characteristics and excellent cardioid directivity.
Convention Paper 7523 (Purchase now)P4-4 Methods and Limitations of Line Source Simulation
, Ahnert Feistel Media Group - Berlin, Germany; Ambrose Thompson
, Martin Audio - High Wycombe, Bucks, UK; Wolfgang Ahnert
, Ahnert Feistel Media Group - Berlin, Germany
Although line array systems are in widespread use today, investigations of the requirements and methods for accurate modeling of line sources are scarce. In previous publications the concept of the Generic Loudspeaker Library (GLL) was introduced. We show that on the basis of directional elementary sources with complex directivity data finite line sources can be simulated in a simple, general, and precise manner. We derive measurement requirements and discuss the limitations of this model. Additionally, we present a second step of refinement, namely the use of different directivity data for cabinets of identical type based on their position in the array. All models are validated by measurements. We compare the approach presented with other proposed solutions.
Convention Paper 7524 (Purchase now)
Thursday, October 2, 4:30 pm — 6:30 pm
B4 - Mobile/Handheld Broadcasting: Developing a New Medium
, Public Broadcasting ServicePanelists
, Sinclair Broadcast GroupSterling Davis
, Cox BroadcastingBrett Jenkins
, Ion Media NetworksDakx Turcotte
, Neural Audio Corp.
The broadcasting industry, the broadcast and consumer equipment vendors, and the Advanced Television Systems Committee have been vigorously moving forward toward the development of a Mobile/Handheld DTV broadcast standard and its practical implementation. In order to bring this new service to the public players from various industry segments have come together in an unprecedented fashion. In this session key leaders in this activity will present what the emerging system includes, how far the industry has progressed, and what’s left to be done.
Thursday, October 2, 5:00 pm — 6:45 pm
W5 - Engineering Mistakes We Have Made in Audio
, Oxford Digital Limited - UKPanelists
, Audio ImaginationJames D. (JJ) Johnston
, Neural Audio Corp.Mel Lambert
, Media & MarketingGeorge Massenburg
, Massenburg Design WorksJim McTigue
, Impulsive Audio
Six leading audio product developers will share the enlightening, thought-provoking, and (in retrospect) amusing lessons they have learned from actual mistakes they have made in the product development trenches.
Friday, October 3, 9:00 am — 10:45 am
L4 - White Space Issues
, Shure Incorporated - NiPanelists
, Production Radio Rentals - Yonkers, NY, USA
The DTV conversion will be complete on February 17, 2009. The impact of this and surrounding FCC decisions is of great concern to wireless microphone users. Will 700 MHz band mics retain type certification? Will proposed white space devices create new interference? Will there be an FCC crack-down on unlicensed microphone use? This panel will discuss the latest FCC rule decisions and decisions still pending.
Friday, October 3, 9:00 am — 10:30 am
W6 - Audio Networking for the Pros
, ZP Engineering srlPanelists
, Peavey Digital ResearchGreg Shay
, Axia AudioJérémie Weber
, AuvitranAidan Williams
Several solutions are available on the market today for digital audio transfer over conventional data cabling, but only some of them allow usage of standard networking equipment. This workshop presents some commercially available solutions (Cobranet, Livewire, Ethersound, Dante), with specific focus on noncompressed, low-latency audio transmission for pro-audio and live applications using standard IEEE 802.3 network technology. The main challenges of digital audio transport will be outlined, including compatibility with common networking equipment, reliability, latency, and deployment. Typical scenarios will be proposed, with panelists explaining their own approaches and solutions.
Friday, October 3, 9:00 am — 1:00 pm
P6 - Loudspeaker Design
: Alexander Voishvillo
, JBL Professional - Northridge, CA, USAP6-1 Loudspeaker Production Variance
, Equity Sound Investments - Bloomington, IN, USA; Laurie Fincham
, THX Ltd. - San Rafael, CA, USA
Numerous quality assurance philosophies have evolved over the last few decades designed to manage manufacturing quality. Managing quality control of production loudspeakers is particularly challenging. Variation of subcomponents and assembly processes across loudspeaker driver production batches may lead to excessive variation of sensitivity, bandwidth, frequency response, and distortion characteristics, etc. As loudspeaker drivers are integrated into production audio systems these variants result in broad performance permutation from system to system that affects all aspects of acoustic balance and spatial attributes. This paper will discuss traditional electro-dynamic loudspeaker production variation.
Convention Paper 7530 (Purchase now)P6-2 Distributed Mechanical Parameters Describing Vibration and Sound Radiation of Loudspeaker Drive Units
, University of Technology Dresden - Dresden, Germany; Joachim Schlechter
, KLIPPEL GmbH - Dresden, Germany
—Wolfgang Klippel, University of Dresden, Dresden, Germany; Joachim Schlechter, Klippel GmbH, Dresden, Germany
The mechanical vibration of loudspeaker drive units is described by a set of linear transfer functions and geometrical data that are measured at selected points on the surface of the radiator (cone, dome, diaphragm, piston, panel) by using a scanning technique. These distributed parameters supplement the lumped parameters (T/S, nonlinear, thermal), simplify the communication between cone, driver, and loudspeaker system design and open new ways for loudspeaker diagnostics. The distributed vibration can be summarized to a new quantity called accumulated acceleration level
(AAL), which is comparable with the sound pressure level (SPL) if no acoustical cancellation occurs. This and other derived parameters are the basis for modal analysis and novel decomposition techniques that make the relationship between mechanical vibration and sound pressure output more transparent. Practical problems and indications for practical improvements are discussed for various example drivers. Finally, the usage of the distributed parameters within finite and boundary element analyses is addressed and conclusions for the loudspeaker design process are made.
Convention Paper 7531 (Purchase now)P6-3 A New Methodology for the Acoustic Design of Compression Driver Phase-Plugs with Radial Channels
, Celestion International Ltd. - Ipswich, UK,and GP Acousics (UK) Ltd., Maidstone, UK; Jack Oclee-Brown
, GP Acousics (UK) Ltd. - Maidstone, UK, and University of Southampton, Southampton, UK
Recent work by the authors describes an improved methodology for the design of annular-channel, dome compression drivers. Although not so popular, radial channel phase plugs are used in some commercial designs. While there has been some limited investigation into the behavior of this kind of compression driver, the literature is much more extensive for annular types. In particular, the modern approach to compression driver design, based on a modal description of the compression cavity, as first pioneered by Smith, has no equivalent for radial designs. In this paper we first consider if a similar approach is relevant to radial-channel phase plug designs. The acoustical behavior of a radial-channel compression driver is analytically examined in order to derive a geometric condition that ensures minimal excitation of the compression cavity modes.
Convention Paper 7532 (Purchase now)P6-4 Mechanical Properties of Ferrofluids in Loudspeakers
—Guy Lemarquand, Romain Ravaud, Valerie Lemarquand, Claude Depollier
, Laboratoire d’Acoustique de l’Université du Maine - Le Mans, France
This paper describes the properties of ferrofluid seals in ironless electrodynamic loudspeakers. The motor consists of several outer stacked ring permanent magnets. The inner moving part is a piston. In addition, two ferrofluid seals are used that replace the classic suspension. Indeed, these seals fulfill several functions. First, they ensure the airtightness between the loudspeaker faces. Second, they act as bearings and center the moving part. Finally, the ferrofluid seals also exert a pull back force on the moving piston. Both radial and axial forces exerted on the piston are calculated thanks to analytical formulations. Furthermore, the shape of the seal is discussed as well as the optimal quantity of ferrofluid. The seal capacity is also calculated.
Convention Paper 7533 (Purchase now)P6-5 An Ironless Low Frequency Subwoofer Functioning under its Resonance Frequency
, Université du Maine - Le Mans, France, Orkidia Audio, Saint Jean de Luz, France; Guy Lemarquand
, Université du Maine - Le Mans, France; Bernard Nemoff
, Orkidia Audio - Saint Jean de Luz, France
A low frequency loudspeaker (10 Hz to 100 Hz) is described. Its structure is totally ironless in order to avoid nonlinear effects due to the presence of iron. The large diaphragm and the high force factor of the loudspeaker lead to its high efficiency. Efforts have been made for reducing the nonlinearities of the loudspeaker for a more accurate sound reproduction. In particular we have developed a motor totally made of permanent magnets, which create a uniform induction across the entire intended displacement of the coil. The motor linearity and the high force factor of this flat loudspeaker make it possible to function under its resonance frequency with great accuracy.
Convention Paper 7534 (Purchase now)P6-6 Line Arrays with Controllable Directional Characteristics—Theory and Practice
—Laurie Fincham, Peter Brown
, THX Ltd. - San Rafael, CA, USA
A so-called arc line array is capable of providing directivity control. Applying simple amplitude shading can, in theory, provide good off-axis lobe suppression and constant directivity over a frequency range determined at low-frequencies by line length and at high-frequencies by driver spacing. Array transducer design presents additional challenges–the dual requirements of close spacing, for accurate high-frequency control, and a large effective radiating area, for good bass output, are incompatible with the use of multiple full-range drivers. A novel drive unit layout is proposed and theoretical and practical design criteria are presented for a two-way line with controllable directivity and virtual elimination of spatial aliasing. The PC-based array controller permits real-time changes in beam parameters for multiple overlaid beams.
Convention Paper 7535 (Purchase now)P6-7 Loudspeaker Directivity Improvement Using Low Pass and All Pass Filters
, Excelsior Audio Design & Services, LLC - Gastonia, NC, USA
The response of loudspeaker systems employing multiple drivers within the same pass band is often less than ideal. This is due to the physical separation of the drivers and their lack of proper acoustical coupling within the higher frequency region of their use. The resultant comb filtering is sometimes addressed by applying a low pass filter to one or more of the drivers within the pass band. This can cause asymmetries in the directivity response of the loudspeaker system. A method is presented to greatly minimize these asymmetries through the use of low pass and all pass filters. This method is also applicable as a means to extend the directivity control of a loudspeaker system to lower frequencies.
Convention Paper 7536 (Purchase now)P6-8 On the Necessary Delay for the Design of Causal and Stable Recursive Inverse Filters for Loudspeaker Equalization
—Avelino Marques, Diamantino Freitas
, Polytechnic Institute of Porto - Porto, Portugal
The authors have developed and applied a novel approach to the equalization of non-minimum phase loudspeaker systems, based on the design of Infinite Impulse Response (recursive) inverse filters. In this paper the results and improvements attained on this novel IIR filter design method are presented. Special attention has been given to the delay of the equalized system. The boundaries to be posed on the search space of the delay for a causal and stable inverse filter, to be used in the nonlinear least squares minimization routine, are studied, identified, and related with the phase response of a test system and with the order of the inverse filter. Finally, these observations and relations are extended and applied to multi-way loudspeaker systems, demonstrating the connection of the lower and upper bounds of the delay with the loudspeaker’s crossover filters phase response and inverse filter order.
Convention Paper 7537 (Purchase now)
Friday, October 3, 2:30 pm — 4:30 pm
, AudienceGreg Duckett
, RaneMichael Poimboeuf
, DigiDesignRichard Wear
This session is aimed at job candidates in electrical engineering and computer science who want a private, no-cost, no-obligation confidential review of their resume. You can expect feedback such as: what is missing from the resume; what you should omit from the resume; how to strengthen your explanation of your talents and skills. Recent graduates, juniors, seniors, and graduate students who are now seeking, or will soon be seeking, a full-time employment position in the audio and music industries in hardware or software engineering will especially benefit from participating, but others with more industry experience are also invited. You will meet one-on-one with someone from a company in the audio and music industries with experience in hiring for R&D positions. Bring a paper copy of your resume and be prepared to take notes.
Friday, October 3, 2:30 pm — 3:30 pm
T6 - Modern Perspectives on Hewlett's Sine Wave Oscillator
, Linear Technology - Milpitas, CA, USA
This tutorial describes the thesis and related work of a Stanford University graduate student, William R. Hewlett. Hewlett’s 1939 thesis, concerning a then-new type of sine wave oscillator, is reviewed. His use of new concepts and ideas of Nyquist, Black, and Meacham is considered. Hewlett displays an uncanny knack for combining ideas to synthesize his desired result. The oscillator is a beautiful example of lateral thinking. The whole problem was considered in an interdisciplinary spirit, not just an electronic one. This is the signature of superior problem solving and good engineering. Although the theoretics and technology are now passe, the quality of Hewlett's thinking remains rare, and singularly human. No computer driven “expert system” could ever emulate such lateral thinking, advertising copy notwithstanding. Modern adaptations of Hewlett’s guidance complete the tutorial. Handouts include Hewlett’s thesis, a detailed production schematic of the oscillator, and modern versions of the circuit.
Friday, October 3, 2:30 pm — 6:30 pm
P9 - Multichannel Sound Reproduction
: Durand Begault
, NASA Ames Research Center - Mountain View, CA, USAP9-1 An Investigation of 2-D Multizone Surround Sound Systems
, Industrial Research Limited - Lower Hutt, Wellington, New Zealand
Surround sound systems can produce a desired sound field over an extended region of space by using higher order Ambisonics. One application of this capability is the production of multiple independent soundfields in separate zones. This paper investigates multi-zone surround systems for the case of two-dimensional reproduction. A least squares approach is used for deriving the loudspeaker weights for producing a desired single frequency wave field in one of N
zones. It is shown that reproduction in the active zone is more difficult when an inactive zone is in-line with the virtual sound source and the active zone. Methods for controlling this problem are discussed.
Convention Paper 7551 (Purchase now)P9-2 Two-Channel Matrix Surround Encoding for Flexible Interactive 3-D Audio Reproduction
, Creative Advanced Technology Center - Scotts Valley, CA, USA
The two-channel matrix surround format is widely used for connecting the audio output of a video gaming system to a home theater receiver for multichannel surround reproduction. This paper describes the principles of a computationally-efficient interactive audio spatialization engine for this application. Positional cues including 3-D elevation are encoded for each individual sound source by frequency-independent interchannel phase and amplitude differences, rather than HRTF cues. A matrix surround decoder based on frequency-domain Spatial Audio Scene Coding (SASC) is able to faithfully reproduce both ambient reverberation and positional cues over headphones or arbitrary multichannel loudspeaker reproduction formats, while preserving source separation despite the intermediate encoding over only two channels.
Convention Paper 7552 (Purchase now)P9-3 Is My Decoder Ambisonic?
, SRI International - Menlo Park, CA, USA; Richard Lee
, Pandit Littoral - Cooktown, Queensland, Australia; Eric Benjamin
, Dolby Laboratories - San Francisco, CA, USA
In earlier papers, the present authors established the importance of various aspects of Ambisonic decoder design: a decoding matrix matched to the geometry of the loudspeaker array in use, phase-matched shelf filters, and distance compensation. These are needed for accurate reproduction of spatial localization cues, such as interaural time difference (ITD), interaural level difference (ILD), and distance cues. Unfortunately, many listening tests of Ambisonic reproduction reported in the literature either omit the details of the decoding used or utilize suboptimal decoding. In this paper we review the acoustic and psychoacoustic criteria for Ambisonic reproduction; present a methodology and tools for "black box" testing to verify the performance of a candidate decoder; and present and discuss the results of this testing on some widely used decoders.
Convention Paper 7553 (Purchase now)P9-4 Exploiting Human Spatial Resolution in Surround Sound Decoder Design
—David Moore, Jonathan Wakefield
, University of Huddersfield - West Yorkshire, UK
This paper presents a technique whereby the localization performance of surround sound decoders can be improved in directions in which human hearing is more sensitive to sound source location. Research into the Minimum Audible Angle is explored and incorporated into a fitness function based upon a psychoacoustic model. This fitness function is used to guide a heuristic search algorithm to design new Ambisonic decoders for a 5-speaker surround sound layout. The derived decoder is successful in matching the variation in localization performance of the human listener with better performance to the front and rear and reduced performance to the sides. The effectiveness of the standard ITU 5-speaker layout versus a non-standard layout is also considered in this context.
Convention Paper 7554 (Purchase now)P9-5 Surround System Based on Three-Dimensional Sound Field Reconstruction
—Filippo M. Fazi, Philip A. Nelson, Jens E. Christensen
, University of Southampton - Southampton, UK; Jeongil Seo
, Electronics and Telecommunications Research Institute (ETRI) - Daejeon, Korea
The theoretical fundamentals and the simulated and experimental performance of an innovative surround sound system are presented. The proposed technology is based on the physical reconstruction of a three-dimensional target sound field over a region of the space using an array of loudspeakers surrounding the listening area. The computation of the loudspeaker gains includes the numerical or analytical solution of an integral equation of the first kind. The experimental setup and the measured reconstruction performance of a system prototype constituted by a three dimensional array of 40 loudspeakers are described and discussed.
Convention Paper 7555 (Purchase now)P9-6 A Comparison of Wave Field Synthesis and Higher-Order Ambisonics with Respect to Physical Properties and Spatial Sampling
—Sascha Spors, Jens Ahrens
, Technische Universität Berlin - Berlin, Germany
Wave field synthesis (WFS) and higher-order Ambisonics (HOA) are two high-resolution spatial sound reproduction techniques aiming at overcoming some of the limitations of stereophonic reproduction techniques. In the past, the theoretical foundations of WFS and HOA have been formulated in a quite different fashion. Although some work has been published that aims at comparing both approaches their similarities and differences are not well documented. This paper formulates the theory of both approaches in a common framework, highlights the different assumptions made to derive the driving functions, and the resulting physical properties of the reproduced wave field. Special attention will be drawn to the spatial sampling of the secondary sources since both approaches differ significantly here.
Convention Paper 7556 (Purchase now)P9-7 Reproduction of Virtual Sound Sources Moving at Supersonic Speeds in Wave Field Synthesis
—Jens Ahrens, Sascha Spors
, Technische Universität Berlin - Berlin, Germany
In conventional implementations of wave field synthesis, moving sources are reproduced as sequences of stationary positions. As reported in the literature, this process introduces various artifacts. It has been shown recently that these artifacts can be reduced when the physical properties of the wave field of moving virtual sources are explicitly considered. However, the findings were only applied to virtual sources moving at subsonic speeds. In this paper we extend the published approach to the reproduction of virtual sound sources moving at supersonics speeds. The properties of the actual reproduced sound field are investigated via numerical simulations.
Convention Paper 7557 (Purchase now)P9-8 An Efficient Method to Generate Particle Sounds in Wave Field Synthesis
—Michael Beckinger, Sandra Brix
, Fraunhofer Institute for Digital Media Technology - Ilmenau, Germany
Rendering a couple of virtual sound sources for wave field synthesis (WFS) in real time is nowadays feasible using the calculation power of state-of-the-art personal computers. If immersive atmospheres containing thousands of sound particles like rain and applause should be rendered in real time for a large listening area with a high spatial accuracy, calculation complexity increases enormously. A new algorithm based on continuously generated impulse responses and following convolutions, which renders many sound particles in an efficient way will be presented in this paper. The algorithm was verified by first listening tests and its calculation complexity was evaluated as well.
Convention Paper 7558 (Purchase now)
Friday, October 3, 2:30 pm — 5:00 pm
P10 - Nonlinearities in Loudspeakers
: Laurie Fincham
, THX Ltd. - San Rafael, CA, USAP10-1 Audibility of Phase Response Differences in a Stereo Playback System. Part 2: Narrow-Band Stimuli in Headphones and Loudspeakers
—Sylvain Choisel, Geoff Martin
, Bang & Olufsen A/S - Struer, Denmark
An series of experiments were conducted in order to measure the audibility thresholds of phase differences between channels using mismatched cross-over networks. In Part 1 of this study, it was shown that listeners are able to detect very small inter-channel phase differences when presented with wide-band stimuli over headphones, and that the threshold was frequency dependent. This second part of the investigation focuses on listeners’ abilities with narrow-band signals (from 63 to 8000 Hz) in headphones as well as loudspeakers. The results confirm the frequency dependency of the audibility threshold over headphones, whereas for loudspeaker playback the threshold was essentially independent of the frequency.
Convention Paper 7559 (Purchase now)P10-2 Time Variance of the Suspension Nonlinearity
, Technical University of Denmark - Lyngby, Denmark; Bo Rhode Petersen
, Aalborg University - Esbjerg, Denmark
It is well known that the resonance frequency of a loudspeaker depends on how it is driven before and during the measurement. Measurement done right after exposing it to high levels of electrical power and/or excursion giver lower values than what can be measured when the loudspeaker is cold. This paper investigates the changes in compliance the driving signal can cause, this includes low level short duration measurements of the resonance frequency as well as high power long duration measurements of the nonlinearity of the suspension. It is found that at low levels the suspension softens but recovers quickly. The high power and long term measurements affect the nonlinearity of the loudspeaker, by increasing the compliance value for all values of displacement. This level dependency is validated with distortion measurements and it is demonstrated how improved accuracy of the nonlinear model can be obtained by including the level dependency.
Convention Paper 7560 (Purchase now)P10-3 A Study of the Creep Effect in Loudspeakers Suspension
, Technical University of Denmark - Lyngby, Denmark; Knud Thorborg, Carsten Tinggaard
, Tymphany A/S - Taastrup, Denmark
This paper investigates the creep effect, the visco elastic behavior of loudspeaker suspension parts, which can be observed as an increase in displacement far below the resonance frequency. The creep effect means that the suspension cannot be modeled as a simple spring. The need for an accurate creep model is even larger as the validity of loudspeaker models are now sought extended far into the nonlinear domain of the loudspeaker. Different creep models are investigated and implemented both in simple lumped parameter models as well as time domain nonlinear models, the simulation results are compared with a series of measurements on three version of the same loudspeaker with different thickness and rubber type used in the surround.
Convention Paper 7561 (Purchase now)P10-4 The Influence of Acoustic Environment on the Threshold of Audibility of Loudspeaker Resonances
, Bang & Olufsen A/S - Struer, Denmark and University of Surrey, Guildford, Surrey, UK; Sylvain Choisel
, Bang & Olufsen A/S - Struer, Denmark
Resonances in loudspeakers can produce a detrimental effect on sound quality. The reduction or removal of unwanted resonances has therefore become a recognized practice in loudspeaker tuning. This paper presents the results of a listening test that has been used to determine the audibility threshold of a single resonance in different acoustic environments: headphones, loudspeakers in a standard listening room, and loudspeakers in a car. Real loudspeakers were measured and the resonances modeled as IIR filters. Results show that there is a significant interaction between acoustic environment and program material.
Convention Paper 7562 (Purchase now)P10-5 Confirmation of Chaos in a Loudspeaker System Using Time Series Analysis
, Queen Mary, University of London - London, UK; Ivan Djurek, Antonio Petosic
, University of Zagreb - Zagreb, Croatia; Danijel Djurek
, AVAC – Alessandro Volta Applied Ceramics, Laboratory for Nonlinear Dynamics - Zagreb, Croatia
The dynamics of an experimental electrodynamic loudspeaker is studied by using the tools of chaos theory and time series analysis. Delay time, embedding dimension, fractal dimension, and other empirical quantities are determined from experimental data. Particular attention is paid to issues of stationarity in the system in order to identify sources of uncertainty. Lyapunov exponents and fractal dimension are measured using several independent techniques. Results are compared in order to establish independent confirmation of low dimensional dynamics and a positive dominant Lyapunov exponent. We thus show that the loudspeaker may function as a chaotic system suitable for low dimensional modeling and the application of chaos control techniques.
Convention Paper 7563 (Purchase now)
Friday, October 3, 5:00 pm — 6:30 pm
W7 - Same Techniques, Different Technologies—Recurring Strategies for Producing Game, Web, and Mobile Audio
, NickOnlineGeorge "The Fatman" Sanger
, Legendary Game Audio GuruGuy Whitmore
, Microsoft Game Studio
When any new technology develops, the limitations of current systems are inevitably met. Bandwidth constraints then generate a class of techniques designed to maximize information transfer. Over time as bottlenecks expand, new kinds of applications become possible, making previous methods and file formats obsolete. By the time broadband access becomes available, we can observe a similar progression taking place in the next developing technology. The workshop discusses this trend as exhibited in the gaming, Internet, and mobile industries, with particular emphasis on audio file types and compression techniques. The presenter will compare and contrast obsolete tricks of the trade with current practices and invite industry veterans to discuss the trend from their points of view. Finally the panel makes predictions about the evolution of media.
Friday, October 3, 5:00 pm — 6:30 pm
P12 - Amplifiers and Automotive Audio
P12-1 Imperfections and Possible Advances in Analog Summing Amplifier Design
, MMK Instruments - Belgrade, Serbia; Dragan Drincic
, Advanced School for Electrical & Computer Engineering - Belgrade, Serbia; Sasha Jankovic
, OXYGEN-Digital, Parkgate Studio - Sussex, UK
The major requirement in the design of the analog summing amplifier is the quality of the summing bus. The key problem in most common designs is the artifact of summing bus impedance, which cannot be considered as true physical impedance, because it has been generated by negative feedback. The loop gain of the amplifier used will limit the performance at higher audio frequencies where the loop gain is lower, increasing the channels cross talk. The inevitable effect of heavy feedback is the increased susceptibility of the amplifier to oscillate as well as sensitivity to RFI. The advanced solution, presented in this paper, could be seen in the usage of the transistor common-base pair (CB-CB) configuration as a summing bus. The CB pair offers inherent low-input impedance, low-noise, very good frequency response, and, very importantly, makes the application of total feedback not necessarily.
Convention Paper 7569 (Purchase now)P12-2 A Switchmode Power Supply Suitable for Audio Power Amplifiers
, Factor One Inc. - Keyport, NJ, USA
Power supplies for audio amplifiers have different requirements than typical commercial power supplies. A tabulation of power supply parameters that affect the audio application is presented and discussed. Different types of audio amplifiers are categorized and shown to have different requirements. Over time new technologies have emerged that affect the implementation of AC to DC converters used in audio amplifiers. A brief history of audio power supply technology is presented. The evolution of the newly proposed interleaved boost with LLC resonant half bridge topology from preceding technologies is shown. The operation of the new topology is explained and its advantages are shown by a simulation of the circuit.
Convention Paper 7570 (Purchase now)P12-3 On the Optimization of Enhanced Cascode
, Consultant - Miami, FL, USA
Twenty years ago enhanced cascode and other circuit topologies based on the same design principles were presented to audio amplifier designers. The circuit was supposed to be incorporated in transconductance gain stages and current sources. Enhanced cascode was used in some commercial products but have not received wide adoption. It was speculated that enhanced cascode has reduced phase margin and at times higher distortion being compared to conventional cascode. Enhanced cascode is analyzed on the basis of distortion and frequency response. It is shown how to make the most of enhanced cascode. Optimized novel circuit topology is presented.
Convention Paper 7571 (Purchase now)P12-4 An Active Load and Test Method for Evaluating the Efficiency of Audio Power Amplifiers
—Harry Dymond, Phil Mellor
, University of Bristol - Bristol, UK
This paper presents the design, implementation, and use of an “active load” for audio power amplifier efficiency testing. The active load can simulate linear complex loads representative of real-world amplifier operation with a load modulus between 4 and 50 ohms inclusive, load phase-angles between -60° and +60° inclusive, and operates from 20 to 20,000 Hz. The active load allows for the development of an automated test procedure for evaluating the efficiency of an audio power amplifier across a range of output voltage amplitudes, load configurations, and output signal frequencies. The results of testing a class-B and a class-D amplifier, each rated at 100 watts into 8 ohms, are presented.
Convention Paper 7572 (Purchase now)P12-5 An Objective Method of Measuring Subjective Click-and-Pop Performance for Audio Amplifiers
—Kymberly Christman (Schmidt)
, Maxim Integrated Products - Sunnyvale, CA, USA
Click-and-pop refers to any “clicks” and “pops” or other unwanted, audio-band transient signals that are reproduced by headphones or loudspeakers when the audio source is turned on or off. Until recently, the industry’s characterization of this undesirable effect has been almost purely subjective. Marketing phrases such as “low pop noise” and “clickless/popless operation” illustrate the subjectivity applied in quantifying click-and-pop performance. This paper presents a method that objectively quantifies this parameter, allowing meaningful, repeatable comparisons to be drawn between different components. Further, results of a subjective click-and-pop listening test are presented to provide a baseline for objectionable click-and-pop levels in headphone amplifiers.
Convention Paper 7573 (Purchase now)P12-6 Effective Car Audio System Enabling Individual Signal Processing Operations of Coincident Multiple Audio Sources through Single Digital Audio Interface Line
—Chul-Jae Yoo, In-Sik Ryu
, Hyundai Autonet - South Korea
There are three major audio sources in recent car environments: primary audio (usually music including radio), navigation voice prompt, and hands-free voice. Listening situations in cars include not only listening to a single audio source, but also listening to concurrent multiple audio sources—for example, navigation guided as listening music and
navigation guided or listening music as talking on a hands-free cell phone. In this paper a conventional external amplifier system connected with a head unit by three audio interface
lines was introduced. Then, an effective automotive audio system having single SPDIF interface line that is capable of concurrent processing of the above three kinds of audio sources was proposed. The new system leads to a reduced wire harness in car environments and also increases voice qualities by transmitting voice signals via an SPDIF digital line compared with that via analog lines.
Convention Paper 7574 (Purchase now)P12-7 Digital Equalization of Automotive Sound Systems Employing Spectral Smoothed FIR Filters
—Marco Binelli, Angelo Farina
, University of Parma - Parma, Italy
In this paper we investigate the usage of spectral smoothed FIR filters for equalizing a car audio system. The target is also to build short filters that can be processed on DSP processors with limited computing power. The inversion algorithm is based on the Nelson-Kirkeby method and on independent phase and magnitude smoothing, by means of a continuous phase method as Panzer and Ferekidis showd. The filter is aimed to create a "target" frequency response, not necessarily flat, employing a short number of taps and maintaining good performances everywhere inside the car's cockpit. As shown also by listening tests, smoothness, and the choice of the right frequency response increase the performances of the car audio systems.
Convention Paper 7575 (Purchase now)P12-8 Implementation of a Generic Algorithm on Various Automotive Platforms
—Thomas Esnault, Jean-Michel Raczinski
, Arkamys - Paris, France
This paper describes a methodology to adapt a generic automotive algorithm to various embedded platforms while keeping the same audio rendering. To get over the limitations of the target DSPs, we have developed tools to control the transition from one platform to another including algorithm adaptation and coefficients computing. Objective and subjective validation processes allow us to certify the quality of the adaptation. With this methodology, productivity has been increased in an industrial context.
Convention Paper 7576 (Purchase now)P12-9 Advanced Audio Algorithms for a Real Automotive Digital Audio System
—Stefania Cecchi, Lorenzo Palestini, Paolo Peretti, Emanuele Moretti, Francesco Piazza
, Università Politecnica delle Marche - Ancona, Italy; Ariano Lattanzi, Ferruccio Bettarelli
, Leaff Engineering - Porto Potenza Picena (MC), Italy
In this paper an innovative modular digital audio system for car entertainment is proposed. The system is based on a plug-in-based software (real-time) framework allowing reconfigurability and flexibility. Each plug-in is dedicated to a particular audio task such as equalization and crossover filtering, implementing innovative algorithms. The system has been tested on a real car environment, with a hardware platform comprising professional audio equipments, running on a PC. Informal listening tests have been performed to validate the overall audio quality, and satisfactory results were obtained.
Convention Paper 7577 (Purchase now)
Friday, October 3, 5:30 pm — 6:45 pm
T8 - Free Source Code for Processing AES Audio Data
:Gregg C. Hawkes
, Xilinx - San Jose, CA, USAReed Tidwell
, Xilinx - San Jose, CA, USA
This session is a tutorial on the Xilinx free Verilog and VHDL source code for extracting and inserting audio in SDI streams, including “on the fly” error correction and high performance, continuously adaptive, asynchronous sample rate conversion. The audio sample rate conversion supports large ratios as well as fractional conversion rates and maintains high performance while continuously adapting itself to the input and output rates without user control. The features, device utilization, and performance of the IP will be presented and demonstrated with industry standard audio hardware.
Saturday, October 4, 9:00 am — 11:00 am
The Career/Job Fair will feature several companies from the exhibit floor. All attendees of the convention, students and professionals alike, are welcome to come visit with representatives from the companies and find out more about job and internship opportunities in the audio industry. Bring your resume!
to reserve a table at this event.
Saturday, October 4, 9:00 am — 10:45 am
B7 - DTV Audio Myth Busters
, Fraunhofer USA Digital Media TechnologiesTim Carroll
, Linear Acoustic, Inc.Ken Hunold
, Dolby LaboratoriesDavid Wilson
, Consumer Electronics Association
There is no limit to the confusion created by the audio options in DTV. What do the systems really do? What happens when the systems fail? How much control can be exercised at each step in the content food chain? There are thousands of opinions and hundreds of options, but what really works and how do you keep things under control? Bring your questions and join the discussion as four experts from different stages in the chain try to sort it out.
Saturday, October 4, 9:00 am — 10:30 am
W9 - Low Frequency Acoustic Issues in Small Critical Listening Environments - Today's Audio Production Rooms
:Renato CiprianoDave Kotch
Increasing real estate costs coupled with the reduced size of current audio control room equipment have dramatically impacted the current generation of recording studios. Small room environments (those under 300 s.f.) are now the norm for studio design. These rooms, particularly in view of current 5.1 audio requirements, create special challenges, associated with low frequency audio response in an ever expanding listening sweet spot. Real world conditions and result data will be presented for Ovesan Studios (New York), Roc the Mic Studios (New York), and Diante Do Trono (Brazil).
Saturday, October 4, 9:00 am — 10:45 am
T9 - How I Does Filters: An Uneducated Person’s Way to Design Highly Regarded Digital Equalizers and Filters
, Oxford Digital Limited - Oxfordshire, UK
Much has been written in many learned papers about the design of audio filters and equalizers, this is NOT another one of those. The presenter is a bear of little brain and has over the years had to reduce the subject of digital filtering into bite-sized lumps containing a number of simple recipes that have got him through most of his professional life. Complete practical implementations of high pass and low pass multi-order filters, bell (or presence) filters, and shelving filters including the infrequently seen higher order types. The tutorial is designed for the complete novice, it is light on mathematics and heavy on explanation and visualization—even so, the provided code works and can be put to practical use.
Saturday, October 4, 9:00 am — 10:30 am
P15 - Loudspeakers—Part 1
P15-1 Advanced Passive Loudspeaker Protection
, Kludge Audio - Williamsburg, VA, USA
In a follow-on to a previous conference paper (AES Convention Paper 5881), the author explores the use of polymeric positive temperature coefficient (PPTC) protection devices that have a discontinuous I/V curve that is the result of a physical state change. He gives a simple model for designing networks employing incandescent lamps and PPTC devices together to give linear operation at low levels while providing effective limiting at higher levels to prevent loudspeaker damage. Some discussion of applications in current service is provided.
Convention Paper 7588 (Purchase now)P15-2 Target Modes in Moving Assemblies of Compression Drivers and Other Loudspeakers
—Fernando Bolaños, Pablo Seoane
, Acústica Beyma S.A. - Valencia, Spain
This paper deals with how the important modes in a moving assembly of compression drivers and other loudspeakers can be found. Dynamic importance is an essential tool for those who work on modal analysis of systems with many degrees of freedom and complex structures. The important modes calculation or measurement in moving assemblies is an objective (absolute) method to find the relevant modes that act on the dynamics of these transducers. Our paper discusses axial modes and breath modes, which are basic for loudspeakers. The model generalized masses and the participation factors are useful tools to find the moving assemblies important modes (target modes). The strain energy of the moving assembly, which represents the amount of available potential energy, is essential as well.
Convention Paper 7589 (Purchase now)P15-3 Determining Manufacture Variation in Loudspeakers Through Measurement of Thiele/Small Parameters
—Scott Laurin, Karl Reichard
, Pennsylvania State University - State College, PA, USA
Thiele/Small parameters have become a standard for characterizing loudspeakers. Using fairly straightforward methods, the Thiele/Small parameters for twenty nominally identical loudspeakers were determined. The data were compiled to determine the manufacturing variations. Manufacturing tolerances can have a large impact on the variability and quality of loudspeakers produced. Generally, when more stringent tolerances are applied, there is less variation and drivers become more expensive. Now that the loudspeakers have been characterized, each one will be driven to failure. Some loudspeakers will be intentionally degraded to accelerate failures. The goal is to correlate variation in the Thiele/Small parameters with variation in speaker failure modes and operating life.
Convention Paper 7590 (Purchase now)P15-4 About Phase Optimization in Multitone Excitations
—Delphine Bard, Vincent Meyer
, University of Lund - Lund, Sweden
Multitone signals are often used as excitation for the characterization of audio systems. The frequency spectrum of the response consists of harmonics of the frequencies contained in the excitation and intermodulation products. Besides the choice of frequencies, in order to avoid frequency overlapping, there is also the need to chose adequate magnitudes and phases for the different components that constitute the multitone signal. In this paper we will investigate how the choice of the phases will impact the properties of the multitone signal, but also how it will affect the performances of a compensation method based on Volterra kernels and using multitone signals as an excitation.
Convention Paper 7591 (Purchase now)P15-5 Viscous Friction and Temperature Stability of the Mid-High Frequency Loudspeaker
—Ivan Djurek, Antonio Petosic
, University of Zagreb - Zagreb, Croatia; Danijel Djurek
, Alessandro Volta Applied Ceramics (AVAC) - Zagreb, Croatia
Mid-high frequency loudspeakers behave quite differently as compared to low-frequency units, regarding effects coming from the surrounding air medium. Previous work stressed high influence of the imaginary part of the viscous force, which significantly affects the resonance frequency of mid-high frequency loudspeakers. Viscous force is relatively highly dependent on temperature and humidity of the surrounding air, and in this paper we have evaluated how changes in temperature and humidity reflect to the loudspeaker's linearity, which may be significant for the quality of sound reproduction.
Convention Paper 7592 (Purchase now)P15-6 Calorimetric Evaluation of Intrinsic Friction in the Loudspeaker Membrane
—Antonio Petosic, Ivan Djurek
, University of Zagreb - Zagreb, Croatia; Danijel Djurek
, Alessandro Volta Applied Ceramics (AVAC) - Zagreb, Croatia
Friction losses in the vibrating system of an electrodynamic loudspeaker are represented by the intrinsic friction Ri
, which enters the equation of motion, and these losses are accompanied by irreversible release of the heat. A method is proposed for measurement of the friction losses in the loudspeaker's membrane by measurement of the thermocouple temperature probe glued to the membrane. Temperature on the membrane surface fluctuates stochastically as a result of thermo-elastic coupling in the membrane material. Evaluation of the amplitude in the temperature fluctuations enables an absolute and direct evaluation of intrinsic friction Ri
entering friction force F=Ri
·?(x), irrespective of the nonlinearity type and strength associated with the loudspeaker operation.
Convention Paper 7593 (Purchase now)P15-7 Phantom Powering the Modern Condenser Microphone: A Practical Look at Conditions for Optimized Performance
—Mark Zaim, Tadashi Kikutani, Jackie Green
, Audio-Technica U.S., Inc.
Phantom Powering a microphone is a decades old concept with powering conventions and methods that may have become obsolete, ineffective, or inefficient. Modern sound techniques, including those of live sound settings, now use many condenser microphones in settings that were previously dominated by dynamics. As a prerequisite for considering a modern phantom power specification or method, we study the efficiencies and requirements of microphones in typical multiple mic and high SPL settings in order to gain understanding of circuit and design requirements for the maximum dynamic range performance.
Convention Paper 7594 (Purchase now)
Saturday, October 4, 11:00 am — 1:00 pm
B8 - Lip Sync Issue
:Jonathan S. Abrams
, Nutmeg Audio PostPanelists
, Syntax-BrillianRichard Fairbanks
, Pharoah Editoial, Inc.David Moulton
, Sausalito Audio, LLCKent Terry
, Dolby Laboratories
This is a complex problem, with several causes and fewer solutions. From production to broadcast, there are many points in the signal path and postproduction process where lip sync can either be properly corrected, or made even worse.
This session’s panel will discuss several key issues. Where do the latency issues exist in postproduction? Where do they exist in broadcast? Is there an acceptable window of latency? How can this latency be measured? What correction techniques exist? Does one type of video display exhibit less latency than another? What is being done in display design to address the latency? What proposed methods are on the horizon for addressing this issue in the future?
Join us as our panel covers the field from measurement, to post, to broadcast, and to the home.
Saturday, October 4, 1:00 pm — 2:00 pm
Abstract:The Music Business Is Dead—Long Live the NEW Music Business!
of Topspin Media
Peter Gotcher will deliver a high-level view of the changing business models facing the music industry today. Gotcher will explain why it no longer works for artists to derive their income from record labels that provide a tiny share of high volume sales. He will also explore new revenue models that include multiple revenue streams for artists; the importance of getting rid of unproductive middlemen; and generating more revenue from fewer fans.
Saturday, October 4, 2:30 pm — 4:00 pm
P20 - Loudspeakers—Part 3
P20-1 Preliminary Results of Calculation of a Sound Field Distribution for the Design of a Sound Field Effector Using a 2-Way Loudspeaker Array with Pseudorandom Configuration
, Musashi Institute of Technology - Tokyo, Japan; Kaoru Ashihara
, Advanced Industrial Science and Technology - Tsukuba, Japan; Shogo Kiryu
, Musashi Institute of Technology - Tokyo, Japan
We have been developing a loudspeaker array system that can control a sound field in real time for live concerts. In order to reduce the sidelobes and to improve the frequency range, a 2-way loudspeaker array with pseudorandom configuration is proposed. Software is being developed to determine the configuration. For now, the configuration is optimized for a focused sound. The software calculates the ratio between the sound pressure of the focus point and the average of the sound pressure around the focus. It was shown that the sidelobes can be reduced with a pseudorandom configuration.
Convention Paper 7616 (Purchase now)P20-2 Design and Implementation of a Sound Field Effector Using a Loudspeaker Array
—Seigo Hayashi, Tomoaki Tanno
, Musashi Institute of Technology - Tokyo, Japan; Toru Kamekawa
, Tokyo National University Arts and Music - Tokyo, Japan; Kaoru Ashihara
, Advanced Industrial Science and Technology - Tokyo Japan; Shogo Kiryu
, Musashi Institute of Technology - Tokyo, Japan
We have been developing an effector that uses a 128-channel two-way loudspeaker array system for live concerts. The system was designed to realize the change of the sound field within 10 ms. The variable delay circuits and the communication circuit between the hardware and the control computer are implemented in one FPGA. All of the delay data that have been calculated in advance are stored in the SDRAM that is mounted on the FPGA board, and only the simple command is sent from the control computer. The system can control up to four sound focuses independently.
Convention Paper 7617 (Purchase now)P20-3 Wave Field Synthesis: Practical Implementation and Application to Sound Beam Digital Pointing
—Paolo Peretti, Laura Romoli, Lorenzo Palestini, Stefania Cecchi, Francesco Piazza
, Universita Politecnica delle Marche - Ancona, Italy
Wave Field Synthesis (WFS) is a digital signal processing technique introduced to achieve an optimal acoustic sensation in a larger area than in traditional systems (Stereophony, Dolby Digital). It is based on a large number of loudspeakers and its real-time implementation needs the study of efficient solutions in order to limit the computational cost. To this end, in this paper we propose an approach based on a preprocessing of the driving function component, which does not depend on the audio streaming. Linear and circular geometries tests will be described and the application of this technique to digital pointing of the sound beam will be presented.
Convention Paper 7618 (Purchase now)P20-4 Highly Focused Sound Beamforming Algorithm Using Loudspeaker Array System
—Yoomi Hur, Seong Woo Kim
, Yonsei University - Seoul, Korea; Young-cheol Park
, Yonsei University - Wonju, Korea; Dae Hee Youn
, Yonsei University - Seoul, Korea
This paper presents a sound beamforming technique that can generate a highly focused sound beam using a loudspeaker array. For this purpose, we find the optimal weight that maximizes the contrast of sound power ratio between the target region and the other regions. However, there is a limitation to make the level of non-target region low with the directly derived weights, so the iterative pattern synthesis technique, which was introduced for antenna array, is investigated. Since it is assumed that there are imaginary signal powers in the non-target regions, the system makes efforts to further improve the contrast ratio iteratively. The performance of the proposed method was evaluated, and the results showed that it could generate highly focused sound beam than conventional method.
Convention Paper 7619 (Purchase now)P20-5 Super-Directive Loudspeaker Array for the Generation of a Personal Sound Zone
—Jung-Woo Choi, Youngtae Kim, Sangchul Ko, Jung-Ho Kim
, Samsung Electronics Co. Ltd. - Gyeonggi-do, Korea
A sound manipulation technique is proposed for selectively enhancing a desired acoustic property in a zone of interest called personal sound zone. In order to create the personal sound zone in which a listener can experience high sound level, acoustic energy is focused on only a selected area. Recently, two performance measures indicating acoustic properties of the personal sound zone—acoustic brightness and contrast—were employed to optimize driving functions of a loudspeaker array. In this paper first some limitations of individual control method are presented, and then a novel control strategy is suggested such that advantages of both are combined in a single objective function. Precise control of a sound field with desired shape of energy distribution is made possible by introducing a continuous spatial weighting technique. The results are compared to those based on the least-square optimization technique.
Convention Paper 7620 (Purchase now)
Saturday, October 4, 4:30 pm — 6:00 pm
T13 - Improved Power Supplies for Audio Digital-Analog Conversion
, National Semiconductor Corporation - Santa Clara, CA, USARobert A. Pease
, National Semiconductor Corporation - Santa Clara, CA, USA
It is well known that good, stable, linear, constant-impedance, wide-bandwidth power supplies are important for high-quality Digital-to-Analog Conversion. Poor supplies can add noise and jitter and other unknown uncertainties, adversely affecting the audio.
Saturday, October 4, 5:00 pm — 6:45 pm
B10 - Audio Transport
, APT Ltd.Chris Crump
, ComrexAngela DePascale
, Global Digital Datacom Services Inc.Herb Squire
, DSI RFMike Uhl
This will be a discussion of techniques and technologies used for transporting audio (i.e., STL, RPU, codecs, etc.). Transporting audio can be complex. This will be a discussion of various roads you can take.
Saturday, October 4, 5:00 pm — 6:30 pm
P21 - Low Bit-Rate Audio Coding
P21-1 A Framework for a Near-Optimal Excitation Based Rate-Distortion Algorithm for Audio Coding
, Nokia Research Center - Tampere, Finland
An optimal excitation based rate-distortion algorithm remains an elusive target in audio coding. Typical complexity of the problem for one frame alone is in the order of 6050
. This paper presents a framework for reducing the complexity. Excitation is calculated using cochlear filters that have relatively steep slopes above and below the central frequency of the filter. An approximation of the excitation can be calculated by limiting the cochlear filters to a small frequency region. For example, the cochlear filters may span 15 subbands. In this way, the complexity can be reduced approximately to the order of 6015
Convention Paper 7621 (Purchase now)P21-2 Audio Bandwidth Extension by Frequency Scaling of Sinusoidal Partials
—Tomasz Zernicki, Maciej Bartkowiak
, Poznan University of Technology - Poznan, Poland
This paper describes a new technique of efficient coding of high-frequency signal components as an alternative to Spectral Band Replication. The main idea is to reconstruct the high frequency harmonic structure trajectories by using fundamental frequencies obtained at the encoder side. Audio signal is decomposed into narrow subbands by demodulation based on the local instantaneous fundamental frequency of individual partials. High frequency components are reconstructed by modulation of the baseband signals with appropriately scaled instantaneous frequencies. Such approach offers correct synthesis of rapidly changing sinusoids as well as proper reconstruction of harmonic structure in the high-frequency band. This technique allows correct energy adjustment over sinusoidal partials. The high efficiency of the proposed technique has been confirmed by listening tests.
Convention Paper 7622 (Purchase now)P21-3 Robustness Issues in Multi-View Audio Coding
, Nokia Research Center - Tampere, Finland
This paper studies the problem of noise unmasking when multiple spatial filtering options (multiple views) are required from multi-microphone recordings compressed with lossy coding. The envisaged application is re-use and postprocessing of user-created content. A potential solution based on inter-channel prediction is outlined, that would also allow subtractive downmix options without excessive noise unmasking. The simple case of two relatively closely spaced omnidirectional microphones and mono downmix is used as an example, experimenting with real-world recordings and MPEG-1 Layer 3 coding.
Convention Paper 7623 (Purchase now)P21-4 Quality Improvement of Very Low Bit Rate HE-AAC Using Linear Prediction Module
—GunWoo Lee, JaeSeong Lee
, University of Yonsei - Seoul, Korea; YoungCheol Park
, University o Yonsei - Wonju-city, Korea; DaeHee Youn
, University of Yonsei - Seoul, Korea
This paper proposes a new method of improving the quality of High Efficiency Advanced Audio Coding (HE-AAC) at very low bit rate under 16 kbps. Low bit rate HE-AAC often produces obvious spectral holes inducing musical noise in low energy frequency bands due to its limited number of available bits. In the proposed system, a linear prediction module is combined with HE-AAC as a pre-processor to reduce the spectral holes. For its efficient implementation, masking threshold of psychoacoustic model is normalized with LPC spectral envelope to quantize LPC residual signal with appropriate masking threshold. To reduce the pre-echo, we also modified the block switching module. Experimental results show that, at very low bit rate modes, the linear prediction module effectively reduce the spectral holes, which results in the reduction of musical noises compared to the conventional HE-AAC.
Convention Paper 7624 (Purchase now)P21-5 An Implementation of MPEG-4 ALS Standard Compliant Decoder on ARM Core CPUs
—Noboru Harada, Takehiro Moriya, Yutaka Kamamoto
, NTT Communication Science Labs. - Kanagawa, Japan
MPEG-4 Audio Lossless Coding (ALS) is a standard that losslessly compresses audio signals in an efficient manner. MPEG-4 ALS is a suitable compression scheme for high-sound-quality portable music players. We have implemented a decoderder compliant with the MPEG-4 ALS standard on the ARM platform. In this paper the required CPU resources for MPEG-4 ALS tools on ARM9E are characterized by using an ARM CPU emulator, called ARMulator, as a simulation platform. It is shown that the required CPU clock cycle for decoding MPEG-4 ALS standard compliant bit streams is less than 20 MHz for 44.1-kHz/16-bit, stereo signals on ARM9E when the combination of the MPEG-4 ALS tools is properly selected and coding parameters are properly restricted.
Convention Paper 7625 (Purchase now)
Sunday, October 5, 9:00 am — 10:30 am
The design competition is a competition for audio projects developed by students at any university or recording school challenging students with an opportunity to showcase their technical skills. This is not for recording projects or theoretical papers, but rather design concepts and prototypes. Designs will be judged by a panel of industry experts in design and manufacturing. Multiple prizes will be awarded.
Sunday, October 5, 9:00 am — 11:00 am
M4 - Acoustics and Multiphysics Modeling
, Comsol - Palo Alto, CA, USA
This Master Class covers acoustics and multiphysics modeling using Comsol. The Acoustics Module is specifically designed for those who work in classical acoustics with devices that produce, measure, and utilize acoustic waves. Application areas include the design of loudspeakers, microphones, hearing aides, noise control, sound barriers, mufflers, buildings, and performance spaces.
Sunday, October 5, 9:00 am — 10:45 am
B11 - Internet Streaming—Audio Quality, Measurement, and Monitoring
, CBS RadioRusty Hodge
, SomaFMBenjamin Larson
, Streambox, Inc.Greg J. Ogonowski
, Orban/CRLSkip Pizzi
, Contributing Editor, Radio World magazineGeir Skaaden
, Neural Audio Corp.
Internet Streaming has become a provider of audio and video content to the public. Now that the public has recognized the medium, the provider needs to deliver the content with a quality comparable to other mediums. Audio monitoring is becoming important, and a need to quantify the performance is important so that the streamer can deliver product of a standard quality.
Sunday, October 5, 9:00 am — 10:30 am
P24 - Audio Digital Signal Processing and Effects—Part 1
P24-1 Simple Arbitrary IIRs
, Pandit Littoral - Cooktown, Queensland, Australia
This is a method of fitting IIRs (Infinite Impulse Response filters) to an arbitrary frequency response simple enough to incorporate in intelligent AV receivers. Short IIR filters are useful where computational power is limited and at low frequencies where FIRs have poor performance. Loudspeaker and microphone frequency response defects are often better matched to IIRs. Some caveats for digital EQ design are discussed. The emphasis is on loudspeakers and microphones
Convention Paper 7635 (Purchase now)P24-2 Analysis of Design Parameters for Crosstalk Cancellation Filters Applied to Different Loudspeaker Configurations
—Yesenia Lacouture Parodi
, Aalborg University - Aalborg, Denmark
Several approaches to render binaural signals through loudspeakers have been proposed in past decades. Some studies have focused on the optimum loudspeaker arrangement while others have proposed more efficient filters. However, to our knowledge, the identification of optimal parameters for crosstalk cancellation filters applied to different loudspeaker configurations has not yet been addressed systematically. In this paper we document a study of three different inversion techniques applied to several loudspeaker arrangements. Least square approximations in frequency and time domain are evaluated along with a crosstalk canceller-based on minimum-phase approximation. The three methods are simulated in two-channel configuration and the least square approaches in four-channel configurations. Different span angles and elevations are evaluated for each case. In order to obtain optimum parameter, we varied the bandwidth, filter length, and regularization constant for each loudspeaker position and each method. We present a description of the simulations carried out and the optimum regularization values, expected channel separation, and performance error obtained for each configuration.
Convention Paper 7636 (Purchase now)P24-3 A Hybrid Time and Frequency Domain Audio Pitch Shifting Algorithm
, University of Fribourg - Fribourg, Switzerland; Stefan Müller Arisona
, University of Santa Barbara - Santa Barbara, CA, USA; Simon Schubiger-Banz
, Computer Systems Institute, ETH Zürich - Zürich, Switzerland
This paper presents an abstract algorithm that performs audio pitch shifting as a combination of a signal analysis, a filter bank, and frequency shifting operations. Then, it is shown that two previously proposed pitch shifting algorithms are actually concrete implementations of the presented abstract algorithm. One of them is implemented in the frequency domain whereas the other is implemented in the time domain. Based on an analysis and comparison of the properties of these two implementations (quality, artifacts, assumptions on the signal), we propose a new hybrid implementation working partially in the frequency domain and partially in the time domain, and achieving superior quality by taking the best from each of the two existing implementations.
Convention Paper 7637 (Purchase now)P24-4 A Colored Noise Suppressor Using Lattice Filter with Correlation Controlled Algorithm
—Arata Kawamura, Youji Iiguni
, Osaka University - Toyonaka, Osaka, Japan
A noise suppression technique is necessary in a wide range of applications including mobile communication and speech recognition systems. We have previously proposed a noise suppressor using a lattice filter that can cancel a white noise from an observed signal. Unfortunately, many practical noises are not white, and hence the conventional noise suppressor is not available for the practical noises. In this paper we propose a new adaptive algorithm used for the lattice filter to suppress a colored noise. The proposed algorithm can be directly derived from the conventional time recursive algorithm. To extract a speech from a speech mixed with colored noise, the lattice filter with the proposed algorithm gives a noise replica whose auto-correlation is close to the noise’s one. Subtracting the noise replica from the observed noisy speech, we can obtain an extracted speech. Simulation results showed that the proposed noise suppressor can extract a speech from a speech mixed with a tunnel noise, which is a colored noise recorded in a practical environment.
Convention Paper 7638 (Purchase now)P24-5 Accurate IIR Equalization to an Arbitrary Frequency Response, with Low Delay and Low Noise Real-Time Adjustment
, Oxford Digital Limited - Stonesfield, Oxfordshire, UK
A new form of equalizer has been developed that combines minimum phase, low delay, IIR signal processing with low noise, real-time adjustment of coefficients to accurately deliver an arbitrary frequency response as entered from a graphical user interface. The use of a join-the-dots type graphical user interface combined with cubic or similar splines is a common method of entering curved lines into 2-D drawing programs. The equalizer described in this paper combines a similar type of user interface with low-delay, minimum phase, IIR audio DSP. Key attributes also include real-time, nearly noiseless adjustment of the DSP coefficients in response to user input. All necessary information for the construction of these filters is included.
Convention Paper 7639 (Purchase now)P24-6 A Method of Capacity Increase for Time-Domain Audio Watermarking Based on Low-Frequency Amplitude Modification
—Harumi Murata, Akio Ogihara, Motoi Iwata, Akira Shiozaki
, Osaka Prefecture University - Osaka, Japan
The objective of this work is to increase the capacity of watermark information in “the audio watermarking method based on amplitude modification,” which has been proposed by W. N. Lie as a prevention technique against copyright infringement. In this conventional method, the capacity of watermark information is not enough, and it is desirable that the capacity of watermark information is increased. In this paper we increase the capacity of watermark information by embedding multiple watermarks in the different levels of audio data independently. The proposed method has many data-channels for embedding, and hence it is possible to embed multiple watermarks by selecting the proper data-channel according to required data capacity or recovery rate.
Convention Paper 7640 (Purchase now)P24-7 Constrained-Optimized Sound Beamforming of Loudspeaker-Array System
—Myung Song, Soonho Baek
, Yonsei University - Seoul, Korea; Seok-Pil Lee
, Korea Electronics Technology Institute - Seongnam, Korea; Hong-Goo Kang
, Yonsei University - Seoul, Korea
This paper proposes a novel loudspeaker-array system to form relatively high sound pressure toward the desired location. The proposed algorithm adopts a constrained-optimization technique such that the array response to the desired response is maintained over mainlobe width while minimizing its sidelobe level. At first the characteristic of sound propagation in reverberant environment is analyzed by off-line computer simulation. Then, the performance of the implemented loudspeaker-array system is evaluated by measuring sound pressure distribution in a real test room. The results show that the proposed sound beamforming algorithm forms more concentrative sound beam to the desired location than conventional algorithms even in a reverberation environment.
Convention Paper 7641 (Purchase now)
Sunday, October 5, 11:00 am — 1:00 pm
The Evolution of Electronic Instrument Interfaces: Past, Present, Future
, editor of Electronic Musician magazinePanelists
, Roger Linn DesignsTom Oberheim
, Founder, Oberheim ElectronicsDave Smith
, Dave Smith Instruments
Developing musical instruments that take advantage of new technologies is exciting. However, coming up with something that is not only intuitive and musically useful but that will be accepted by musicians requires more than just a feature-rich box with sexy industrial design. This panel will discuss the issues involved in creating new musical instruments, with a focus on interface design, as well as explore ways to avoid the mistakes of the past when designing products for the future. These three panelists have brought a variety of innovative products to market (with varying degrees of success), which have made each of them household names in the MI world.
Sunday, October 5, 11:00 am — 1:00 pm
W11 - Upcoming MPEG Standard for Efficient Parametric Coding and Rendering of Audio Objects
, Fraunhofer Institute for Integrated Circuits IISPanelists
:Jonas EngdegårdChristof FallerJürgen HerreLeon van de Kerkhof
Through exploiting the human perception of spatial sound, “Spatial Audio Coding” technology enabled new ways of low bit-rate audio coding for multichannel signals. Following the finalization of the MPEG Surround specification, ISO/MPEG launched a follow-up standardization activity for bit-rate-efficient and backward compatible coding of several sound objects. On the receiving side, such a Spatial Audio Object Coding (SAOC) system renders the objects interactively into a sound scene on a reproduction setup of choice. The workshop reviews the ideas, principles, and prominent applications behind Spatial Audio Object Coding and reports on the status of the ongoing ISO/MPEG Audio standardization activities in this field. The benefits of the new approach will be highlighted and illustrated by means of real-time demonstrations.
Sunday, October 5, 11:30 am — 1:00 pm
T15 - Real-Time Embedded Audio Signal Processing
, DSP Concepts, LLC - Sunnyvale, CA, USA
Product developers implementing audio signal processing algorithms in real-time encounter a host of challenges and tradeoffs. This tutorial focuses on the high-level architectural design decisions commonly faced. We discuss memory usage, block processing, latency, interrupts, and threading in the context of modern digital signal processors with an eye toward creating maintainable and reusable code. The impact of integrating audio decoders and streaming audio to the overall design will be presented. Examples will be drawn from typical professional, consumer, and automotive audio applications.
Sunday, October 5, 11:30 am — 1:00 pm
T16 - Latest Advances in Ceramic Loudspeakers and Their Drivers
, Maxim, Inc. - Sunnyvale, CA, USARobert Polleros
, Maxim Integrated Products - AustriaPeter Tiller
, Murata - Atlanta, GA, USA
New cell phone designs demand small form factor while maintaining audio sound-pressure level. Speakers have typically been the component that limits the thinness of the design. New developments in ceramic, or piezoelectric, loudspeakers have opened the door for new sleek designs. Due to the capacitive nature of these ceramic speakers, special considerations need to be taken into account when choosing an audio amplifier to drive them. Today’s portable devices need smaller, thinner, more power-efficient electronic components. Cellular phones have become so thin that the dynamic speaker is typically the limiting factor in how thin manufacturers can make their handsets. The ceramic, or piezoelectric, speaker is quickly emerging as a viable alternative to the dynamic speaker. These ceramic speakers can deliver competitive sound-pressure levels (SPL) in a thin and compact package, thus potentially replacing traditional voice-coil dynamic speakers.
Sunday, October 5, 2:30 pm — 4:30 pm
T17 - An Introduction to Digital Pulse Width Modulation for Audio Amplification
, Freescale Semiconductor Inc. - Austin, TX, USA
Digital PWM is highly suitable for audio amplification. Digital audio sources can be readily converted to digital PWM using digital signal processing. The mathematical nonlinearity associated with PWM can be corrected with extremely high accuracy. Natural sampling and other techniques will be discussed that convert a PCM signal to a digital PWM signal. Due to limitations of digital clock speeds and jitter, the duty ratio of the PWM signal has to be quantized to a small number of bits. The noise due to quantization can be effectively shaped to fall outside the audio band. PWM specific noise shaping techniques will be explained in detail. Further, there is a need for sample rate conversion for a digital PWM modulator to work with a digital PCM signal that is generated using a different clock. The mathematics of an asynchronous sample rate converters will also be discussed. Digital PWM signals are amplified by a power stage that introduces nonlinearity and mixes in noise from the power supply. This mechanism will be examined and ways to correct for it will be discussed.
Sunday, October 5, 2:30 pm — 4:30 pm
T18 - FPGA for Broadcast Audio
, Fairlight Pty LTD - Frenchs Forest, AustraliaGirish Malipeddi
, Altera Corporation - San Jose, CA, USA
This tutorial presents broadcast-quality solutions based on FPGA technology for audio processing with significant cost savings over existing discrete solutions. The solutions include digital audio interfaces such as AES3/SPDIF and I2S, audio processing functions such as sample rate converters and SDI audio embed/de-embed functions. Along with these solutions, an audio video framework that consists of a suite of A/V functions, reference designs, an open interface to easily stitch the AV blocks, system design methodology, and development kits is introduced. Using the framework system designers can quickly prototype and rapidly develop complex audio video systems.
Sunday, October 5, 2:30 pm — 4:00 pm
P26 - Audio Digital Signal Processing and Effects—Part 2
P26-1 Applications of Algorithmically-Generated Digital Audio for Web-Based Sonic Measure Ear Training
, Towson University - Towson, MD, USA
This paper examines applications of algorithmically-generated digital audio for a new type of ear training. This approach, called sonic measure ear training, circumvents the many limits of MIDI-based aural testing, and may offer a valuable resource for computer musicians and audio engineers. The Post-Ut system, introduced here, is the first web-based ear training system to offer sonic measure ear-training. After describing the design of the Post-Ut system, including the use of athenaCL, Csound, Python, and MySQL, the audio generation procedures are examined in detail. The design of questions and perceptual considerations are evaluated, and practical applications and opportunities for future development are outlined.
Convention Paper 7645 (Purchase now)P26-2 A Perceptual Model-Based Speech Enhancement Algorithm
, Dolby Laboratories - San Francisco, CA, USA
This paper presents a perceptual model-based speech enhancement algorithm. The proposed algorithm measures the amount of the audible noise in the input noisy speech explicitly by using a psychoacoustic model, and decides an appropriate amount of noise reduction accordingly to achieve good noise level reduction without introducing significant distortion to the clean speech embedded in the input noisy signal. The proposed algorithm also mitigates the musical noise problem commonly encountered in conventional speech enhancement algorithms by having the amount of noise reduction adapt to the instantly estimated noise amplitude. Good performance of the proposed algorithm has been confirmed through objective and subjective tests.
Convention Paper 7646 (Purchase now)P26-3 Real Time Implementation of an ESPRIT-Based Bass Enhancement Algorithm
—Lorenzo Palestini, Emanuele Moretti, Paolo Peretti, Stefania Cecchi, Laura Romoli, Francesco Piazza
, Università Politecnica delle Marche - Ancona, Italy
This paper presents a software real-time implementation for the NU-Tech platform of a bass enhancement algorithm based on the FAPI subspace tracker and the ESPRIT algorithm for fundamentals estimation to realize bass improvement of small loudspeakers exploiting the well known psychoacoustic phenomenon of the missing fundamental. Comparative informal listening tests have been performed to validate the virtual bass improvement, and their results show that the proposed method is well appreciated.
Convention Paper 7647 (Purchase now)P26-4 Low-Power Implementation of a Subband Acoustic Echo Canceller for Portable Devices
—Julie Johnson, David Hermann, John Wdowiak, Edward Chau, Hamid Sheikhzadeh
, ON Semiconductor - Waterloo, Ontario, Canada
Portable audio communication devices require increasingly superior audio quality while using minimal power. Devices such as cell phones with speakerphone functionality can generate substantial acoustic echo due to the proximity of the microphone and speaker. To improve the audio quality in such devices, an oversampled subband acoustic echo canceller has been implemented on a miniature low-power dual core DSP system. This application is comprised of three subband-based algorithms: a Pseudo-Affine Projection adaptive filter, an Ephraim-Malah based single-microphone noise reduction algorithm, and a novel nonlinear residual echo suppressor. The system consumes less than 4 mW of power when configured with a 128 ms filter. Real-world tests indicate an echo return loss enhancement of greater than 30 dB for typical input levels.
Convention Paper 7648 (Purchase now)P26-5 A Digital Model of the Echoplex Tape Delay
—Steinunn Arnardottir, Jonathan S. Abel, Julius O. Smith
, Stanford University - Stanford, CA, USA
The Echoplex is a tape delay unit featuring fixed playback and erase heads, a moveable record head, and a tape loop moving at roughly 8 ips. The relatively slow tape speed allows large frequency shifts, including "sonic booms" and shifting of the tape bias signal into the audio band. Here, the Ecxhoplex tape delay is modeled with read, write, and erase pointers moving along a circular buffer. The model separately generates the quasiperiodic capstan and pinch wheel components and drift of the observed fluctuating time delay. This delay drives an interpolated write simulating the record head. To prevent aliasing in the presence of a changing record head speed, an anti-aliasing filter with a variable cutoff frequency is described.
Convention Paper 7649 (Purchase now)P26-6 A Digital Reverberator Modeled after the Scattering of Acoustic Waves by Trees in a Forest
—Kyle Spratt, Jonathan S. Abel
, Stanford University - Stanford, CA, USA
A digital reverberator modeled after the scattering of acoustic waves among trees in an idealized forest is presented. Termed "treeverb," the technique simulates forest acoustics using a network of digital waveguides, with bi-directional delay lines connecting trees represented by multi-port scattering junctions. The reverberator is designed by selecting tree locations and diameters, with waveguide delays determined by inter-tree distances, and scattering filters fixed according to tree-to-tree angles and trunk diameters. The scattering is modeled as that of plane waves normally incident on a rigid cylinder, and a simple low-order scattering filter is presented and shown to closely approximate the theoretical scattering. Small forests are seen to yield dense, gated reverb-like impulse responses.
Sunday, October 5, 4:30 pm — 6:00 pm
W15 - Interactive MIDI-Based Technologies for Game Audio
, THX Ltd.Panelists
, IASIGLarry the OTom Savell
, Creative Labs
The MIDI Manufacturers Ass’n (MMA) has developed three new standards for MIDI-based technologies with applications in game audio. The 3-D MIDI Controllers specification allows for real-time positioning and movement of music and sound sources in 3-D space, under MIDI control. The Interactive XMF specification marks the first nonproprietary file format for portable, cue-oriented interactive audio and MIDI content with integrated scripting. Finally, the MMA is working toward a completely new, and drastically simplified, 32-bit version of the MIDI message protocol for use on modern transports and software APIs, called the HD Protocol for MIDI Devices.
Sunday, October 5, 4:30 pm — 6:30 pm
T19 - Point-Counterpoint—Fixed vs. Floating-Point DSPs
, Audio ImaginationJayant Datta
, THX - Syracuse, NY, USABoris Lerner
, Analog Devices - Norwood, MA, USAMatthew Watson
, Texas Instruments, Inc. - Dallas, TX, USA
There is a lot of controversy and interest in the signal processing community concerning the use of fixed and floating-point DSPs. There are various trade-offs between these two approaches. The audience will walk away with an appreciation of these two approaches and an understanding of the strengths of weaknesses of each. Further, this tutorial will focus on audio-specific signal processing applications to show when a fixed-point DSP is applicable and when a floating-point DSP is suitable.
Sunday, October 5, 5:00 pm — 6:45 pm
T20 - Radio Frequency Interference and Audio Systems
, Audio Systems Group, Inc.
This tutorial begins by identifying and discussing the fundamental mechanisms that couple RF into audio systems and allow it to be detected. Attention is then given to design techniques for both equipment and systems that avoid these problems and methods of fixing problems with existing equipment and systems that have been poorly designed or built.