AES Preprints: AES 119th Convention

6525
Perceptual Modeling of Piano Tones
Hamadicharef, Brahim; Ifeachor, Emmanuel
A modeling system for piano tones is presented. It fully automates the modeling process includes the following three mains stages: sound analysis, sound synthesis and sound quality assessment. High quality piano sounds are analysed in time and frequency domain. Analysis results are then used to design filter models matching the string resonance and create excitation signals using an inverse filtering technique for the excitation-filter synthesis model. The impact of each sound model parameter onto the perceived sound quality has been assessed using Perceptual Evaluation of Audio Quality (PEAQ) algorithm. This is helping to optimise the DSP resources requirements for real-time implementation onto multimedia PC and FPGA-based hardware.

6526
Multi-Channel Audio Processing Using a Unified Domain Representation
Daniels, Michelle L.; Garcia, Ricardo A.; Short, Kevin M.
The Unified Domain representation for synchronized multi-channel audio streams is introduced. This lossless and invertible transformation describes multiple streams of audio as a single frequency domain magnitude component multiplied by a complex matrix encoding the spatial and phase relationship information for each channel. Unified Domain analysis and signal processing techniques for applications such as high-resolution frequency analysis, sound source separation, spatial psychoacoustic models and low-bitrate audio coding are presented.

6527
Multi-Channel Audio Time-Scale Modification
Coyle, Eugene; Dorran, David; Lawlor, Robert
Phase vocoder based approaches to audio time-scale modification introduce a reverberant artefact into the time-scaled output. Recent techniques have been developed to reduce the presence of this artefact; however, these techniques have the effect of introducing additional issues relating to their application to multi-channel recordings. This paper addresses these issues by collectively analysing all channels prior to time-scaling each individual channel.

6528
Improving MPEG-7 Sound Classification
Crysandt, Holger
This paper describes a mechanism to improve the sound classification algorithm included in the MPEG-7 standard without modifying or extending it. The sequential classification is turned into a hierarchical classification. Thereby it is possible to adopt the classification algorithm more flexible to the characteristics of the sound classes. This paper also gives a detailed view on how the algorithm is implemented using a XML database to store and request content information of the audio signals and model descriptions of sound classes using the MPEG-7 standard.

6529
Measurement of Architectural Speech Security of Closed Offices and Meeting Rooms
Bradley, John S.; Gover, Bradford N.
A measurement procedure has been developed for rating the architectural speech security of closed offices and meeting rooms. It is based on measuring the attenuation between average levels in the meeting room and received levels at spot locations outside the room, 0.25 m from the room boundaries. These attenuations are used with statistical distributions of speech and noise levels to calculate a suitable signal-to-noise measure. This previously derived objective measure is related to the audibility and intelligibility of the transmitted speech. The measurement at spot receiver locations allows detection and characterization of localized weak spots (“hot spots”) in the room’s boundaries.

6530
Frequency-Based Coloring of the Waveform Display to Facilitate Audio Editing and Retrieval
Rice, Stephen V.
The audio waveform display provides the visual focus in audio-editing systems yet sounds are difficult to see in the display. Using a new technique, the display is colored to represent the frequency content to make sounds more visible. This requires extraction of frequency information from the audio signal and an appropriate mapping of this information to the color space. Ideally, the coloring is independent of recording level, and similar sounds are represented by similar colors. Audio-editing systems are enhanced by the improved user interface. Audio-retrieval systems can present colored waveform displays as visual “thumbnails” in a list of search results.

6531
Development of Auditory Alerts for Air Traffic Control Consoles
Cabrera, Densil; Ferguson, Sam; Laing, Gary
This paper documents a project that developed a hierarchical auditory alert scheme for air traffic control consoles, replacing a basic system of auditory alerts. Alerts are designed to convey the level of urgency, not provoke annoyance, be easily distinguished, minimize speech interference, and be easily localized. User evaluations indicate that the new alert scheme is highly advantageous, especially when combined with improved visual coding of alerts. The alert scheme was implemented in Australian air traffic control centers in July 2005.

6532
Multi-Channel Impulse Response Measurement, Analysis and Rendering in Archaeological Acoustics
Murphy, Damian T.
Developments in measuring the acoustic characteristics of concert halls and opera houses are leading to standardized methods of impulse response capture for a wide variety of auralization applications. This work extends and develops these methods to non-traditional performance venues and examines how objective acoustic parameter analysis can be applied in the field of acoustic archaeology. An initial study of selected archaeological sites in the UK is presented, each site demonstrating some feature of interest in terms of its acoustic characteristics. The resulting database of measurements has a particular use in convolution based reverberation, and an acoustic analysis of the impulse responses provides an additional insight as to the characteristics and construction of these spaces.

6533
Preferred Listening Levels in the Automotive Environment
Benjamin, Eric; Crockett, Brett
In recent years the automobile passenger compartment has become increasingly important as a location in which entertainment, primarily audio entertainment, is consumed. Listener preference for dialog levels in the domestic environment is fairly well understood, and it follows conversational speech levels. It has been known for some time that preferred listening levels for music are higher than those for dialog, and it has also been observed that because of the high ambient noise levels in the automobile passenger compartment that higher listening levels are used in order to bring the reproduction levels above the noise level. What is the range of noise levels within the automobile passenger compartment and how does that affect listener preference for sound reproduction levels?

6534
High Frequency Compensation for Compressed Digital Audio Using Sampled-Data Control
Fujiyama, Koji; Iwasaki, Naoya; Kaibe, Riku; Kano, Hiroshi; Yamamoto, Yutaka
The demands are growing bigger for improving the quality of compressed digital audio as portable non-CD players become popular. Because high compression causes poor sound quality, a solution to improve it is being sought. An effective solution is to generate high frequency components from the lower one. We propose the idea of using imaging components and digital filters designed using “Sampled-Data Control” to minimize the error between an assumed original analog signal and interpolated digital signal. We successfully improved the sound quality of compressed audio by realizing the close frequency spectrum to CD especially for the sound with low bit rate.

6535
Are There Criteria to Evaluate Optical Disc Quality That Are Relevant for End-Users?
Fontaine, Jean-Marc; Poitevineau, Jacques
This work deals with long term preservation of Sound and Audiovisual heritage collections. Once-recordable optical discs (among other media) can fulfill such an objective, provided rigorous measures are taken, essentially through careful inspection of the discs by means of a specific equipment. We have studied disc quality criteria by considering thoroughly end-users applications, especially during data transfer processing (initial quality) and later accesses to existing collections (aging behavior). We show that error rates cannot be adopted as unique quality descriptors, and other parameters have to be defined e.g. from the data provided by our research-grade analyzer. A multidimensional statistical method such as Principal Component Analysis (PCA) gives promising indications towards this goal. These studies we have been conducting for many years are now reinforced thanks to the constitution of a network (Groupe d’Int´erˆet Scientifique-GIS in France) that combines complementary abilities and research equipment.

6536
An Open Design and Implementation for the Enabler Component of the Plural Node Architecture of Professional Audio Devices
Foss, Richard John; Fujimori, Jun-ichi; Okai-Tettey, Harold
The Plural Node architecture is an implementation architecture for professional audio devices that adhere to the “Audio and Music (A/M)” protocol. The Plural-Node implementation architecture comprises two components on separate IEEE 1394 nodes – a “Transporter” component dedicated to A/M protocol handling, and an “Enabler” component that controls the Transporter and provides high level plug abstractions. An Open Generic Transporter specification has been developed for the Transporter component. This paper details an open design and implementation for the Enabler component that allows for connection management via abstract ,mLAN plugs.

6537
Audibility of Spectral Switching in Head-Related Transfer Functions
Faundez Hoffmann, Pablo; Møller, Henrik
Binaural synthesis of a time-varying sound field is performed by updating head-related transfer functions (HRTFs). The updating is done to reflect the changes in the sound transmission to the listener’s ears that occur as a result of moving sound. Unless the differences in HRTFs are sufficiently small, a direct switching between them will cause an audible artifact that is heard as a click. By modeling HRTFs as minimum-phase filters and pure delays, it is possible to study the effects of spectral and time switching separately. Time switching was studied in a previous investigation. This work presents preliminary results on minimum audible spectral switching (MASS).

6538
Virtual Source Location Information for Binaural Cue Coding
Choi, In Yong; Chon, Sang Bae; Moon, Han-gil; Seo, Jeongil; Sung, Koeng-Mo
Binaural Cue Coding (BCC) is an audio coding technology that expresses multi-channel audio signals with mono or stereo downmixed audio signal(s) and side information which are Inter-Channel Level Difference (ICLD), Inter-Channel Time Delay (ICTD), and Inter-Channel Correlation (ICC). Among these, the ICLD describes the level difference between the signal of one channel and the reference downmixed signal of the BCC system. This ICLD plays the most important role to lateralize spatial image. However, the fact that the spatial image of sound is created by the location of the sound source in nature raises the question whether there is a more direct solution to describe the location of the sound source for the spatial image than ICLD. Virtual Source Location Information (VSLI) , the proposed new side information in this paper, provides an answer to this question.

6539
Perceptual Movement of Sounds Fed Through Multiway Loudspeakers Perpendicularly Set Up
Agatsuma, Yu; Miyasaka, Eiichi
Vertical localization and perceptual movement of auditory images were investigated through 8 loudspeakers perpendicularly set up with each loudspeaker 30cm apart. The results obtained by over 20 observers show that the localization was fairly identified for one octave bands of noises with the center frequencies from 2 to 8 kHz and smooth movement of auditory images from low to high was perceived when a stimulus consisting of one octave bands of noises was linearly climbed up with a various movement speed.

6540
High Order Spatial Audio Capture and Its Binaural Head-Tracked Playback Over Headphones with HRTF Cues
Davis, Larry S.; Duraiswami, Ramani; Grassi, Elena; Gumerov, Nail A.; Li, Zhiyun; Zotkin, Dmitry N.
A theory and a system for capturing an audio scene and then rendering it remotely are developed and presented. The sound capture is performed with a spherical microphone array. The sound field at the location, and in a region of space in the neighborhood, of the array is deduced from the captured sound and represented using either spherical wave-functions or plane-wave expansions. The representation is then transmitted to a remote location for immediate rendering or stored for later reproduction. The sound renderer, coupled with the head tracker, reconstructs the acoustic field using individualized head-related transfer functions to preserve the perceptual spatial structure of the audio scene. Rigorous error bounds and a Nyquist-like sampling criterion for the representation of the sound field are presented and verified.

6541
Performance of Spatial Audio Using Dynamic Cross-Talk Cancellation
Assenmacher, Ingo; Lentz, Tobias; Sokoll, Jan
Creating a virtual sound scene with spatial distributed sources needs a technique to introduce spatial cues into audio signals and an appropriate reproduction system. In our case a complete binaural approach is used. It consists of binaural synthesis and head-tracked dynamic cross talk cancellation (CTC) for the reproduction of the binaural signal at the ears of the listener. In this paper performance and limitations of the complete system and also the various subsystems will be investigated and discussed. The channel separation of the dynamic CTC system is measured for various positions in the listening area as well as the subjective accomplishment of the localisation is examined in listening tests.

6542
An Application of Lined-Up Loudspeaker Array System for Mixed Reality Audio-Visual Reproduction System
Ikenaga, Toshikazu; Komiyama, Setsu; Naito, Youichi; Nakayama, Yasushige; Okubo, Hiroyuki
An interactive audio-visual system called the Mixed Reality Audio-Visual (MRAV) reproduction system has been developed. The MRAV system employs a method of stereoscopic image projection and a technique of multichannel sound field reproduction in which the loudspeaker array is able to focus the sound pressure in front of the listener to coincide with the three-dimensional (3D) visual image. A new sound screen with a silver metallic coating has also been developed. It helps to maintain the sound quality even if the sound radiated from the loudspeaker array system when it is set behind the screen. This paper describes the design of the loudspeaker array system and investigates the generated sound field corresponding to 3D Computer Graphic (CG) images.

6543
Perceptual Evaluation of 5.1 Downmix Algorithms
Klehs, Beate; Krake, Alexander; Liebetrau, Judith; Muckenschnabl, Gabi; Richter, Felix; Sporer, Thomas; Weitzel, Mandy
At the 118th AES Convention in a workshop problems and solutions for automatic down-mixing have been summarized. The key question is for which items automatic algorithms provide acceptable quality. In close conjunction to this issue is the question how to evaluate the quality of the result. Standardized listening test procedures like ITU-R BS.1116 and ITU-R BS.1534 are designed to evaluate the difference between an unimpaired reference and a modified signal under test. They never had been intended to be used for the comparison of 5-channel signals to 2-channel signals. In this paper a new listening test procedure is described which is designed to judge the quality of down-mixing algorithms. First results of listening tests performed using this procedure are described.

6544
Discrimination of Auditory Source Focus for Musical Instrument Sounds with Varying Low-Frequency Cross Correlation in Multichannel Loudspeaker Reproduction
Kim, Sungyoung; Martens, William L.; Marui, Atsushi
This study examined the changes in auditory spatial impression associated with changes in signal incoherence within the low-frequency portion of a multichannel loudspeaker reproduction. Multichannel recordings were made in reverberant concert settings of single notes played on musical instruments with significant low-frequency energy. A signal processing method was then developed to manipulate low-frequency correlation in the pre-recorded material while maintaining high sound quality; subsequent listening tests measured the perceptual effects of varying low frequency correlation on otherwise identical recordings of low-pitch, single-note performances on musical instruments such as the bass violin. For cutoff frequencies ranging from 200 Hz down to 63 Hz, the effects of cutoff frequency on discrimination thresholds were measured for changes in low-frequency correlation using a two-alternative forced-choice task. Listeners also made forced-choice identifications regarding auditory source focus. Results indicated that both discrimination and identification performance was degraded in the presence of the higher-frequency portion of the musical stimuli.

6545
Optimizing Placement and Equalization of Multiple Low Frequency Loudspeakers in Rooms
Birkedal Nielsen, Sofus; Celestinos, Adrian
Every room has strong influence on the low frequency performance of a loudspeaker. This is often problematic to control and to predict. The modal resonances modify the response of the loudspeaker depending on placement and listening position. In order to anticipate the behaviour of low frequency loudspeakers in rooms a simulation tool has been created based on finite-difference time-domain approximations (FDTD). Simulations have shown that by increasing the number of loudspeakers and modifying their placement a significant improvement is achieved. A more even sound pressure level distribution along a listening area is obtained. The placement of loudspeakers has been optimized. Furthermore an equalization strategy can be implemented for optimization purpose. This solution can be combined with multi channel sound systems.

6546
An Immersive Audio Environment with Source Positioning Based on Virtual Microphone Control
Braasch, Jonas; Ryan, Timothy J.; Woszczyk, Wieslaw
In this paper, an auditory virtual environment (AVE) is described that uses virtual microphone control (ViMiC) to address a 24-channel loudspeaker system based on ribbon speakers. In the newly designed environment, the microphones, with adjustable directivity patterns and axes of orientation, can be spatially placed as desired. The system architecture was designed to comply with the augmented ITU surround-sound loudspeaker placement and to create sound imagery similar to that associated with standard sound recording practice. The AVE will be used with close-spot microphone techniques in two-way internet audio transmissions to avoid feedback loops and provide dynamic placement for a number of sources.

6547
Simulation and Visualization of Room Compensation for Wave Field Synthesis with the Functional Transformation Method
Petrausch, Stefan; Rabenstein, Rudolf; Spors, Sascha
Active room compensation based on wave field synthesis (WFS) has been recently introduced. So far, the verification of the compensation algorithm is only possible through elaborate acoustical measurements. Therefore, a new simulation method is applied that is based on the functional transformation method (FTM). Compared with other simulation techniques, the FTM provides several advantages that facilitate the correct simulation of the complete wave field particularly in the interesting frequency ranges for WFS. The complete procedure, starting from the virtual "measurements" of the acoustical properties of the simulated room, via the correct excitation for the simulated wave field, towards the resulting animations and sounds is presented in this paper.

6548
Acoustic Intensity in Multichannel Rendering Systems
Hurtado-Huyssen, Antoine; Polack, Jean-Dominique
Acoustic intensity is the mechanical energy stream through a point in the sound field. In the far field, it also identifies the direction of the main source (when there is one), but it remains seemingly a neglected data in multichannel recording and reproduction systems. This paper describes the information contained in sound intensity, how it can be used, what can be expected from it, and it underlines the fact that this data is accessible in the frequency domain through any existing cardioid recording system, such as ambisonic or double MS.

6549
Towards a Procedure for Stability Analysis of High Order Sigma Delta Modulators
Reiss, Joshua
One of the greatest unsolved problems in the theory of sigma delta modulation concerns the ability to analytically derive the stability, or boundedness, of a high order sigma delta modulator (SDM). In this work, we describe the existing literature and try to clarify the issues involved. We fully derive the stability of first order sigma delta modulators, and derive some important results for the basic second order sigma delta modulator. For third order sigma delta modulators, we describe interesting simulated results as well as sketch a proof of instability, based on linear programming, for one particular SDM. Finally, we present two theoretical results concerning stability of general high order SDMs that point towards promising directions of future research.

6550
An Interface for Analysis-Driven Sound Processing
Bogaards, Niels; Röbel, Axel
AudioSculpt is an application for the musical analysis and processing of sound files. The program unites a very detailed inspection of sound, both visually and auditorily, with high quality analysis-driven effects, such as time-stretch, transposition and spectral filtering. Multiple algorithms provide automatic segmentation to guide the placement of sound treatments and steer processing parameters. By designing transformations dircetly on the sonogram, very precise spectral modifications can be made, allowing both intuitive sound design as well as sound restoration and source separation.

6551
A Comparison of Digital Power Amplifiers with Conventional Linear Technology: Performance, Function and Application
Bell, Craig; Sibson, Isaac
Drawing conclusions about the actual subjective performance of an amplifier from the results of standard tests can be difficult. The measured results for the harmonic distortion and inter-modulation distortion in the case of the linear amplifier are excellent and exceed those of the digital amplifier. However, a subjective assessment with music material ranked the digital amplifier performance as superior. More investigation was required.

6552
Selective Mixing of Sounds
Kleczkowski, Piotr
An interesting psychoacoustic phenomenon has been found: the removal of large parts of musical tracks in the timefrequency domain may not be perceived in the mix at all, whereas some details of the sounds are heard enhanced in the mix. The phenomenon is described and investigations into the possibility of its practical use are presented. It is shown how the details of implementation and particular parameters affect the attributes of the sound. The differences in the sound of a standard mix and the sound of the mix based on this phenomenon are summarised.

6553
An Efficient Asynchronous Sampling-Rate Conversion Algorithm for Multi-Channel Audio Applications
Beckmann, Paul; Stilson, Timothy
We describe an asynchronous sampling-rate conversion (SRC) algorithm that is specifically tailored to multi-channel audio applications. The algorithm is capable of converting between arbitrary asynchronous sampling rates around a fixed operating point, and it is designed to operate in multi-threaded systems. The algorithm uses a set of fractional delay filters together with cubic interpolation to achieve accurate and efficient sampling-rate conversion.

6554
An Approach for Multichannel Recording and Reproduction of Sound Source Directivity
Albrecht, Bernhard; de Vries, Diemer; Jacques, Roland; Melchior, Frank
Current holophonic sound systems allow the creation and free positioning of virtual sound sources. But only few systems incorporate the directivity characteristics of natural sources, virtual sources are mostly monopoles. Especially in the case of auditory scenes with a high degree of freedom for source and listener positioning, the reproduction of directivities is desirable. This paper presents an approach for the efficient multichannel capturing of source directivities, as well as a suitable reproduction technique which creates a sound field approximating the directional radiation of the real source. The performance of this system, which is based on Wave Field Synthesis, has been examined by means of dedicated measurements and listening tests from recordings of brass instruments.

6555
Artificial Reverberation Algorithm to Control Distance and Direction of Sound Source for Multi-Channel Audio System
Seo, Jeong-Hun; Shim, Hwan; Sung, Koeng-Mo; Yoo, Jae-Hyoun
Multichannel artificial reverberation algorithm to control perceived direction and distance is described in this paper. In conventional algorithms using IIR filters, reverberation time is the only parameter to be controlled. Moreover, since the convolution-based conventional algorithms apply only same impulse responses, but not considering sound localization, it was not realistic enough. The new algorithm proposed in this paper utilizes early reflections segmented according to the azimuth from which direct sound comes and controls perceived direction by panning the direct sound, and controls perceived distance by adjusting Energy Decay Curve (EDC) of reverberation and gain of the direct sound. In addition, the algorithm enhances Listener Envelopment(LEV) to make late reverberation incoherent among channels.

6556
Surround Recording of Music: Problems and Solutions
Wuttke, Jörg
How can we make surround recording for pure audio applications more successful? Musical content and good sound (among other concerns) are as important in surround as they are in two-channel stereo. But the manufacturers of home playback equipment and the producers of program material have not yet exhausted all the possibilities. Surround must offer both localization and a sense of spaciousness throughout an enlarged listening area if it is to justify its higher costs. Success therefore depends greatly on the use of recording methods which extract maximum benefit from the center channel. Various possibilities exist for surround microphone arrangements; eight different methods for 5.0 pickup will be described, along with the issues of crosstalk and "stereo down-mix compatibility."

6557
Motion-Tracked Binaural Sound for Personal Music Players
Algazi, V. Ralph; Dalton, Robert J., Jr.; Duda, Richard O.; Thompson, Dennis M.
Motion-tracked binaural or MTB recording enhances headphone-based spatial sound reproduction by capturing and exploiting localization cues that result from voluntary head motion. For music reproduction, the sound field can be stabilized for any arbitrary head rotation by using sixteen microphones to sample the space around the head, and by employing the signal from a head tracker to interpolate between these channels. MTB’s use of headphones makes it particularly suitable for portable music players. However, the technique must be modified to meet the special needs of this application. Methods are described for (a) converting legacy recordings to MTB format, (b) reducing the number of channels from 16 to 2.5, and (c) processing the head-tracker signal to extract head motion from the combination of head and torso motion.

6558
Subjective Consumer Evaluation of Multi-Channel Audio Codecs
Barbour, James L.
Are normal listeners able to identify any significant differences between multi-channel audio codecs when listening to commercial music releases on good quality, consumer audio equipment? Audio professionals have often questioned whether consumers are able to hear the difference between high density, uncompressed multi-channel formats and lower data-rate delivery formats. In this study, formal subjective listening tests were conducted according to the ITU-R BS.1534 (MUSHRA) recommendation to evaluate consumer perception of popular 5.1 surround sound formats, namely Dolby AC-3, DTS, WMA Pro and mp3surround. Results suggest there is a threshold data-rate below which consumers are able to hear audible differences. Experimental design, methodology and results will be presented and discussed.

6559
Advanced Multichannel Audio System for Reproducing a Live Sound Field with Ultimate Sensation of Presence
Hamasaki, Kimio; Iwaki, Masakazu; Nakayama, Yasushige; Nishiguchi, Toshiyuki; Okubo, Hiroyuki; Okumura, Reiko
An advanced multichannel audio system for reproducing a live sound field with ultimate sensation of presence has been studied for reproducing the sound field without any restrictions on the listening points. This system also aims to give audiences the optimum natural impression of presence and reality according to their listening point, and to be applicable to live transmission and broadcasting. This paper introduces the experimental sound system and describes in detail the recording techniques and latest subjective evaluation experiments to evaluate the appropriateness of necessary conditions of the new sound system. This paper also discusses the advantages of the new sound system regarding the impression of presence and interactivity compared with conventional multichannel audio systems.

6560
VisualAudio - An Environment for Designing, Tuning, and Testing Embedded Audio Applications
Beckmann, Paul; Jaffe, David A.; Peddie, Britton; Stilson, Timothy; Van Duyne, Scott
Different hardware configurations and applications suggest different audio system design trade-offs. VisualAudio is focused on embedded processor applications, and currently works with Analog Devices, Inc. SHARC and Blackfin processors. VisualAudio is appropriate for a wide range of applications, including general purpose audio, pro audio, music “stomp” boxes, consumer electronics (such as audio-visual receiver (AVR) systems), and automotive audio systems. This article describes the decisions that were made in the design of VisualAudio and how they are tailored to the embedded processing environment. It contrasts VisualAudio with previous systems created by the authors, particularly Staccato Systems’ “SynthCore,” currently known as Analog Devices’ “SoundMAX.”

6561
Analysis and Design Algorithm of Time Varying Reverberator for Low Memory Applications
Choi, Tacksung; Lee, Junho; Park, Young-cheol; Youn, Daehee
Development of an artificial reverberation algorithm with low memory requirements has been an issue of importance in applications such as mobile multimedia devices. One possible solution to this problem is to embed a time-varying all-pass filter to the feedback loop of the comb filter. In this paper, theoretical and perceptual analyses of reverberators embedding time-varying all-pass filters in their feedback loops are presented. The analyses are to find a perceptually acceptable degree of phase variation by the all-pass filter. Based on the analyses, we propose a new methodology of designing reverberators embedding time-varying all-pass filters. Through the subjective tests, we showed that, even with smaller memory, the proposed method is capable of providing perceptually superior sound to the previous methods involving time-invariant parameters.

6562
A Comparison of the Performance of “Pruned Tree” Versus “Stack” Algorithms for Look-Ahead Sigma Delta Modulators
Angus, Jamie A. S.
Look-ahead Sigma-Delta modulators look forward k samples before deciding to output a “one” or a “zero”. The Viterbi algorithm is then used to search the trellis of the exponential number of possibilities that such a procedure generates. This paper describes alternative tree based algorithms. Tree based algorithms are simpler to implement because they do not require backtracking to determine the correct output value. They can also be made more efficient using “Stack” algorithms. Both the tree algorithm and the more computationally efficient “Stack” algorithms are described. Implementations of both algorithms are described in some detail. In particular, the appropriate data structures for both the trial filters and score memories. Comparative results of their performance are also presented.

6563
Adaptive Strategies for Inverse Filtering
Bouchard, Martin; Norcross, Scott G.; Soulodre, Gilbert A.
Inverse filtering methods commonly use techniques such as regularization and/or smoothing to reduce artifacts created by the inverse filter. revious studies have shown that these additional techniques can themselves introduce audible artifacts. Furthermore, the “optimal” amount of regularization or smoothing must be chosen by trial and error. This paper introduces some adaptive strategies based on analyzing the incoming audio to improve the subjective performance of various inverse filtering methods. The incoming audio signal is processed in blocks and the spectrum or masking curve can be calculated. One can then use the information from the audio signal to modify the inverse filter to help its performance. The characteristics of the incoming audio signal could also be used to determine if the application of an inverse filter is even necessary. In this paper two approaches are used to help define an inverse filter that is dependent on the incoming audio signal based on a frequency-domain fast-deconvolution method.

6564
New Understandings of the Use of Ferrites in the Prevention and Suppression of RF Interference to Audio Systems
Brown, Jim
Building on the work of Muncy, [1] the author has shown that radio-frequency current on cable shields is often coupled to audio systems by two mechanisms – “the pin 1 problem” and shield-current-induced noise (SCIN). [2, 3, 4, 5] An improved equivalent circuit for a ferrite choke is developed that addresses both dimensional resonance within ferrites and the self resonance of inductors formed using those materials, then compared with measured data. Field tests show that chokes formed by passing signal cables through ferrite cores can significantly reduce current-coupled interference over the range of 500 kHz to 1,000 MHz. Guidelines are presented for diagnosing the causes of EMI from sources as diverse as AM broadcast transmitters and cell phones. Solutions are presented for use in new products and for RFI suppression in field installations.

6565
Parametric Control of Filter Slope Versus Time Delay for Linear Phase Crossovers
Baird, Justin; Jackson, Bruce; McGrath, David
Linear phase crossover filters are a powerful tool for sound system designers. They deliver a near-ideal response with ruler-flat pass band, steep transition slopes and adjustable stop-band rejection – all with zero phase shift. Transition slopes can be matched to a target response, for example 24 dB or 48 dB per octave, and can also be arbitrarily specified while still retaining a perfect-reconstruction characteristic. Practical application of linear phase crossovers requires manipulation of center frequency, transition slope and stopband rejection. A graphical user interface is described which gives users new degrees of freedom in defining linear phase filter parameters. By setting bounds for parameters such as delay, a user can continuously vary other parameters while the graphical user interface optimizes the resulting filter. This paper presents new parameters for optimization of a target transition slope within a bounded delay parameter, providing fast and efficient user controls for working with and adjusting the crossover filters in real time.

6566
A Robust Partial Tracker for Analysis of Music Signals
Satar-Boroujeni, Hamid; Shafai, Bahram
We propose a novel approach for tracking of partials in music signals based on a robust Kalman filter. The tracker is based on a regularized least-squares approach that is designed to minimize the worst-possible regularized residual norm over the class of admissible uncertainties at each iteration. We introduce a set of state-space models for our signals based on the evolution of frequency and amplitude in different classes of musical instruments. These prior models are used to estimate future values of partial tracks in successive time frames of our spectral data. Here, the parameters of evolution models are treated as bounded uncertainties and our tracker can robustly track both frequency and power partials in all frequency regions.

6567
Automatic Retrieval of Musical Rhythmic Patterns
Kostek, Bozena; Wojcik, Jaroslaw
Even though the research within Music Information Retrieval domain is well-advanced, searching for music is still under development. Thanks to melody search methods applied in 'query by humming' systems, users can retrieve melodies on the basis of an audio input. However, the research on rhythm is not advanced to such an extent yet. This paper addresses automatic retrieval of rhythmic patterns based on symbolic representation of music employing repeating rhythmic and melodic patterns. In the experiments the importance of melorhythmic representation of a musical piece is verified and compared to the sound duration-based hypothesis ranking method. Since most of musical files to be found in the Internet are polyphonic the lowest or the highest sounds of the chords are also taken into consideration.

6568
A Spectrogram Display for Loudspeaker Transient Response
Gunness, David W.; Hoy, William R.
A spectrogram is a two-dimensional depiction of a waveform or transfer function in which frequency is depicted on one axis and time is depicted on the other. The level is plotted against frequency and time by using a color or gray scale. If the time resolution is constant, the display is usually referred to as a Fourier transform spectrogram. If the time resolution is scaled to the frequency, it is usually referred to as a wavelet transform spectrogram. In this paper, we present a novel and efficient method for calculating a wavelet transform spectrogram, which is optimized for the analysis of loudspeaker transient response. The new method employs complex convolution of the frequency response, rather than explicit time domain windowing, or the wavelet transform.

6569
Quality Enhancement of Low Bit Rate Mpeg1-Layer 3 Audio Based on Audio Resynthesis
Cantzos, Demetrios; Kyriakakis, Chris
One of the most popular audio compression formats is indisputably the MPEG1-Layer 3 format which is based on the idea of low-bit transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate MP3 encoded audio segments by applying multichannel audio resynthesis methods in a post-processing stage or during decoding. Our algorithm employs the highly efficient Generalized Gaussian mixture model which, combined with cepstral smoothing, leads to very low cepstral reconstruction errors. In addition, residual conversion is applied which proves to significantly improve the enhancement performance. The method presented can be easily generalized to include other audio formats for which sound quality is an issue.

6570
Obtaining 120dB Performance Using Switching Power Supplies
Gaddy, Larry; Rouse, Gregg
There is a growing tendency to use switching power supplies to reduce costs. Many designers believe that switching supplies and high performance are mutually exclusive. With some careful design considerations, it is possible to optimize performance using switching power supplies. This paper will review selection of optimal switching frequency. The tradeoffs of switching frequency and efficiency, which are typical in high performance systems, will be examined. Measurement results using asynchronous audio switching clocks will be presented. Measurement results of synchronizing power supply switching to audio clocking multiples will demonstrate how to achieve 120dB performance, the current high benchmark for professional audio systems.

6571
Influence of Artificial Mouth’s Directivity in Determining Speech Transmission Index
Bilzi, Paolo; Bozzoli, Fabio; Farina, Angelo
In room acoustics, one of the most used parameters for evaluating the speech intelligibility is the Speech Transmission Index (STI). The experimental evaluation of this STI generally employ an artificial speaker (artificial mouth) and listener (binaural head). In this study, the influence on the measurements of the emission directivity of the artificial mouth was investigated for different acoustic environments and we have found that, in many cases ( i.e. big rooms or systems of telecommunications ) the results is not sensitive to modifications of the directivity; on the contrary, inside cars, the shape of the whole balloon of directivity is important for determining correct and comparable values and the different mouth studied gives really different results in the same situation.

6572
A New Low-Delay Codec for Two-Way High-Quality Audio Communication
Ferreira, Aníbal J. S.; Sinha, Deepen
High-quality audio bit-rate reduction systems are widely used in many application areas involving audio broadcast, streaming and download services. With the advent of 3G mobile and wireless communication networks, there is a clear opportunity for new multimedia services, notably those relying on two-way high-quality audio communication. In this paper we describe a new perceptual audio coder that features low-delay, intrinsic error robustness and high subjective audio quality at competitive compression ratios. The structure of the audio coder is described and an emphasis is given on its innovative approaches to semantic signal segmentation and decomposition, independent coding of sinusoidal and noise components, and bandwidth extension using Accurate Spectral Replacement. A few test results are presented that illustrate the operation and performance of the new coder. Audio demos are available at http://www.atc-labs.com/acc/

6573
Compensation of Nonlinearities of Horn Loudspeakers
Bard, Delphine; Del Nobile, Mauro; Rossi, Mario
This paper presents an compensation method of nonlinearities of horn loudspeakers. It is possible to compensate the nonlinearities effects of electroacoustic devices by applying the inverse nonlinearity upstream. The method is based on the measurements of the nonlinearity by Volterra series using multitone excitations. Once the Volterra kernels has been determined, we proceed by computing the inverse Volterra kernels, both in magnitude and phase. The method was implemented and validated in non-real time and real time (DSP implementation). To validate the nonlinearities compensation a comparison between total harmonic distortion measurements with and without compensation has been done.

6574
Diaphragm Parameters and Radiation Characteristics of Multilayer Piezoelectric Ceramic Loudspeakers
Fujii, Jun; Ohga, Juro; Oohira, Ikuo; Sashida, Norikazu
This paper presents diaphragm parameters and analysis of radiation characteristics for small size loudspeaker by a multilayer piezoelectric ceramic bimorph diaphragm. The multilayer ceramic wafer is suitable for buttery operated mobile phones because of its lower electrical impedance nature. Three diaphragm parameter measuring methods are compared to develop the optimum measurement of diaphragm parameters. Then, output sound pressure frequency characteristics of a loudspeaker model with actual acoustical loads are analyzed.

6575
A Study on Lumped Elements Model and Thermal Effects of Eddy Currents in Loudspeakers
Shen, Yong; Wu, Ning; Xu, Xiaobing
A frequency-divided thermal model is developed to study the heat arising from the eddy currents in electro-dynamic loudspeakers. Using pure tone as test signal, the steady state temperature of the voice coil is measured point by point in high frequency range. The results illuminate on the contrast of all the existed lumped electrical-models of eddy currents and show distinctly which is better. Also, a set of innovative thermal expressions considering different circumstances are deduced from the frequency-divided thermal model. With these expressions and the measurement data, all the thermal elements’ value in the model can be obtained. And then arbitrary temperature rising course can be predicted easily if several necessary parameters are given.

6576
Spatial Audio Coding System Based on Virtual Source Location Information
Jang, Inseon; Kang, Kyeongok; Seo, Jeongil
Spatial audio coding (SAC) is a process to represent multichannel audio signals as down-mixed mono or stereo signals with spatial cues. The main strength of SAC is the significant bit-rate reduction while maintaining the perceptual sound quality. Binaural cue coding (BCC) has been introduced and now becomes an important scheme for multichannel SAC both in the sense of audio coding and the standardization issue in the MPEG. However, interchannel level difference (ICLD), one of the essential spatial cues for SAC, has a limitation that the quantized ICLD for transmission may lead to the sound quality degradation of decoded signal. In this paper, we propose virtual source location information (VSLI), which is an angle representing geometric spatial information between channels on playback layout, instead of the ICLD, and also a VSLI-based SAC system. Since a human being can not easily distinguish the variation of the spatial angle within the three degree distortion, the spatial angle, hence the VSLI, can be approximated discretely with the three degree resolution while maintaining the perceptual quality of output signals. The objective and subjective assessment results of our proposed system confirm superior performance to the ICLD-based SAC system.

6577
An Ultra High Performance DAC with Controlled Time-Domain Response
Lesso, Paul; Magrath, Anthony J.
This paper describes the design of an ultra-high performance stereo digital-to-analogue converter (DAC) employing advanced digital filtering techniques. Recently there has been a renewed interest in the time-domain properties of digital filters used for interpolation and decimation. Linear phase FIR filters, which have proliferated digital filter design for the last two decades, have the undesirable properties of pre-ringing and high group delay. Conversely, minimum phase filter filters, which offer lower levels of pre-ringing, do not have a uniform phase response. This paper describes the trade-offs in the design of filters with controlled pre-ringing, coupled with desirable phase and magnitude characteristics. The paper also describes architectural choices in the implementation of the DAC signal processing chain, required to achieve commensurate analogue performance.

6578
Understanding the Effects of AES-17 When Evaluating 192kHz Converter Performance
Gaddy, Larry; Kulavik, Richard
This paper will cover hidden performance issues in 192kHz converters that are evaluated using AES-17. AES-17 is the “AES Standard method for digital audio engineering – measurement of digital audio equipment”. This standard calls out how to test most digital audio equipment, and it outlines the standard test procedures and methods in testing of audio. It is important to understand that these methods allows for several important issues to be hidden in the real spectrum of the converter. These hidden elements will be addressed in detail.

6579
Wideband Piezoelectric Rectangular Loudspeaker Using Tuck Shape PVDF Bimorph
Moriyama, Nobuhiro; Ohga, Juro; Ouchi, Toshinori; Takei, Toshitaka
A bimorph sheet of PVDF (polyvinylidenfluoride) film is applied to a flat rectangular loudspeaker as a folded zigzag-tack shape diaphragm whose size is, for example, 260 mm X 144mm with various depths. These loudspeakers are characterized by moderate size with a wide frequency range, light weight and no magnetic flux radiation. This paper examines the electro-acoustic transducer characteristics of this loudspeaker. Sensitivity and resonant frequency were measured by using both a flat panel baffle and a closed box. Theoretical estimation was carried out by thin curved beam theory. The estimated values are compared to the measured results.

6580
Modelling Compression Drivers Using T-Matrices and Finite Element Analysis
Morgans, Rick; Murphy, David J.
Models for a commercial compression driver were developed using transmission line matrices and Finite Element Analysis using the commercial package ANSYS. The models were compared with measurements using plane wave tube loading, and discrepancies investigated. The electrical impedance was measured in-vacuo to obtain Thiele-Small parameters without acoustic loading. A resonance was investigated and found to be air leakage into the magnet cavity. The development of frequency dependent damping in the matrix and FEA models was necessary to improve the simulation accuracy.

6581
A Proposal for Low Frequency Loudspeaker Design Utilizing Ultrasonic Motor
Negishi, Hirokazu; Ohga, Juro; Oohira, Ikuo
Limitations of low frequency sound reproduction ability of the conventional direct-radiator loudspeakers is essential because the diaphragm of them must be driven in a mass-controlled range for a flat response. This paper proposes a novel direct-radiator loudspeaker suitable for low-frequency signal radiation. It utilizes a ultrasonic motor (USM) including a piezoelectric transducer. A velocity modulated continuous revolution is better than a reciprocal motion to avoid distortion in wave form due to the difference between dynamic and static frictional forces. A few fundamental ideas, a continuously revolving flat radiator, an air-flow modulation type without any mechanical radiator and a conventional radiator actuated by a revolving mass are compared to investigate merit of the loudspeaker proposed here.

6582
Finite Element Modelling of a Loudspeaker. Part 1: Theory and Validation.
Geaves, Gary P.; Henwood, David J.
The paper describes finite element modelling of an axi-symmetric loudspeaker and the resulting predicted behaviour, both the motion and the resulting sound pressure field, in both the time and frequency domains. The effect of the electrical circuit is included through post-processing. Laser and impedance measurements are shown to aid the estimation of the material parameters. Predictions are compared with measured responses and are seen to represent the main features accurately. A significant spider resonance is described. Modes can be loosely classified by the position of their dominant motion, in the spider, cone or surround. This understanding of the modal structure is used in a study of trying to reduce the influence of a cone mode by varying a surround parameter (thickness). An additional paper, Part 2 at this conference, describes applications of the model.

6583
Radiated Sound Field Analysis of Loudspeaker Systems: Discrete Geometrical Distribution of Circular Membranes Versus Co-Incident Annular Rings.
Debail, Bernard G.A.; Shaiek, Hmaied
This paper addresses the problem of the spatial distribution of sound pressure generated by an annular membrane. A generalized theoretical approach will be developed in order to predetermine the sound field radiation of a disc or ring shaped diaphragm, placed in a rigid infinite baffle. Because no assumption is made regarding the observing point, this generalized method is able to predict the acoustic pressure not only in the far field region but also in near field. Results demonstrate the superiority of the co-incident ring distribution compared to the traditional discrete distribution of discs. A new transducer based on concentric ring and disc especially designed to respect this coincident criterion will be introduced.

6584
Loudspeaker Nonlinearities – Causes, Parameters, Symptoms
Klippel, Wolfgang
The paper addresses the relationship between nonlinear distortion measurement and nonlinearities which are the physical causes for signal distortion in loudspeakers, headphones, micro-speakers and other transducers. Using simulation techniques characteristic symptoms are found for each nonlinearity and presented systematically in a guide for loudspeaker diagnostics. This information is important for the interpretation of nonlinear parameters and for performing measurements which describe the loudspeaker more comprehensively. The practical application of the new techniques are demonstrated on three different loudspeakers.

6585
Upfront Time Segmentation Methods for Transform Coding of Audio
Heusdens, Richard; Lincklaen Arriëns, Huib J.; Niamut, Omar A.
We study a transform coder that employs a dynamic programming based rate-distortion optimization framework for time segmentation. Although this coder exhibits a high performance, its computational complexity makes it unfeasible for many practical applications. It is investigated whether upfront time segmentation can reduce computational complexity without a significant increase in perceptual distortion. Upfront time segmentation can be accomplished by replacing the rate-distortion cost functional with low-complexity cost measures, that are independent of bit rate and perceptual distortion. Through both quantitative and qualitative evaluation it is shown that dynamic programming based upfront time segmentation for minimization of perceptual entropy can be a viable alternative to rate-distortion optimal time segmentation.

6586
Enhanced Accuracy of the Tonality Measure and Control Parameter Extraction Modules in MPEG-4 HE-AAC
Rose, Kenneth; Ryu, Sang-Uk
This paper investigates possible enhancements of the high efficiency-advanced audio coding (HE-AAC) encoder, with focus on the spectral band replication modules. The HE-AAC encoder generates side information, including control parameters, that characterizes the energy distribution across time and frequency as well as tonal and noise components, to ensure perceptually coherent regeneration of the high band at the decoder. The accuracy of the encoder's tonality measure and control parameter extraction modules is analyzed, leading to the proposal of an alternative approach employing sinusoidal analysis, which offers enhanced estimation of tonal and noise energy levels, as well as an improved control parameter extraction procedure. Comparative performance evaluation of the standard and modified encoders, on a set of audio signals demonstrates the perceptual impact of estimation inaccuracy on the regenerated high band quality, and identifies the type of audio where it causes meaningful degradation.

6587
New Techniques in Spatial Audio Coding
Seefeldt, Alan; Vinton, Mark S.; Robinson, Charles Q.
The goal of spatial audio coding is to data compress multi-channel audio material by combining channels into a composite signal and transmitting supporting side-information so that a decoder can reconstruct an approximation of the original signal from the composite. Many techniques have been discussed in the literature, most of which manipulate across time and frequency the magnitude and phase of the composite channels to create a perceptual approximation of the original multi-channel sound field. Building on this framework, we discuss new techniques for computing and applying the side-information, new de-correlation techniques, and a new way of utilizing a traditional spatial coding system for the purpose of synthesizing a multi-channel signal blindly from an existing stereo signal. We also compare the performance of this system to other existing systems.

6588
A New Broadcast Quality Low Bit Rate Audio Coding Scheme Utilizing Novel Bandwidth Extension Tools
Ferreira, Aníbal J. S.; Sinha, Deepen
In this paper we describe the components of a novel audio coding algorithm capable of delivering high-fidelity CD-like stereo audio at the bit rates of 40-48 kbps and natural sounding FM grade mono at the bit rates of 18-22 kbps. Bandwidth Extension has emerged as an important tool for the satisfactory performance of low bit rate audio codecs. Recently we proposed two new bandwidth extension algorithms, Fractal Self-Similarity Model (FSSM) and Accurate Spectral Replacement (ASR), which belong to a new class of Bandwidth Extension techniques which are applied directly to the high resolution frequency representation of the signal (e.g., MDCT or ODFT). The proposed coding scheme uses FSSM and ASR in an adaptive and complementary framework. Another important component of the proposed codec is a wideband psychoacoustic model that makes an explicit use of the Comodulation Release of Masking (CMR) phenomenon. It also includes a novel parametric stereo coding technique. The proposed audio coding scheme is geared towards broadcast applications where codec latency and encoder complexity is generally not an overriding concern. In this paper we present algorithmic details of the new codec, audio demonstrations, and, comparison to other audio coding schemes. Further information and audio demonstrations are available at http://www.atc-labs.com/teslapro.

6589
The MPEG-4 Audio Lossless Coding (ALS) Standard - Technology and Applications
Harada, Noboru; Kamamoto, Yutaka; Liebchen, Tilman; Moriya, Takehiro; Reznik, Yuriy A.
MPEG-4 Audio Lossless Coding (ALS) is a new extension of the MPEG-4 audio coding family. The MPEG-4 ALS codec is based on forward-adaptive linear prediction, which enables remarkable compression even with low complexity. Additional features include long-term prediction, multichannel coding, and compression of floating-point audio material. In this paper authors who have actively contributed to the standard describe the basic elements of the ALS codec with a focus on prediction, entropy coding, and related tools. It also presents latest developments during the standardization process and points out the most important applications of this new lossless audio format.

6590
Improving Loudspeaker Transient Response with Digital Signal Processing
Gunness, David W.
The transient response of a loudspeaker represents the combined effect of a multitude of physical behaviors. Some of these behaviors are time-variant, nonlinear, or spatially variable and are not good candidates for digital correction. Others are sufficiently LTI (linear, time-invariant) and sufficiently consistent directionally to be largely correctable with specialized digital filters. In the particular case of high powered, horn-loaded loudspeakers, most of the observed transient misbehavior is the result of stable, correctable phenomena. Consequently, the transient response of such loudspeakers can be significantly improved with signal preconditioning. Measurements of an example loudspeaker demonstrate the improvements that are possible.

6591
Simulation of Harmonic Distortion in Horns Using an Extended BEM Postprocessing
Makarski, Michael
The Boundary Element Method is a well-known tool to calculate sound radiation of horns. As the BEM is based on the linearized sound field equation, only linear properties of the sound field frequency response, directivity etc.) can be calculated. Besides these linear properties, the nonlinear wave propagation in horns is of great interest. It depends mainly on the shape of the horn and the growth rate of the first narrow part. This paper describes a method to combine the pure linear method BEM with the calculation of nonlinear wave propagation in horns. Simulation and measurement results of different horns are presented and discussed. As first results indicate, this method offers a fast and accurate way to calculate nonlinear wave propagation in horns.

6592
Modal Analysis and Nonlinear Normal Modes (NNM) on Moving Assemblies of Loudspeakers
Bolaños, Fernando
The most important modes for a direct acoustic radiator are the axial modes, which are axisymmetric circular modes of a high temporal and spatial coherence [38]. Numeric modal analysis and measurement of the free and forced accelerations and displacement responses of the moving assemblies are performed to establish the main modes involved in the acoustic response. The axial modes had been identified by measurements (within the intrinsic degree of uncertainty). The experiences show evidence of clearly nonlinear normal modes (NNM) [18] and [19], justifying the high complexity of mode finding in loudspeaker cones. Based on the axial modes, a three degrees of freedom model is proposed, where only one of the masses is externally forced. The modal analysis of a double cone speaker has been treated in short form.

6593
Finite Element Modelling of a Loudspeaker Part 2: Applications
Geaves, Gary P.; Henwood, David J.; Vanderkooy, John
The finite-element loudspeaker model presented in Part 1 is extended to three applications. Firstly, we study the effect of a significant increase of the magnetic motor strength Bl on the breakup modes and other resonances of a typical driver. Approximations to the theory allow modal decomposition even when the loudspeaker voice coil is driven from a normal amplifier, showing that the modes most affected are those for which the back-emf due to voice-coil motion is strong. Secondly, we probe how the shape of the cone influences the resonances, breakup, and acoustic performance. Cones that flare outwards have the most desirable characteristics. A final third study concerns the result of a change in the distribution of the damping and stiffness of parts of the driver, to see if useful characteristics ensue. The model is also used to investigate some aspects of measurements.

6594
Ground-Plane Constant Beamwidth Transducer (CBT) Loudspeaker Circular-Arc Line Arrays
Button, Douglas J.; Keele, Jr., D.B. (Don)
This paper describes a design variation of the CBT loudspeaker line array that is intended to operate very close to a planar reflecting surface. The original free-standing CBT array is halved lengthwise and then positioned close to a flat surface so that acoustic reflections essentially recreate the missing half of the array. This halved array can then be doubled in size which forms an array which is double the height of the original array. When compared to the original free-standing array, the ground-plane CBT array provides several advantages including: 1. elimination of detrimental floor reflections, 2. doubles array height, 3. doubles array sensitivity, 4. doubles array maximum SPL capability, 5. extends vertical beamwidth control down an octave, and 6. minimizes near-far variation of SPL. This paper explores these characteristics through sound-field simulations and over-the-ground-plane measurements of three systems: 1. a conventional two-way compact monitor, 2. an experimental un-shaded straight-line array, and 3. an experimental CBT Legendre-shaded circular-arc curved-line array.

6595
A Balanced Modal Radiator (BMR)
Bank, Graham; Harris, Neil
The goal of a practical loudspeaker that behaves like the "perfect point source" has been long sought. Mathematical analysis shows that the general prototype for such a device does indeed exist, but it does not point to an obvious embodiment. Using this prototype, a practical flat diaphragm loudspeaker is developed, which has a substantially flat on-axis pressure response, as well as a smooth and extended power response. A fully-coupled FEA model is used to investigate the intrinsic characteristics of this radiator in both the mechanical and acoustical domains. Measurements from a real prototype loudspeaker illustrate the practicality of the method.

6596
Jointly Optimal Time Segmentation, Distribution and Quantisation for Sinusoidal Audio Coding
Heusdens, Richard; Jensen, Jesper; Korten, Pim
In this paper we propose a rate-distortion optimal algorithm for sinusoidal coding of audio and speech. The algorithm determines for a pre-specified target bit-rate the optimal (variable-length) time segmentation, the optimal distribution of sinusoidal components over the segments and the optimal (scalar) quantisers for quantising the sinusoid parameters amplitude, phase and frequency. The optimisation is done by jointly optimising the segment lengths, number of sinusoids and quantisers using high-resolution quantisation theory and dynamic programming techniques, which makes it possible to execute the algorithm in polynomial time. A particular advantage of the proposed method is that, given a target bit-rate, it solves the problem of finding the optimal balance between total number of sinusoids and number of bits per sinusoid.

6597
Enhanced Performance in the Functionality of Fine Grain Scalability
Choo, KiHyun; Kim, JungHoe; Oh, Eunmi; Son, ChangYong
The purpose is to take advantage of the characteristics of arithmetic decoding and then to improve coding efficiency of codecs that provide the functionality of fine-grain scalability. The smart arithmetic decoding algorithm exploits the fact that a decoding buffer still contains meaningful information for arithmetic decoding even if we terminate decoding when there is no bit to be fed into the buffer. We tested the effect of the symbols additionally decoded from truncated MPEG-4 BSAC and MPEG-4 scalable lossless audio coding (SLS) bitstreams. On average, approximately additional 41 symbols and 13 symbols are uniquely decodable per frame in MPEG-4 BASC and MPEG-4 SLS respectively. The experimental results show that much less spectral difference and higher SNR with the smart arithmetic decoding. This additional “compression” can be effective when transmitting truncated bitstreams at lower bitrates.

6598
Scalability in KOZ Audio Compression Technology
Daniels, Michelle L.; Garcia, Ricardo A.; Short, Kevin M.
Intra-codec scalability in the KOZ audio compression technology is presented in detail. The KOZ codec uses a psychoacoustic model and high-resolution spectral analysis to create, prioritize and layer audio objects, making it inherently scalable by varying the number of layers. The layers are sufficiently fine-grained to allow both smallstep and large-step bitrate variations in a light-weight, real-time process during content delivery. Decoder scalability based on availability of device resources is introduced. An overview of the architecture of the KOZ technology and some of the applications of its scalability are discussed.

6599
MPEG Spatial Audio Coding / MPEG Surround: Overview and Current Status
Breebaart, Jeroen; Disch, Sascha; Faller, Christof; Herre, Jürgen; Hotho, Gerard; Kjörling, Kristofer; Myburg, Francois; Neusinger, Matthias; Oomen, Werner; Purnhagen, Heiko; Rödén, Jonas
Recently, the MPEG Audio standardization group started a new work item on Spatial Audio Coding. This new approach allows for a fully backward compatible representation of multi-channel audio at bit rates that are only slightly higher than common rates currently used for coding of mono / stereo sound. This paper briefly describes the underlying idea and reports on the current status of the MPEG standardization activities. It provides an overview of MPEG Spatial Audio Coding technology, and discusses its capabilities. The current level of performance will be illustrated by listening test results.

6600
Efficient Design of Time-Frequency Stereo Parameter Sets for Parametric HE-AAC
Chang, Tzu-Wen; Hsu, Han-Wen; Lee, Kan-Chun; Lee, Wen-Chieh; Liu, Chi-Min; Yang, Chung-Han
Parametric Stereo Coding (PS) tool is used to reconstruct stereo signal from the monaural signal. The tool can be jointly used with the HE-AAC to have high compression ratio and is referred to as the parametric HE-AAC in this paper. The PS tool is able to capture the stereo image of the audio input signal into a limited number of parameters, requiring only a small overhead. In MPEG-4 HE-AAC, the PS tool segments a frame into several regions in time domain and into stereo bands in frequency domain to deliver stereo parameter sets. This paper considers the design of the stereo parameters. These methods are integrated in the NCTU-HE-AAC and the objective experiments are conducted to check the quality.

6601
Structural Analysis of Low Latency Audio Coding Schemes
Geiger, Ralf; Lutzky, Manfred; Schmidt, Markus; Schnell, Markus
Low latency audio coding gains increasing importance among upcoming high quality communication applications like videoconferencing and VoIP. This paper provides a comparison of two low latency audio codecs suitable for these tasks: MPEG 4 ER AAC-LD and ITU-T G722.1 Annex C. Despite their similar coding strategies both codecs show significant differences with respect to the used tools and coding performance. A comparison of the coding tools is provided and the influence on different signal classes is discussed.

6602
The Design of Half-Band FIR Filters Using Ripple Attenuation of a Manipulated Lowpass
Wise, Duane K.
This paper investigates a technique for extracting a half-band FIR filter from a lowpass root FIR. Employing this technique with a root filter designed with an optimal least squares algorithm can result in a half-band FIR with very low ripple over most of the pass-band and stop-band, at the expense of ripple size at the band-edges. The principal advantage of this technique is in the design of half-band FIRs with less priority to band-edge ripple without the manipulation of an arbitrary weighting function.

6603
Single Channel Source Separation Using Short-Time Independent Component Analysis
Barry, Dan; Coyle, Eugene; Fitzgerald, Derry; Lawlor, Robert
In this paper we develop a method for the sound source separation of single channel mixtures using Independent Component Analysis within a time-frequency representation of the audio signal. We apply standard Independent Component Analysis techniques to contiguous magnitude frames of the short-time Fourier transform of the mixture. Provided that the amplitude envelopes of each source are sufficiently different, it can be seen that it is possible to recover the independent short-time power spectra of each source. A simple scoring scheme based on auditory scene analysis cues is then used to overcome the source ordering problem ultimately allowing each of the independent spectra to be assigned to the correct output source. A final stage of adaptive filtering is then applied which forces each of the spectra to become more independent. Each of the sources is then resynthesised using the standard inverse short-time Fourier transform and overlap add scheme.

6604
A New Class of Smooth Power Complementary Windows and Their Application to Audio Signal Processing
Ferreira, Aníbal J. S.; Sinha, Deepen
In this paper we describe a new family of smooth power complementary windows which exhibit a very high level of localization in both time and frequency domain. This window family is parameterized by a "smoothness quotient". As the smoothness quotient increases the window becomes increasingly localized in time (most of the energy gets concentrated in the center half of the window) and frequency (far field rejection becomes increasing stronger to the order of 150 dB or higher). A closed form solution for such window function exists and the associated design procedure is described. The new class of windows is quite attractive for a number of applications as switching functions, equalization functions, or as windows for overlap-add and modulated filter banks. An extension to the family of smooth windows which exhibits improved near-field response in the frequency domain is also discussed. More information is available at http://www.atc-labs.com/technology/misc/windows.

6605
Active Leak Compensation in Small Sized Speakers Using Digital Signal Processing
Chopra, Varun
The frequency response of a small sized speaker unit used in applications such as mobile phones changes substantially with changes in the acoustical load of the speaker. Presently, an acoustical solution is used for reducing the variations in the acoustical load of the speaker. The acoustical solution relies on certain space and volume considerations to function satisfactorily, which are difficult to attain in today’s compact mobile phones. An unconventional approach using digital signal processing to counter such degradation in frequency response is described here.

6606
Digital Filter Design and Implementation within the Steinberg Virtual Studio Technology (VST) Architecture
Osorio-Goenaga, Roberto
Steinberg Media Technologies, GmbH, of Germany, is one of the leading manufacturers of pro-audio hardware and software products. Within their software realm, they have developed a plug-in architecture for adding third-party DSP functionality for program developers who choose to support it. The architecture, commonly referred to as VST (Virtual Studio Technology), has become a standard for third-party add-ons over the last decade, partly because of its cross-platform functionality. The software development kit (SDK) for VST plug-ins is available free of charge from Steinberg, and is optimized for building within Microsoft’s Visual C++ environment on x86 PC’s, and on the CodeWarrior environment for Apple computers. This project focuses on the implementation of classic and experimental filters within the aforementioned architecture, created and compiled on Visual C++; rebuilding these examples on a Mac should be a straightforward process. DSP literature is included, although only to a certain depth so as to not overwhelm the reader with the mathematics behind the process; a recommended reading list is included for that purpose. The paper is designed for introducing general signal processing theory as well as documenting the process of creating VST plug-ins in a clear and understandable method. A suite of VST plug-ins is produced and the entire source code is available online as appendix to this project at http://www.thedigitalvortex.com/appendix.html

6607
Comparison Between Time Delay Based and Non-Uniform Phase Based Equalization for Multi-Channel Loudspeaker-Room Responses
Bharitkar, Sunil; Kyriakakis, Chris
Traditionally, room response equalization is performed to improve sound quality at a listener. Given a loudspeaker and a listener, in a room, a loudspeaker-room response is obtained and an inverse filter is designed for loudspeaker-room magnitude response equalization. However, due to non-coincident positions of any two loudspeakers, in a multi-channel setup, the combined response of the two loudspeakers may have an undesired broad spectral notch or peak or large spectral deviations in the crossover region. These spectral deviations introduced around the crossover, due to the combined phase response, generally cannot be compensated with magnitude response equalization. In this paper, we compare two different methods (time delay and all-pass cascade) for correcting for the spectral deviations in the crossover region. We demonstrate that using non-uniform phase distribution, with all-pass filters, around the crossover region, as opposed to a constant phase (i.e., a fixed and optimized time delay in the satellite), it is possible to obtain better correction in the crossover region but with increased complexity. We also present an automatic approach for evaluating performance with the time-delay approach.

6608
Objective Function for Automatic Multi-Position Equalization and Bass Management Filter Selection
Bharitkar, Sunil; Kyriakakis, Chris
Traditionally, multi-position room response equalization is performed to improve sound quality at multiple listeners. Furthermore, even after multi-position equalization, due to non-coincident positions of the subwoofer and the satellite, in a multi-channel setup, the combined response of the two loudspeakers may include undesirable spectral deviations in the crossover region, which are different at different positions. These spectral deviations introduced around the crossover, due to the combined phase response, may be fixed by proper choice of the bass management filters. In this paper, we present an objective function that can be used to characterize the performance of multi-position equalization, determine the uniformity of equalization, as well as allow automatic selection of the bass management filters for correcting the spectral deviations in the crossover region.

6609
Acoustical Monitoring Research for National Parks and Wilderness Areas
Chen, Zhixin; Gregoire, B. Jerry; Maher, Robert C.
The natural sonic environment, or soundscape, of parks and wilderness areas is not yet fully characterized in a scientific sense. Published research in the U.S. National Park System is generally based on short-term sound level measurements or visitor response surveys associated with regulatory evaluation of noise intrusions from motorized recreational vehicles, tour aircraft, or nearby industrial activity. This paper reviews the history of soundscape studies in the National Park System and describes several recent advances that will allow automated recording and analysis of long-term audio recordings covering days, weeks, and months at a time.

6610
A Binaural Model to Predict Position and Extension of Spatial Images Created with Standard Sound Recording Techniques
Braasch, Jonas
A binaural model was used to investigate different microphone techniques (Blumlein, ORTF, MS, spaced omni). In contrast to previous attempts, the model algorithm was not only designed to predict the position, but also the spatial extent of a reproduced spatial image. The architecture of the model was designed to optimally process binaural cue distributions with multiple peaks as often found in psychoacoustical data. The model also contains elements to simulate the precedence effect, which is required for analyzing spacedmicrophone techniques, and is also useful when measuring the influence of the concert space on the recording.

6611
An Unsupervised Adaptive Filtering Approach of 2-to-5 Channel Upmix
Driessen, Peter F.; Li, Yan
A new algorithm of converting two-channel audio materials to five-channel based on subband unsupervised adaptive filtering is proposed in this paper. This algorithm uses a subband analysis-processing-synthesis framework. In each subband, a robust stereo image is obtained using principle component analysis, and an effective energy re-distribution among surround channels is achieved by mapping cross-correlation between two input channels to a weighted panning matrix.

6612
Investigation on the Related Effect Caused by ECM Miniaturization
Hu, Zongbao; Lee, Jun; Wu, Zonghan; Zhang, Tao
Miniaturization is a very important issue that people must concern during design and production of the electronic components. The miniaturization of ECM (electret condenser microphone) needs the corresponding changes of ECM structure, dimensions, and circuit distribution. Such changes impact on the specifications of ECM. This paper makes discussions on the related effect of ECM miniaturization, which is useful for the design of mini-microphones.

6613
On Amplitude Panning and Asymmetric Loudspeaker Set-Ups
van Leest, Arno J.
The aim of amplitude panning is to create a phantom sound source that is heard at a certain predetermined position by feeding a mon signal to several louspeakers using particular weighting factors. Two models used for amplitude panning are in particular important: the velocity and energy model. A nice property of these models is that the position of a phantom sound source can be described as linear combinations of the vectors pointing towards the physical positions of the loudspeakers. In this paper, we describe how the coefficients of these linear combinations can be numerically computed by means of simple matrix algebra, which meet both the velocity and the energy model simultaneously for a given optimization criterion.

6614
Phantom Audio Sources with Vertically Seperated Speakers
Kyriakakis, Chris; Sundaram, Shiva
Multichannel auditory displays and Immersive Audio Systems are frequently integrated with video displays. Often, in these applications, the video display placement makes it difficult to place the centre speaker in front of the listener. A solution to this problem is to create a phantom centre channel in front of the listener using speakers placed elsewhere. Phantom sources can be created by amplitude-panned techniques in the horizontal plane. However, since it is practical and aesthetic to have speakers above or below the video display, we propose a technique to create a centre phantom at 0 degrees elevation and 0 degrees azimuth to the listener using two vertically separated speakers placed above and below the horizontal plane at 0 degrees azimuth. This technique can also be extended to create phantom sources in the median plane.

6615
Difference of the Sound Levels Among 15 Japanese Terrestrial and Digital Satellite TV Broadcasting Channels
Kamada, Takahiro; Miyasaka, Eiichi
Towards better sound services for elderly listeners, we investigated, for the first time, the sound levels of 15 Japanese terrestrial analogue broadcasting channels and digital broadcasting channels including NHK and the Commercial broadcasting Bureaus. The results show that (1) the daily averaged sound level of the terrestrial channels was –6.1 dB with 4 dB higher than that of BS digital channels, (2) the maximum level difference of the averaged sound levels between main programs and advertisements inserted in the programs was 15 dB and (3) the average level difference was 2.9 dB. These results imply that there exist perceptually significant problems for elderly viewers.

6616
Electroacoustic Analogy Analysis of Electret Condenser Microphones with Noise-Canceling Effects
Lee, Fang-Ching
An electroacoustic analogy is developed to analyze the open-circuit sensitivity, noise-canceling effect and frequency response of electret condenser microphones. In contrast to conventional models of electret condenser microphones in the literature, the present electroacoustic analogy analysis (E.A.A.) model details the open-circuit sensitivity, noise-canceling effect and frequency response of microphones. Two commercially available electret condenser microphones are analyzed to demonstrate the model. The results show that the calculating results of discussed microphones consist with the measuring data.

6617
Noise Shaping in Time-Domain Quantized LFM
Hawksford, Malcolm J.
SDM binary code can be derived from linear frequency modulation (LFM) using zero crossing detection and time-domain quantization. However, the jitter-like nature of this quantization process does not generally yield well-structured noise shaped characteristics compared to conventional time domain coders. The problem of improving the frequency domain performance of the error resulting from a LFM-based model is studied and further observations are made on the relationship between SDM and quantized LFM.

6618
Stereo and Multichannel Loudness Perception and Metering
Soulodre, Gilbert A.; Lavoie, Michel C.
Much research has been conducted in recent years on loudness perception and metering. This has been motivated by an ITU-R effort to identify a loudness measure suitable for broadcast applications. The initial ITU-R effort focussed exclusively on mono signals, and a simple loudness meter based on a weighted energy sum [Leq(RLB)] was shown to perform best. In the present study, the work on loudness perception was extended to the stereo and multichannel cases. Formal subjective tests were conducted using typical broadcast material to derive a subjective database to evaluate the performance of new loudness meters. The results were used to examine the requirements for a loudness meter for mono, stereo and multichannel signals.

6619
The Influence of Stereophony on the Restitution of Timbre by Loudspeakers
Guyot, Benjamin; Herzog, Philippe; Lavandier, Mathieu; Meunier, Sabine
In a previous study [1], the restitution of timbre by loudspeakers was evaluated by perceptual and physical measurements, in order to find the link between these two parallel approaches. An experimental protocol was built which consists in recording the sound radiated by single loudspeakers in a room, and then submitting the recorded sounds to listening tests under headphones. Even if the spatial dimension of sound reproduction is not investigated here, the stereophony might also influence the restitution of timbre by loudspeakers. A panel of loudspeakers was recorded in both monophonic and stereophonic reproductions. The recordings were submitted to listening tests in order to evaluate the influence of the stereophony on the perceived differences between loudspeakers. This influence turns out to be minor, and the perceptual spaces resulting from multidimensional scaling analysis of the perceived differences were not affected by the change of the reproduction configuration.

6620
Audio Processor ICs for Advanced TV
Mansson, Johan
A family of fully integrated advanced audio processors for advanced TV applications implemented in deep submicron complementary metal oxide semiconductor (CMOS) will be presented. These are mixed signal designs with integrated data converters and digital signal processing to enable rich acoustic experience for cost-sensitive systems. Digital performance, in terms of speed and large memory, in conjunction with low cost required by the application, calls for the use of a small geometric CMOS process. It is technically challenging to design high performance converters on such sub-micron CMOS, particular due to substrate noise and flicker noise. To overcome these issues, continuous-time converter architectures have been developed and designed.

6621
The Native B-Format Microphone
Benjamin, Eric; Chen, Thomas
Ambisonic sound recording is predicated on the acquisition of audio signals in what has been termed “B-format”, which is the output of a microphone array known as the Soundfield Microphone. That microphone is a tetrahedral array of nominally cardioid microphone capsules. The capsule signals are processed in such a way as to give four output signals which are proportional to the pressure and the three-dimensional particle velocity vector at the center of the array. These same signals can be achieved by the use of a single omnidirectional microphone and three figure-of-eight microphones located close to each other. Errors in the shape of the polar pattern or in the ratio of the direct to diffuse-field sensitivity of the microphones result. The two methods are compared in both a theoretical analysis and in acoustical measurements. The results of the comparison of experimental recordings using these two types of arrays will be presented in part two of this paper.

6622
A Web Search Engine for Sound Effects
Bailey, Stephen V.; Rice, Stephen V.
FindSounds.com is the first Web search engine for sound effects. Queries are processed using a selective index of Web audio files that includes sound effects and musical instrument samples but excludes song and speech recordings. A text search retrieves audio files based on how they are labelled, and a “sounds-like” search locates audio files based on sound similarity. Each month FindSounds.com processes more than 1.5 million queries for more than 150,000 Internet users.

6623
Alternative Approaches for Recording Surround Sound
Preston, Colin
The aim of this paper is to describe a series of adaptations of stereo microphone techniques for surround sound. The adaptations provide variable microphone polar patterns, multiple outputs for surround sound, as well as retaining a conventional 2 channel stereo output. The system enables the recording engineer to position and adjust the microphone cluster for a conventional stereo output and then derive the surround sound outputs. It is also possible to record the microphone outputs separately, and create the desired polar patterns in postproduction. The technique also has creative applications for multimicrophone or multi-track situations where a number of single microphones would be used.

6624
Microphones, High Wind and Rain
Brixen, Eddy B.
In outdoor-recording, high wind and rain are generally a problem as this causes unwanted noise in the microphone signal. In order to prevent noise, the microphones can be protected from the weather by using windjammers, windscreens, etc. However, the effectiveness of these products varies significantly and full specifications characterizing maximum wind speed permitted, wind attenuation, spectral damping, influence by rain etc., are seldom given. In this paper an overview of the problems involved in outdoor recording is presented. It also presents measurement methods that provide more informative and useful specifications.

6625
Automatic Evaluation of Musical Sound Separation Quality
Dziubinski, Marek; Kostek, Bozena
This paper addresses the problem of evaluating effectiveness of musical sound separation algorithms. The standardized procedure for evaluating separation quality does not exist. The most convincing and typical way to do this is by carrying out subjective listening tests. However, subjective tests need a solid statistical validation, which means that many experts should take part in such tests, the room characteristics should be adequate, and what is also important, such tests are time consuming. Thus this paper attempts to show that it is possible to carry out the evaluation tests in an automatic way, by employing an Artificial Network System (ANN), which is further justified by experts’ opinion.

6626
Internet-Based Automatic Hearing Assessment System
Czyzewski, Andrzej; Kostek, Bozena; Skarzynski, Henryk
In the paper the Internet-based system that allows for automatic testing of hearing is described. Hearing impairment is one of the fastest growing diseases of modern society. Therefore it is very important to organize mass screening tests to identify people suffering from this kind of impairment. The described application provides a test that uses automatic questionnaire analysis, standardized audiometric tone test procedures, and assessment of speech intelligibility in noise. When all the testing is completed, the system automatically analyzes the results for each person examined. Based on the number of incorrect answers, the decision is made automatically by the expert system. Persons whose hearing impairment is confirmed are referred to treatment in rehabilitation centers. All these centers are connected via the Internet and are provided with special distributed database access allowing them to automatically register and track the patient discovered during the remote screening.

6627
Constructing Individual and Group Timbre Spaces for Sharpness-Matched Distorted Guitar Timbres
Martens, William L.; Marui, Atsushi
In a previous study on predicting timbral variation resulting from distortion-based effects processing, two types of direct ratings were collected from a large group of na¨ive subjects (approximately 50) for a set of sharpness-matched guitar timbres; first on a dissimilarity scale and then on 11 bipolar adjective scales. For the current study, similar data were collected from five trained subjects to allow for a comparison between results derived for each of the trained subjects and the previously reported group results. To investigate differences in how individuals might describe these distorted guitar timbres, the trained subjects’ adjective ratings were submitted separately to analysis using a method inspired by Repertory Grid Technique (RGT), as well as being submitted for MultiDimensional Unfolding (MDU) analyses.

6628
Physiological and Content Considerations for a Second Low Frequency Channel for Bass Management, Subwoofers, and LFE
Miller III, Robert (Robin)
By convention, frequencies below 90Hz produce no interaural cues useful for spatial sound or localization. Yet some claim they are able to hear a difference between a single subwoofer channel (whether or not to more than one subwoofer) and two (“stereo bass”). Reported research supports the Jeffress model of interaural time difference (ITD) determination in brain structures, and extending the accepted lower frequency limit of interaural phase difference (IPD). Meanwhile, uncorrelated very low frequencies (VLF<100Hz) exist in nearly all existing multi-channel music and movie content. The audibility, recording, and reproduction of uncorrelated VLF are explored in theory and experiments.

6629
Individual Vocabulary Profiling of Spatial Enhancement Systems for Stereo Headphone Reproduction
Lorho, Gaëtan
This paper presents an audio descriptive analysis experiment employing an individual vocabulary development approach. The aim of the study was to compare the perceptual characteristics of spatial enhancement systems for stereo headphone reproduction. Five musical programs were selected and processed with two subsets of algorithms representing different approaches to spatial enhancement for headphones, including stereo enhancement systems and Virtual Home Theatre systems for headphone reproduction. Ten listeners were selected based on their discriminative and descriptive skills. Each subject developed his or her own set of attributes in three hours and performed a comparative evaluation of seven series of eight stimuli. The methods employed for the descriptive analysis process and for the analysis of this individual vocabulary profiling data are presented and some results from the perceptual evaluation are reported.

6630
Assessing the Suitability of Digital Signal Processing as Applied to Performance Audio Such as In-Ear Monitoring Systems
Armstrong, Steve; Gordon, Keith; Rule, Betty
In the sound reinforcement field, current in-ear monitor (IEM) systems provide a number of benefits over floor wedges including hearing protection, reduced stage volume and improved coverage. However, new problems arise from occlusion caused by the tight earbud seal while old problems such as lack of personalization still remain. By applying digital signal processing (DSP) derived from the current state of the art in the hearing aid (HA) industry, these problems can be overcome. DSP is applied to both ambient microphones located at the users’ ears and the monitor audio feed. It provides multi-band parametric equalization, compression and limiting for each feed and ear separately, allowing for precise tailoring of the sound, including compensation for hearing loss.

6631
New Data Format to Describe Complex Sound Sources
Ahnert, Wolfgang; Bock, Steffen; Feistel, Stefan
Originated by the evolution of modern sound systems and the resulting need to describe them formally, a new data format was developed to contain the mechanical, electronic and acoustic properties of complex sound sources, such as loudspeaker clusters, column speakers or line arrays. The GLL (Generic Loudspeaker Library) format can utilize measurement data such as impulse responses or transfer functions directly, or include already post-processed fractional-octave data. To conveniently manage the amount of data involved, a specialized storage algorithm was developed. Furthermore, a freely available software is presented that was created during the research work. It allows for utilizing these import and export functions as well as for investigating the data. As a part of the EASE acoustic simulation program, the software will allow loudspeaker manufacturers and users of acoustic prediction software to create and exchange complex loudspeaker data in a high-resolution format.

6632
The Significance of Phase Data for the Acoustic Prediction of Combinations of Sound Sources
Ahnert, Wolfgang; Feistel, Stefan
Until today, only few acoustic prediction software packages utilize complex directivity data to characterize sound sources, such as line arrays. This work gives some theoretical background on the significance of phase data for the prediction of combinations of coherent sound sources. A mathematical model is introduced that allows evaluating the error ranges for several loudspeaker measurement methods. It is shown that, in contrast to what one would naively expect, the choice of the reference point for measuring complex data is rather irrelevant within given limits. These limits are derived based on the propagation equation for spherical waves. It is further shown analytically that the use of phase data reduces the measurement error to be expected by at least an order of magnitude.

6633
Simulation, Auralization and Their Verification of Acoustic Parameters Using Line Arrays
Ahnert, Wolfgang; Feistel, Stefan
An existing room with installed line arrays was investigated by means of corresponding measurements. Furthermore in EASE4.1 a simulation of direct sound coverage as well as of the room impulse response has been done, to derive as a result sound levels and intelligibility measures. Here different approaches have been utilized including a statistic field approximation and the room acoustic algorithms of EASE AURA. Additionally the same acoustic parameters have been measured using the measurement software EASERA. The comparison of measured and simulated results showed a good correlation. To additionally confirm the similarity, a comparison of auralized files was performed which showed deviations audible only for experts.