AES Conventions and Conferences

   Return to 116th
   Detailed Calendar
         (in Excel)
   Calendar (in PDF)
   Preliminary Program
   4 Day Planner PDF
   Convention Program
         (in PDF)
   Exhibitor Seminars
         (in PDF)
   Paper Sessions
   Tutorial Seminars
   Special Events
   Exhibitor Seminars
   Student Program
   Heyser Lecture
   Tech Comm Mtgs
   Standards Mtgs
   Hotel Information
   Travel Info
   Press Information

v3.0, 20040325, ME

Session Z3 Sunday, May 9 09:30 h–11:00 h
Posters: Signal Processing & Audio in Broadcasting

Signal Processing
Efficient Arbitrary Sample Rate Conversion Using Zero Phase IIR FiltersSeyed Ali Azizi, Harman/Becker Automotive Systems, Ittersbach, Germany
Modern asynchronous sample rate converters (ASRCs) are composed of an interpolation filter to increase the sample rate by an integer factor, followed by a polynomial interpolator that produces the desired output samples at arbitrary output sampling time instants. A crucial feature determining the precision of the ASRCs is the phase linearity of the interpolation filter in use. That is the main reason why traditionally easily realizable linear phase FIR filters, but not IIR filters suffering from inherent phase nonlinearity, have been employed as interpolation filters, although IIR filters are more economical. This paper introduces a novel ASRC design approach which uses the zero phase IIR filtering concept to produce highly efficient, linear phase IIR interpolation filters to be used in ASRCs. The basic concept is explained and the functions of the involved units are investigated.
Z3-2 A Study on Implementing Switching Transfer Functions Focusing on Wave DiscontinuityAkihiro Kudo, Haruhide Hokari, Shoji Shimada, Nagaoka University of Technology, Niigata, Japan
Many papers have described moving sound image localization schemes that use loudspeakers or headphones. Most of these schemes are based on switching spatial transfer functions, so wave discontinuity occurs at the moment of switching, which degrades the sound quality. While the characteristics of the wave discontinuity depend on the moving sound image localization schemes, no paper appears to have considered the relationship between the wave discontinuity and the scheme used. To rectify this omission, this paper examines three approaches: simple switching approach, overlap-add approach, and fade-in-fade-out approach, We assess the sound degradation caused by wave discontinuity and use the objective measure of spectrum distortion width to quantify the wave discontinuity. We also carry out paired comparison tests as subjective assessments. Both assessments verify that the third approach is the best of the three.
Z3-3 Warped DFT Based Perceptual Noise Reduction SystemAlexander Petrovsky, Marek Parfieniuk, Adam Borowicz, Bialystok Technical University, Bialystok, Poland
This paper considers a novel application of the Warped Discrete Fourier Transform in a single channel noise reduction system. Namely, the WDFT is simultaneously the basis for the spectral weighting and psychoacoustic model, thus allowing the overall system to operate strictly in a critical band domain. The warped transform allows nonuniform allocation of the z-transform frequency samples in accord with the Bark scale. Thus, the psychoacoustic modeling is more accurate than in the DFT-based solutions, and the subjective quality of enhanced speech increases. The noise suppression algorithm utilizes the majority of currently most advanced ideas in perceptually motivated spectral weighting. Its advantage is in the fact that the masking threshold is directly involved in the weighting rule.
Z3-4 Digital Loudspeaker Arrays Driven by 1-Bit SignalsNicolas-Alexander Tatlas, John Mourjopoulos, University of Patras, Patras, Greece
Loudspeaker arrays driven by digital bit streams are direct digital-signal to acoustic transducers, usually comprising a digital signal processing module with driving actuators. Current research efforts are focusing on topologies directly driven by multibit digital bit streams. In this paper the above investigations are extended to the case of using 1-bit signals such as sigma-delta for driving such topologies using time and frequency domain analysis. Simulation results will be presented for ideal actuators. Finally, an optimized architecture for such a loudspeaker will be proposed, based on this analysis.
Z3-5 Unsupervised Classification Techniques for Multipitch EstimationJulie Rosier, Yves Grenier, ENST, Paris, France
In this paper we present a fast and efficient technique for multipitch estimation of musical signals. We deal with mixtures where several instruments are present in a monophonic recording. The approach consists in clustering the spectral peaks of the mixture to obtain a spectral representation of each musical note. These spectra are then used to estimate the fundamental frequencies. We compare two techniques for the classification of the spectral peaks: a K-means procedure and a simpler aggregation technique associated with a criterion that represents the closeness to harmonicity for any couple of frequency peaks. This comparison is made on complex mixtures holding various musical instruments and piano chord mixtures. The effectiveness of the two estimation methods is presented using computation of pitch recognition rates and mean source number estimate.
Z3-6 Speaker Array Calibration Using Inter-Speaker Range Measurements—Jeffrey Walters1, Scott Wilson1, Jonathan Abel2
Stanford University, Stanford, CA, USA
Universal Audio, Inc., Santa Cruz, CA, USA
Given an array of speakers and a set of noisy inter-speaker range estimates, we consider the problem of estimating the relative positions of the array elements. A closed-form position estimator that minimizes an equation error norm is presented and shown to be related to a multidimensional scaling analysis. The information inequality is used to bound position estimate mean square error and to gauge the accuracy of the closed-form estimator. A geometric interpretation of the bound variance is given and used in examining our simulation results.

Audio in Broadcasting
Loudness in TV SoundJean Paul Moerman, VRT, Belgian National Broadcasters for the Flemish Community, Brussels, Belgium
Nowadays, in a world of super-audio formats, the loudness problem is one of the most important elements for an audience to get an informative and relaxed experience. When zapping through the channels, loudness-differences are quite the usual thing. But also within one broadcaster, levels are not consistent from one program switch to another. Viewers are extremely annoyed and complaints are to be expected, but no major enhancement has been undertaken in the broadcast world. Surprisingly enough the transition from analog to digital did not improve matters—on the contrary, it became much worse!
The trap to be the loudest is very tempting. The use of heavily compression techniques and the development of new signal processors have fed a culture of rivaling loudness. Louder attracts attention, but in the end the viewer will turn down the volume and discover a beaten, compressed, and uninteresting sound. A common solution to the loudness-problem is to try to correct the level at the end of the production chain. Inserting just one peace of equipment right before transmission cannot solve this: a processor, which solves all of the problems. This results in a sound that even causes listening fatigue. It should be clear that a more extensive solution is necessary.
Our solution was the installation of a broadcast processor in every facility unit within the VRT. The program will also be processed just before transmission and pro format: mono, nicam-stereo, and recently audio for DVB-T. Most important was not to forget the training of all technicians from every unit as postproduction, studio, OB-facility, continuity, and transmission. Even the (non-sound-minded) editors who fill in all the production aspects in an off-line video facility, do need some facts on how to judge loudness. The external production units of advertising trailers and programs should also be given the necessary information.
Z3-8 Audio Processing for Digital Broadcast Mediums—Frank Foti, Omnia Audio, Cleveland, OH, USA
Over the past few years, as development, testing, and rollout progressed regarding the HD-radio (IBOC), DAB, and DRM transmission systems, audio processing has been one of the key components to augment this new technology. It became apparent that dynamics processing would figure in both the aural and technical performance aspects of these new systems. It has been successfully proven that signal processing improved other bit-rate-reduced audio services such as Internet audio streaming, especially at low bit rates. This paper will offer examples of proven methods that demonstrate the benefits of audio processing in the digital broadcast system. There are some important issues that must be considered, or digital radio’s benefits will not be fully realized.

Back to AES 116th Convention Back to AES Home Page

(C) 2004, Audio Engineering Society, Inc.