AES 110th Convention: Session M

Other AES Events

Chairman's Welcome

General Information

Exhibitors

Calendar in Excel

Calendar in PDF

Paper Sessions

Workshops

Special Events

Historical Program

Student Program

Technical Tours

Cultural Tours

Standards Comm Mtgs

Technical Comm Mtgs

Registration

Session M Tuesday, May 15 8:30 - 12:30 hr Room B

Signal Processing for Audio, Part 2

Chair: John Mourjopoulos, University of Patras, Patras, Greece

8:30 hr M-1
Neural Networks Applied to Sound Localization Detection
Andrzej Czyzewski, Bozena Kostek & Rafal Krolikowski
Technical University of Gdansk, Gdansk, Poland

The primary aim of this paper is to show that it is possible to localize the direction of the incoming acoustical signal based on the neural network trained for that purpose. Consequently, the automatically localized acoustical signal may be attenuated if it obscures the desired target sound. A set of parameters was formulated in order to localize target source and unwanted signals. In order to process acoustical signals incoming from various directions at the same time the neural network-based system was designed and implemented. The feature extraction method is thoroughly discussed, the training process is described and recently obtained results are discussed.
Paper 5375

9:00 hr M-2
MPEG-2 AAC Multi-Channel Real-Time Implementation on a Single Floating Point DSP
Wolfgang Fiesel, Harald Gernhardt, Doris Huhn & Nikolaus Rettelbach
Fraunhofer IIS, Erlangen, Germany

Recently introduced, the ISO/MPEG-2 Advanced Audio Coding (AAC) coding technology provides a powerful framework which covers almost any application from simple monophonic compression to full-featured multi-channel coding. This paper discusses approaches and results of an implementation effort of a 5.1 AAC multi-channel encoder on a high performance floating point catalog DSP platform. Based on a new AAC coding strategy, this leads to a very efficient encoder on a single DSP, thus enabling cost-effective high-quality multi-channel encoding also for consumer type applications.
Paper 5376

9:30 hr M-3
DSD-Wide: A Practical Implementation for Professional Audio
Peter Eastty, Peter Thorpe, Nathan Bentall, Gary Cook, Chris Gerard, Chris Sleight & Mike Smith
Sony Pro-Audio R&D, Oxford, UK

This paper presents practical recipes for the processing of DSD-Wide [64FS 8-bit] signals which are fully compatible with the DSD [64FS 1-bit] signals used by the SACD consumer audio format. The designs are presented in a schematic form compatible with implementation by interested engineers in either FPGA or (with some modification) by traditional DSP methods. This is intended to open up the processing of such Super High Fidelity signals to a wider audience.
Paper 5377

10:00 hr M-4
Kautz Filters and Generalized Frequency Resolution - Theory and Audio Applications
Tuomas Paatero & Matti Karjalainen
Helsinki University of Technology, Espoo, Finland

Frequency-warped filters have recently been applied successfully to a number of audio applications. The idea of all-pass delay elements replacing unit delays in digital filters allows for focusing of enhanced frequency resolution on lowest (or highest) frequencies and enable good match to the psycho-acoustic Bark scale. Kautz filters can be seen as a further generalization where each transversal element may be different, including complex conjugate poles. This enables arbitrary allocation of frequency resolution for filter design, such as modeling and equalization (inverse modeling) of linear systems. In this paper we formulate strategies for using Kautz filters in audio applications. Case studies of loudspeaker equalization, room response modeling, and guitar body modeling for sound synthesis are presented.
Paper 5378

10:30 hr M-5
Segmentation of Musical Signals Using Hidden Markov Models
Jean-Julien Aucouturier & Mark Sandler
King's College, London, UK

In this paper, we present a segmentation algorithm for acoustic musical signals, using a hidden Markov model. Through unsupervised learning, we discover regions in the music that present steady statistical properties: textures. We investigate different front-ends for the system, and compare their performances. We then show that the obtained segmentation often translates a structure explained by musicology: chorus and verse, different instrumental sections, etc. Finally, we discuss the necessity of the HMM and conclude that an efficient segmentation of music is more than a static clustering and should make use of the dynamics of the data.
Paper 5379

11:00 hr M-6
AudioID: Towards Content-Based Identification of Audio Material
Eric Allamanche, J�rgen Herre, Oliver Hellmuth, Bernhard Fr�ba & Markus Cremer
Fraunhofer IIS, Erlangen, Germany

Fuelled by the digital revolution, efficient data reduction schemes and the break-through of the Internet, an ever-increasing amount of audio material has become available in digital format recently. Efficient handling and possibility of identification for these items are becoming extremely important to manage such amounts of content. This paper describes a prototype system for a content-based identification system of audio material based on a database of registered works. The technical approach is outlined and the system's current performance and the range of possible applications are discussed.
Paper 5380

11:30 hr M-7
Time-Variant Orthogonal Matrix Feedback Delay Network Reverberator
Shreyas Paranjpe
Parthus Technologies, San Jose, CA, USA

We developed an artificial reverberation device based on a novel, time-variant orthogonal matrix feedback delay network topology. Our novel topology uses multiple, time-variant output taps for each delay line, and therefore simultaneously reduces the amount of delay memory required without introducing coloration, and increases the echo density. Furthermore, the system is guaranteed stable provided that certain constraints on the delay line lengths and tap weights are fulfilled. Our implementation on a 24-bit digital signal processor requires only 16384 words of delay line memory for a four channel input / four channel output reverb.
Paper 5381

12:00 hr M-8
Broad-Band Acoustic Noise Reduction using a Novel Frequency Depended Parametric Wiener Filter - Implementations using Filter-Bank, STFT and Wavelet Analysis/Synthesis Techniques
George Kalliris, George Papanikolaou & Charalampos Dimoulas
Aristotle University of Thessaloniki, Thessaloniki, Greece

Equivalent masking noise estimation could be introduced in conventional broad-band acoustic noise reduction, to provide a new class of modified techniques. The psycho-acoustical facts exploited in this paper, result to a frequency-depended parametric Wiener filter. A discussion of classical spectral subtraction, and a proof of equivalence under certain conditions to Wiener filtering, is given first. The concept of parametric Wiener filter is then examined, and a frequency dependence based on the model of pure tones masked by broad-band noise, is introduced. Filter-bank, STFT and wavelet implementations of the new approach, are finally compared to classical spectral subtraction for background noise reduction in old 78 rpm music disc recordings and noisy speech tape recordings.
Paper 5382

Return to list of Sessions