AES London 2011
Paper Session P16

P16 - Audio Signal Processing and Analysis


Sunday, May 15, 14:00 — 17:30 (Room 1)

Chair:
Jayant Datta

P16-1 A New Approach to Designing Decimation Filters for Oversampled A/D ConvertersJamie A. S. Angus, University of Salford - Salford, Greater Manchester, UK
This paper presents a new approach to designing the necessary decimation filter in over-sampled noise shaping analog to digital converters. These filters are still finite impulse response designs but they are designed using novel window functions that produce a stop-band attenuation that increases with frequency thus matching the out of band noise characteristics of the noise-shaping modulator. As a result high quality decimation filters can be realized using shorter filter lengths.
Convention Paper 8414 (Purchase now)

P16-2 Warped IIR Filter Design with Custom Warping Profiles and its Application to Room Response Modeling and EqualizationBalázs Bank, Budapest University of Technology and Economics - Budapest, Hungary
In traditional warped FIR and IIR filters, the frequency-warping profile is adjusted by a single free parameter, leading to a less flexible allocation of frequency resolution. As an example, it is not possible to achieve a truly logarithmic frequency resolution, which would be often desired in audio applications. In this paper a new approach is presented for warped IIR filter design where the filter specification is transformed by any desired (e.g., logarithmic) frequency transformation, and a standard IIR filter is designed to this transformed specification. Then, the poles and zeros of this transformed filter are found and mapped back to the original frequency scale. Due to the approximations in mapping back the poles and zeros, the resulting transfer function may show some discrepancies from its optimal version. This is resolved by an additional optimization of the zeros of the final filter. Examples of loudspeaker-room response modeling and equalization are presented.
Convention Paper 8415 (Purchase now)

P16-3 Computationally Efficient Nonlinear Chebyshev Models Using Common-Pole Parallel Filters with the Application to Loudspeaker ModelingBalázs Bank, Budapest University of Technology and Economics - Budapest, Hungary
Many audio systems show some form of nonlinear behavior that has to be taken into account in modeling. For this, often a black-box model is identified, coming from the generality and simplicity of this approach. One such model is the simplified Volterra model, using parallel branches that have a polynomial-type nonlinearity and a linear filter in series. For example, Chebyshev models use Chebyshev polynomials as nonlinear functions, making the model identification a very straightforward procedure by using logarithmic sweep measurements. This paper proposes a highly efficient implementation of Chebyshev models by using fixed-pole parallel filters for the linear filtering part. The efficiency comes from the fact that parallel filters can have a logarithmic frequency resolution, which better fits the human hearing behavior than traditional FIR or IIR filters. Moreover, the branches can share the same denominators, leading to an additional performance benefit. The proposed model is particularly well suited for the real-time digital simulation of loudspeakers and other weakly nonlinear devices, such as tube guitar amplifiers.
Convention Paper 8416 (Purchase now)

P16-4 Event-Driven Real-Time Audio Processing with GPGPUsTiziano Leidi, Thierry Heeb, Marco Colla, ICIMSI-SUPSI - Manno, Switzerland; Jean-Philippe Thiran, EPFL - Lausanne, Switzerland
Development of real-time audio processing applications for GPGPUs is not without challenges. Parallel processing of audio signals is often constrained by serial dependencies within or between the algorithms. On GPGPUs, insufficient data pressure further limits the attainable performance improvements, as it causes inactivity of the GPU cores. In this paper we analyze the limits of audio processing on GPGPUs and present an approach based on event-driven scheduling, that maximizes data pressure to favor performance improvements. We also present recent enhancements of Audio n-Genie, an open-source development environment for audio-processing applications. By combining Audio n-Genie and the proposed approach, we show that it is possible to increase audio processing speed-up.
Convention Paper 8417 (Purchase now)

P16-5 A Comparison of Parametric Optimization Techniques for Tone MatchingMatthew Yee-King, Goldsmiths, University of London - London, UK; Martin Roth, Reality Jockey, Ltd. - London, UK
Parametric optimization techniques are compared in their abilities to elicit parameter settings for sound synthesis algorithms, which cause them to emit sounds as similar as possible to target sounds. A hill climber, a genetic algorithm, a neural net, and a data driven approach are compared. The error metric used is the Euclidean distance in MFCC feature space. This metric is justified on the basis of its success in previous work. The genetic algorithm offers the best results with the FM and subtractive test synthesizers but the hill climber and data driven approach also offer strong performance. The concept of sound synthesis error surfaces, allowing the detailed description of sound synthesis space, is introduced. The error surface for an FM synthesizer is described and suggestions are made as to the resolution required to effectively represent these surfaces. This information is used to inform future plans for algorithm improvements.
Convention Paper 8418 (Purchase now)

P16-6 On the Multichannel Sinusoidal Model for Coding Audio Object SignalsToni Hirvonen, Institute of Computer Science, Foundation for Research and Technology–Hellas (FORTH-ICS) Heraklion - Crete, Greece (now with Dolby Laboratories, Stockholm, Sweden); Athanasios Mouchtaris, Institute of Computer Science, Foundation for Research and Technology–Hellas (FORTH-ICS), Heraklion - Crete, Greece, and University of Crete, Heraklion, Crete, Greece
This paper presents two improvements on a recently proposed multichannel sinusoidal modeling system for coding multiple audio object signals. The system includes extracting the sinusoidal components and an LPC envelope for each object signal, as well as transform coding of the residuals' downmix. The contributions of this paper are: (a) a psychoacoustic model for enabling the system to scale well with multiple object signals, and (b) an improved method to encode the common residual, tailored to the "white" nature of this signal. As a result, sound quality of around 90% on the MUSHRA scale is obtained for 10 simultaneous object signals coded with a total rate of 150 kbit/s, while retaining the individual object parametric representations.
Convention Paper 8419 (Purchase now)

P16-7 An Additive Synthesis Technique for Independent Modification of the Auditory Perceptions of Brightness and WarmthAsteris Zacharakis, Joshua Reiss, Queen Mary University of London - London, UK
An algorithm that achieves independent modification of two low-level features that are correlated with the auditory perceptions of brightness and warmth was implemented. The perceptual validity of the algorithm was tested through a series of listening tests in order to examine whether the low-level modification was indeed perceived as independent and to investigate the influence of the fundamental frequency on the perceived modification. A Multidimensional Scaling analysis (MDS) on listener responses to pairwise dissimilarity comparisons accompanied by a verbal elicitation experiment examined the perceptual significance and independence of the two low-level features chosen. This is a first step for the future development of a perceptually based control of an additive synthesizer.
Convention Paper 8420 (Purchase now)


Return to Paper Sessions