AES Amsterdam 2008: Poster Session P16

P16 - Analysis and Synthesis of Sound, Part 1

Monday, May 19, 09:30 — 11:00
P16-1 A Channel Vocoder Using Wavelet Packets over a Reconfigurable Device—César Daniel Salvador Castañeda, Pontificia Universidad Católica del Perú - Lima, Peru
A channel vocoder using wavelet packets for computer music applications is proposed. The inputs are a modulating signal, which is choice to be voice, and a carrier signal, which can be music or noise. The Wavelet Packets Channel Vocoder transforms windowed frames of both signals to a symmetric multiresolution representation, mixes the envelope of the modulating signal with the carrier, and transforms back the result to the original domain. Simulations run with Simulink. Real time implementations are presented for Pure Data and Xilinx Virtex II Pro FPGA. Appropriate choices of window, overlap, wavelets, decomposition levels, and envelope detector are presented to achieve different sound effects. Finally, new ideas to improve transmission and compression rates in future works are also proposed.
Convention Paper 7416 (Purchase now)

P16-2 The Effects of Lossy Audio Encoding on Genre Classification Tasks—Kurt Jacobson, Queen Mary University of London - London, UK; Ben Fields, Goldsmith's College, University of London - London, UK; Mark Sandler, Queen Mary, University of London - London, UK; Michael Casey, Goldsmith's College, University of London - London, UK
In large audio collections, it is common to store audio content using perceptual encoding. However, encoding parameters may vary from collection to collection or even within a collection—using different bit rates, sample rates, codecs, etc. We evaluate the effect of various lossy audio encodings on the application of audio spectrum projection features to the automatic genre classification tasks. We show that decreases in mean classification accuracy, while small, are statistically significant for bit-rates of 96-kbps or lower. Also, a heterogeneous collection of audio encodings has statistically significant decreases in mean classification accuracy compared to a pure PCM collection.
Convention Paper 7417 (Purchase now)

P16-3 Loop Region Detection in Music Signals—Bee Suan Ong, Sebastian Streich, Centre for Advanced Sound Technologies, Yamaha Corporation - Japan
Spotting loops within a music recording seems to be an easy task for human listeners. Nevertheless it becomes highly time and effort consuming when loop segments are to be identified from a large music collection. The process can be greatly facilitated with an audio editing tool that highlights regions where loops appear and suggests loop durations respectively. This paper proposes a method for computing both types of information from the music signals. Our approach is based on identifying sequential and regular repetitions of tonal features. In addition, we present a prototype implementation featuring the proposed method to facilitate the audio browsing and searching process. Finally, we discuss other possible applications of this technology in the audio content description context.
Convention Paper 7418 (Purchase now)

P16-4 Music-Inspired Harmony Search Algorithm Applied to Feature Selection for Sound Classification in Hearing Aids—Javier Amor, Enrique Alexandre, Roberto Gil-Pita, Lorena Álvarez, Ester Huerta, Universidad de Alcalá - Alcalá, Spain
This paper explores the application of the music-inspired harmony search algorithm to the problem of feature selection for sound classification in digital hearing aids. The importance of this problem is given by the strong computational constraints inherent to the DSPs used in modern digital hearing aids. The goal of the feature selection algorithm is to select a subset of features in order to reduce the computational complexity of the system while maintaining a low probability of error. A set of experiments will be performed to test the performance of the proposed system, using a total of 74 different features. The results will be compared with those obtained using other widely-used algorithms, such as a genetic algorithm, a sequential search algorithm or random search.
Convention Paper 7419 (Purchase now)

P16-5 Analysis of the Effects of Finite Precision in Sound Classifiers for Digital Hearing Aids—Ester Huerta, Enrique Alexandre, Roberto Gil-Pita, Lorena Álvarez, Javier Amor, Universidad de Alcalá - Alcalá de Henares, Madrid, Spain
This paper deals with the analysis of quantization effects in an automatic sound classification system for DSP-based hearing aids. The results obtained in this work will be used to find out the impact of finite accuracy determined by the digital signal processor (DSP) on the users of hearing aids. The DSP has a finite word length that affects the main ability of these systems: the automatic adaptation to the changing acoustic environment. The goal of this work is to model a quantized Neural Network-based classifier in order to compare the probability of error obtained with those nonfinite precision systems.
Convention Paper 7420 (Purchase now)

P16-6 A Constructive Algorithm for Multilayer Perceptrons for Speech/Non-Speech Classification in Hearing Aids—Lorena Álvarez, Enrique Alexandre, Raúl Vicen, Lucas Cuadra, Manuel Rosa, Universidad de Alcalá - Alcalá de Henares, Spain
Constructive learning algorithms offer an attractive approach for the incremental construction of near-minimal neural-network architectures for pattern classification. This paper explores the feasibility of using a constructive algorithm for multilayer perceptrons (MLPs) applied to the problem of speech/non-speech classification in hearing aids. When properly designed and trained, MLPs are able to generate an arbitrary classification frontier with a relatively low computational complexity. The paper will focus on the design of a constructive algorithm for MLPs that attempts to converge to the minimum complexity network for the given problem. The results obtained will be compared with those cases in which the constructive algorithm is not considered.
Convention Paper 7421 (Purchase now)

P16-7 Seeing the Inaudible. Descriptors Used for Generating Objective and Reproducible Data in Real-Time for Musical Instrument Playing Standard Situations—Tobias Grosshauser, Diemo Schwarz, Norbert Schnell, IRCAM - Paris, France
This paper describes a method to generate objective and reproducible data to assist instrument teaching and practicing. the method is based on using audio descriptors and their efficient visualization that assist in the perception of musical parameters difficult to hear. To aid comparison, we defined and recorded a comprehensive database of positive and negative sound examples from the violin that encompasses frequent mistakes made by students and a wide variety of playing styles.
Convention Paper 7422 (Purchase now)

Last Updated: 20080612, tendeloo

AES Amsterdam 2008Poster Session P16

AES Amsterdam 2008
Poster Session P16