Last Updated: 20050816, mei
P9 - Posters: Miscellaneous -1
Saturday, October 8, 1:00 pm — 2:30 pm
P9-1 A Robust Partial Tracker for Analysis of Music Signals—Hamid Satar-Boroujeni, Bahram Shafai, Northeastern University - Boston, MA, USA
We propose a novel approach for tracking of partials in music signals based on a robust Kalman filter. This tracker is based on a regularized least-squares approach that is designed to minimize the worst-possible regularized residual norm over the class of admissible uncertainties at each iteration. We introduce a set of state-space models for our signals based on the evolution of frequency and amplitude in different classes of musical instruments. These prior models are used to estimate future values of partial tracks in successive time frames of our spectral data. Parameters of evolution models are treated as bounded uncertainties, and our tracker can robustly track both frequency and power partials in all frequency regions.
Convention Paper 6566 (Purchase now)
P9-2 Automatic Retrieval of Musical Rhythmic Patterns—Bozena Kostek, Gdansk University of Technology - Gdansk, Poland; Jaroslaw Wojcik, Wroclaw University of Technology - Wroclaw, Poland
Even though the research within Music Information Retrieval domain is well-advanced, searching for music is still under development. Thanks to melody search methods applied in "query by humming" systems, users can retrieve melodies on the basis of an audio input. However, the research on rhythm is not advanced to such an extent yet. This paper addresses automatic retrieval of rhythmic patterns based on symbolic representation of music employing repeating rhythmic and melodic patterns. In the experiments the importance of melorhythmic representation of a musical piece is verified and compared to the sound duration-based hypothesis ranking method. Since most of the musical files to be found on the Internet are polyphonic the lowest or the highest sounds of the chords are also taken into consideration.
Convention Paper 6567 (Purchase now)
P9-3 A Spectrogram Display for Loudspeaker Transient Response—David Gunness, William Hoy, Eastern Acoustic Works, Inc. - Whitinsville, MA, USA
A spectrogram is a two-dimensional depiction of a waveform or transfer function in which frequency is depicted on one axis and time is depicted on the other. The level is plotted against frequency and time by using a color or gray scale. If the time resolution is constant, the display is usually referred to as a Fourier transform spectrogram. If the time resolution is scaled to the frequency, it is usually referred to as a wavelet transform spectrogram. In this paper we present a novel and efficient method for calculating a wavelet transform spectrogram, which is optimized for the analysis of loudspeaker transient response. The new method employs complex convolution of the frequency response, rather than explicit time domain windowing, or the wavelet transform.
Convention Paper 6568 (Purchase now)
P9-4 Quality Enhancement of Low Bit Rate MPEG1-Layer 3 Audio Based on Audio Resynthesis—Demetrios Cantzos, Chris Kyriakakis, University of Southern California - Los Angeles, CA, USA
One of the most popular audio compression formats is indisputably the MPEG1-Layer 3 format which is based on the idea of low-bit transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate MP3 encoded audio segments by applying multichannel audio resynthesis methods in a post-processing stage or during decoding. Our algorithm employs the highly efficient Generalized Gaussian mixture model, which, combined with cepstral smoothing, leads to very low cepstral reconstruction errors. In addition, residual conversion is applied which proves to significantly improve the enhancement performance. The method presented can be easily generalized to include other audio formats for which sound quality is an issue.
Convention Paper 6569 (Purchase now)
P9-5 Obtaining 120-dB Performance Using Switching Power Supplies—Gregg Rouse, Larry Gaddy, AKM Semiconductors, Inc. - San Jose, CA, USA
There is a growing tendency to use switching power supplies to reduce costs. Many designers believe that switching supplies and high performance are mutually exclusive. With some careful design considerations, it is possible to optimize performance using switching power supplies. This paper will review selections of optimal switching frequency. The tradeoffs of switching frequency and efficiency, which are typical in high performance systems, will be examined. Measurement results using asynchronous audio switching clocks will be presented. Measurement results of synchronizing power supply switching to audio clocking multiples will demonstrate how to achieve 120-dB performance, the current high benchmark for professional audio systems.
Convention Paper 6570 (Purchase now)
P9-6 Influence of Artificial Mouth’s Directivity in Determining Speech Transmission Index—Fabio Bozzoli, Paolo Bilzi, Angelo Farina, University of Parma - Parma, Italy
In room acoustics, one of the most used parameters for evaluating the speech intelligibility is the Speech Transmission Index (STI). The experimental evaluation of this STI generally employs an artificial speaker (binaural head) and listener (artificial mouth). In this paper the influence on the measurements of the emission directivity of the artificial mouth was investigated for different acoustic environments and we have found that, in many cases (i.e., big rooms or systems of telecommunications) the results are not sensitive to modifications of the directivity; on the contrary, inside cars the shape of the whole balloon of directivity is important for determining correct and comparable values and the different mouths studied gives really different results in the same situation.
Convention Paper 6571 (Purchase now)