AES Munich 2009
Poster Session P29

Sunday, May 10, 13:30 — 15:00

P29 - Signal Analysis, Measurements, Restoration

P29-1 Evaluation and Comparison of Audio Chroma Feature Extraction Methods—Michael Stein, Benjamin M. Schubert, Ilmenau University of Technology - Ilmenau, Germany; Matthias Gruhne, Gabriel Gatzsche, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany; Markus Mehnert, Ilmenau University of Technology - Ilmenau, Germany
This paper analyzes and compares different methods for digital audio chroma feature extraction. The chroma feature is a descriptor, which represents the tonal content of a musical audio signal in a condensed form. Therefore chroma features can be considered as an important prerequisite for high-level semantic analysis, like chord recognition or harmonic similarity estimation. A better quality of the extracted chroma feature enables much better results in these high-level tasks. In order to discover the quality of chroma features, seven different state-of-the-art chroma feature extraction methods have been implemented. Based on an audio database, containing 55 variations of triads, the output of these algorithms is critically evaluated. The best results were obtained with the Enhanced Pitch Class Profile.
Convention Paper 7814 (Purchase now)

P29-2 Measuring Transient Structure-Borne Sound in Musical Instruments—Proposal and First Results from a Laser Intensity Measurement Setup—Robert Mores, Hamburg University of Applied Sciences - Hamburg, Germany; Marcel thor Straten, Consultant - Seevetal, Germany; Andreas Selk, Consultant - Hamburg, Germany
The proposal for this new measurement setup is motivated by curiosity in transients propagating across arched tops of violins. Understanding the impact of edge construction on transient wave reflection back to the to the top of a violin or on conduction into the rib requires single-shot recordings possibly without statistical processing. Signal-to-noise ratio should be high although mechanical amplitudes at distinct locations on the structure surface are in the range of a few micrometers only. In the proposed setup, the intensity of a laser beam is directly measured after passing a screen attached to the device under test. The signal-to-noise ratio achieved for one micrometer transients in single-shot recordings is significantly more than 60 dB.
Convention Paper 7815 (Purchase now)

P29-3 Evaluating Ground Truth for ADRess as a Preprocess for Automatic Musical Instrument Identification—Joseph McKay, Mikel Gainza, Dan Barry, Dublin Institute of Technology - Dublin, Ireland
Most research in musical instrument identification has focused on labeling isolated samples or solo phrases. A robust instrument identification system capable of dealing with polytimbral recordings of instruments remains a necessity in music information retrieval. Experiments are described that evaluate the ground truth of ADRess as a sound source separation technique used as a preprocess to automatic musical instrument identification. The ground truth experiments are based on a number of basic acoustic features, while using a Gaussian Mixture Model as the classification algorithm. Using all 44 acoustic feature dimensions, successful identification rates are achieved.
Convention Paper 7816 (Purchase now)

P29-4 Improving Rhythmic Pattern Features Based on Logarithmic Preprocessing—Matthias Gruhne, Christian Dittmar, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany
In the area of Music Information Retrieval, the rhythmic analysis of music plays an important role. In order to derive rhythmic information from music signals, several feature extraction algorithms have been described in the literature. Most of them extract the rhythmic information by auto-correlating the temporal envelope derived from different frequency bands of the music signal. Using the auto-correlated envelopes directly as an audio-feature is afflicted with the disadvantage of tempo dependency. To circumvent this problem, further postprocessing via higher-order statistics has been proposed. However, the resulting statistical features are still tempo dependent to a certain extent. This paper describes a novel method, which logarithmizes the lag-axis of the auto-correlated envelope and discards the tempo-dependent part. This approach leads to tempo-invariant rhythmic features. A quantitative comparison of the original methods versus the proposed procedure is described and discussed in this paper.
Convention Paper 7817 (Purchase now)

P29-5 Further Developments of Parameterization Methods of Audio Stream Analysis for Security Purposes—Pawel Zwan, Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland
The paper presents an automatic sound recognition algorithm intended for application in an audiovisual security monitoring system. A distributed character of security systems does not allow for simultaneous observation of multiple multimedia streams, thus an automatic recognition algorithm must be introduced. In the paper a module for the parameterization and automatic detection of audio events is described. The spectral analysis of sounds of a broken window, gunshot, and scream are performed and parameterization methods are proposed and discussed. Moreover, a sound classification system based on the Support Vector Machines (SVM) algorithm is presented and its accuracy is discussed. The practical application of the system with the use of a monitoring station is shown. The plan of further experiments is presented and the conclusions are derived.
Convention Paper 7818 (Purchase now)

P29-6 Estimating Instrument Spectral Envelopes for Polyphonic Music Transcription in a Music Scene-Adaptive Approach—Julio J. Carabias-Orti, Pedro Vera-Candeas, Nicolas Ruiz-Reyes, Francisco J. Cañadas-Quesada, Pablo Cabañas-Molero, University of Jaén - Linares, Spain
We propose a method for estimating the spectral envelope pattern of musical instruments in a musical scene-adaptive scheme, without having any prior knowledge about the real transcription. A musical note is defined as stable when variations between its harmonic amplitudes are held constant during a certain period of time. A density-based clustering algorithm is used with the stable notes in order to separate different envelope models for each note. Music scene-adaptive envelope patterns are finally obtained from similarity and continuity of the different note models. Our approach has been tested in a polyphonic music transcription scheme with synthesized and real music recordings obtaining very promising results.
Convention Paper 7819 (Purchase now)

AES Munich 2009Poster Session P29

P29 - Signal Analysis, Measurements, Restoration

AES Munich 2009
Poster Session P29