Journal Forum

Sound Board: High-Resolution Audio - October 2015

Synchronized Swept-Sine: Theory, Application, and Implementation - October 2015

Effect of Microphone Number and Positioning on the Average of Frequency Responses in Cinema Calibration - October 2015
1 comment

Access Journal Forum

AES E-Library

How Efficient is MPEG-7 for General Sound Recognition?

Document Thumbnail

Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on several basis decomposition algorithms vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we evaluate three approaches: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into a continuous hidden Markov model (CHMM) classifier. Our conclusion is that established MFCC features yield better performance compared to MPEG-7 ASP in the general sound recognition under practical constraints.

AES Conference:
Paper Number:
Publication Date:

Click to purchase paper or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this paper!

Facebook   Twitter   LinkedIn   Google+   YouTube   RSS News Feeds  
AES - Audio Engineering Society