Metadata for Audio 25th International AES Conference 17th to 19th June 2004 London UK
Important dates
Contact information
OverviewProgrammePapers by session

Metadata for Audio

Poster CD5-2

How Efficient Is MPEG-7 for General Sound Recognition?

Hyoung-Gook Kim, Juan Josť Burred, Thomas Sikora
Technical University Berlin, Berlin, Germany

Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on several basis decomposition algorithms vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we evaluate three approaches: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into a hidden Markov model (HMM) classifier. Our conclusion is that established MFCC features yield better performance compared to MPEG-7 ASP in general sound recognition under practical constraints.

Conference logo