Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification

Pham, Lam; McLoughlin, Ian; Phan, Huy; Palaniappan, Ramaswamy; Lang, Yue

AES E-Library

Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification

This work proposes bag-of-features deep learning models for acoustic scene classi?cation (ASC) – identifying recording locations by analyzing background sound. We explore the effect on classi?cation accuracy of various front-end feature extraction techniques, ensembles of audio channels, and patch sizes from three kinds of spectrogram. The back-end process presents a two-stage learning model with a pre-trained CNN (preCNN) and a post-trained DNN (postDNN). Additionally, data augmentation using the mixup technique is investigated for both the pre-trained and post-trained processes, to improve classi?cation accuracy through increasing class boundary training conditions. Our experiments on the 2018 Challenge on Detection and Classi?cation of Acoustic Scenes and Events - Acoustic Scene Classi?cation (DCASE2018-ASC) subtask 1A and 1B signi?cantly outperform the DCASE2018 reference implementation and approach state-of-the-art performance for each task. Results reveal that the ensemble of multi-spectrogram features and data augmentation is bene?cial to performance.

Authors: Pham, Lam; McLoughlin, Ian; Phan, Huy; Palaniappan, Ramaswamy; Lang, Yue
Affiliations: University of Kent, School of Computing, Medway, UK; University of Kent, School of Computing, Medway, UK; University of Kent, School of Computing, Medway, UK; University of Kent, School of Computing, Medway, UK; Huawei Technologies Co. Ltd., Shenzhen, China(See document for exact affiliation information.)
AES Conference: 2019 AES International Conference on Audio Forensics (June 2019)
Paper Number: 12
Publication Date: June 8, 2019 Import into BibTeX
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20465

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: /conf/2019/forensics/2019_audio_forensics_paper_12.pdf

Start a discussion about this paper!

AES E-Library

Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification

ABOUT AES

Contact Us