AES E-Library

AES E-Library

Improving Domain Generalization Via Event-Based Acoustic Scene Classification

Document Thumbnail

Acoustic Scene Classification (ASC) has been typically addressed by feeding raw audio features to deep neural networks. However, such an audio-based approach has consistently proved to result in poor model generalization across different recording devices. In fact, device-specific transfer functions and nonlinear dynamic range compression highly affect spectro-temporal features, resulting in a deviation from the learned data distribution known as domain shift. In this paper, we present an alternative ASC paradigm that involves ditching the classic end-to-end audio-based training in favor of gathering an intermediate event-based representation of the acoustic scenes using large-scale pretrained models. Performance evaluation on the TAU Urban Acoustic Scenes 2020 Mobile Development dataset shows that the proposed event-based approach is up to 160% more robust than corresponding audio-based methods in the face of mismatched recording devices.

Authors:
Affiliations:
Express Paper 5; AES Convention 153; October 2022
Publication Date:
Subject:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=21933

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this Semantic Audio!


AES - Audio Engineering Society