MP3 compression classification through audio analysis statistics

McFarlane, Jamie; Chakravarthi, Bharathi Raja

AES E-Library

MP3 compression classification through audio analysis statistics

MP3 audio compression can be undesirable in circumstances where high-quality music presentation is required and there is a lack of automated, evidenced, and open-source methods to determine this. This study introduced a new and accessible approach to discriminate between compression levels and identify lossy audio transcoding. Machine learning classifiers were trained on feature sets of audio analysis statistics, derived from multiple step-wise re-encodings of compressed audio samples. Two classifiers, a stacked model and a XGBoost-based model, had comparable accuracies to previous examples in the literature and marketplace (Stacked: 0.947, XGBoost: 0.970, Literature reference: 0.965, Commercial reference: 0.980). For transcoded samples, which hide compression levels with post-processing, the new classifiers were less accurate than existing methods. However, all methods were inaccurate in identifying transcodes where artificial noise was added via the µ-law encoder. A command-line implementation is available at gitlab.com/jammcfar/kbps_detect_proto.

Open
Access

Authors: McFarlane, Jamie; Chakravarthi, Bharathi Raja
Affiliation: National University of Ireland
AES Convention: 152 (May 2022) Paper Number: 10558
Publication Date: May 2, 2022 Import into BibTeX
Subject: Sound Classification
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=21671

AES E-Library

MP3 compression classification through audio analysis statistics

ABOUT AES

Contact Us