MP3 compression classification through audio analysis statistics
×
Cite This
Citation & Abstract
J. McFarlane, and BH. RA. Chakravarthi, "MP3 compression classification through audio analysis statistics," Paper 10558, (2022 May.). doi:
J. McFarlane, and BH. RA. Chakravarthi, "MP3 compression classification through audio analysis statistics," Paper 10558, (2022 May.). doi:
Abstract: MP3 audio compression can be undesirable in circumstances where high-quality music presentation is required and there is a lack of automated, evidenced, and open-source methods to determine this. This study introduced a new and accessible approach to discriminate between compression levels and identify lossy audio transcoding. Machine learning classifiers were trained on feature sets of audio analysis statistics, derived from multiple step-wise re-encodings of compressed audio samples. Two classifiers, a stacked model and a XGBoost-based model, had comparable accuracies to previous examples in the literature and marketplace (Stacked: 0.947, XGBoost: 0.970, Literature reference: 0.965, Commercial reference: 0.980). For transcoded samples, which hide compression levels with post-processing, the new classifiers were less accurate than existing methods. However, all methods were inaccurate in identifying transcodes where artificial noise was added via the µ-law encoder. A command-line implementation is available at gitlab.com/jammcfar/kbps_detect_proto.
@article{mcfarlane2022mp3,
author={mcfarlane, jamie and chakravarthi, bharathi raja},
journal={journal of the audio engineering society},
title={mp3 compression classification through audio analysis statistics},
year={2022},
volume={},
number={},
pages={},
doi={},
month={may},}
@article{mcfarlane2022mp3,
author={mcfarlane, jamie and chakravarthi, bharathi raja},
journal={journal of the audio engineering society},
title={mp3 compression classification through audio analysis statistics},
year={2022},
volume={},
number={},
pages={},
doi={},
month={may},
abstract={mp3 audio compression can be undesirable in circumstances where high-quality music presentation is required and there is a lack of automated, evidenced, and open-source methods to determine this. this study introduced a new and accessible approach to discriminate between compression levels and identify lossy audio transcoding. machine learning classifiers were trained on feature sets of audio analysis statistics, derived from multiple step-wise re-encodings of compressed audio samples. two classifiers, a stacked model and a xgboost-based model, had comparable accuracies to previous examples in the literature and marketplace (stacked: 0.947, xgboost: 0.970, literature reference: 0.965, commercial reference: 0.980). for transcoded samples, which hide compression levels with post-processing, the new classifiers were less accurate than existing methods. however, all methods were inaccurate in identifying transcodes where artificial noise was added via the µ-law encoder. a command-line implementation is available at gitlab.com/jammcfar/kbps_detect_proto.},}
TY - paper
TI - MP3 compression classification through audio analysis statistics
SP -
EP -
AU - McFarlane, Jamie
AU - Chakravarthi, Bharathi Raja
PY - 2022
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2022
TY - paper
TI - MP3 compression classification through audio analysis statistics
SP -
EP -
AU - McFarlane, Jamie
AU - Chakravarthi, Bharathi Raja
PY - 2022
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2022
AB - MP3 audio compression can be undesirable in circumstances where high-quality music presentation is required and there is a lack of automated, evidenced, and open-source methods to determine this. This study introduced a new and accessible approach to discriminate between compression levels and identify lossy audio transcoding. Machine learning classifiers were trained on feature sets of audio analysis statistics, derived from multiple step-wise re-encodings of compressed audio samples. Two classifiers, a stacked model and a XGBoost-based model, had comparable accuracies to previous examples in the literature and marketplace (Stacked: 0.947, XGBoost: 0.970, Literature reference: 0.965, Commercial reference: 0.980). For transcoded samples, which hide compression levels with post-processing, the new classifiers were less accurate than existing methods. However, all methods were inaccurate in identifying transcodes where artificial noise was added via the µ-law encoder. A command-line implementation is available at gitlab.com/jammcfar/kbps_detect_proto.
MP3 audio compression can be undesirable in circumstances where high-quality music presentation is required and there is a lack of automated, evidenced, and open-source methods to determine this. This study introduced a new and accessible approach to discriminate between compression levels and identify lossy audio transcoding. Machine learning classifiers were trained on feature sets of audio analysis statistics, derived from multiple step-wise re-encodings of compressed audio samples. Two classifiers, a stacked model and a XGBoost-based model, had comparable accuracies to previous examples in the literature and marketplace (Stacked: 0.947, XGBoost: 0.970, Literature reference: 0.965, Commercial reference: 0.980). For transcoded samples, which hide compression levels with post-processing, the new classifiers were less accurate than existing methods. However, all methods were inaccurate in identifying transcodes where artificial noise was added via the µ-law encoder. A command-line implementation is available at gitlab.com/jammcfar/kbps_detect_proto.
Open Access
Authors:
McFarlane, Jamie; Chakravarthi, Bharathi Raja
Affiliation:
National University of Ireland
AES Convention:
152 (May 2022)
Paper Number:
10558
Publication Date:
May 2, 2022Import into BibTeX
Subject:
Sound Classification
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=21671