In large audio collections, it is common to store audio content using perceptual encoding. However, encoding parameters may vary from collection to collection or even within a collection - using different bit rates, sample rates, codecs, etc. We evaluate the effect of various lossy audio encodings on the application of audio spectrum projection features to the automatic genre classification tasks. We show that decreases in mean classification accuracy, while small, are statistically significant for bit-rates of 96kbps or lower. Also, a heterogeneous collection of audio encodings has statistically significant decreases in mean classification accuracy compared to a pure PCM collection.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.