A robust music audio fingerprinting system for automatic music retrieval is proposed in this paper. The fingerprinting feature is extracted from the long-term dynamic modulation spectrum estimation in perceptual compressed domain. The modulation frequency analysis, smoothing with a low-pass filter and the low resolution quantization significantly improve the robustness of the feature. Further the fast searching problem is solved by looking up hash table with 32-bit hash values. The hash value bits are quantized from the logarithmic scale modulation frequency coefficients. The system obtains 50.6%, 92.6%, 99.4%, or 100% search precision with approximately zero false positive rate when the query clips’ signal-to-noise ratio is <0dB, 0~5dB, 5~15dB, or >15dB, respectively.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.