Scalable, Content-Based Audio Identification by Multiple Independent Psychoacoustic Matching
A software system for content-based identification of audio recordings is presented. The system transforms its input using a perceptual model of the human auditory system, making its output robust to lossy compression and to other distortions. In order to make use of both the instantaneous pattern of a recording's perceptual features and the information contained in the evolution of these features over time, the system first matches fragments of the input against a database of fragments of known recordings. In a subsequent step, these matches at the fragment level are assembled in order to identify a single recording that matches consistently over time. In a small-scale test the system has matched all queries successfully against a database of 100 000 commercially released recordings.
Click to purchase paper or login as an AES member. If your company or school subscribes to the AES Journal then you can look for this paper in the institutional version of the Online Journal. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.