AES E-Library

AES E-Library

Supervised and Unsupervised Sound Retrieval by Vocal Imitation

Document Thumbnail

Existing methods to index and search audio documents are generally based on text metadata and text-based search engines, but this approach is often problematic and time consuming because the text label does not necessarily describe the audio content. Query by Example (QBE) is an alternative approach for improving the effectiveness and efficiency of sound retrieval. In this research, the authors propose a novel approach for sound query by vocal imitation. Vocal imitation is commonly used in human communication and can be employed for human-computer interaction. Two proposals are suggested: (1) a supervised system that trains a multiclass classifier using training vocal imitations of different sound classes in the library and classifies a new imitation query into one of the classes; (2) an unsupervised system that is more flexible because it measures the feature distance between the imitation query and each sound in the library, returning sounds most similar to the query. Such systems require an effective feature representation of imitation queries and sounds in the library. Existing handcrafted audio features may not work well given the variety of vocal imitations and the mismatch between vocal imitations and actual sounds. It is therefore proposed to learn feature representations from training vocal imitations automatically using a Stacked Auto-Encoder (SAE). Experiments show that sound retrieving performance by automatically learned features outperform those using carefully handcrafted features in both supervised and unsupervised settings.

Authors:
Affiliation:
JAES Volume 64 Issue 7/8 pp. 533-543; July 2016
Publication Date:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=18339

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

DOI:

Start a discussion about this paper!


AES - Audio Engineering Society