With the growing capability of recording and storage devices, the problem of indexing large audio databases has been the object of much attention. Most of this effort is dedicated to automatic inferences from indexed metadata. In contrast, browsing audio databases in an effective manner has been less considered. This report studies the relevance of a semantic organization of sounds to ease the browsing of a sound database. For such a task, semantic access to data is traditionally implemented by a keyword selection process. However, various limitations of written language, such as word polysemy, ambiguities, or translation issues, may bias the browsing process. Two sound presentation strategies organized sounds spatially to reflect an underlying semantic hierarchy. For the sake of comparison, the authors also considered a display whose spatial organization was only based on acoustic cues. Those three displays were evaluated in terms of search speed in a crowdsourcing experiment using two different corpora: environmental sounds from urban environments and sounds produced by musical instruments. Coherent results demonstrate the usefulness of an implicit semantic organization for representing sounds in terms of both search speed and of learning efficiency.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.