A soundscape recording captures the sonic environment at a given location at a given time using one or more fixed or moving microphones. In most cases, the soundscape is uncontrolled and unscripted. Human listeners experience sonic components as being either background or foreground depending on their salient perceptual characteristics, such as proximity, repetition, and spectral attributes. Analyzing soundscapes in research tasks requires the classification and segmentation of the important sonic components, but that process is time consuming when done manually. This research establishes the background and foreground classification task within a musicological and soundscape context and then presents a method for the automatic segmentation of soundscape recordings. Using a soundscape corpus with ground truth data obtained from a human perception study, the analysis shows that participants have a high level of agreement on the category assigned to background samples (92.5%), foreground samples (80.8%), and background with foreground samples (75.3%). Experiments demonstrate how smaller window sizes affect the performance of the classifier.
https://www.aes.org/e-lib/browse.cfm?elib=18334
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Learn more about the AES E-Library
Start a discussion about this paper!