Most of recent studies on vocal detection in audio recordings typically focus either on the development of new features or on classification methods. The impact of training and test data is largely neglected, leading to weaknesses in the design of databases which do not cover differences of vocal techniques across music genres. In this paper, we compare approaches for singing voice detection on individual genres. For both methods with the best performance, we further investigate the impact of disjunct distribution of training and test tracks with regard to their genres. In particular, the tracks of electronic genres, which are barely contained in public databases for vocal recognition, contribute to a better classification performance identifying vocals in tracks of other genres.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.