A Robust and Computationally Efficient Speech/Music Discriminator
A New method for discriminating between speech and music signals is introduced. The strategy is based on the extraction of four features, whose values are combined linearly into a unique parameter. This parameter is used to distinguish between the two kinds of signals. The method has achieved an accuracy superior to 99%, even for severely degraded and noisy signals. Moreover, the low dimensionality of the feature space, together with a very simple information-merging technique, has resulted in a remarkable robustness to new situations. The low computational complexity of the method makes it appropriate for applications that demand real-time operation. Finally excellent resolution for the segmentation of audio streams is achieved by manipulating the analyzed data properly.
Click to purchase paper or login as an AES member. If your company or school subscribes to the AES Journal then you can look for this paper in the institutional version of the Online Journal. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.