Forensic Automatic Speaker Recognition with Degraded and Enhanced Speech
Various types of noise and other forms of degradation in the acoustic signal are typical of speech recordings used in forensic speaker recognition. The results of this study suggest that certain speech enhancement algorithms can be a useful tool for preprocessing speech samples before attempting automated recognition. This is particularly true for additive noise such as instrumental music and noise inside of a moving car. Comparing equal-error rates of identification experiments for ten male speakers based on the original, degraded, and enhanced voice signals, the performance of the speaker recognition system was most affected by pop music in both single-channel and 2-channel recordings. In contrast, road traffic and restaurant noise do not markedly degrade recognition performance.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is temporarily free for AES members.