Various types of noise and other forms of degradation in the acoustic signal are typical of speech recordings used in forensic speaker recognition. The results of this study suggest that certain speech enhancement algorithms can be a useful tool for preprocessing speech samples before attempting automated recognition. This is particularly true for additive noise such as instrumental music and noise inside of a moving car. Comparing equal-error rates of identification experiments for ten male speakers based on the original, degraded, and enhanced voice signals, the performance of the speaker recognition system was most affected by pop music in both single-channel and 2-channel recordings. In contrast, road traffic and restaurant noise do not markedly degrade recognition performance.
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.