Three experiments were conducted into the identification of speakers from their voices after electronic disguise using pitch scaling and vocal tract length scaling. A cohort of undergraduate students was used as a source of both speakers and listeners. The speed and accuracy with which speakers were identified from their voices was measured in conditions ranging from undisguised to severely distorted. Results show that when listeners know speakers well, identification accuracy can be very high, and it is hard to disguise speakers by pitch and vocal tract length scaling alone. Recognition levels close to chance were only achieved when extreme levels of disguise were applied, corresponding to a pitch increase of 12 semitones together with vocal tract length reduction of 20%. These were also the most unnatural and most distorted conditions. The implications of the study for the use of voice disguise in witness protection are considered.
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.