AES E-Library

AES E-Library

Phonetic-Oriented Identification of Twin Speakers Using 4-Second Vowel Sounds and a Combination of a Shift-Invariant Phase Feature (NRD), MFCCs, and F0 Information

Document Thumbnail

Automatic speaker identi?cation typically relies on sophisticated statistical modeling and classi?cation which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classi?cation, and constructive combination of congruent classi?cation scores. We use phase, spectral magnitude and F0-related features in speaker identi?cation experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classi?cation score fusion. Insights for further research are also presented.

Author:
Affiliation:
AES Conference:
Paper Number:
Publication Date:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20466

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this paper!


AES - Audio Engineering Society