Phonetic-Oriented Identification of Twin Speakers Using 4-Second Vowel Sounds and a Combination of a Shift-Invariant Phase Feature (NRD), MFCCs, and F0 Information

Ferreira, Anìbal

AES E-Library

Phonetic-Oriented Identification of Twin Speakers Using 4-Second Vowel Sounds and a Combination of a Shift-Invariant Phase Feature (NRD), MFCCs, and F0 Information

Automatic speaker identi?cation typically relies on sophisticated statistical modeling and classi?cation which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classi?cation, and constructive combination of congruent classi?cation scores. We use phase, spectral magnitude and F0-related features in speaker identi?cation experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classi?cation score fusion. Insights for further research are also presented.

Author: Ferreira, Anìbal
Affiliation: University of Porto, Porto, Portugal
AES Conference: 2019 AES International Conference on Audio Forensics (June 2019)
Paper Number: 13
Publication Date: June 8, 2019 Import into BibTeX
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20466

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: /conf/2019/forensics/2019_audio_forensics_paper_13.pdf

Start a discussion about this paper!

AES E-Library

Phonetic-Oriented Identification of Twin Speakers Using 4-Second Vowel Sounds and a Combination of a Shift-Invariant Phase Feature (NRD), MFCCs, and F0 Information

ABOUT AES

Contact Us