AES E-Library

AES E-Library

On the Physiological Validity of the Group Delay Response of All-Pole Vocal Tract Modeling

Document Thumbnail

Magnitude-oriented approaches dominate the voice analysis front-ends of most current technologies addressing, e.g., speaker identification, speech coding/compression, and voice reconstruction and re-synthesis. A popular technique is all-pole vocal tract modeling. The phase response of all-pole models is known to be non-linear and highly dependent on the magnitude frequency response. In this paper we use a shift-invariant phase-related feature that is estimated from signal harmonics in order to study the impact of all-pole models on the phase structure of voiced sounds. We relate that impact to the phase structure that is found in natural voiced sounds to conclude on the physiological validity of the group delay of all-pole vocal tract modeling. Our findings emphasize that harmonic phase models are idiosyncratic, and this is important in speaker identification and in fostering the quality and naturalness of synthetic and reconstructed speech.

Author:
Affiliation:
AES Convention: Paper Number:
Publication Date:
Subject:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=19764

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this paper!


AES - Audio Engineering Society