Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model

Ben Ali, Faten; Djaziri-Larbi, Sonia; Girin, Laurent

AES E-Library

Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model

In speech/music coders and analysis/synthesis systems, spectral modeling is generally performed on a short-term (ST) frame-by-frame basis, which is justified by the fact that the signal is only locally (quasi-) stationary. The vocal tract configuration moves slowly and smoothly thereby resulting in a high correlation between the spectral parameters of successive frames: this correlation property is exploited in long-term modeling of the ST parameters, which however results in longer modeling/coding delays. The short delay constraint can be relaxed in many applications, such as text-to-speech modification/synthesis, telephony surveillance data, digital answering machines, electronic voicemail, digital voice logging, electronic toys, and video games. The long-term harmonic plus noise model (LT-HNM) for speech shows additional data compression possibilities since it exploits the smooth evolution of the time trajectories of the short-term harmonic plus noise model parameters by applying a discrete cosine model (DCM). In this paper, the authors extend the LT-HNM to a complete low bit-rate speech coder that is based on a long-term approach ca. 200ms. The proposed LT-HNM coder reaches a bit-rate of 2.7kbps for wideband speech.

Authors: Ben Ali, Faten; Djaziri-Larbi, Sonia; Girin, Laurent
Affiliations: University of Tunis El Manar, National Engineering School of Tunis, Signal and Systems Lab, Tunis, Tunisia; GIPSA Lab, University Grenoble Alpes, France, and INRIA Grenoble Rhone-Alpes, France(See document for exact affiliation information.)
JAES Volume 64 Issue 11 pp. 844-857; November 2016
Publication Date: December 1, 2016 Import into BibTeX
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=18522

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: (CD JAES64) /jaes64/11/pg844.pdf

DOI: http://dx.doi.org/10.17743/jaes.2016.0028

Start a discussion about this paper!

AES E-Library

Low Bit-Rate Speech Codec Based on a Long-Term Harmonic Plus Noise Model

ABOUT AES

Contact Us