The authors discuss the way we perceive the quality of a speech signal and how different degradations contribute to the overall perceived speech (listening) quality. More specifically, ITU-T Recommendation P.862 (perceptual evaluation of speech quality—PESQ), which provides a perceptual modeling approach with which the subjectively perceived speech quality can be predicted, is used as a starting point for a degradation decomposition algorithm. This algorithm decomposes the perceived degradation into three different contributions by finding specific degradation indicators that quantify the impact of each type of degradation separately. The first degradation indicator quantifies the impact of additive noise as found in many speech-processing situations, such as when unwanted background noise is sent over a voice connection. The second degradation indicator quantifies the impact of linear timeinvariant frequency response distortions as, for example, introduced by a band-limited telephone system. The last degradation indicator quantifies the impact of the time-varying behavior of the system under test. This time response degradation indicator quantifies the impact of temporal signal loss, as found with packet loss in modern digital speech connections, and the impact of pulses (clicks) as found in many speech-processing systems.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.