A Simple Hybrid Approach to the Time-Scale Modification of Speech
Time-domain methods of time-scale modification (TSM) are attractive from the point of view of computational effort. However, they suffer from audible artifacts for larger timestretch ratios (greater than 1.3 times the original duration). The occurrence of these artifacts is often the main justification for the use of more involved analysis/synthesis methods at these ratios. For speech signals these artifacts take the form of transient repetition—causing a “stuttering” effect and roughness due to spectral mismatch at segment boundaries—most obvious during voiced signal periods. These phenomena are not addressed by existing timedomain methods. A simple hybrid algorithm utilizing both time-domain and analysis/synthesis methods is presented which illustrates how these distortions may be minimized. Results of formal listening tests illustrate an improvement in basic audio quality for timestretched speech signals when compared to equivalent samples processed by the synchronized overlap and add (SOLA) algorithm.
Click to purchase paper or login as an AES member. If your company or school subscribes to the AES Journal then you can look for this paper in the institutional version of the Online Journal. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.