We present a method using temporal information of musical audio to considerably improve time-scaling with pitch-preservation. This work builds on a signal model where the original signal is split into transient and steady state components, based on an adaptive multiresolution analysis. By considering only the transient signal content, temporal segmentation is achieved with a much higher degree of accuracy than standard onset detection algorithms. Only the segmented steady state regions are then stretched, whilst phase is locked in the temporally masked regions at transients. Despite local variances in stretching factor, rhythm is maintained globally, yielding perceptually very high quality results for a range of complex polyphonic musical signals, at a low computational cost.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.