State of the art audio encoders are based on transform-domain coding algorithms. Due to time-frequency uncertainty, transform domain coders suffer from ?pre-echo? and ?diffusion? artifacts during transient portions of the signal. These artifacts occur because of large transform lengths used to achieve higher coding gains. Audio encoders employ various tools such as adaptive transform lengths, TNS etc to efficiently code transient portions of the audio signal. Typically audio signals have time domain transients (e.g. castanets), frequency domain transients (e.g. flute, clarinet) and transients observed in speech signals during consonant to vowel transitions etc. Identification of these transients in an audio signal is vital to achieve perceptual quality at low bit-rates. This paper discusses the various transient classes present in audio signals, apart from describing a transient detector employed for efficient modeling of all classes of transients. The proposed transient detector has been incorporated in MPEG-4 AAC encoder, independent of the psycho-acoustic analysis methodology used. Listening tests as well as OPERA scores indicate substantial improvement in audio quality, over the baseline encoder.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.