Improved Prediction of Multichannel Audio Quality by the Use of Envelope ITD of High Frequency Sounds
In the assessment of the multichannel audio coding systems, spatial factors are also important as well as timbral factors. A prediction model by Choi et al.  that extended the ITU-R Rec. BS.1387-1  to the multichannel audio coding systems with the use of spatial features, ITDDist (Interaural Time Difference Distortion), ILDDist (Interaural Level Difference Distortion), and IACCDist (InterAural Cross Correlation Distortion), is an example. In that implementation, the ITDDists were computed only for the low frequency (below 1500Hz) sounds and ILD distortions were computed only for the high frequency components. That implementation is reasonable under classical duplex theory . However, in high frequency range, the interaural difference in temporal envelopes is also important in spatial perception, especially in sound localization. In order to investigate the role of such ITD on prediction of perceived spatial quality in quantitative way, a new model to compute the ITD distortions of temporal envelopes in high frequency components is introduced in this paper. The computed ITD distortions of temporal envelopes in high frequency components were highly correlated with perceived sound quality. Moreover, when the proposed envelope ITD distortion was included in the prediction model in  as one of the multiple features to predict overall sound quality, it enhanced the overall performance of sound quality prediction compared with the model in .
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is temporarily free for AES members.