Since the evaluation of audio systems or processing schemes is time-consuming and resource-expensive, alternative objective evaluation methods attracted considerable research interests. However, current perceptual models are not yet capable of replacing a human listener especially when the test stimulus is complex, for example, a sound scene consisting of time-varying multiple acoustic images. This paper describes a data-driven approach to develop a model to predict the subjective evaluation of complex acoustic scenes, where the extensive set of listening test results collected in the latest MPEG-H 3-D audio initiative was used as training data. The results showed that a few selected outputs of various auditory models may be a useful set of features, where linear regression and multilayer perceptron models reasonably predicted the overall distribution of listening test scores, estimating both mean and variance.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.