The acceptability of a newly proposed technology for commercial application is often assumed if the sound quality reached in a listening test surpasses a certain target threshold. As an example, it is a well-established procedure for decisions on the deployment of audio codecs to run a listening test comparing the coded/decoded signal with the uncoded reference signal. For other technologies, e.g. upmix or binaural processing, however, the unprocessed signal only can act as a “comparison signal”. Here, the goal is to achieve a significant preference of the processed over the comparison signal. For such preference listening tests, we underline the importance of clustering the test results to obtain additional valuable information, as opposed to using the standard statistic metrics like mean and confidence interval. This approach allows determining the size of the user group that significantly prefers to use the proposed algorithm when it would be available in a consumer device. As an example, listening test data for binaural processing algorithms are analyzed in this investigation.
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.