Friday, October 20, 1:30 pm — 4:00 pm
P14-1 A Statistical Model that Predicts Listeners’ Preference Ratings of In-Ear Headphones: Part 2—Development and Validation of the Model—Sean Olive, Harman International - Northridge, CA, USA; Todd Welti, Harman International Inc. - Northridge, CA, USA; Omid Khonsaripour, Harman International - Northridge, CA, USA
Part 1 of this paper presented the results of controlled listening tests where 71 listeners both trained and untrained gave preference ratings for 30 different models of in-ear (IE) headphones. Both trained and untrained listeners preferred the headphone equalized to Harman IE target curve. Objective measurements indicated the magnitude response of the headphone appeared to be a predictor of its preference rating, and the further it deviated from the response of the Harman IE target curve the less it was generally preferred. Part 2 presents a linear regression model that accurately predicts the headphone preference ratings (r = 0.91) based on the size, standard deviation and slope of the magnitude response deviation from the response of the Harman IE headphone target curve.
Convention Paper 9878
P14-2 Comparison of Hedonic and Quality Rating Scales for Perceptual Evaluation of High- and Intermediate Quality Stimuli—Nick Zacharov, DELTA SenseLab - Hørsholm, Denmark; Christer Volk, DELTA SenseLab - Hørsholm, Denmark; Tore Stegenborg-Andersen, DELTA SenseLab - Hørsholm, Denmark
In this study four rating scales for perceptual evaluation of Preference were compared: 9-point hedonic, Continuous Quality Scale (CQS) (e.g., used in ITU-R BS.1534-3 , “MUSHRA”), Labelled Hedonic Scale (LHS) , and a modified version of the LHS. The CQS was tested in three configurations to study the role and impact of the reference and anchor stimuli, namely: A full MUSHRA test with anchors and references, a test without references, and a test with neither references nor anchors. The six test configurations were tested with two groups of AAC codec qualities: High and Intermediate quality ranges. Results showed that the largest difference in scale usage were caused by having a declared reference, but also that the scale range usage is not strongly related to stimuli discrimination power.
Convention Paper 9879
P14-3 Perceptual Evaluation of Source Separation for Remixing Music—Hagen Wierstorf, University of Surrey - Guildford, Surrey, UK; Dominic Ward, University of Surrey - Guildford, Surrey, UK; Russell Mason, University of Surrey - Guildford, Surrey, UK; Emad M. Grais, University of Surrey - Guildford, Surrey, UK; Chris Hummersone, University of Surrey - Guildford, Surrey, UK; Mark D. Plumbley, University of Surrey - Guildford, Surrey, UK
Music remixing is difficult when the original multitrack recording is not available. One solution is to estimate the elements of a mixture using source separation. However, existing techniques suffer from imperfect separation and perceptible artifacts on single separated sources. To investigate their influence on a remix, five state-of-the-art source separation algorithms were used to remix six songs by increasing the level of the vocals. A listening test was conducted to assess the remixes in terms of loudness balance and sound quality. The results show that some source separation algorithms are able to increase the level of the vocals by up to 6 dB at the cost of introducing a small but perceptible degradation in sound quality.
Convention Paper 9880
P14-4 Adaptive Low-frequency Extension Using Auditory Filterbanks—Sunil G. Bharitkar, HP Labs., Inc. - San Francisco, CA, USA; Timothy Mauer, HP, Inc. - San Francisco, CA, USA; Charles Oppenheimer, HP, Inc. - San Francisco, CA, USA; Teresa Wells, HP, Inc. - San Francisco, CA, USA; David Berfanger, HP, Inc. - San Francisco, CA, USA
Microspeakers used in mobile devices and PCs have band-limited frequency response, from constraining small drivers in tight enclosures, resulting in the loss of low-frequency playback content. The lack of low-frequencies in turn degrades the audio quality and perceived loudness. A method to overcome this physical limitation is to leverage the auditory phenomena of the missing fundamental; where by synthesizing the harmonic structure decodes the missing fundamental frequency. The proposed approaches employs side-chain processing for synthesizing the harmonics with only dominant portions of the low-frequency signal using critical-band filters. Additionally a parametric filter is used to shape the harmonics. Listening tests reveal that the proposed technique is preferred in terms of both the overall sound quality and the bass-only quality.
Convention Paper 9881
P14-5 The Bandwidth of Human Perception and its Implications for Pro Audio—Thomas Lund, Genelec Oy - Iisalmi, Finland; Aki Mäkivirta, Genelec Oy - Iisalmi, Finland
Locked away inside its shell, the brain has ever only learned about the world through our five primary senses. With them, we just receive a fraction of the information actually available, while we perceive far less still. A fraction of a fraction: The perceptual bandwidth. Conscious perception is furthermore subject to 400 ms of latency, and associated with a temporal grey-zone that can only be tapped into via reflexes or training. Based on a broad review of physiological, clinical and psychological research, the paper proposes three types of listening strategies we should distinguish between; not only in our daily lives, but also when conducting subjective tests: Easy listening, trained listening, and slow listening.
Convention Paper 9882