Head-Related Transfer Function (HRTF) personalization is key to improving spatial audio perception and localization in virtual auditory displays. We investigate the task of personalizing HRTFs from anthropometric measurements, which can be decomposed into two sub tasks: Interaural Time Delay (ITD) prediction and HRTF magnitude spectrum prediction. We explore both problems using state-of-the-art Machine Learning (ML) techniques. First, we show that ITD prediction can be significantly improved by smoothing the ITD using a spherical harmonics representation. Second, our results indicate that prior unsupervised dimensionality reduction-based approaches may be unsuitable for HRTF personalization. Last, we show that neural network models trained on the full HRTF representation improve HRTF prediction compared to prior methods.
http://www.aes.org/e-lib/browse.cfm?elib=19287
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Learn more about the AES E-Library
Start a discussion about this paper!