AR/VR applications commonly face difficulties binaurally spatializing many sound sources at once because of computational constraints. Existing techniques for efficient binaural rendering, such as Ambisonics, Vector-Based Amplitude Panning, or Principal Component Analysis, alleviate this issue by approximating Head-Related Transfer Function (HRTF) datasets with a linear combination of basis filters. This paper proposes a novel binaural renderer that convolves each basis filter with a layer of low-order finite impulse response filters applied in time-domain and derives both the spatial functions and filter coefficients through the minimization of a perceptually motivated error function. In a MUSHRA test, expert listeners had more difficulty differentiating the proposed method from the HRTF dataset it approximates than it did with existing methods configured with an equivalent number of Fast Fourier Transforms and identical HRTF preprocessing. This was consistent across both an internal Microsoft HRTF dataset and an individual head from the SADIE database.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.