Binaural recording technology offers an inexpensive, portable solution for spatial audio capture. In this paper a full-sphere 2D localization method is proposed that utilizes the Model-Based Expectation-Maximization Source Separation and Localization system (MESSL). The localization model is trained using a full-sphere head related transfer function dataset and produces localization estimates by maximum-likelihood of frequency-dependent interaural parameters. The model’s robustness is assessed using matched and mismatched HRTF datasets between test and training data, with environmental sounds and speech. Results show that the majority of sounds are estimated correctly with the matched condition in low noise levels; for the mismatched condition, a “cone of confusion” arises with albeit effective estimation of lateral angles. Additionally, the results show a relationship between the spectral content of the test data and the performance of the proposed method.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.