This article presents a method for real-time estimation of the directions of speech sources from captured binaural audio. Accurate direction estimates are required in order to embed the sound sources correctly to the auditory environment of the far-end user in telecommunication between two augmented reality audio (ARA) users. The dependency of the estimation accuracy on the orientation of the near-end user is avoided in this method by combining the information from an orientation tracker to the direction estimates. The results from the anechoic experiments illustrate that the presented method can estimate the direction(s) of non-simultaneous speech source(s) in real-time, and that head movement improves the estimation accuracy of sources on the sides of the user.
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.