Investigating Spatial Audio Coding Cues for Meeting Audio Segmentation
As multiparty meetings involve participants that are generally stationary when actively speaking, participant location information can be used to segment the recorded meeting audio into speaker ‘turns.’ In this paper, speaker location information derived from ‘spatial cues’ generated by spatial audio coding techniques is investigated. The validity of using spatial cues for meeting audio segmentation is explored through investigating multiple microphone meeting audio recording techniques and extracting and comparing spatial cues used by different spatial audio coders. Experimental results show the statistical relationship between speaker location and interchannel level and phase-based spatial cues strongly depends on the microphone pattern. Results also indicate that interchannel correlation-based spatial cues represent location information that is ambiguous for meeting audio segmentation.
Click to purchase paper or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.