Multi-task Based Sound Localization Model

Song, Tao; Qu, Tianshu; Chen, Jing

AES E-Library

Multi-task Based Sound Localization Model

For machine hearing in complex sences (i.e. reverberation, noise), sound localization either serves as the front-end or is implicitly encoded in speech enhancing models. However, it is suggested that there may be cross-talk between identification and localization streams in auditory system. Based on this idea, a multi-task based sound localization method is proposed in this study. The proposed model takes waveform as input, and simutaneously estimates the azimuth of sound source and the time-frequency (T-F) masks. Localization experiments were performed using binaural simulation in reverberant environment and the results show that comparing to single-task sound localization method, the presence of speech enhancement task can improve the localization performance.

Authors: Song, Tao; Qu, Tianshu; Chen, Jing
Affiliation: Peking University
AES Convention: 148 (May 2020) Paper Number: 10366
Publication Date: May 28, 2020 Import into BibTeX
Subject: Posters: Perception
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20783

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: /conv/148/10366.pdf

Start a discussion about this paper!

AES E-Library

Multi-task Based Sound Localization Model

ABOUT AES

Contact Us