A Generalized Subspace Approach for Multichannel Speech Enhancement Using Machine Learning-Based Speech Presence Probability Estimation

Ke, Yuxuan; Hu, Yi; Li, Jian; Zheng, Chengshi; Li, Xiaodong

AES E-Library

A Generalized Subspace Approach for Multichannel Speech Enhancement Using Machine Learning-Based Speech Presence Probability Estimation

A generalized subspace-based multichannel speech enhancement in frequency domain is proposed by estimating multichannel speech presence probability using machine learning methods. An efficient and low-latency neural networks (NN) model is introduced to discriminatively learn a gain mask for separating the speech and the noise components in noisy scenarios. Besides, a generalized subspace-based approach in frequency domain is proposed, where the speech power spectral density (PSD) matrix and the noise PSD matrix are estimated by short-term and long-term averaging periods, respectively. Experimental results show that the proposed method outperforms the existing NN-based beamforming methods in terms of the perceptual evaluation of speech quality score and the segmental signal-to-noise ratio improvement.

Authors: Ke, Yuxuan; Hu, Yi; Li, Jian; Zheng, Chengshi; Li, Xiaodong
Affiliations: University of Chinese Academy of Sciences, Beijing, China; University of Wisconsin - Milwaukee, Milwaukee, WI, USA; Institute of Acoustics, Chinese Academy of Sciences, Beijing, China(See document for exact affiliation information.)
AES Convention: 146 (March 2019) Paper Number: 10192
Publication Date: March 10, 2019 Import into BibTeX
Subject: Poster Session 3
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20325

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: /conv/146/10192.pdf

Start a discussion about this paper!

AES E-Library

A Generalized Subspace Approach for Multichannel Speech Enhancement Using Machine Learning-Based Speech Presence Probability Estimation

ABOUT AES

Contact Us