Monaural Speech Source Separation by Estimating the Power Spectrum Using Multi-Frequency Harmonic Product Spectrum

Ayllon, David; Gil-Pita, Roberto; Rosa-Zurera, Manuel

AES E-Library

Monaural Speech Source Separation by Estimating the Power Spectrum Using Multi-Frequency Harmonic Product Spectrum

This paper proposes an algorithm to perform monaural speech source separation by means of time-frequency masking. The algorithm is based on the estimation of the power spectrum of the original speech signals as a combination of a carrier signal multiplied by an envelope. A Multi-Frequency Harmonic Product Spectrum (MF-HPS) algorithm is used to estimate the fundamental frequency of the signals in the mixture. These frequencies are used to estimate both the carrier and the envelope from the mixture. Binary masks are generated comparing the estimated spectra of the signals. Results show an important improvement in the separation in comparison to the original algorithm that only uses the information from the HPS.

Authors: Ayllon, David; Gil-Pita, Roberto; Rosa-Zurera, Manuel
Affiliation: University of Alcala, Alcalá de Henares, Spain
AES Convention: 134 (May 2013) Paper Number: 8832
Publication Date: May 4, 2013 Import into BibTeX
Subject: Speech Processing
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=16733

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: (CD 134Papers) /conv/134/8832.pdf

Start a discussion about this paper!

AES E-Library

Monaural Speech Source Separation by Estimating the Power Spectrum Using Multi-Frequency Harmonic Product Spectrum

ABOUT AES

Contact Us