Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

Kim, Minje; Beack, Seungkwon; Choi, Keunwoo; Kang, Kyeongok

AES E-Library

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

This paper presents an adaptive prediction method about source-specific ranges of binaural cues, such as inter-channel level difference (ILD) and inter-channel phase difference (IPD), for centrally positioned singing voice separation. To this end, we employ Gaussian mixture model (GMM) to cluster underlying distributions in the feature domain of mixture signal. By regarding responsibilities to those distinct Gaussians as unmixing coefficients of each mixture spectrogram sample, the proposed method can reduce artificial deformations that previous center channel extraction methods usually suffer, caused by their imprecise or rough decision about ranges of central subspaces. Experiments on commercial music show superiority of the proposed method.

Authors: Kim, Minje; Beack, Seungkwon; Choi, Keunwoo; Kang, Kyeongok
Affiliation: Electronics and Telecommunications Research Institute (ETRI), Daejeon, Korea
AES Conference: 43rd International Conference: Audio for Wirelessly Networked Personal Devices (September 2011)
Paper Number: 6-2
Publication Date: September 29, 2011 Import into BibTeX
Subject: Interactive Audio
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=16121

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: (CD 43rdPapers) /conf/43/aes43-000017.pdf

Start a discussion about this paper!

AES E-Library

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

ABOUT AES

Contact Us