In this paper we present a scheme for unsupervised extraction of sound objects or sources from a single recording containing a mixture of sounds. The separation/extraction procedure is performed by orthogonal projection of the mixed sound onto sub-spaces that are derived by clustering of transform coefficients, such as coefficients obtained by PCA or ICA. The clustering step reveals a residual non-linear grouping structure of the signal that is omitted by the linear transform. To achieve independence we are searching for partitioning that maximizes the mutual information between a component and a set to which it belongs. This information is obtained by considering a pairwise distance measure among all coefficients. Source separation experiments are reported in the paper.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.