Separating the singing voice from accompanying instruments is important in music information-retrieval systems, since it allows for such applications as melody extraction, lyrics recognition, and singer identity. The authors investigate effective methods for unsupervised separation of the singing voice, called H-Semantics (Hybrid Singing Extraction through Multiband Amplitude Enhanced Thresholding and Independent Component Subtraction). The proposed method adds time-domain separation to the previous work that was based on frequency-domain cepstral methods. The results indicate separation of approximately 8.5 dB signal-to-distortion ratio over the baseline.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.