Binary masking is a common technique for separating target audio from an interferer. Its use is often justified by the high signal-to-noise ratio achieved. The mask can introduce musical noise artifacts, limiting its perceptual performance and that of techniques that use it. Three mask-processing techniques, involving adding noise or cepstral smoothing, are tested and the processed masks are compared to the ideal binary mask using the perceptual evaluation for audio source separation (PEASS) toolkit. Each processing technique's parameters are optimized before the comparison is made. Each technique is found to improve the overall perceptual score of the separation. Results show a trade-off between interferer suppression and artifact reduction.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.