Separation of underdetermined audio mixtures is often performed in the Time-Frequency (TF) domain by masking each TF element according to its target-to-mixture ratio. This work uses sigmoidal functions to map the target-to-mixture ratio to mask values. The series of functions used encompasses the ratio mask and an approximation of the binary mask. Mixtures are chosen to represent a range of different amounts of TF overlap, then separated and evaluated using objective measures. PEASS results show improved interferer suppression and artifact scores can be achieved using softer masking than that applied by binary or ratio masks. The improvement in these scores gives an improved overall perceptual score; this observation is repeated at multiple TF resolutions.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.