Perceptual Quality of Audio Separated Using Sigmoidal Masks

Stokes, Toby; Hummersone, Christopher; Brookes, Tim; Mason, Andrew

AES E-Library

Perceptual Quality of Audio Separated Using Sigmoidal Masks

Separation of underdetermined audio mixtures is often performed in the Time-Frequency (TF) domain by masking each TF element according to its target-to-mixture ratio. This work uses sigmoidal functions to map the target-to-mixture ratio to mask values. The series of functions used encompasses the ratio mask and an approximation of the binary mask. Mixtures are chosen to represent a range of different amounts of TF overlap, then separated and evaluated using objective measures. PEASS results show improved interferer suppression and artifact scores can be achieved using softer masking than that applied by binary or ratio masks. The improvement in these scores gives an improved overall perceptual score; this observation is repeated at multiple TF resolutions.

Authors: Stokes, Toby; Hummersone, Christopher; Brookes, Tim; Mason, Andrew
Affiliations: University of Surrey, Guildford, Surrey, UK; BBC Research and Development, London, UK(See document for exact affiliation information.)
AES Convention: 137 (October 2014) Paper Number: 9182
Publication Date: October 8, 2014 Import into BibTeX
Subject: Signal Processing
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=17505

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: (CD 137Papers) /conv/137/9182.pdf

Start a discussion about this paper!

AES E-Library

Perceptual Quality of Audio Separated Using Sigmoidal Masks

ABOUT AES

Contact Us