"Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model—Application to Informed Audio Source Separation
×
Cite This
Citation & Abstract
J. Pinel, and L. Girin, ""Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model—Application to Informed Audio Source Separation," Paper 4-4, (2011 July.). doi:
J. Pinel, and L. Girin, ""Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model—Application to Informed Audio Source Separation," Paper 4-4, (2011 July.). doi:
Abstract: Sparse representations have proved a very useful tool in a variety of domain, e.g. speech/music source separation. As strictly sparse representations (in the sense of l0) are often impossible to achieve, other ways of studying signals sparsity have been proposed. In this paper, we revisit the irrelevance filtering analysis-synthesis approach proposed in (Balazs et al., IEEE Trans. ASLP, 18(1), 2010), where the TF coefficients that are below some masking threshold are set to zero. Instead of using the Gabor transform and a specific psychoacoustic model, we use tools directly inspired from perceptual audio coding, for instance MPEG-AAC. We show that significantly better "sparsification performances" are obtained on music signals, at lower computational cost. We then apply the sparsification process to the informed source separation (ISS) problem and show that it enables to significantly decrease the computational cost at the ISS decoder.
@article{pinel2011"sparsification",
author={pinel, jonathan and girin, laurent},
journal={journal of the audio engineering society},
title={"sparsification" of audio signals using the mdct/intmdct and a psychoacoustic model—application to informed audio source separation},
year={2011},
volume={},
number={},
pages={},
doi={},
month={july},}
@article{pinel2011"sparsification",
author={pinel, jonathan and girin, laurent},
journal={journal of the audio engineering society},
title={"sparsification" of audio signals using the mdct/intmdct and a psychoacoustic model—application to informed audio source separation},
year={2011},
volume={},
number={},
pages={},
doi={},
month={july},
abstract={sparse representations have proved a very useful tool in a variety of domain, e.g. speech/music source separation. as strictly sparse representations (in the sense of l0) are often impossible to achieve, other ways of studying signals sparsity have been proposed. in this paper, we revisit the irrelevance filtering analysis-synthesis approach proposed in (balazs et al., ieee trans. aslp, 18(1), 2010), where the tf coefficients that are below some masking threshold are set to zero. instead of using the gabor transform and a specific psychoacoustic model, we use tools directly inspired from perceptual audio coding, for instance mpeg-aac. we show that significantly better "sparsification performances" are obtained on music signals, at lower computational cost. we then apply the sparsification process to the informed source separation (iss) problem and show that it enables to significantly decrease the computational cost at the iss decoder.},}
TY - paper
TI - "Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model—Application to Informed Audio Source Separation
SP -
EP -
AU - Pinel, Jonathan
AU - Girin, Laurent
PY - 2011
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - July 2011
TY - paper
TI - "Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model—Application to Informed Audio Source Separation
SP -
EP -
AU - Pinel, Jonathan
AU - Girin, Laurent
PY - 2011
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - July 2011
AB - Sparse representations have proved a very useful tool in a variety of domain, e.g. speech/music source separation. As strictly sparse representations (in the sense of l0) are often impossible to achieve, other ways of studying signals sparsity have been proposed. In this paper, we revisit the irrelevance filtering analysis-synthesis approach proposed in (Balazs et al., IEEE Trans. ASLP, 18(1), 2010), where the TF coefficients that are below some masking threshold are set to zero. Instead of using the Gabor transform and a specific psychoacoustic model, we use tools directly inspired from perceptual audio coding, for instance MPEG-AAC. We show that significantly better "sparsification performances" are obtained on music signals, at lower computational cost. We then apply the sparsification process to the informed source separation (ISS) problem and show that it enables to significantly decrease the computational cost at the ISS decoder.
Sparse representations have proved a very useful tool in a variety of domain, e.g. speech/music source separation. As strictly sparse representations (in the sense of l0) are often impossible to achieve, other ways of studying signals sparsity have been proposed. In this paper, we revisit the irrelevance filtering analysis-synthesis approach proposed in (Balazs et al., IEEE Trans. ASLP, 18(1), 2010), where the TF coefficients that are below some masking threshold are set to zero. Instead of using the Gabor transform and a specific psychoacoustic model, we use tools directly inspired from perceptual audio coding, for instance MPEG-AAC. We show that significantly better "sparsification performances" are obtained on music signals, at lower computational cost. We then apply the sparsification process to the informed source separation (ISS) problem and show that it enables to significantly decrease the computational cost at the ISS decoder.
Authors:
Pinel, Jonathan; Girin, Laurent
Affiliation:
Grenoble Institute of Technology, Grenoble, France
AES Conference:
42nd International Conference: Semantic Audio (July 2011)
Paper Number:
4-4
Publication Date:
July 22, 2011Import into BibTeX
Subject:
Informed Source Separation
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=15956