Sparse Time-Frequency Representations for Polyphonic Audio Based on Combined Efficient Fan-Chirp Transforms

Costa, Maurício V. M.; Apolinário, Isabela F.; Biscainho, Luiz W. P.

AES E-Library

Sparse Time-Frequency Representations for Polyphonic Audio Based on Combined Efficient Fan-Chirp Transforms

In audio signal processing, several techniques rely on the Time-Frequency Representation (TFR) of an audio signal, and particularly in applications for music information retrieval. Examples include automatic music transcription, sound source separation, and classification of instruments playing in a musical piece. This paper presents a novel method for obtaining a sparse time-frequency representation by combining different instances of the Fan-Chirp Transform (FChT). The method described is comprised of two main steps: computing the multiple FChTs by means of the structure tensor; and combining them, along with spectrograms, using the smoothed local sparsity method. Experiments conducted with synthetic and real-world audio signals suggest that the proposed method is able to effectively yield much better TFRs than the standard short-time Fourier transform, especially in the presence of fast frequency variations; this allows using the FChT for polyphonic audio signals. As a result, the proposed method allows for better extraction of precise information from audio signals with multiple sources.

Authors: Costa, Maurício V. M.; Apolinário, Isabela F.; Biscainho, Luiz W. P.
Affiliations: Signals, Multimedia, and Telecommunications Lab (SMT) – DEL/Poli & PEE/COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil;Signals, Multimedia, and Telecommunications Lab (SMT) – DEL/Poli & PEE/COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil;Signals, Multimedia, and Telecommunications Lab (SMT) – DEL/Poli & PEE/COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil(See document for exact affiliation information.)
JAES Volume 67 Issue 11 pp. 894-905; November 2019
Publication Date: November 22, 2019 Import into BibTeX
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20703

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: (CD JAES67) /jaes67/11/pg894.pdf

DOI: https://doi.org/10.17743/jaes.2019.0039

Start a discussion about this paper!

AES E-Library

Sparse Time-Frequency Representations for Polyphonic Audio Based on Combined Efficient Fan-Chirp Transforms

ABOUT AES

Contact Us