This paper describes a novel, low-cost method for combining time-frequency representations into a more sparse one. To this end, a new local quality measure that is based on an amplitude-weighted version of the so-called Hoyer sparsity is proposed. A detailed evaluation procedure that employs a dataset with nearly perfect f0 annotations of melodic signals and a set of white-noise pulses is adopted for assessing the time-frequency resolution attained. The proposed method is shown to produce state-of-the-art results among the existing combination methods in terms of energy concentration at frequency contours, onsets, and offsets, meeting the most desirable requirements: high time-frequency resolution, low computational cost, and the capability of combining representations with non-linear frequency scale.
https://www.aes.org/e-lib/browse.cfm?elib=21889
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Learn more about the AES E-Library
Start a discussion about this paper!