Music Structure Boundaries Estimation Using Multiple Self-Similarity Matrices as Input Depth of Convolutional Neural Networks

Cohen-Hadria, Alice; Peeters, Geoffroy

AES E-Library

Music Structure Boundaries Estimation Using Multiple Self-Similarity Matrices as Input Depth of Convolutional Neural Networks

In this paper we propose a new representation as input of a Convolutional Neural Network in the goal of detecting music structure boundaries. For this task, previous works used a late-fusion of a Mel-scaled Log-Magnitude Spectrograms (MLS) and a lag matrices networks. We propose here to use several self-similarity-matrices, each representing different audio descriptors, and combined using the depth of the input layer. We show that this representation improve the results over the use of the lag-matrix. We also show that using the depth of the input layer provide a convenient way for early fusion of representations.

Authors: Cohen-Hadria, Alice; Peeters, Geoffroy
Affiliation: IRCAM, Paris, France
AES Conference: 2017 AES International Conference on Semantic Audio (June 2017)
Paper Number: 5-3
Publication Date: June 13, 2017 Import into BibTeX
Subject: Deep Learning
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=18763

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: /conf/2017/semantic/semantic_audio_2017_paper_30.pdf

Start a discussion about this paper!

AES E-Library

Music Structure Boundaries Estimation Using Multiple Self-Similarity Matrices as Input Depth of Convolutional Neural Networks

ABOUT AES

Contact Us