Deep Learning for Timbre Modification and Transfer: An Evaluation Study
×
Cite This
Citation & Abstract
L. Gabrielli, CA. EM. Cella, F. Vesperini, D. Droghini, E. Principi, and S. Squartini, "Deep Learning for Timbre Modification and Transfer: An Evaluation Study," Paper 9996, (2018 May.). doi:
L. Gabrielli, CA. EM. Cella, F. Vesperini, D. Droghini, E. Principi, and S. Squartini, "Deep Learning for Timbre Modification and Transfer: An Evaluation Study," Paper 9996, (2018 May.). doi:
Abstract: In the past years, several hybridization techniques have been proposed to synthesize novel audio content owing its properties from two audio sources. These algorithms, however, usually provide no feature learning, leaving the user, often intentionally, exploring parameters by trial-and-error. The introduction of machine learning algorithms in the music processing field calls for an investigation to seek for possible exploitation of their properties such as the ability to learn semantically meaningful features. In this first work we adopt a Neural Network Autoencoder architecture, and we enhance it to exploit temporal dependencies. In our experiments the architecture was able to modify the original timbre, resembling what it learned during the training phase, while preserving the pitch envelope from the input.
@article{gabrielli2018deep,
author={gabrielli, leonardo and cella, carmine emanuel and vesperini, fabio and droghini, diego and principi, emanuele and squartini, stefano},
journal={journal of the audio engineering society},
title={deep learning for timbre modification and transfer: an evaluation study},
year={2018},
volume={},
number={},
pages={},
doi={},
month={may},}
@article{gabrielli2018deep,
author={gabrielli, leonardo and cella, carmine emanuel and vesperini, fabio and droghini, diego and principi, emanuele and squartini, stefano},
journal={journal of the audio engineering society},
title={deep learning for timbre modification and transfer: an evaluation study},
year={2018},
volume={},
number={},
pages={},
doi={},
month={may},
abstract={in the past years, several hybridization techniques have been proposed to synthesize novel audio content owing its properties from two audio sources. these algorithms, however, usually provide no feature learning, leaving the user, often intentionally, exploring parameters by trial-and-error. the introduction of machine learning algorithms in the music processing field calls for an investigation to seek for possible exploitation of their properties such as the ability to learn semantically meaningful features. in this first work we adopt a neural network autoencoder architecture, and we enhance it to exploit temporal dependencies. in our experiments the architecture was able to modify the original timbre, resembling what it learned during the training phase, while preserving the pitch envelope from the input.},}
TY - paper
TI - Deep Learning for Timbre Modification and Transfer: An Evaluation Study
SP -
EP -
AU - Gabrielli, Leonardo
AU - Cella, Carmine Emanuel
AU - Vesperini, Fabio
AU - Droghini, Diego
AU - Principi, Emanuele
AU - Squartini, Stefano
PY - 2018
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2018
TY - paper
TI - Deep Learning for Timbre Modification and Transfer: An Evaluation Study
SP -
EP -
AU - Gabrielli, Leonardo
AU - Cella, Carmine Emanuel
AU - Vesperini, Fabio
AU - Droghini, Diego
AU - Principi, Emanuele
AU - Squartini, Stefano
PY - 2018
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2018
AB - In the past years, several hybridization techniques have been proposed to synthesize novel audio content owing its properties from two audio sources. These algorithms, however, usually provide no feature learning, leaving the user, often intentionally, exploring parameters by trial-and-error. The introduction of machine learning algorithms in the music processing field calls for an investigation to seek for possible exploitation of their properties such as the ability to learn semantically meaningful features. In this first work we adopt a Neural Network Autoencoder architecture, and we enhance it to exploit temporal dependencies. In our experiments the architecture was able to modify the original timbre, resembling what it learned during the training phase, while preserving the pitch envelope from the input.
In the past years, several hybridization techniques have been proposed to synthesize novel audio content owing its properties from two audio sources. These algorithms, however, usually provide no feature learning, leaving the user, often intentionally, exploring parameters by trial-and-error. The introduction of machine learning algorithms in the music processing field calls for an investigation to seek for possible exploitation of their properties such as the ability to learn semantically meaningful features. In this first work we adopt a Neural Network Autoencoder architecture, and we enhance it to exploit temporal dependencies. In our experiments the architecture was able to modify the original timbre, resembling what it learned during the training phase, while preserving the pitch envelope from the input.
Authors:
Gabrielli, Leonardo; Cella, Carmine Emanuel; Vesperini, Fabio; Droghini, Diego; Principi, Emanuele; Squartini, Stefano
Affiliations:
Universitá Politecnica delle Marche, Ancona, Italy; IRCAM, Paris, France(See document for exact affiliation information.)
AES Convention:
144 (May 2018)
Paper Number:
9996
Publication Date:
May 14, 2018Import into BibTeX
Subject:
Audio Processing and Effects – Part 1
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=19513