Low Latency Timbre Interpolation and Warping using Autoencoding Neural Networks
×
Cite This
Citation & Abstract
J. Colonel, and S. Keene, "Low Latency Timbre Interpolation and Warping using Autoencoding Neural Networks," Paper 10406, (2020 October.). doi:
J. Colonel, and S. Keene, "Low Latency Timbre Interpolation and Warping using Autoencoding Neural Networks," Paper 10406, (2020 October.). doi:
Abstract: A lightweight algorithm for low latency timbre interpolation of two input audio streams using an autoencoding neural network is presented. Short-time Fourier transform magnitude frames of each audio stream are encoded, and a new interpolated representation is created within the autoencoder’s latent space. This new representation is passed to the decoder, which outputs a spectrogram. An initial phase estimation for the new spectrogram is calculated using the original phase of the two audio streams. Inversion to the time domain is done using a Griffin-Lim iteration. A method for avoiding pops between processed batches is discussed. An open source implementation in Python is made available.
@article{colonel2020low,
author={colonel, joseph and keene, sam},
journal={journal of the audio engineering society},
title={low latency timbre interpolation and warping using autoencoding neural networks},
year={2020},
volume={},
number={},
pages={},
doi={},
month={october},}
@article{colonel2020low,
author={colonel, joseph and keene, sam},
journal={journal of the audio engineering society},
title={low latency timbre interpolation and warping using autoencoding neural networks},
year={2020},
volume={},
number={},
pages={},
doi={},
month={october},
abstract={a lightweight algorithm for low latency timbre interpolation of two input audio streams using an autoencoding neural network is presented. short-time fourier transform magnitude frames of each audio stream are encoded, and a new interpolated representation is created within the autoencoder’s latent space. this new representation is passed to the decoder, which outputs a spectrogram. an initial phase estimation for the new spectrogram is calculated using the original phase of the two audio streams. inversion to the time domain is done using a griffin-lim iteration. a method for avoiding pops between processed batches is discussed. an open source implementation in python is made available.},}
TY - paper
TI - Low Latency Timbre Interpolation and Warping using Autoencoding Neural Networks
SP -
EP -
AU - Colonel, Joseph
AU - Keene, Sam
PY - 2020
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2020
TY - paper
TI - Low Latency Timbre Interpolation and Warping using Autoencoding Neural Networks
SP -
EP -
AU - Colonel, Joseph
AU - Keene, Sam
PY - 2020
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2020
AB - A lightweight algorithm for low latency timbre interpolation of two input audio streams using an autoencoding neural network is presented. Short-time Fourier transform magnitude frames of each audio stream are encoded, and a new interpolated representation is created within the autoencoder’s latent space. This new representation is passed to the decoder, which outputs a spectrogram. An initial phase estimation for the new spectrogram is calculated using the original phase of the two audio streams. Inversion to the time domain is done using a Griffin-Lim iteration. A method for avoiding pops between processed batches is discussed. An open source implementation in Python is made available.
A lightweight algorithm for low latency timbre interpolation of two input audio streams using an autoencoding neural network is presented. Short-time Fourier transform magnitude frames of each audio stream are encoded, and a new interpolated representation is created within the autoencoder’s latent space. This new representation is passed to the decoder, which outputs a spectrogram. An initial phase estimation for the new spectrogram is calculated using the original phase of the two audio streams. Inversion to the time domain is done using a Griffin-Lim iteration. A method for avoiding pops between processed batches is discussed. An open source implementation in Python is made available.
Authors:
Colonel, Joseph; Keene, Sam
Affiliations:
Queen Mary University of London, UK; The Cooper Union for the Advancement of Science and Art, New York, NY, USA(See document for exact affiliation information.)
AES Convention:
149 (October 2020)
Paper Number:
10406
Publication Date:
October 22, 2020Import into BibTeX
Subject:
Audio Processing
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=20943