Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data
×
Cite This
Citation & Abstract
G. Hagerer, V. Pandit, F. Eyben, and B. Schuller, "Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data," Paper P1-1, (2017 June.). doi:
G. Hagerer, V. Pandit, F. Eyben, and B. Schuller, "Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data," Paper P1-1, (2017 June.). doi:
Abstract: This paper presents a new method for Long Short-Term Memory Recurrent Neural Network (LSTM) based speech overlap detection. To this end, speech overlap data is created artificially by mixing large amounts of speech utterances. Our elaborate training strategies and presented network structures demonstrate performance surpassing the considered state-of-the-art overlap detectors. Thereby we target the full ternary task of non-speech, speech, and overlap detection. Furthermore, speakers' gender is recognised, as the first successful combination of this kind within one model.
@article{hagerer2017enhancing,
author={hagerer, gerhard and pandit, vedhas and eyben, florian and schuller, björn},
journal={journal of the audio engineering society},
title={enhancing lstm rnn-based speech overlap detection by artificially mixed data},
year={2017},
volume={},
number={},
pages={},
doi={},
month={june},}
@article{hagerer2017enhancing,
author={hagerer, gerhard and pandit, vedhas and eyben, florian and schuller, björn},
journal={journal of the audio engineering society},
title={enhancing lstm rnn-based speech overlap detection by artificially mixed data},
year={2017},
volume={},
number={},
pages={},
doi={},
month={june},
abstract={this paper presents a new method for long short-term memory recurrent neural network (lstm) based speech overlap detection. to this end, speech overlap data is created artificially by mixing large amounts of speech utterances. our elaborate training strategies and presented network structures demonstrate performance surpassing the considered state-of-the-art overlap detectors. thereby we target the full ternary task of non-speech, speech, and overlap detection. furthermore, speakers' gender is recognised, as the first successful combination of this kind within one model.},}
TY - paper
TI - Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data
SP -
EP -
AU - Hagerer, Gerhard
AU - Pandit, Vedhas
AU - Eyben, Florian
AU - Schuller, Björn
PY - 2017
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - June 2017
TY - paper
TI - Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data
SP -
EP -
AU - Hagerer, Gerhard
AU - Pandit, Vedhas
AU - Eyben, Florian
AU - Schuller, Björn
PY - 2017
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - June 2017
AB - This paper presents a new method for Long Short-Term Memory Recurrent Neural Network (LSTM) based speech overlap detection. To this end, speech overlap data is created artificially by mixing large amounts of speech utterances. Our elaborate training strategies and presented network structures demonstrate performance surpassing the considered state-of-the-art overlap detectors. Thereby we target the full ternary task of non-speech, speech, and overlap detection. Furthermore, speakers' gender is recognised, as the first successful combination of this kind within one model.
This paper presents a new method for Long Short-Term Memory Recurrent Neural Network (LSTM) based speech overlap detection. To this end, speech overlap data is created artificially by mixing large amounts of speech utterances. Our elaborate training strategies and presented network structures demonstrate performance surpassing the considered state-of-the-art overlap detectors. Thereby we target the full ternary task of non-speech, speech, and overlap detection. Furthermore, speakers' gender is recognised, as the first successful combination of this kind within one model.
Authors:
Hagerer, Gerhard; Pandit, Vedhas; Eyben, Florian; Schuller, Björn
Affiliations:
audEERING GmbH, Gilching, Germany; University of Passau, Passau, Germany(See document for exact affiliation information.)
AES Conference:
2017 AES International Conference on Semantic Audio (June 2017)
Paper Number:
P1-1
Publication Date:
June 13, 2017Import into BibTeX
Subject:
Semantic Audio
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=18764