Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits
×
Cite This
Citation & Abstract
B. Schuller, M. Wöllmer, F. Eyben, G. Rigoll, and D. Arsic, "Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits," Paper 2-1, (2011 July.). doi:
B. Schuller, M. Wöllmer, F. Eyben, G. Rigoll, and D. Arsic, "Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits," Paper 2-1, (2011 July.). doi:
Abstract: A number of paralinguistic problems are often dealt with in isolation, such as emotion, health state or personality. However, there are also good examples of mutual benefit, mostly incorporating speaker gender knowledge. In this paper we deal with the question how further paralinguistic information, such as speaker age, height, or race can provide beneficial information when their ground truth knowledge is provided within single-task speaker classification. Tests with openSMILE's 1.5 k Paralinguistic Challenge Feature set on the TIMIT corpus of 630 speakers reveal significant boost in accuracy or cross-correlation|depending on the representation form of the problem at hand.
@article{schuller2011semantic,
author={schuller, björn and wöllmer, martin and eyben, florian and rigoll, gerhard and arsic, dejan},
journal={journal of the audio engineering society},
title={semantic speech tagging: towards combined analysis of speaker traits},
year={2011},
volume={},
number={},
pages={},
doi={},
month={july},}
@article{schuller2011semantic,
author={schuller, björn and wöllmer, martin and eyben, florian and rigoll, gerhard and arsic, dejan},
journal={journal of the audio engineering society},
title={semantic speech tagging: towards combined analysis of speaker traits},
year={2011},
volume={},
number={},
pages={},
doi={},
month={july},
abstract={a number of paralinguistic problems are often dealt with in isolation, such as emotion, health state or personality. however, there are also good examples of mutual benefit, mostly incorporating speaker gender knowledge. in this paper we deal with the question how further paralinguistic information, such as speaker age, height, or race can provide beneficial information when their ground truth knowledge is provided within single-task speaker classification. tests with opensmile's 1.5 k paralinguistic challenge feature set on the timit corpus of 630 speakers reveal significant boost in accuracy or cross-correlation|depending on the representation form of the problem at hand.},}
TY - paper
TI - Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits
SP -
EP -
AU - Schuller, Björn
AU - Wöllmer, Martin
AU - Eyben, Florian
AU - Rigoll, Gerhard
AU - Arsic, Dejan
PY - 2011
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - July 2011
TY - paper
TI - Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits
SP -
EP -
AU - Schuller, Björn
AU - Wöllmer, Martin
AU - Eyben, Florian
AU - Rigoll, Gerhard
AU - Arsic, Dejan
PY - 2011
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - July 2011
AB - A number of paralinguistic problems are often dealt with in isolation, such as emotion, health state or personality. However, there are also good examples of mutual benefit, mostly incorporating speaker gender knowledge. In this paper we deal with the question how further paralinguistic information, such as speaker age, height, or race can provide beneficial information when their ground truth knowledge is provided within single-task speaker classification. Tests with openSMILE's 1.5 k Paralinguistic Challenge Feature set on the TIMIT corpus of 630 speakers reveal significant boost in accuracy or cross-correlation|depending on the representation form of the problem at hand.
A number of paralinguistic problems are often dealt with in isolation, such as emotion, health state or personality. However, there are also good examples of mutual benefit, mostly incorporating speaker gender knowledge. In this paper we deal with the question how further paralinguistic information, such as speaker age, height, or race can provide beneficial information when their ground truth knowledge is provided within single-task speaker classification. Tests with openSMILE's 1.5 k Paralinguistic Challenge Feature set on the TIMIT corpus of 630 speakers reveal significant boost in accuracy or cross-correlation|depending on the representation form of the problem at hand.
Authors:
Schuller, Björn; Wöllmer, Martin; Eyben, Florian; Rigoll, Gerhard; Arsic, Dejan
Affiliations:
Müller-BBM Vibroakustiksysteme, Planegg, Germany; Technische Universität München, Munich, Germany(See document for exact affiliation information.)
AES Conference:
42nd International Conference: Semantic Audio (July 2011)
Paper Number:
2-1
Publication Date:
July 22, 2011Import into BibTeX
Subject:
Speech Processing and Analysis
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=15947