AES E-Library

AES E-Library

Premature Overspecialization in Emotion Recognition Systems

Document Thumbnail

Emotion recognition from speech, the ability to identify expressed emotional states in vocal utterances, is an inherent ability humans apply in their daily interactions. Though a highly researched topic, it has yet to conform with real human performance levels, which may be due to the overspecialization or inability of most automatic recognition systems to adapt to non-emotional human conversational traits. Given that these traits may contain information pertinent to a speech based recognition system, generalization should be emphasized in early emotional feature extraction stages. To support this, an application of the VGGVox speaker recognition model has been evaluated for emotional feature extraction. Results on state-of-the-art classi?ers were comparable to other recent speech emotion recognition techniques.

Authors:
Affiliations:
AES Conference:
Paper Number:
Publication Date:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20471

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this paper!


AES - Audio Engineering Society