Toward Language-Agnostic Speech Emotion Recognition

Ntalampiras, Stavros

AES E-Library

Toward Language-Agnostic Speech Emotion Recognition

Cross-language speech emotion recognition is receiving increased attention due to its extensive real-world applicability. This work proposes a language-agnostic speech emotion recognition algorithm focusing on Italian and German languages. The mel-scaled and temporal modulation spectral representations are combined and then subsequently modeled by means of Gaussian mixture models. Emotion prediction is carried out via a Kullback Leibler divergence scheme. The proposed methodology is applied to two problem settings: one including positive vs. negative emotion classification and a second one where all Big Six emotional states are considered. A thorough experimental campaign demonstrated the efficacy of such a method, as well as its superiority over other generative modeling schemes and state-of-the-art approaches. The results demonstrate the feasibility of recognizing emotional states in a language-, gender- and speaker-independent setting.

Author: Ntalampiras, Stavros
Affiliation: Department of Computer Science, University of Milan, Via Celoria 18, 20133 Milan, Italy
JAES Volume 68 Issue 1/2 pp. 7-13; January 2020
Publication Date: February 5, 2020 Import into BibTeX
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20713

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: (CD JAES68) /jaes68/1/pg7.pdf

DOI: https://doi.org/10.17743/jaes.2019.0045

Start a discussion about this paper!

AES E-Library

Toward Language-Agnostic Speech Emotion Recognition

ABOUT AES

Contact Us