Since people regularly use computers for listening, emotion classification is an important part of human-computer interaction, which has various applications in industrial and commercial sectors. This research investigates and compares recognizing vocal emotions by three different classifiers: multiclass support vector machine, Adaboost, and random forests. The decisions of these classifiers are then combined using majority voting. The proposed method has been applied to two different emotional databases: the Surrey Audio-Visual Expressed Emotion (SAVEE) Database and the Polish Emotional Speech Database. A vector of 14 features was used in order to recognize seven basic emotions from the SAVEE database and six emotions form the Polish database. Features extracted from these databases include pitch, intensity, first through fourth formants and their bandwidths, mean autocorrelation, mean noise-to-harmonic ratio, and standard deviation. Recognition rates ranged from 71 to 87%.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.