Human emotions can be recognized through speech analysis. One main problem of this discipline is the lack of databases with a sufficient number of patterns for a correct learning. This fact makes generalization in the learning process be more difficult. One possible solution is the creation of new virtual patterns, enlarging the training set. In order to carry out this enlargement, we modify the average pitch by using the technique known as Pitch Synchronous Overlap and Add combined with resampling, that allows to change the average pitch without altering neither the pitch variations nor the speech rate. Therefore, the emotion in the utterance is unaltered. Results over the original test set show that it is possible to achieve a significant reduction in the generalization effects with the proposed creation of new virtual training patterns.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.