Speech enhancement performance may degrade when the peak level of the noisy speech is significantly different from the training datasets in Deep Neural Networks (DNN)-based speech enhancement algorithms, especially when the non-normalized features are used in practical applications, such as log-power spectra. To overcome this shortcoming, we introduce an automatic gain control (AGC) method as a preprocessing technique. By doing so, we can train the model with the same peak level of all the speech utterances. To further improve the proposed DNN-based algorithm, the feature compensation method is combined with the AGC method. Experimental results indicate that the proposed algorithm can maintain consistent performance when the peak of the noisy speech changes in a large range.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.