In many situations, measuring the amount and type of reverberation in a room assumes that the room impulse response is available for the computation. When that impulse response is not available, a nonintrusive room acoustic (NIRA) method must be used. In this report, the authors use the C50 clarity index to characterize reverberation in the signal because it has been shown to be more highly correlated with the speech recognition performance then other measures of reverberation. Multiple features are extracted from a reverberant speech signal and they are then used to train a bidirectional long short-term memory model that maps from the feature space into the target C50 value. Prediction intervals, which provide an upper and lower bound of the estimate, can be derived from the standard deviation of the per frame estimations. Confidence measures are then obtained by normalizing these prediction intervals. These measures are highly correlated with the absolute C50 estimation errors. The performance of the prediction intervals and confidence measure are shown to be consistent in many different noisy reverberant environments. The procedure proposed in this paper for deriving C50 prediction intervals and confidence measures could as well be applied to other room acoustic parameter estimation, for example, T60 (reverberation decay time to 60 dB) or DRR (direct to reverberation ratio).
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.