Sunday, October 12 4:00 pm 5:30 pm
Session Z6 Posters: Sound Quality and Listening Tests
Z6-1 Authentic Reproduction of Stereo SoundA Wiener Filter ApproachSang-Myeong Kim, Kwang-Institute of Science & Technology, Gwangju, Korea
Authentic binaural reproduction over loudspeakers using crosstalk cancellation is considered in this paper. A systematic time domain deconvolution method is presented using both stochastic and deterministic Wiener filters. This approach is advantageous over its frequency domain counterpart in that no additional stabilization process is required since the Wiener filter is inherently causal stable. A series of reproduction tests were conducted by changing the length of the Wiener filters in an anechoic chamber using a PC-based reproduction system using a soundcard. Excellent performance was achieved with the filter length of only 500; less than 1 percent and 2 percent reproduction performance errors for the monaural and binaural reproduction tests, respectively.
Z6-2 Objective Evaluation of Noise Reduction Algorithms in Speech ApplicationsKarthikeyan Umapathy, Vijay Parsa, University of Western Ontario, London, Ontario, Canada
We have evaluated objectively the comparative performance of five noise reduction algorithms. These algorithms were based on the Short-Time Spectral Amplitude (STSA) estimation, subspace projection, wavelet packets with auditory masking, and time-frequency decompositions using matching pursuits. Speech stimuli corrupted by speech-shaped noise and multitalker babble at five different Signal-to-Noise Ratios (SNRs) were used to test the performance of the noise reduction algorithms. Noise reduction performance was quantified using two different methods. In the first method, the Perceptual Evaluation of Speech Quality (PESQ) measure was computed twiceonce between the original and noisy speech and the other between the original and enhanced speech. The difference between these two PESQ values indicated the performance of the noise reduction algorithm. The second method was based on the phase reversed noise technique where the noise reduction algorithm was tested twice, once with speech + noise and then with speech + phase reversed noise. The PESQ and SNR gain measures were then computed on the combined output. The results obtained from this study indicate that the STSA based algorithm performs better in terms of the amount of noise reduction, while the wavelet packet based algorithm performs better in terms of minimizing the speech distortion introduced by the noise reduction process.
Z6-3 Directivity Balloons of Real and Artificial Mouth Simulators for Measurement of the Speech Transmission Index Fabio Bozzoli, Angelo Farina, University of Parma, Parma, Italy
One of the most used intelligibility parameters is the Speech Transmission Index. The techniques for determining it employs an artificial speaker and listener. In many cases (i.e., big rooms or systems of telecommunications) the precision of the directivity of an artificial mouth does not influence the result very much. On the contrary inside cars, but also in other cases, the shape of the whole balloon of directivity is important for determining correct and comparable values, and different mouths give different results in the same situation. Moreover, there is no current standard that fixes the whole balloon of directivity of an artificial mouth but defines only limits for some frontal position. For these reasons we have determined, in an anechoic room, the directivities of a real speaker and some artificial mouths. Finally, we have compared them for underlying the need of a more precise standard in this field.
Z6-4 Intrusive Speech Transmission Quality Measurements for Low Bit-Rate Coded Audio SignalsJan Holub, Czech Technical University, Prague, Czech Republic; Michael D. Street, NATO, The Hague, The Netherlands; Radislav Smid, Czech Technical University, Prague, Czech Republic
This paper describes an algorithm that allows intrusive speech transmission quality measurements in networks with low bit-rate coding schemes (600 to 2400 bits/s) as used in satellite and military communications. The proposed algorithm is based on the PESQ (ITU-T P.862) perceptual model, enhanced by input noise resistance capability. The new algorithm also reflects noise cancellation capabilities of modern audio coders. The correlation between Absolute Category Rating (ACR) Mean Opinion Score (MOS) listening test results and output of the developed algorithm achieves 0.89 without the knowledge of original noise-free speech sample.
Z6-5 Automatic Level Alignment for the Arbitrary Multichannel Reproduction SystemSe-Ung Kim, Sin-Lyul Lee, Lae-Hoon Kim, Koeng-Mo Sung, Seoul National University, Seoul, Korea
The correct level alignment of the multichannel reproduction system is critical for the quality of the reproduction. However, the level alignment in a conventional product is controlled manually by the user. And if the user is not an expert, it is not easy to align the correct level of each speaker. This paper provides how to apply binaural energy summation, which is from all of available positions of speakers, to level alignment for the arbitrary multichannel reproduction system. If it is possible to measure the distance and the angle of a speaker, and if there is an omnidirectional microphone, it is possible to align the correct level automatically applying the binaural energy summation as the position of speakers obtained before.