AES San Francisco 2008
Paper Session P23

Sunday, October 5, 9:00 am — 11:00 am

P23 - Audio DSP

Chair: Jon Boley, LSB Audio

P23-1 Determination and Correction of Individual Channel Time Offsets for Signals Involved in an Audio Mixture—Enrique Perez Gonzalez, Joshua Reiss, Queen Mary University of London - London, UK
A method for reducing comb-filtering effects due to delay time differences between audio signals in sound mixer has been implemented. The method uses a multichannel cross-adaptive effect topology to automatically determine the minimal delay and polarity contributions required to optimize the sound mixture. The system uses real time, time domain transfer function measurements to determine and correct the individual channel offset for every signal involved in the audio mixture. The method has applications in live and recorded audio mixing where recording a single sound source with more than one signal path is required, for example when recording a drum set with multiple microphones. Results are reported that determine the effectiveness of the proposed method.
Convention Paper 7631 (Purchase now)

P23-2 STFT-Domain Estimation of Subband Correlations—Michael M. Goodwin, Creative Advanced Technology Center - Scotts Valley, CA, USA
Various frequency-domain and subband audio processing algorithms for upmix, format conversion, spatial coding, and other applications have been described in the recent literature. Many of these algorithms rely on measures of the subband autocorrelations and cross-correlations of the input audio channels. In this paper we consider several approaches for estimating subband correlations based on a short-time Fourier transform representation of the input signals.
Convention Paper 7632 (Purchase now)

P23-3 Separation of Singing Voice from Music Accompaniment with Unvoiced Sounds Reconstruction for Monaural Recordings—Chao-Ling Hsu, Jyh-Shing Roger Jang, National Tsing Hua University - Hsinchu, Taiwan; Te-Lu Tsai, Institute for Information Industry - Taipei, Taiwan
Separating singing voice from music accompaniment is an appealing but challenging problem, especially in the monaural case. One existing approach is based on computational audio scene analysis, which uses pitch as the cue to resynthesize the singing voice. However, the unvoiced parts of the singing voice are totally ignored since they have no pitch at all. This paper proposes a method to detect unvoiced parts of an input signal and to resynthesize them without using pitch information. The experimental result shows that the unvoiced parts can be reconstructed successfully with 3.28 dB signal-to-noise ratio higher than that achieved by the currently state-of-the-art method in the literature.
Convention Paper 7633 (Purchase now)

P23-4 Low Latency Convolution In One Dimension Via Two Dimensional Convolutions: An Intuitive Approach—Jeffrey Hurchalla, Garritan Corp. - Orcas, WA, USA
This paper presents a class of algorithms that can be used to efficiently perform the running convolution of a digital signal with a finite impulse response. The impulse is uniformly partitioned and transformed into the frequency domain, changing the one dimensional convolution into a two dimensional convolution that can be efficiently solved with nested short length acyclic convolution algorithms applied in the frequency domain. The latency of the running convolution is the time needed to acquire a block of data equal in size to the uniform partition length.
Convention Paper 7634 (Purchase now)

AES San Francisco 2008Paper Session P23

P23 - Audio DSP

AES San Francisco 2008
Paper Session P23