AES San Francisco 2012
Paper Session P15
Sunday, October 28, 2:00 pm — 5:30 pm (Room 121)
Paper Session: P15 - Signal Processing Fundamentals
Lars Villemoes, Dolby Sweden - Stockholm, Sweden
P15-1 Frequency-Domain Implementation of Time-Varying FIR Filters—Earl Vickers, STMicroelectronics, Inc. - Santa Clara, CA, USA; The Sound Guy, Inc. - San Jose, CA, USA
Finite impulse response filters can be implemented efficiently by means of fast convolution in the frequency domain. However, in applications such as speech enhancement or channel upmix, where the filter is a time-varying function of the input signal, standard approaches can suffer from artifacts and distortion due to circular convolution and the resulting time-domain aliasing. Existing solutions can be computationally prohibitive. This paper compares a number of previous algorithms and presents an alternate method based on the equivalence between frequency-domain convolution and time domain windowing. Additional computational efficiency can be attained by careful choice of the analysis window.
Convention Paper 8784 (Purchase now)
P15-2 Estimating a Signal from a Magnitude Spectrogram via Convex Optimization—Dennis L. Sun, Stanford University - Stanford, CA, USA; Julius O. Smith, III, Stanford University - Stanford, CA, USA
The problem of recovering a signal from the magnitude of its short-time Fourier transform (STFT) is a longstanding one in audio signal processing. Existing approaches rely on heuristics that often perform poorly because of the nonconvexity of the problem. We introduce a formulation of the problem that lends itself to a tractable convex program. We observe that our method yields better reconstructions than the standard Griffin-Lim algorithm. We provide an algorithm and discuss practical implementation details, including how the method can be scaled up to larger examples.
Convention Paper 8785 (Purchase now)
P15-3 Distance-Based Automatic Gain Control with Continuous Proximity-Effect Compensation—Walter Etter, Bell Labs, Alcatel-Lucent - Murray Hill, NJ, USA
This paper presents a method of Automatic Gain Control (AGC) that derives the gain from the sound source to microphone distance, utilizing a distance sensor. The concept makes use of the fact that microphone output levels vary inversely with the distance to a spherical sound source. It is applicable to frequently arising situations in which a speaker does not maintain a constant microphone distance. In addition, we address undesired bass response variations caused by the proximity effect. Knowledge of the sound-source to microphone distance permits accurate compensation for both frequency response changes and distance-related signal level changes. In particular, a distance-based AGC can normalize these signal level changes without deteriorating signal quality, as opposed to conventional AGCs, which introduce distortion, pumping, and breathing. Provided an accurate distance sensor, gain changes can take effect instantaneously and do not need to be gated by attack and release time. Likewise, frequency response changes due to undesired proximity-effect variations can be corrected adaptively using precise inverse filtering derived from continuous distance measurements, sound arrival angles, and microphone directivity no longer requiring inadequate static settings on the microphone for proximity-effect compensation.
Convention Paper 8786 (Purchase now)
P15-4 Subband Comfort Noise Insertion for an Acoustic Echo Suppressor—Guangji Shi, DTS, Inc. - Los Gatos, CA, USA; Changxue Ma, DTS, Inc. - Los Gatos, CA, USA
This paper presents an efficient approach for comfort noise insertion for an acoustic echo suppressor. Acoustic echo suppression causes frequent noise level change in noisy environments. The proposed algorithm estimates the noise level for each frequency band using a minimum variance based noise estimator, and generates comfort noise based on the estimated noise level and a random phase generator. Tests show that the proposed comfort noise insertion algorithm is able to insert an appropriate level of comfort noise that matches the background noise characteristics in an efficient manner.
Convention Paper 8787 (Purchase now)
P15-5 Potential of Non-uniformly Partitioned Convolution with Freely Adaptable FFT Sizes—Frank Wefers, RWTH Aachen University - Aachen, Germany; Michael Vorländer, RWTH Aachen University - Aachen, Germany
The standard class of algorithms used for FIR filtering with long impulse responses and short input-to-output latencies are non-uniformly partitioned fast convolution methods. Here a filter impulse response is split into several smaller sub filters of different sizes. Small sub filters are needed for a low latency, whereas long filter parts allow for more computational efficiency. Finding an optimal filter partition that minimizes the computational cost is not trivial, however optimization algorithms are known. Mostly the Fast Fourier Transform (FFT) is used for implementing the fast convolution of the sub filters. Usually the FFT transform sizes are chosen to be powers of two, which has a direct effect on the partitioning of filters. Recent studies reveal that the use of FFT transform sizes that are not powers two has a strong potential to lower the computational costs of the convolution even more. This paper presents a new real-time low-latency convolution algorithm, which performs non-uniformly partitioned convolution with freely adaptable FFT sizes. Alongside, an optimization technique is presented that allows adjusting the FFT sizes in order to minimize the computational complexity for this new framework of non-uniform filter partitions. Finally the performance of the algorithm is compared to conventional methods.
Convention Paper 8788 (Purchase now)
P15-6 Comparison of Filter Bank Design Algorithms for Use in Low Delay Audio Coding—Stephan Preihs, Leibniz Universität Hannover - Hannover, Germany; Thomas Krause, Leibniz Universität Hannover - Hannover, Germany; Jörn Ostermann, Leibniz Universität Hannover - Hannover, Germany
This paper is concerned with the comparison of filter bank design algorithms for use in audio coding applications with a very low coding delay of less than 1ms. Different methods for numerical optimization of low delay filter banks are analyzed and compared. In addition, the use of the designed filter banks in combination with a delay-free ADPCM coding scheme is evaluated. Design properties and results of PEAQ (Perceptual Evaluation of Audio Quality) based objective audio-quality evaluation as well as a listening test are given. The results show that in our coding scheme a significant improvement of audio-quality, especially for critical signals, can be achieved by the use of filter banks designed with alternative filter bank design algorithms.
Convention Paper 8789 (Purchase now)
P15-7 Balanced Phase Equalization; IIR Filters with Independent Frequency Response and Identical Phase Response—Peter Eastty, Oxford Digital Limited - Oxford, UK
It has long been assumed that in order to provide sets of filters with arbitrary frequency responses but matching phase responses, symmetrical, finite impulse response filters must be used. A method is given for the construction of sets of infinite impulse response (recursive) filters that can achieve this aim with lower complexity, power, and delay. The zeros of each filter in a set are rearranged to provide linear phase while the phase shift due to the poles of each filter is counteracted by all-pass compensation filters added to other members of the set.
Convention Paper 8790 (Purchase now)