AES Paris 2016
Paper Session P7
P7 - Audio Signal Processing—Part 2: Beamforming, Upmixing, HRTF
Sunday, June 5, 09:00 — 12:30 (Room 352B)
Jamie Angus, University of Salford - Salford, Greater Manchester, UK; JASA Consultancy - York, UK
P7-1 Dual-Channel Beamformer Based on Hybrid Coherence and Frequency Domain Filter for Noise Reduction in Reverberant Environments—Hong Liu, Peking University - Beijing, China; Miao Sun, Shenzhen Graduate School, Peking University - Guangdong, China
As an effective technique for suppressing coherent noise, adaptive beamforming shows a strong decrease in reverberant rooms due to multipath room reflections of received signals. In this paper a dual-channel beamformer based on noise coherence and frequency domain filter is proposed. First, hybrid coherence based on coherent-to-diffuse energy ratio (CDR) is introduced to approximate the coherence of noise signals. Then the hybrid coherence is used to estimate the noise power spectral density (PSD), which is applied to the frequency domain filter to reduce noise and reverberation components in microphone signals. Finally, outputs of the filter are processed by a beamformer to suppress residual noise. Experiments demonstrate that the proposed system has noticeable improvements in SNR and quality of the output in reverberant environments.
Convention Paper 9518 (Purchase now)
Convention Paper 9519 (Purchase now)
P7-3 Estimation of Individualized HRTF in Unsupervised Conditions—Mounira Maazaoui, UMR STMS, IRCAM-CNRS-UPMC Sorbonne Universités - Paris, France; Olivier Warusfel, UMR STMS, IRCAM-CNRS-UPMC Sorbonne Universités - Paris, France
Head Related Transfer Functions (HRTF) are the key features of binaural sound spatialization. Those filters are specific to each individual and generally measured in an anechoic room using a complex process. Although the use of non-individual filters can cause perceptual artifacts, the generalization of such measurements is hardly accessible for large public. Thus, many authors have proposed alternative individualization methods to prevent from measuring HRTFs. Examples of such methods are based on numerical modeling, adaptation of non-individual HRTFs or selection of non-individual HRTFs from a database. In this article we propose an individualization method where the best matching set of HRTFs is selected from a database on the basis of unsupervised binaural recordings of the listener in a real-life environment.
Convention Paper 9520 (Purchase now)
P7-4 Plane Wave Identification with Circular Arrays by Means of a Finite Rate of Innovation Approach—Falk-Martin Hoffmann, University of Southampton - Southampton, Hampshire, UK; Filippo Maria Fazi, University of Southampton - Southampton, Hampshire, UK; Philip Nelson, University of Southampton - Southampton, UK
Many problems in the field of acoustic measurements depend on the direction of incoming wave fronts w.r.t. a measurement device or aperture. This knowledge can be useful for signal processing purposes such as noise reduction, source separation, de-aliasing, and super-resolution strategies among others. This paper presents a signal processing technique for the identification of the directions of travel for the principal plane wave components in a sound field measured with a circular microphone array. The technique is derived from a finite rate of innovation data model and the performance is evaluated by means of a simulation study for different numbers of plane waves in the sound field. [Also a poster—see session P12-13]
Convention Paper 9521 (Purchase now)
P7-5 Mismatch between Interaural Level Differences Derived from Human Heads and Spherical Models—Ramona Bomhardt, RWTH Aachen University - Aachen, Germany; Janina Fels, RWTH Aachen University - Aachen, Germany
The individualization of head-related transfer functions (HRTFs) is important for binaural reproduction to reduce measurement efforts and localization errors. One common assumption of individualization for frequencies below 6 kHz is that the sound pressure field around a sphere is similar to the one of a human head. To investigate the accuracy of this approximation, this paper compares the frequency-dependent interaural level difference (ILD) from a spherical approximation, a simulation using magnetic resonance imaging and individually measured HRTFs of 23 adults' heads. With this database, it is possible to analyze the influence of the head shape and the pinna on ILD using the boundary element method and the measured HRTFs. Meanwhile the mismatch between the spherical and human ILD below 1.5 kHz in the horizontal plane is small, they differ above. In the frequency range of 1.5 and 3.5 kHz, ILD of one side of the head is dominated by two maxima. The offset of the ear canal entrance towards the back of the head and the depth of the head are the two major influencing factors. In general, it is observed that the maxima of a spherical ILD are much smaller and more widely spaced than in the human ILD. Above 4 kHz the difference between human and spherical ILDs is even stronger.
Convention Paper 9522 (Purchase now)
P7-6 Stereo Panning Law Remastering Algorithm Based on Spatial Analysis—François Becker, Paris, France; Benjamin Bernard, Medialab Consulting SNP - Monaco, Monaco; Longcat Audio Technologies - Chalon-sur-Saone, France
Changing the panning law of a stereo mixture is often impossible when the original multitrack session cannot be retrieved or used, or when the mixing desk uses a fixed panning law. Yet such a modification would be of interest during tape mastering sessions, among other applications. We present a frequency-based algorithm that computes the panorama power ratio from stereo signals and changes the panning law without altering the original panorama. [Also a poster—see session 19-12]
Convention Paper 9523 (Purchase now)
P7-7 Non-Linear Extraction of a Common Signal for Upmixing Stereo Sources—François Becker, Paris, France; Benjamin Bernard, Medialab Consulting SNP - Monaco, Monaco; Longcat Audio Technologies - Chalon-sur-Saone, France
In the context of a two- to three-channel upmix, center channel derivations fall within the field of common signal extraction methods. In this paper we explore the pertinence of the performance criteria that can be obtained from a probabilistic approach to source extraction; we propose a new, non-linear method to extract a common signal from two sources that makes the implementation choice of deeper extraction with a criteria of information preservation; and we provide the results of preliminary listening tests made with real-world audio materials. Also a poster—see session P19-13]
Convention Paper 9524 (Purchase now)