Authors:He, Jianjun; Ranjan, Rishabh; Gan, Woon-Seng; Chaudhary, Nitesh Kumar; Hai, Nguyen Duy; Gupta, Rishabh
Affiliation:Maxim Integrated, San Jose, CA, USA; DSP Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Head-related transfer function (HRTF) is an essential component of a system for creating an immersive listening experience over headphones in multimedia and virtual and augmented reality applications. A critical requirement is the measure of the HRTFs for each individual to accommodate ear idiosyncrasies. Conventional static stop-and-go HRTF measurement methods are tedious and time-consuming. Recently proposed continuous HRTF acquisition methods could improve the acquisition efficiency but they still restrict the movements of the subjects and must be conducted in a controlled environment. In this paper the authors propose a fast and continuous HRTF measurement system with an embedded head tracker that can drastically reduce the measurement duration, obtaining HRTFs at high resolution while still allowing unconstrained head movements in both azimuth and elevation directions. To extract the HRTFs from such dynamic binaural measurements with random head movements, an improved adaptive filtering algorithm is proposed by integrating direction quantization, variable step size, and including optimal HRTF selection into the progressive-based normalized least-mean-squares algorithm. Objective evaluations and subjective listening tests were conducted using measurement data obtained from human subjects. The experimental results demonstrate that the proposed system and algorithm can yield HRTFs that are very close to HRTFs obtained with conventional static methods.
Download: PDF (HIGH Res) (4.9MB)
Download: PDF (LOW Res) (635KB)
Affiliation:Dept. of Measurement and Information Systems, Budapest University of Technology and Economics, Budapest, Hungary
Equalizing a multi-input multi-output (MIMO) transfer function is a common task in audio and acoustic signal processing, a process called multichannel inversion or deconvolution. For the MIMO equalization, it has been shown that by taking advantage of fixed poles the design of an IIR equalizer is equally simple when compared to the commonly used FIR equalizer. In many applications parallel filters can be used instead of FIR filters without major modifications to the methods and algorithms, resulting in higher flexibility in frequency resolution. The fixed-pole design of parallel filters is generalized to the multichannel case, and it is shown that the filter parameters can be estimated by the least-squares equations similar to the common FIR equalizers with the added flexibility and modeling efficiency of IIR filters. The method is illustrated with a MIMO crosstalk canceller for a common-enclosure loudspeaker array and results in a significantly better crosstalk cancellation compared to a least-squares FIR MIMO design of comparable computational complexity. Similar savings are foreseen in other acoustic MIMO applications that may benefit from the nonuniform frequency resolution achievable with IIR filters.
Download: PDF (HIGH Res) (2.0MB)
Download: PDF (LOW Res) (440KB)
Authors:Stolfi, Ariane; Sokolovskis, Janis; Goródscy, Fábio; Iazzetta, Fernando; Barthet, Mathieu
Affiliation:University of São Paulo, Brazil; Centre for Digital Music, Queen Mary University of London, London, UK
Technology-mediated audience participation is an emergent topic in creative music technology with a blurred distinction between audience and performers. This paper analyzes communication patterns occurring in the online chat of the Open Band system for participatory live music performance. In addition to acting as a multi-user messaging tool, the chat system also serves as a control interface for the sonification of textual messages from the audience. Open Band performances have been presented at various festivals and conferences since 2016. Its web-based platform enables collective “sound dialogues” that are opened to everyone regardless of musical skills. Drawing on interactive participatory art and networked music performance, the system aims to provide engaging social experiences in colocated music-making situations. The authors collected data from four public performances including over 3,000 anonymous messages sent by audiences. After presenting the design of the system, the authors analyzed the semantic content of messages using thematic and statistical methods. Findings show how different sonification mechanisms alter the nature of the communication between participants who articulate both linguistic and musical self-expression. One of the design goals was to provide a platform for free audience expression as a web “agora.” The various themes that emerged from the analyses endorse this idea, as participants felt free to discuss subjects ranging from love to political opinions.
Download: PDF (HIGH Res) (4.0MB)
Download: PDF (LOW Res) (492KB)
Authors:Mäkivirta, Aki; Liski, Juho; Välimäki, Vesa
Affiliation:Genelec Oy, Iisalmi, Finland; Aalto University, Acoustics Lab, Dept. of Signal Processing and Acoustics, Espoo, Finland
This paper focuses on the modeling of the linear properties of loudspeakers. The impulse response of a generalized multi-way loudspeaker is modeled and delay-equalized using digital filters. The dominant features of a loudspeaker are its low- and high-frequency roll-off characteristics and its behavior at the crossover points. The proposed loudspeaker model also characterizes the main effects of the mass-compliance resonant system. The impulse response, its logarithm and spectrogram, and the magnitude and group-delay responses are visualized and compared with those measured from a high-quality two-way loudspeaker. The model explains the typical local group-delay variations and magnitude-response deviations from a flat response in the passband. The group-delay equalization of a three-way loudspeaker is demonstrated with three different methods. Time-alignment of the tweeter and midrange elements using a bulk delay is shown to cause ripple in the magnitude response. The frequency-sampling method for the design of an FIR group-delay equalizer is detailed and is used to flatten the group delay of the speaker model in both the whole and limited audio range. The full-band equalization is shown to lead to preringing in the impulse response. In contrast, group-delay equalization at mid- and high-frequencies only reduces the length of the loudspeaker impulse response without introducing preringing.
Download: PDF (HIGH Res) (2.3MB)
Download: PDF (LOW Res) (1.5MB)
Discuss this paper (3 comments)
Authors:Vairetti, Giacomo; Sena, Enzo De; Catrysse, Michael; Jensen, Søren Holdt; Moonen, Marc; Waterschoot, Toon van
Affiliation:KU Leuven, Dept. of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium; KU Leuven, Dept. of Electrical Engineering (ESAT), ETC, e-Media Research Lab, Leuven, Belgium; Institute of Sound Recording, University of Surrey, Guildford, Surrey, UK; Televic N.V., Izegem, Belgium; Dept. of Electronic Systems, Aalborg University, Aalborg, Denmark
Parametric equalization of an acoustic system aims to compensate for the deviations of its response from a desired target response using parametric digital filters. An optimization procedure is presented for the automatic design of a low-order equalizer using parametric infinite impulse response (IIR) filters, specifically second-order peaking filters and first-order shelving filters. The proposed procedure minimizes the sum of square errors between the system and the target complex frequency responses instead of the commonly used difference in magnitudes, and exploits a previously unexplored orthogonality property of one particular type of parametric filter. This brings a series of advantages over the state-of-the-art procedures, such as (1) an improved mathematical tractability of the equalization problem, with the possibility of computing analytical expressions for the gradients, (2) an improved initialization of the parameters, including the global gain of the equalizer, (3) the incorporation of shelving filters in the optimization procedure, and (4) a more accentuated focus on the equalization of the more perceptually relevant frequency peaks. Examples of loudspeaker and room equalization are provided, as well as a note about extending the procedure to multipoint equalization and transfer function modeling.
Download: PDF (HIGH Res) (1.5MB)
Download: PDF (LOW Res) (1.0MB)
Authors:Moore, Jonathan B.; Hill, Adam J.
Affiliation:Department of Electronics, Computing and Mathematics, University of Derby, Derby, UK
In many sound reinforcement and reproduction scenarios, the desired audience sound coverage may only be achieved by using multiple electro-acoustic transducers emitting coherent signals at equal or nearly equal sound power levels. When transducers are not part of an acoustically coupled array, any difference in path-length from a listening position to two or more loudspeakers will result in a relative phase difference between the received signals. The summation of such signals will result in a frequency response that is dependent on the path-length difference, and there is cancellation of frequencies where phase difference equates to 180 degrees. High interchannel correlation can also lead to lack of apparent source width in multichannel reproduction and a lack of externalization with headphone reproduction. This work examines a time-variant, real-time decorrelation algorithm for the reduction of interchannel correlation that is also capable of reducing correlation between direct sound and early reflections. The focus is on minimizing wide-area low-frequency magnitude response variation in sound reinforcement scenarios but is applicable to a wide range of sound reproduction applications. Key variables that control the balance between decorrelation and processing artifacts, such as transient smearing, are described and evaluated using a MUSHRA test. Parameter values that render the processing transparent while still providing decorrelation are discussed. Additionally, the benefit of transient preservation is investigated and shown to increase transparency while not significantly degrading performance.
Download: PDF (HIGH Res) (4.0MB)
Download: PDF (LOW Res) (515KB)
Affiliation:Agnew Analog Reference Instruments, UK
Even though the manufacturing of disk mastering systems had ceased by the mid 1980s and the equipment used in this industry was largely manufactured between 1950 and 1985, such systems would still benefit from further research and development because of the recent resurgence of consumer interest in vinyl records. During the tests and experiments described in this article, it was observed that a power amplifier of high quality and performance may not be compatible with a motional feedback cutter head system. Cutting amplifiers need to be designed and optimized specifically for this application. The author explores the design problems of moving coil transducers using motional feedback as applied to professional disk recording and mastering systems. Significant improvement can be achieved. Wideband measurements from 10 Hz to 100 kHz should be considered to be the minimum standard and ideally the range should be DC to 1 MHz. Square waves are used as a test signal and development aid. Cutter heads have traditionally been the limiting factor in the performance of a disk recording system, as well as the most fragile and frequently damaged component. It is recommended that disk recording systems be designed to be transducer limited rather than electronics limited.
Download: PDF (HIGH Res) (16.6MB)
Download: PDF (LOW Res) (498KB)
Although channel-based audio is not dead in the current push for easily managed immersive audio workflows, alternatives such as object-based and scene-based audio have a lot to recommend them. This is because the latter offer the flexibility to handle the user interaction, context dependency, and multiformat aspects of modern systems. Papers from the recent AES conference dealing with these topics are summarized.
Download: PDF (1.1MB)