AES Rome 2013
Paper Session P14
P14 - Applications in Audio
Monday, May 6, 14:30 — 17:30 (Sala Carducci)
Juha Backman, Nokia Corporation - Espoo, Finland
P14-1 Implementation of an Intelligent Equalization Tool Using Yule-Walker for Music Mixing and Mastering—Zheng Ma, Queen Mary University of London - London, UK; Joshua D. Reiss, Queen Mary University of London - London, UK; Dawn A. A. Black, Queen Mary University of London - London, UK
A new approach for automatically equalizing an audio signal toward a target frequency spectrum is presented. The algorithm is based on the Yule-Walker method and designs recursive IIR digital filters using a least-squares fitting to any desired frequency response. The target equalization curve is obtained from the spectral distribution analysis of a large dataset of popular commercial recordings. A real-time C++ VST plug-in and an off-line Matlab implementation have been created. Straightforward objective evaluation is also provided, where the output frequency spectra are compared against the target equalization curve and the ones produced by an alternative equalization method.
Convention Paper 8892 (Purchase now)
P14-2 On the Informed Source Separation Approach for Interactive Remixing in Stereo—Stanislaw Gorlow, University of Bordeaux - Talence, France; Sylvain Marchand, Université de Brest - Brest, France
Informed source separation (ISS) has become a popular trend in the audio signal processing community over the past few years. Its purpose is to decompose a mixture signal into its constituent parts at the desired or the best possible quality level given some metadata. In this paper we present a comparison between two ISS systems and relate the ISS approach in various configurations with conventional coding of separate tracks for interactive remixing in stereo. The compared systems are Underdetermined Source Signal Recovery (USSR) and Enhanced Audio Object Separation (EAOS). The latter forms a part of MPEG’s Spatial Audio Object Coding technology. The performance is evaluated using objective difference grades computed with PEMO-Q. The results suggest that USSR performs perceptually better than EOAS and has a lower computational complexity.
Convention Paper 8893 (Purchase now)
P14-3 Scene Inference from Audio—Daniel Arteaga, Fundacio Barcelona Media - Barcelona, Spain; Universitat Pompeu Fabra - Barcelona, Spain; David García-Garzón, Universitat Pompeu Fabra - Barcelona, Spain; Toni Mateos, imm sound - Barcelona, Spain; John Usher, Hearium Labs - San Francisco, CA, USA
We report on the development of a system to characterize the geometric and acoustic properties of a space from an acoustic impulse response measured within it. This can be thought of as the inverse problem to the common practice of obtaining impulse responses from either real-world or virtual spaces. Starting from an impulse response recorded in an original scene, the method described here uses a non-linear search strategy to select a scene that is perceptually as close as possible to the original one. Potential applications of this method include audio production, non-intrusive acquisition of room geometry, and audio forensics.
Convention Paper 8894 (Purchase now)
P14-4 Continuous Mobile Communication with Acoustic Co-Location Detection—Robert Albrecht, Aalto University - Espoo, Finland; Sampo Vesa, Nokia Research Center - Nokia Group, Finland; Jussi Virolainen, Nokia Lumia Engineering - Nokia Group, Finland; Jussi Mutanen, JMutanen Software - Jyväskylä, Finland; Tapio Lokki, Aalto University - Aalto, Finland
In a continuous mobile communication scenario, e.g., between co-workers, participants may occasionally be located in the same space and thus hear each other naturally. To avoid hearing echoes, the audio transmission between these participants should be cut off. In this paper an acoustic co-location detection algorithm is proposed for the task, grouping participants together based solely on their microphone signals and mel-frequency cepstral coefficients thereof. The algorithm is tested both on recordings of different communication situations and in real time integrated into a voice-over-IP communication system. Tests on the recordings show that the algorithm works as intended, and the evaluation using the voice-over-IP conferencing system concludes that the algorithm improves the overall clarity of communication compared with not using the algorithm. The acoustic co-location detection algorithm thus proves a useful aid in continuous mobile communication systems.
Convention Paper 8895 (Purchase now)
P14-5 Advancements and Performance Analysis on the Wireless Music Studio (WeMUST) Framework—Leonardo Gabrielli, Universitá Politecnica delle Marche - Ancona, Italy; Stefano Squartini, Università Politecnica delle Marche - Ancona, Italy; Francesco Piazza, Universitá Politecnica della Marche - Ancona (AN), Italy
Music production devices and musical instruments can take advantage of IEEE 802.11 wireless networks for interconnection and audio data sharing. In previous works such networks have been proved able to support high-quality audio streaming between devices at acceptable latencies in several application scenarios. In this work a prototype device discovery mechanism is described to improve ease of use and flexibility. A diagnostic tool is also described and provided to the community that allows to characterize average network latency and packet loss. Lower latencies are reported after software optimization and sustainability of multiple audio channels is also proved by means of experimental tests.
Convention Paper 8896 (Purchase now)
P14-6 Acoustical Characteristics of Vocal Modes in Singing—Eddy B. Brixen, EBB-consult - Smorum, Denmark; Cathrine Sadolin, Complete Vocal Institute - Copenhagen, Denmark; Henrik Kjelin, Complete Vocal Institute - Copenhagen, Denmark
According to the Complete Vocal Technique four vocal modes are defined: Neutral, Curbing, Overdrive, and Edge. These modes are valid for both the singing voice and the speaking voice. The modes are clearly identified both from listening and from visual laryngograph inspection of the vocal cords and the surrounding area of the vocal tract. In a recent work a model has been described to distinguish between the modes based on acoustical analysis. This paper looks further into the characteristics of the voice modes in singing in order to test the model already provided. The conclusion is that the model is too simple to cover the full range. The work has also provided information on singers’ SPL and formants’ repositioning in dependence of pitch. Further work is recommended.
Convention Paper 8897 (Purchase now)