Affiliation:Philips Research, Eindhoven, The Netherlands
Algorithms that manipulate, remix, or transform stereo audio signals to suit the format of sound reproduction systems make assumptions about the way in which the stereo signal was originally mixed. However, these algorithms usually do not test if the assumptions are valid. In order to determine the type of mixing used in a stereo mix, this research compares two blind classification algorithms that divide individual time–frequency regions into six classes of mixing types. The results show that mixing strategies vary with the musical genre, and the classification algorithm can reduce the instability and center disintegration when upmixing a stereo signal to a multichannel format.
Download: PDF (HIGH Res) (3.9MB)
Download: PDF (LOW Res) (559KB)
Authors:Kolundzija, Mihailo; Faller, Christof; Vetterli, Martin
Affiliation:School of Computer and Communications Sciences, Ecole Polytechnique Fédérale Lausanne (EPFL), Lausanne, Switzerland
There are many ways to reproduce a sound field over a wide spatial area using an array of loudspeakers, as for example, wavefield synthesis (WFS), ambisonics, and spectral division methods. A new approach, called sound field reconstruction (SFR), optimally reproduces a desired sound field in a given listening area while keeping loudspeaker driver signals within physical constraints. The reproduction of a continuous sound field is represented as an inversion of the discrete acoustic channel from a loudspeaker array to a grid of control points. Extensive simulations comparing WFS, which is the state of the art, with SFR show that on average SFR provides better reproduction accuracy.
Download: PDF (HIGH Res) (13.0MB)
Download: PDF (LOW Res) (1004KB)
Discuss this paper (2 comments)
Authors:Samardzic, Nikolina; Novak, Colin
Affiliation:Department of Mechanical, Automotive and Materials Engineering, University of Windsor, Windsor, Ont., Canada
In the acoustically complex environment of an automobile, speech communications among the vehicle occupants varies from unintelligible to excellent depending on their location, vehicle speed, and vocal effort. Because of its strong correlation with subjective tests, the speech transmission index (STI) was used to explore the influence of sound source parameters on in-vehicle intelligibility. For example, STI scores were higher when there was a direct path from the talker to the listener; scores were much lower when the talker was in front of the listener; and scores were lower for the outboard ear because of road noise.
Download: PDF (HIGH Res) (11.8MB)
Download: PDF (LOW Res) (2.2MB)
Author:Abuelma’atti, Muhammad Taher
Affiliation:King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
A simple mathematical model is proposed to represent the nonlinear characteristics of the combined power amplifier and loudspeaker in acoustic echo cancellers. Three adjustable parameters of the model relate to the distinct parts of the nonlinearity: the linear region for small amplitude, the onset elbow where the nonlinearity begins, and the saturation region for large amplitudes. Compared to a piece-wise linear model, this approach has continuous first derivatives. In combination with linear adaptive filters, this model allows for the efficient design of echo cancellers.
Download: PDF (HIGH Res) (808KB)
Download: PDF (LOW Res) (407KB)
The field of psychoacoustics covers a multitude of topics concerned with human perception of sound. There is a “hard” or classical branch that deals in threshold detection, masking, loudness, and other low-level phenomena; then there is a “softer” branch that deals with the higher-level cognitive aspects of sound perception, including the ways in which we describe and evaluate aspects of sound quality. The classical branch typically employs relatively simple, laboratory-controlled test stimuli such as tones, clicks, and noises in order to find out fundamental things about the hearing process. The sound quality branch tends to be more interested in ecological signals such as music, speech, and environmental sounds. These are harder to control and make consistent, but they offer the possibility of finding out about the way we behave toward sound in richer, “real world” situations.
Download: PDF (1.3MB)