In This Section
Journal of the AES
2011 October - Volume 59 Number 10
Algorithms that manipulate, remix, or transform stereo audio signals to suit the format of sound reproduction systems make assumptions about the way in which the stereo signal was originally mixed. However, these algorithms usually do not test if the assumptions are valid. In order to determine the type of mixing used in a stereo mix, this research compares two blind classification algorithms that divide individual time–frequency regions into six classes of mixing types. The results show that mixing strategies vary with the musical genre, and the classification algorithm can reduce the instability and center disintegration when upmixing a stereo signal to a multichannel format.
There are many ways to reproduce a sound field over a wide spatial area using an array of loudspeakers, as for example, wavefield synthesis (WFS), ambisonics, and spectral division methods. A new approach, called sound field reconstruction (SFR), optimally reproduces a desired sound field in a given listening area while keeping loudspeaker driver signals within physical constraints. The reproduction of a continuous sound field is represented as an inversion of the discrete acoustic channel from a loudspeaker array to a grid of control points. Extensive simulations comparing WFS, which is the state of the art, with SFR show that on average SFR provides better reproduction accuracy.
In the acoustically complex environment of an automobile, speech communications among the vehicle occupants varies from unintelligible to excellent depending on their location, vehicle speed, and vocal effort. Because of its strong correlation with subjective tests, the speech transmission index (STI) was used to explore the influence of sound source parameters on in-vehicle intelligibility. For example, STI scores were higher when there was a direct path from the talker to the listener; scores were much lower when the talker was in front of the listener; and scores were lower for the outboard ear because of road noise.
A Note on Mathematical Modeling of Power Amplifier/Loudspeaker Nonlinearity in Acoustic Echo Cancelers
A simple mathematical model is proposed to represent the nonlinear characteristics of the combined power amplifier and loudspeaker in acoustic echo cancellers. Three adjustable parameters of the model relate to the distinct parts of the nonlinearity: the linear region for small amplitude, the onset elbow where the nonlinearity begins, and the saturation region for large amplitudes. Compared to a piece-wise linear model, this approach has continuous first derivatives. In combination with linear adaptive filters, this model allows for the efficient design of echo cancellers.
Standards and Information Documents
AES Standards Committee News
Core metadata; audio object metadata; high-resolution multichannel audio interconnection (HRMAI)
42nd Conference Report, Ilmenau
The field of psychoacoustics covers a multitude of topics concerned with human perception of sound. There is a “hard” or classical branch that deals in threshold detection, masking, loudness, and other low-level phenomena; then there is a “softer” branch that deals with the higher-level cognitive aspects of sound perception, including the ways in which we describe and evaluate aspects of sound quality. The classical branch typically employs relatively simple, laboratory-controlled test stimuli such as tones, clicks, and noises in order to find out fundamental things about the hearing process. The sound quality branch tends to be more interested in ecological signals such as music, speech, and environmental sounds. These are harder to control and make consistent, but they offer the possibility of finding out about the way we behave toward sound in richer, “real world” situations.
New Officers 2011/2012
Review of Society’s Sustaining Members
132nd Call for Papers, Budapest
46th Call for Papers, Denver
47th Call for Papers, Chicago