In This Section
Perceptual Effects of Dynamic Range Compression in Popular Music Recordings - January 2014
Accurate Calculation of Radiation and Diffraction from Loudspeaker Enclosures at Low Frequency - June 2013
New Measurement Techniques for Portable Listening Devices: Technical Report - October 2013
Journal of the AES
2013 January/February - Volume 61 Number 1/2
Editor’s Note and List of Associate Technical Editors
Wave Field Synthesis (WFS) can synthesize virtual sound sources that are perceived to be at locations between loudspeakers and the listener, called focused sources. Because of practical limitations in the density of loudspeakers, there are artifacts. This research explores the amount of perceptual artifacts and the localization of the focused sources. The results from a variety of listening configurations illustrate the trade-offs. The truncation of loudspeaker arrays creates two opposite effects: (a) fewer additional wave fronts reduce the perception of artifacts, (b) stronger diffraction reduces the size of the listening area with adequate binaural cues.
For perceptual evaluations of room acoustics, a spatial room impulse response is measured in a real space, encoded for a multichannel loudspeaker reproduction system, and then convolved with anechoic music. This paper describes the Spatial Decomposition Method (SDM). The encoding decomposes the spatial impulse response into a set of image-sources. In contrast to previous methods, SDM can be applied to an arbitrary compact microphone array having a small number of microphones without regard to the spatial sound reproduction technique. SDM relies upon the assumption that the sound propagation direction is the average of all the waves arriving at the microphone array, and that the sound pressure wave at the geometric center of the array represents its impulse response. Listening test with simulated impulse responses show that the proposed method produces an auralization indistinguishable from the reference in the best case.
In many audio processing applications, signals are represented by linear combinations of basis functions (such as with windowed Fourier transforms) that are collected in so-called dictionaries. These are considered well adapted to a particular class of signals if they lead to sparse representations, meaning only a small number of basis functions are required for good approximation of signals. Most natural signals have strong inherent structures, such as harmonics and transients, a fact that can be used for adapting audio processing algorithms. This paper considers the audio-denoising problem from the perspective of structured sparse representation. A generalized thresholding scheme is presented from which simple audio-denoising operators are derived. They perform equally well compared to state-of-the-art methods while featuring significantly less computational costs.
A common method for displaying, modeling, and equalizing the frequency response of audio systems is to use smoothing to eliminate the raggedness of the response. Fixed-pole parallel filters, which produce modest computational loading for both signal filtering and parameter estimation, possess the beneficial properties of smoothing. This makes them an efficient method for modeling or equalizing audio systems. The resolution of smoothing is controlled by the choice of pole frequencies: for obtaining a smoothing with 1/ß-octave resolution, ß/2 pole pairs are placed in each octave (e.g., sixth-octave resolution is achieved by having three pole pairs per octave). In addition, an analysis shows the theoretical equivalence of parallel filters and Kautz filters; the formulas for converting the parameters of the two types of filter are given.
Physical models of musical instruments are often the basis for computer synthesis of the sound when played. By manipulating control parameters of the model to mimic the performer’s actions, a good imitation of the music can be achieved. A digital waveguide synthesis of the geomungo, a Korean traditional plucked string instrument, must take into account the fact that vigorous playing techniques produce extreme vibrato with a noticeably fluctuating in the decay of the harmonics. To model pitch fluctuation and the decay characteristics of its harmonic partials, a time-varying loss filter with a sinusoidal loop gain is used. The model uses a generalized form of the Karplus-Strong algorithm with a one-pole filter to model loss, and a Lagrange interpolation filter to implement the fundamental frequency. A real-time system shows the potential for creating a virtual geomungo.
While there has been great progress in inventing complex algorithms to evaluate room acoustics of large spaces, simple statistical models remain attractive because they can describe the space with a few parameters. This paper explores how Barron’s model can be used to investigate the early and late energy parameters in small shoebox shaped rooms, which place additional burdens on the model. Measurements and simulations for three small rooms without furnishings were used to evaluate this simple model. Errors that were only a few dB for both reflected energy and early reflected sound when averaged across a room in the 250 Hz to 2 kHz frequency range. The model may be useful for studying the influence of parameters such as room volume, source-receiver distance, as well as microphone and loudspeaker directivity.
Teleconferencing applications often use a microphone placed directly on the table surface. A boundary or pressure zone microphone has its membrane mounted so close to a sound-reflecting plane that the membrane receives direct and reflected sounds in phase at all frequencies of interest, thereby avoiding destructive phase interference. However, in typical applications with small tables or a podium surface, a more detailed model is required to determine if the planar boundary assumption is still valid. Diffraction at the surface edges might significantly affect the frequency response. A comparison between the calculated and measured frequency responses indicate that the simulation gives 1/3 octave-band levels that are typically within 1 dB of measured values. As expects, the response from a small table is less uniform than for a large table.
Standards and Information Documents
AES Standards Committee News
Audio network control; connector for surround microphones; miniature XLR connectors; carriage of MPEG Surround in AES3; audio-file transfer and exchange file format; acoustics, plane-wave tubes; measuring loudspeaker drive units
Mastering for today’s music delivery media requires an understanding of recent developments in audio coding, physical, and online formats. The advent of Apple’s “Mastered for iTunes” program gives rise to new challenges that were tackled by a panel of experts at the 133rd Convention.
135th Convention, New York, Call for Papers