In This Section
Journal of the AES
2014 March - Volume 62 Number 3
Letter from an AES Founder
Beamforming Regularization, Scaling Matrices, and Inverse Problems for Sound Field Extrapolation and Characterization: Part I – Theory
The underlying hypothesis in spatial sound reproduction technologies is that a listener immersed in a physical reconstruction of a target sound field will experience the appropriate perception over a large listening area. The aim of this paper is twofold: to develop and describe a method of spatial sound field extrapolation (SFE) based on microphone array measurements of arbitrary geometry, and to develop and define a sound field characterization method and a sound field classification based on known objective and subjective metrics. To achieve SFE, a recently developed method was proposed and further analyzed. Once SFE was achieved, the inverse problem solution was investigated to evaluate different sound field metrics: energy density, sound intensity, direction of arrival, diffuseness, velocity vector, energy vector, directional energy, interaural time difference, incident directivity factor, incident directivity index, and directional diffusion.
The Hearing-Aid Speech Quality Index (HASQI) Version 2 is based on a model of the auditory periphery that incorporates changes due to hearing loss. Audio engineers generally assume that any change in the signal produces quality degradation, but hearing aids are more complicated because they explicitly contain nonlinearities that are intended to make speech more intelligible for hearing-impaired users. A quality index for hearing aids must therefore predict the trade-offs between signal distortion and audibility. The nonlinear component of the HASQI model combines measurements of envelope and temporal fine-structure modifications. The linear component is based on changes in the long-term spectrum; the quality prediction is then formed by taking the product of nonlinear and linear terms. Version 2 of HASQI has been validated for noise and nonlinear distortion, noise suppression, frequency compression, acoustic feedback and feedback cancellation, and modulated noise.
In mobile audio device applications, there is sometimes the need to include processing that can change the pitch or tempo of the sound. This additional signal processing can produce a power load on the batteries, which could be minimized if the decoder stage was combined with the pitch-time shift operation. The authors propose a modified AAC (advanced audio coding) decoder structure that can also change the pitch and tempo of an audio signal. To take advantage of the combination, the algorithm includes a preprocessing block before the IMDCT (inverse modified discrete cosine transform) block and a postprocessing block after the overlap and add block. Test results show that the algorithm produces quality results while reducing the computational load to 64% of the separate algorithms.
A Versatile Analytical Approach for Assessing Harmonic Distortion in Current-Driven Electrodynamic Loudspeakers
This research develops a compact and transparent analytical model of current-driven loudspeakers to assess their associated nonlinearities and harmonic distortion. Taylor polynomials are used to derive simple expressions for the first five harmonic distortion components of the spectrum: the significance of each parameter under consideration is rendered transparent using this methodology. This analytical approach is validated by performing a numerical analysis using Simulink software and can easily be implemented using a standard spreadsheet application. The measured parameters of a midrange generic transducer are used to compare the numerical and analytical approaches. The analytical approach is shown to be effective, and it enables the analysis of the evenness property of a given influence parameter. The analytical approach can also be used to model nonlinear parameters as a function of the diaphragm speed.
For design purposes, the transfer function of a so-called bandpass subwoofer, which is designed to handle frequencies at the lowest end of the spectrum, is best viewed as the convolution of a high-pass response defining its low-frequency cut-off, with a low-pass response that defines its crossover to a complementary high-pass response of a “satellite” system. The 2nd-order function that defines the high-frequency limit must now be included in the design of the crossover process with the remaining drivers. The design approach ensures that both monophonic and stereophonic woofer output will integrate with the overall system response. Tables of alignments for a range of combinations of high-pass with low-pass responses indicate that when the low-pass factor of the transfer function is peakier than is desired, drivers can be chosen with more convenient values of parameters.
Sonification is the systematic representation of data using sounds, such as text-to-speech, color readers, Geiger counters, acoustic radars, and MIDI synthesizers. This paper surveys existing sonification systems and suggests taxonomy of algorithms and devices. The sonification process requires an artificial mapping between two sensory modalities using a model based on either psychoacoustics or artificial heuristics. In the former, the paradigm exploits the natural discrimination of the source spatial parameters (distance, azimuth, and elevation, for instance). In the latter, the paradigm creates an artificial match between graphical and auditory cues. Artificial sonification uses nonspatial characteristics of the sound, such as frequency, brightness or timbre, formants, saturation, and time intervals, which are not related to the physical characteristics or parameters of objects or surroundings.
Sonification of 3-dimensional shapes is challenging because there are fundamental differences between the human auditory and visual systems. Subjects were asked to explore the audio representations of top-projected shapes having different levels of visual complexity using several different strategies. After a 50-minute experience with each sonification technique, all subjects agreed that auditory patterns derived from cross-sectional profiles of virtual objects were robust and extremely easy for mental manipulation, enabling them to mentally rebuild virtual shapes. At the beginning of the test, subjects could not imagine that it would be possible to get a sense of a virtual shape by relying exclusively on ordinary MIDI sounds. Two other techniques were not successful.
Standards and Information Documents
AES Standards Committee News
Audio-file transfer & exchange; audio connectors
The AES67 standard describes a relatively straightforward collection of networking solutions and protocols that together enable audio networks to interoperate. A special session presented during the 135th Convention explained the decisions that were made and the compromises involved during the development of the standard, led by some of the key protagonists.
137th Convention, Los Angeles, Call for Papers
Special Issue on Sound Field Control, Call for Papers