AES Barcelona 2005: Paper Session G


	Return to Paper Sessions
	AES Barcelona 2005 Paper Session G - Multichannel Sound, Part 1 (5.1 Multichannel, general)

Last Updated: 20050419, mei

Sunday, May 29, 09:00 — 13:00

Chair: Geoff Martin, Bang & Olufsen - Struer, Denmark

G-1 Extraction of Auditory Features and Elicitation of Attributes for the Assessment of Multichannel Reproduced Sound—Sylvain Choisel, Aalborg University - Aalborg, Denmark, and Bang & Olufsen A/S, Struer, Denmark; Florian Wickelmaier, Aalborg University - Aalborg, Denmark
The identification of relevant auditory attributes is pivotal in sound quality evaluation. Two fundamentally different psychometric methods were employed to uncover perceptually relevant auditory features of multichannel reproduced sound. In the first method, called Repertory Grid Technique (RGT), subjects were asked to directly assign verbal labels to the features when encountering them and to subsequently rate the sounds on the scales thus obtained. The second method requires the subjects to consistently identify the perceptually relevant features before assigning them a verbal label. Under sufficient consistency, a lattice representation—as frequently used in Formal Concept Analysis (FCA)—can be derived to depict the structure of auditory features.
Convention Paper 6369 (Purchase now)

G-2 Interaction of Source and Reverberance Spatial Imagery in Multichannel Loudspeaker Audio—John Usher, Wieslaw Woszczyk, McGill University - Montreal, Quebec, Canada
Sound imagery is discussed in many contexts in subjective loudspeaker audio evaluation. In this paper we investigate imagery in terms of the spatial properties of an auditory object. A general categorization of auditory images into either Source or Reverberance images is well established in the literature (also called ASW and LEV). Here we discuss perceptual organization principles and physical factors which affect this distinction. The degree to which existing theories for controlling an S image direction can be applied to controlling an R image was investigated in a two-part experiment. Using a standard 2/2 loudspeaker system and a graphical mapping system developed in previous work we investigated the spatial imagery associated with source and reverberance images around the listener. Stimuli used were a mono channel of recorded anechoic flute and a channel of artificial reverberation. The image direction was affected with either a real loudspeaker source at various locations or a phantom source created with pair-wise amplitude panning. We compare source and reverberance images in terms of perceived image width, distance, and azimuth. We find that: the spatial location and width of S and R images can be independently described using a GUI; pair-wise amplitude panning of S and R images is possible using the side loudspeakers but the consistency of reported spatial image geometry is less than for frontal images; and we gain an insight into spatial (un)masking effects of an S image by an R image.
Convention Paper 6370 (Purchase now)

G-3 Spatial Impulse Response Rendering: Listening Tests and Applications to Continuous Sound—Ville Pulkki, Juha Merimaa, Helsinki University of Technology - Helsinki, Finland
Spatial impulse response rendering (SIRR) is a method to reproduce room impulse responses over multichannel loudspeaker setups. Time-frequency analysis is used to obtain directional and diffuseness information from recorded sound. Totally nondiffuse sound is reproduced as point-like virtual sources, and totally diffuse sound is reproduced with a diffusion technique. In this paper the synthesis of diffuse sound is analyzed and revised. A hybrid method is proposed based on the analysis results. Listening tests with revised SIRR are presented. It is shown that in large spaces, SIRR reproduction cannot be distinguished from the original sample of virtual acoustics. The modifications of SIRR for continuous sound applications are discussed.
Convention Paper 6371 (Purchase now)

G-4 Measurement of Speech Intelligibility in Noise: A Comparison of a Stereo Image Source and a Central Loudspeaker Source—Ben Shirley, Paul Kendrick, University of Salford - Salford, UK
A series of listening tests assessing any intelligibility benefits associated with presenting dialog using a central loudspeaker, as found in 5.1 surround sound systems, compared to using a central stereo image were carried out. Twenty subjects were presented with a series of sentences containing identifying keywords in a background of multitalker babble over a wide stereo image. Half of the sentences were played as a central source and some as a central stereo image. The tests showed average improvements in word recognition of up to 9.4 percent using a separate central loudspeaker when compared with a phantom image between a pair of stereo loudspeakers.
Convention Paper 6372 (Purchase now)

G-5 The Whys and Wherefores of Microphone Array Crosstalk in Multichannel Microphone Array Design—Michael Williams, Sounds of Scotland - Paris, France
Each aspect of crosstalk has a different and definable influence on a specific segment of the multichannel microphone array system: the difference in the effect of crosstalk between coincident (or near-coincident) and spaced multichannel array systems; the crosstalk introduced by microphones adjacent to a specific segment; the crosstalk introduced by microphones on the opposing sides of an array; the effect of crosstalk in the transitory and quasi-steady state regions of a natural signal; crosstalk reduction in the quasi-steady state. Microphone arrays must therefore be designed to minimize this interference with the final sound image, be it front sound stage coverage or surround sound coverage. This paper also includes a description of the Twisted Quad compatible multichannel/stereo recording array system.
Convention Paper 6373 (Purchase now)

G-6 Investigation of the Effect of Interchannel Crosstalk in Multichannel Microphone Technique—Hyun-Kook Lee, Francis Rumsey, University of Surrey - Guildford, Surrey, UK
A series of subjective listening tests were carried out in order to investigate the effect of interchannel crosstalk in a multichannel microphone technique. Perceived attributes of interchannel crosstalk images were first elicited, and then graded with various independent variables, including different types of microphone arrays (different combinations of time and intensity differences), sound source, and acoustic condition. The results showed that the most dominant effects of interchannel crosstalk were an increase of source width and a decrease of locatedness. The ratio of time and intensity differences in microphone array was the most significant factor for both effects. Sound source type had a significant effect for source width increase but not for the locatedness decrease. Acoustic condition was significant for locatedness decrease but not for source width increase. This paper describes the experiment method and presents and discusses the details of the result data.
Convention Paper 6374 (Purchase now)

G-7 Reproducing Multichannel Sound on Any Speaker Layout—Arnaud Laborie, Rémy Bruno, Sébastien Montoya, Trinnov Audio - Paris, France
Consumers are more and more interested in multichannel sound. However, installing a surround system is still a headache for the average user. The ITU recommendations are generally incompatible with homes’ arrangement, and people install their system how they can, which generally results in large spatial sound distortions. This paper presents a system to overcome this problem by adapting multichannel sound to the actual loudspeaker layout. This system consists of a small calibration microphone measuring loudspeaker characteristics (3-D position, frequency response) and a process which remaps multichannel sound over the calibrated layout so as to compensate the measured loudspeakers misconfiguration, including full 3-D position.
Convention Paper 6375 (Purchase now)

G-8 New File Format and Methods for Multichannel Sound in Broadcasting—Lars Jonsson, Swedish Radio - Stockholm, Sweden; Axel Holzinger, D.A.V.I.D. GmbH - Munich, Germany
The RF64 file format is designed to meet the requirements for multichannel sound in broadcasting and audio archiving. It is based on the Microsoft RIFF/WAVE format and Wave Format Extensible for multichannel parameters. Additions are made to the basic specification to allow for more than 4 Gigabyte file sizes when needed. A maximum of 18 surround channels, a stereo downmix channel, and also bitstream signals with non-PCM coded data can be stored in the file. All existing supplements and chunks of the BWF and WAVE formats can be passed along with the new format. Early applications have been developed.
Convention Paper 6376 (Purchase now)