AES Budapest 2012
Paper Session P9

P9 - Analysis and Synthesis: Part 1

Friday, April 27, 09:00 — 12:00 (Room: Liszt)

Chair:
Juha Backman

P9-1 Emergency Voice/Stress-level Combined Recognition for Intelligent House Applications—Konstantinos Drosos, Andreas Floros, Ionian University - Corfu, Greece; Kyriakos Agkavanakis, BlueDev Ltd. - Patras, Greece; Nicolas-Alexander Tatlas, Technological Educational Institute of Piraeus - Piraeus, Greece; Nikolaos-Grigorios Kanellopoulos, Ionian University - Corfu, Greece
Legacy technologies for word recognition can benefit from emerging affective voice retrieval, potentially leading to intelligent applications for smart houses enhanced with new features. In this paper we introduce the implementation of a system, capable to react to common spoken words, taking into account the estimated vocal stress level, thus allowing the realization of a prioritized, affective aural interaction path. Upon the successful word recognition and the corresponding stress level estimation, the system triggers particular affective-prioritized actions, defined within the application scope of an intelligent home environment. Application results show that the established affective interaction path significantly improves the ambient intelligence provided by an affective vocal sensor that can be easily integrated with any sensor-based home monitoring system.
Convention Paper 8615 (Purchase now)

P9-2 Loudness Range (LRA)—Design and Evaluation—Esben Skovenborg, TC Electronic A/S - Risskov, Denmark
Loudness Range (LRA) is a measure of the variation of loudness on a macroscopic time-scale. Essentially, LRA is the difference in loudness level between the soft and loud parts of a program. In 2009 the algorithm for computing LRA was published by TC Electronic and was then included in the EBU R-128 recommendation for loudness normalization. This paper describes the design choices underlying the LRA algorithm. For each of its parameters the interval of optimal values is presented, supported by analyses of audio examples. Consequences of setting parameter values outside these intervals are also described. Although the LRA measure has already proven its usefulness in practice, this paper provides background knowledge that could support further refinement and standardization of the LRA measure.
Convention Paper 8616 (Purchase now)

P9-3 Statistical Properties of the Close-Microphone Responses—Elias K. Kokkinis, Eleftheria Georganti, John Mourjopoulos, University of Patras - Patras, Greece
The close-microphone technique is widely used in modern sound engineering practice. It is mainly used to minimize the effect of leakage and room acoustics on the received signal. In this paper the properties of the close-microphone response are investigated from a signal processing point of view, through the respective frequency domain statistical moments. Room impulse response measurements were made in various rooms and source-microphone distances, and statistical moments were calculated over frequency and distance. It is shown that the statistical properties of the impulse responses remain relatively constant for short source-microphone distances and this in turn provides a consistent sound, which was validated through a subjective evaluation test.
Convention Paper 8617 (Purchase now)

P9-4 Reproduction of Proximity Virtual Sources Using a Line Array of Loudspeakers—Jung-Min Lee, Jung-Woo Choi, Dong-Soo Kang, Yang-Hann Kim, Korea Advanced Institute of Science and Technology (KAIST) - Daejeon, Korea
For reproducing a desired sound field from virtual sources, Wave Field Synthesis (WFS) assumes that virtual sources are positioned at far-field from the loudspeaker array. This far-field assumption inevitably produces reproduction errors when the virtual source is adjacent to the array. In this paper we propose a method that can render the sound field from virtual sources positioned near the loudspeaker array. The driving functions of loudspeakers are derived for the planar array geometry, and then the surface integral is reduced to a line integral by utilizing different approximations from WFS. In addition, a modified equation for the discrete loudspeaker distribution is presented. Numerical simulations show that the proposed method can reduce the reproduction error to a practically acceptable level.
Convention Paper 8618 (Purchase now)

P9-5 On the Statistics of Binaural Room Transfer Functions—Eleftheria Georganti, University of Patras - Patras, Greece; Tobias May, Steven van de Par, University of Oldenburg - Oldenburg, Germany; John Mourjopoulos, University of Patras - Patras, Greece
The well-known property of the spectral standard deviation of Room Transfer Functions (RTFs), that is, its convergence to 5.57 dB, is extended to reverberant Binaural Room Transfer Functions (BRTFs). The BRTFs are related to the anechoic Head Related Transfer Functions (HRTFs) and the corresponding RTFs. Consequently, the statistical properties of the RTFs and HRTFs can be systematically related to the statistical properties of the BRTFs. In this paper the standard deviation of BRTFs measured in different types of rooms, for various source/receiver distances and azimuth angles is computed. The derived values are compared to the ones obtained from the single channel RTFs measured at the same positions. Their relationship to the 5.57 dB value is discussed.
Convention Paper 8619 (Purchase now)

P9-6 On Acoustic Detection of Vocal Modes—Eddy B. Brixen, EBB-consult - Smorum, Denmark; Cathrine Sadolin, Henrik Kjelin, Complete Vocal Institute - Copenhagen, Denmark
This paper is a last minute cancellation. According to the Complete Vocal Technique four vocal modes are defined: Neutral, Curbing, Overdrive, and Edge. These modes are valid for both the singing voice and the speaking voice. The modes are clearly identified both from listening and from visual laryngograph inspection of the vocal cords and the surrounding area of the vocal tract. For many reasons it would be preferred to apply a simple acoustic analysis to identify the modes. This paper looks at the characteristics of the voice modes from an acoustical perspective based on voice samples from four male and two female subjects. The paper describes frequency domain criteria for the discrimination of the various modes.
Convention Paper 8620 (Purchase now)

Return to Paper Sessions

AES Budapest 2012Paper Session P9

P9 - Analysis and Synthesis: Part 1

AES Budapest 2012
Paper Session P9