AES Conventions and Conferences

   Return to 113th
   Chairman's Welcome
   Exhibitors
   Detailed Calendar
         (in Excel)
   Calendar (in PDF)
   4 Day Planner
   Paper Sessions
   Workshops
   Special Events
   Technical Tours
   Student Program
   Historical
   Heyser Lecture
   Tech Comm Mtgs
   Standards Mtgs
   Registration
   Travel and
         General Info
   Press Information
   Student Volunteers

Saturday, October 5 9 am – 11:00 am
SESSION B: SIGNAL PROCESSING, PART 1

Chair: Rob Maher, Montana State University, Bozeman, MT, USA

B-1 A Talker-Tracking Microphone Array for Teleconferencing SystemsKazunori Kobayashi, Ken’ichi Furuya, Akitoshi Kataoka, NTT, Musashino-shi, Tokyo, Japan

We propose a beam forming method that is applicable to near sound fields where a filter-and-sum microphone array maintains better quality for the target sound than the conventional delay-and-sum array. We also describe a real-time implementation that includes steering of the beam to detected talker locations. With the use of a microphone array, our system also cuts levels of noise to achieve high-quality sound acquisition. Furthermore, it allows the talker to be in any position. Computer simulation and experiments show that our method is effective in teleconferencing systems.
Convention Paper 5642

B-2 An Alternative Model for Sound Signals Encountered in Reverberant Environments; Robust Maximum Likelihood Localization and Parameter Estimation Based on a Sub-Gaussian ModelPanayiotis G. Georgiou, Chris Kyriakakis, University of Southern California, Los Angeles, CA, USA

In this paper we investigate an alternative to the Gaussian density for modeling signals encountered in audio environments. The observation that sound signals are impulsive in nature, combined with the reverberation effects commonly encountered in audio, motivates the use of the sub-Gaussian density. The new sub-Gaussian statistical model and the separable solution of its maximum likelihood estimator are derived. These are used in an array scenario to demonstrate with both simulations and two different microphone arrays the achievable performance gains. The simulations exhibit the robustness of the sub-Gaussian-based method while the real world experiments reveal a significant performance gain, supporting the claim that the sub-Gaussian model is better suited for sound signals.
Convention Paper 5643

B-3 Imperceptible Echo for Robust Audio WatermarkingHyen-O Oh1, Jong Won Seok2, Jin Woo Hong2, Dae-Hee Youn1 - 1Yonsei University, Seoul, Korea; 2Electronics & Telecommunications Research Institute (ETRI), Daejon, Korea

In echo watermarking, the effort to improve robustness often conflicts with the requirement of imperceptibility. There have been inherent trade-offs in general audio
watermarking techniques. In this paper we challenge the development of imperceptible but detectable echo kernels being directly embedded into the high-quality audio signal. Mathematical and perceptual characteristics of echo kernels are analyzed in a frequency domain. Finally, we can obtain a greater flat frequency response in perceptually significant bands by combining closely located positive and negative echoes. The proposed echo makes it possible to improve the robustness of an echo watermark without breaking the imperceptibility.
Convention Paper 5644

B-4 New High Data Rate Audio Watermarking Based on SCS (Scalar Costa Scheme)—Frank Siebenhaar1, Christian Neubauer1, Robert Bäuml2, Jürgen Herre1 - 1Fraunhofer Institute for Integrated Circuits, Erlangen, Germany; 2Friedrich-Alexander University, Erlangen, Germany

Presently, distribution of audio material is no longer limited to physical media. Instead, distribution via the Internet is of increasing importance. In order to attach additional information to the audio content, either for forensic or digital rights management purposes or for annotation purposes, watermarking is a promising technique since it is independent of the audio format and transmission technology.
State-of-the-art spread spectrum watermarking systems can offer high robustness against unintentional and intentional signal modifications. However, their data rate is typically comparatively low, often below 100 bit/s. This paper describes the adaptation of a new watermarking scheme called Scalar Costa Scheme (SCS), which is based on dithered quantization of audio signals. In order to fulfill the demands of high quality audio signal processing, modifications of the basic SCS, such as the introduction of a psychoacoustic model and new algorithms to determine quantization intervals, are required. Simulation figures and results of a sample implementation, which show the potential of this new watermarking scheme, are presented in this paper along with a short theoretical introduction to the SCS watermarking scheme.
Convention Paper 5645

Back to AES 113th Convention Back to AES Home Page


(C) 2002, Audio Engineering Society, Inc.