AES Conventions and Conferences

   Return to 116th
   Detailed Calendar
         (in Excel)
   Calendar (in PDF)
   Preliminary Program
   4 Day Planner PDF
   Convention Program
         (in PDF)
   Exhibitor Seminars
         (in PDF)
   Paper Sessions
   Tutorial Seminars
   Special Events
   Exhibitor Seminars
   Student Program
   Heyser Lecture
   Tech Comm Mtgs
   Standards Mtgs
   Hotel Information
   Travel Info
   Press Information

v3.0, 20040325, ME

Session C Saturday, May 8 13:00 h–16:00 h
Chair: Derk Reefman, Philips Research, Eindoven, The Netherlands

C-1 Taking Care of Tomorrow Before it Is Too Late—A Pragmatic Archiving StrategyNicolas Hans1, Johan de Koster2
Dalet Digital Media Systems, Paris, France
Radio Netherlands, Hilversum, The Netherlands
An increasing number of broadcasters and organizations are considering the digitization of their media archives. Implementing digital media libraries so as to ensure the proper preservation of legacy archives has been recognized as a priority. Yet, many organizations are faced with a paradox: although strategic, these digitization projects are postponed because of budgetary constraints. This paper discusses several case studies and suggests a new approach to implementing a successful digital archiving strategy—one that will get approval and support from management.
C-2 Archiving of Radio Broadcast Data Using Automatic Metadata Generation Methods within MediaFabric FrameworkJobst Löffler1, Joachim Köhler1, Helge Blohmer2, Kai-Uwe Kaup2
Fraunhofer Institute for Media Communication, Sankt Augustin, Germany
VCS Aktiengesellschaft, Bochum, Germany
This paper describes methods for automatic extraction of descriptive metadata for audio material and the workflow of archiving. These new algorithms and archiving tools developed at Fraunhofer IMK are to be directly integrated into MediaFabric, a commercially available radio broadcasting framework. Processing steps are based on pattern recognition algorithms and include speech/nonspeech detection, loudspeaker change detection and classification, jingle and advertising recognition. The extracted audio structure is described as a hierarchical representation of segment nodes annotated with suitable metadata. The extended retrieval application allows interactive display and navigation of the audio structure. A novel approach to keyword search based on a syllable representation of audio material is used for effective retrieval within the digital radio archive.
C-3 EBU Tests of Commercial Audio Watermarking SystemsAndrew Mason, BBC Research and Development, Tadworth, Surrey, UK
Audio watermarking has recently had a resurgence of interest, spurred on by the desire for copyright protection of digital audio recordings. Several audio watermarking techniques, some dating back more than 30 years, are described briefly here. The uses to which watermarking might be put are also summarized. Attention is then focussed on the requirements identified by the EBU applicable to distribution over the Eurovision and Euroradio networks. The EBU issued a call for systems to meet its requirements. Subjective and objective tests were done on the systems supplied for testing. Audibility and robustness of the watermarks were measured. The results are encouraging for those considering using audio watermarking in broadcast applications.
C-4 Morphological Sound Description: Computational Model and Usability EvaluationJulien Ricard, Perfecto Herrera, Pompeu Fabra University, Barcelona, Spain
Sound samples of metadata are usually limited to a source label and several related textual labels. In the context of sound retrieval this makes the retrieval of sounds having no identifiable source (“abstract sounds”) a hard task. We propose a description framework focusing on intrinsic perceptual sound qualities, based on Schaeffer’s research on sound objects, which could be used to represent and retrieve abstract sounds and to refine traditional search by source for non-abstract sounds. We show that some perceptual labels can be automatically extracted with good performance, avoiding the time-consuming manual labeling task, and that the resulting representation is evaluated as useful and usable by a pool of users.
C-5 A Non-Linear Rhythm-Based Style Classification for Broadcast Speech-Music Discrimination—Enric Guaus, Eloi Batlle, Pompeu Fabra University, Barcelona, Spain
Speech-music discriminators are usually designed under some rigid constraints. This paper deals with a more general speech-music discriminator designed for the AIDA project. The system is based on a Hidden Markov Model (HMM) style classification process in which the styles are grouped into two major categories: speech or music. The goals of this subsystem are: (1) the expandable possibilities with the addition of some new styles (like “phone female voice”); (2) the use of new rhythmical descriptors in combination with other typical ones; and (3) the robustness of our speech/music discriminator in many different environments by using some mathematical morphology and nonlinear postprocessing techniques. The techniques used in our system allow a fast track in changes between styles and, thus, typical confusions in commercials can be easily cleaned. The accuracy of this system can be up to a 94.3 percent in broadcast radio environment.
C-6 Audio Patch Method in Audio Decoders—MP3 and AAC— Han-Wen Hsu, Chi-Min Liu, Wen-Chieh Lee, National Chiao Tung University, Hsin-Chu, Taiwan
Current audio encoders like MP3 or AAC leads to some artifacts due to the bit-rate constraint. This paper considers two artifacts. The first artifact is the unusual spectral valley which is perceptually heard as fishy noise. The second one is the spectrum clipping which leads to the muffling audio. This paper proposes the spectrum patch method to handle the two artifacts in the decoders. The technique can be included in MPEG1— Layer3 and MPEG4—AAC (Advanced Audio Coding) decoders to conceal the artifacts without prior information on the original audio tracks. Intensive experiments have been conducted on various encoders and audio tracks to check the quality improvement and the possible risks in degrading the quality. The objective test measures used is the recommendation system by ITU-R Task Group 10/4.

Back to AES 116th Convention Back to AES Home Page

(C) 2004, Audio Engineering Society, Inc.