AES 116th Convention: PAPERS

Return to 116th

Registration

Exhibitors

Detailed Calendar

(in Excel)

Calendar (in PDF)

Preliminary Program

4 Day Planner PDF

Convention Program

(in PDF)

Exhibitor Seminars

(in PDF)

Multichannel

Symposium

Paper Sessions

Tutorial Seminars

Workshops

Special Events

Exhibitor Seminars

Tours

Student Program

Historical

Heyser Lecture

Tech Comm Mtgs

Standards Mtgs

Hotel Information

Travel Info

Press Information

v3.0, 20040325, ME

Session C Saturday, May 8 13:00 h–16:00 h
AUDIO ARCHIVING, STORAGE, AND RESTORATION; CONTENT MANAGEMENT
Chair: Derk Reefman, Philips Research, Eindoven, The Netherlands

C-1 Taking Care of Tomorrow Before it Is Too Late—A Pragmatic Archiving Strategy—Nicolas Hans¹, Johan de Koster^2

1 Dalet Digital Media Systems, Paris, France² Radio Netherlands, Hilversum, The Netherlands
An increasing number of broadcasters and organizations are considering the digitization of their media archives. Implementing digital media libraries so as to ensure the proper preservation of legacy archives has been recognized as a priority. Yet, many organizations are faced with a paradox: although strategic, these digitization projects are postponed because of budgetary constraints. This paper discusses several case studies and suggests a new approach to implementing a successful digital archiving strategy—one that will get approval and support from management.
C-2 Archiving of Radio Broadcast Data Using Automatic Metadata Generation Methods within MediaFabric Framework—Jobst Löffler¹, Joachim Köhler¹, Helge Blohmer², Kai-Uwe Kaup^2

1 Fraunhofer Institute for Media Communication, Sankt Augustin, Germany² VCS Aktiengesellschaft, Bochum, Germany
This paper describes methods for automatic extraction of descriptive metadata for audio material and the workflow of archiving. These new algorithms and archiving tools developed at Fraunhofer IMK are to be directly integrated into MediaFabric, a commercially available radio broadcasting framework. Processing steps are based on pattern recognition algorithms and include speech/nonspeech detection, loudspeaker change detection and classification, jingle and advertising recognition. The extracted audio structure is described as a hierarchical representation of segment nodes annotated with suitable metadata. The extended retrieval application allows interactive display and navigation of the audio structure. A novel approach to keyword search based on a syllable representation of audio material is used for effective retrieval within the digital radio archive.
C-3 EBU Tests of Commercial Audio Watermarking Systems— Andrew Mason, BBC Research and Development, Tadworth, Surrey, UK
Audio watermarking has recently had a resurgence of interest, spurred on by the desire for copyright protection of digital audio recordings. Several audio watermarking techniques, some dating back more than 30 years, are described briefly here. The uses to which watermarking might be put are also summarized. Attention is then focussed on the requirements identified by the EBU applicable to distribution over the Eurovision and Euroradio networks. The EBU issued a call for systems to meet its requirements. Subjective and objective tests were done on the systems supplied for testing. Audibility and robustness of the watermarks were measured. The results are encouraging for those considering using audio watermarking in broadcast applications.
C-4 Morphological Sound Description: Computational Model and Usability Evaluation—Julien Ricard, Perfecto Herrera, Pompeu Fabra University, Barcelona, Spain
Sound samples of metadata are usually limited to a source label and several related textual labels. In the context of sound retrieval this makes the retrieval of sounds having no identifiable source (“abstract sounds”) a hard task. We propose a description framework focusing on intrinsic perceptual sound qualities, based on Schaeffer’s research on sound objects, which could be used to represent and retrieve abstract sounds and to refine traditional search by source for non-abstract sounds. We show that some perceptual labels can be automatically extracted with good performance, avoiding the time-consuming manual labeling task, and that the resulting representation is evaluated as useful and usable by a pool of users.
C-5 A Non-Linear Rhythm-Based Style Classification for Broadcast Speech-Music Discrimination—Enric Guaus, Eloi Batlle, Pompeu Fabra University, Barcelona, Spain
Speech-music discriminators are usually designed under some rigid constraints. This paper deals with a more general speech-music discriminator designed for the AIDA project. The system is based on a Hidden Markov Model (HMM) style classification process in which the styles are grouped into two major categories: speech or music. The goals of this subsystem are: (1) the expandable possibilities with the addition of some new styles (like “phone female voice”); (2) the use of new rhythmical descriptors in combination with other typical ones; and (3) the robustness of our speech/music discriminator in many different environments by using some mathematical morphology and nonlinear postprocessing techniques. The techniques used in our system allow a fast track in changes between styles and, thus, typical confusions in commercials can be easily cleaned. The accuracy of this system can be up to a 94.3 percent in broadcast radio environment.
C-6 Audio Patch Method in Audio Decoders—MP3 and AAC— Han-Wen Hsu, Chi-Min Liu, Wen-Chieh Lee, National Chiao Tung University, Hsin-Chu, Taiwan
Current audio encoders like MP3 or AAC leads to some artifacts due to the bit-rate constraint. This paper considers two artifacts. The first artifact is the unusual spectral valley which is perceptually heard as fishy noise. The second one is the spectrum clipping which leads to the muffling audio. This paper proposes the spectrum patch method to handle the two artifacts in the decoders. The technique can be included in MPEG1— Layer3 and MPEG4—AAC (Advanced Audio Coding) decoders to conceal the artifacts without prior information on the original audio tracks. Intensive experiments have been conducted on various encoders and audio tracks to check the quality improvement and the possible risks in degrading the quality. The objective test measures used is the recommendation system by ITU-R Task Group 10/4.