v3.0, 20040325, ME
Session Z2 Saturday, May 8 14:00 h15:30 h
Posters: Audio in Computers & Audio Video Systems
Audio in Computers:
Z2-1 Film Music Recording Using TechnologyRobert Ellis-Geiger, Hong Kong Polytechnic University, Hong Kong
This paper represents a new approach to recording acoustic music for film and has the potential to dramatically improve the performance of an orchestra, small ensemble or solo performer for highly emotional scenes. Additionally, this approach to film music production could allow for sudden changes to be made during the scoring session, such as last minute film edits that mostly result in changes to the final score. This paper will also reveal some of the processes in film music composition and the use of technology as integral to understanding this new method of music production for film.
Z2-2 Development of a Multimedia Learning Module Covering the Field of Perceptual Audio CodingDaniel Pape1, Gerrit Kalkbrenner2, Jan Maihorn3
1 ZAS Berlin, Germany
2 Universität Dortmund, Dortmund, Germany
3 Rotterdam Conservatory for Music and Dance, Rotterdam, The Netherlands
An electronic learning module covering the field perceptual audio coding (famous representative: MP3) was specified, designed, and implemented by means of the multimedia software Macromedia Director. The presented program is split into different modules. These include: (1) an auralization of the filterbank implemented in MP3; (2) simulations of various classic psychoacoustic experiments (mainly masking thresholds) for three different music stylesother audio examples exhibit (1) a comparison of the sound quality of a Fraunhofer MP3 codec at different bit rates and (2) a comparison of todays most important audio and speech codecs (like Windows MediaEncoder and Real9) at different bit rates; and (3) audio examples and explanation of typical error signals introduced by perceptual audio coding. Finally, a structured explanation of the mode-of-operation of an MP3 encoder and technical papers with further references to publications on perceptive coding were included in the presented software.
Z2-3 Controlling the Quality of Audio Services in the Internet Bernhard Feiten, Ingo Wolf, Andreas Graffunder, Media Solutions, Berlin, Germany
Future services in the Internet have to support heterogeneous networks and end-devices. Audio and video services have to support a flexible adaptation of bit rate. MPEG-21 provides a multimedia framework that supports the digital item adaptation in various ways. Adaptation of the quality of a service is supported by the bitstream syntax description language (BSDL). Additionally, utilities exist to describe the relation between the scaling of the bitstream and the related perceived quality. The brightness, the cleanness, and the wideness are proposed as dimensions to assess the quality and to derive parameters for controlling the audio transmission. A mapping of these features on the model output values (MOVs) of the ITU assessment method PEAQ is proposed.
Z2-4 Design and Implementation of a Commodity Audio SystemMen Muheim, Swiss Federal Institute of Technology, Zurich, Switzerland
This paper presents a Ph.D. thesis that envisions a distributed audio system based on commodity computer components. It examines to what extent the real-time attributes of mainstream operating systems lead to audio dropouts and therefore to quality loss. It studies extrapolation methods to prevent loss of quality and shows that quality improvement to a nonannoying level is possible. A synchronization mechanism is implemented on application layer in order to facilitate the use of Ethernet as the only communication network. Thereby the thesis shows that a synchronization accuracy of 10µs between separated loudspeakers is feasible. Furthermore the thesis proposes a novel software framework, which makes the development of distributed audio services easier.
Z2-5 Advanced 3-D Audio Algorithms by a Flexible, Low Level Application Programming InterfaceAleksandar Simeonov, Giorgio Zoia, Robert Lluis Garcia, Daniel Mlynek, EPFL, Lausanne, Switzerland
The constantly increasing demand for a better quality in sound and video for multimedia content and virtual reality compels the implementation of more and more sophisticated 3-D audio models in authoring and playback tools. A very careful and systematic analysis of the best available development libraries in this area was carried out, considering different application programming interfaces, their features, extensibility, and portability among each other. The results show that it is often difficult to find a tradeoff between flexibility, efficiency, quality, and speed. In this paper we propose a low level, modular DSP library, which can be used to implement advanced 3-D audio models; it is based on reconfigurable primitive methods required by most 3-D algorithms and it provides fast development and good flexibility.
Z2-6 Real-Time Internet MPEG-4 SA Player and the Streaming EngineAlvin Su, Yi-Song Shao, National Cheng-Kung University, Tainan, Taiwan
MPEG-4 structure audio is an algorithmic-based coding standard designed for low bit-rate high-quality audio. With this standard, the desired sound can be identical on both the encoder side and the decoder side by using Structured Audio Orchestra Language (SAOL) to generate sound samples. It requires a player and a streaming engine when real-time interactive Internet presentations are necessary. In this paper we present such a system implemented and applied over IBM PC-based computers. The proposed streaming engine follows ISMA specification and its implementation is closely related to Apples Darwin Server. After the streaming SA player receives the bitstream from the server, it converts SAOL data stream to JAVA codes and links to a proposed scheduler program generated from SASL data stream for direct execution such that one can hear the sound in real time. Unlike sfront ?, no intermediate C codes and C compilers are necessary. In order to improve the performance, optimized software modules such as the core opcodes and the core wavetable engine have been embedded. Significant speedup is achieved compared to the reference SAOLC decoder. Real-time demonstration of the system will be made during the presentation. Discussion of the possible future algorithmic coding method using JAVA is also given.
Z2-7 Application Scenarios of Wearable-and Mobile-Augmented Reality AudioTapio Lokki, Heli Nironen, Sampo Vesa, Lauri Savioja, Aki Härmä, Matti Karjalainen, Helsinki University of Technology, Espoo, Finland
Several applications for wearable and mobile reality audio are presented. All applications exploit a headset where microphones are integrated into small headphone elements. The proposed system allows us to implement applications where virtual sound events are superimposed to the users auditory environment to produce an augmented audio display. In addition, binaural audio-over-IP connections, wired or wireless, are discussed. Finally, some future application scenarios are sketched.
Audio Video Systems
Z2-8 ITC Clean Audio ProjectBen Shirley, Paul Kendrick, University of Salford, Salford, UK
The Clean Audio project involves the assessment of a number of processes on perception of Dolby Digital 5.1 audio for TV. Specifically, the research aims to assess the effect on the enjoyment and clarity of television sound for hard-of-hearing viewers. The preliminary study presented here used subjective listening tests to assess the level of left and right front surround channels required to enhance the enjoyment of the audio without detracting from the clarity of the dialog. The findings provide useful guidelines on the benefits and use of surround sound for hearing impaired viewers.
Z2-9 Audiovisual Virtual Environments: Enabling Real-Time Rendering of Early Reflections by Scene Graph SimplificationAndreas Dantele, Ulrich Reiter, Mathias Schwark, Technical University of Ilmenau, Ilmenau, Germany
In an audiovisual virtual 3-D environment the conformance of visual and auditory impression is important to provide a high level of immersion. Restrictions of processing power for the auralization (including early and late reverberation) are usually high due to the demanding visual rendering. For the audio part a trade-off between high accuracy and speeding up the rendering process has to be found, especially for real-time user interaction. We show how the rendering process of early reflections can be done in real time by reducing the scene representation to auditory relevant elements. A suitable scene simplification algorithm and corresponding audio rendering issues are discussed.