Last Updated: 20060823, mei
P18 - Computers & Mobile Audio
Saturday, October 7, 2:30 pm — 6:00 pm
Chair: Jerry Bauck, Cooper Bauck Corporation - Tempe, AZ, USA
P18-1 A Hybrid Speech Codec Employing Parametric and Perceptual Coding Techniques—Maciej Kulesza, Grzegorz Szwoch, Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland
A hybrid speech codec for VoIP telephony applications is presented employing combined parametric and perceptual coding techniques. The signal is divided into voiced signal components that are encoded using the perceptual algorithm, unvoiced components that are encoded parametrically, and transients that are not encoded with a lossy method. The codec architecture where voiced part of the CELP residual signal is perceptually encoded and transmitted to the decoder along with the CELP main bit stream is also examined. Various methods for transient detection in the speech signal are discussed. The results of experiments revealing the improved subjective quality of the transmitted speech are also presented.
Convention Paper 6956 (Purchase now)
P18-2 EuCon: An Object-Oriented Protocol for Connecting Control Surfaces to Software Applications—Steve Milne, Euphonix, Inc. - Palo Alto, CA, USA; Phil Campbell, Hobbyhorse Music LLC - Palo Alto, CA, USA; Scott Freshour, Rob Boyer, Jim McTigue, Martin Kloiber, Euphonix, Inc. - Palo Alto, CA, USA
This paper describes a control surface to application protocol that addresses the problem of raising user interface efficiency in increasingly complex software applications. Compared with existing MIDI-based protocols, this protocol was designed to have enough bandwidth, high control resolution, and wide variety of controls to provide software application users with the rich and efficient experience offered by modern large format mixing consoles. Recognizing that today’s audio engineer uses many different applications, it is able to simultaneously control multiple applications running on one or more computers from a single control surface. To give users the widest possible choice of applications, object-oriented design was utilized to promote ease of adoption by software developers.
Convention Paper 6957 (Purchase now)
P18-3 Considerations on Audio for Flash: Getting to the Vector Soundstage—Charles Van Winkle, Adobe Systems Incorporated - Seattle, WA, USA
The Flash Platform has been known for animations and interactivity for some time now and research shows that Flash Player is one the world’s more pervasive software platforms. Although providing audio-rich video or interactive content through Flash is not new, preparing audio assets for Flash is new to many audio professionals. Audio for Flash poses noteworthy changes to audio professionals’ workflows when compared to more customary mediums for video or interactive content, e.g., DVDs or video games. This paper attempts to take a first look at the considerations audio professionals must make when preparing audio assets for Flash. This paper gives an overview of the Flash Platform and takes a first look at the considerations audio professionals must make when preparing audio assets for Flash with modified practices suggested when necessary.
Convention Paper 6958 (Purchase now)
P18-4 5.1 Surround and 3-D (Full Sphere with Height) Reproduction for Interactive Gaming and Training Simulation—Robert (Robin) Miller III, FilmakerTechnology - Bethlehem, PA, USA
Immersive sound for gaming and simulation, perhaps more than for music and movies, requires preserving directionality of direct sounds, both fixed and moving, and acoustical reflections dynamically affecting those sounds, to effect the spatiality being presented. Conventionally (as with popular music), sources are panned close-microphone signals or synthesized sounds; the presentation pretends “They are here,” where spatiality is largely that of the listening environment. Convolution with room impulse responses can contribute diffuse ambience but not “real” spatiality and tone color. These issues pertain not only to 5.1 where reproduction is a 2-D horizontal circle of loudspeakers, but to advanced 3-D interactive reproduction, where the listener perceives the experience at the center of the sphere of natural hearing. Production techniques are introduced that satisfy both 3-D and compatible 5.1. Independent measurement confirms that the system preserves directionality and reproduces life-like spatiality and tone color continuously in the 3-D perception sphere.
Convention Paper 6959 (Purchase now)
P18-5 Automatic Volume and Equalization Control in Mobile Devices—Alexander Goldin, Alango Ltd. - Haifa, Israel; Alexey Budkin, Alango Ltd. - St. Petersburg, Russia
Noise spectrum and level are changed dynamically in mobile environments. Loudspeaker volume comfortable in quiet conditions becomes too low when the ambient noise level increases significantly. Loudspeaker volume adjusted for good intelligibility in high ambient noise becomes annoyingly loud in quiet. Automatic Volume Control may compensate for different levels of ambient noise by increasing or decreasing the loudspeaker gain accordingly. However, if the noise and sound spectra are very different, such simple gain adjustment may not work well. More advanced technology will dynamically equalize reproduced sound so that it exceeds the noise level by a specified ratio all over the frequency range. This paper describes principles and practical aspects for Automatic Volume and Equalization Control in mobile audio and communication devices.
Convention Paper 6960 (Purchase now)
P18-6 Speech Source Enhancement Using Modified ADRess Algorithm for Applications In Mobile Communications—Niall Cahill, Rory Cooney, Kenneth Humphreys, Robert Lawlor, National University of Ireland - Maynooth, Ireland
An approach to refine and adapt an existing music sound source separation algorithm to speech enhancement is presented. The existing algorithm has the capability to extract music sources from stereo recordings using the position of the sources in the stereo field. Described in this paper is the ability of a Modified Azimuth Discrimination and Resynthesis algorithm (M-ADRess) to enhance speech in the presence of noise using a two-microphone array. Also proposed is a novel extension to the algorithm, which enables further noise removal from speech based on elevation angle of arrival. Objective measures of processed speech show the suitability of M-ADRess for cleaning noisy speech mixtures in an anechoic environment.
Convention Paper 6961 (Purchase now)
P18-7 Frame Loss Concealment for Audio Decoders Employing Spectral Band Replication—Sang-Uk Ryu, Kenneth Rose, University of California at Santa Barbara - Santa Barbara, CA, USA
This paper presents an advanced frame loss concealment technique for audio decoders employing spectral band replication (SBR). The high frequency signal of the lost frame is reconstructed by estimating the parametric information involved in the SBR process. Utilizing all SBR data from the previous and next frame, the high-band envelope is adaptively estimated from the energy evolution in the surrounding frames. The tonality control parameters are determined in a way that ensures smooth inter-frame transition. Subjective quality evaluation demonstrates that the proposed technique implemented within the aacPlus SBR decoder offers better audio quality after concealment than that achieved by the technique adopted in the standard aacPlus decoder.
Convention Paper 6962 (Purchase now)