AES New York 2007
P21 - Signal Processing, Part 1
Poster Session P21
Monday, October 8, 10:00 am — 11:30 am
P21-1 Dynamic Bit-Rate Adaptation for Speech and Audio—Nicolle H. van Schijndel, Philips Research - Eindhoven, The Netherlands; Laetitia Gros, France Telecom R&D - Lannion, France; Steven van de Par, Philips Research - Eindhoven, The Netherlands
Many audio and speech transmission applications have to deal with highly time-varying channel capacities, making dynamic adaptation to bit rate an important issue. This paper investigates such adaptation using a coder that is driven by rate-distortion optimization mechanisms, always coding the full signal bandwidth. For perceptual evaluation, the continuous quality evaluation methodology was used, which has specifically been designed for dynamic quality testing. Results show latency and smoothing effects in the judged audio quality, but no quality penalty for the switching between quality levels; the overall quality using adaptation is comparable to using the average available bit rate. Thus, dynamic bit-rate adaptation has a clear benefit as compared to always using the lowest guaranteed available rate.
Convention Paper 7288 (Purchase now)
P21-2 A 216 kHz 124 dB Single Die Stereo Delta Sigma Audio Analog-to-Digital Converter—YuQing Yang, Terry Scully, Jacob Abraham, Texas Instruments, Inc. - Austin, TX, USA
A 216 kHz single die stereo delta sigma ADC is designed for high precision audio applications. A single loop, fifth-order, thirty-three level delta sigma analog modulator with positive and negative feedforward path is implemented. An interpolated multilevel quantizer with unevenly weighted quantization levels replaces a conventional 5-bit flash type quantizer in this design. These new techniques suppress the signal dependent energy inside the delta sigma loop and reduce internal channel noise coupling. Integrated with an on-chip bandgap reference circuit, DEM (dynamic element matching) circuit and a linear phase, FIR decimation filter, the ADC achieves 124 dB dynamic range (A-weighted), –110 dB THD+N over a 20 kHz bandwidth. Inter-channel isolation is 130 dB. Power consumption is approximately 330 mW.
Convention Paper 7289 (Purchase now)
P21-3 Encoding Bandpass Signals Using Level Crossings: A Model-Based Approach—Ramdas Kumaresan, Nitesh Panchal, University of Rhode Island - Kingston, RI, USA
A new approach to representing a time-limited, and essentially bandlimited signal x(t), by a set of discrete frequency/time values is proposed. The set of discrete frequencies is the set of frequency locations at which (real and imaginary parts of) the Fourier transform of x(t) cross certain levels and the set of discrete time values corresponds to the traditional level crossings of x(t). The proposed representation is based on a simple bandpass signal model called a Sum-of-Sincs (SOS) model, that exploits our knowledge of the bandwidth/timewidth of x(t). Given the discrete fequency/time locations, we can reconstruct the x(t) by solving a least-squares problem. Using this approach, we propose an analysis/synthesis algorithm to decompose and represent composite signals like speech.
Convention Paper 7290 (Purchase now)
P21-4 Theory of Short-Time Generalized Harmonic Analysis (SGHA) and its Fundamental Characteristics—Teruo Muraoka, University of Tokyo - Meguro-ku, Tokyo, Japan; Takahiro Miura, University of Tokyo - Bunkyo-ku, Tokyo, Japan; Daisuke Ochiai, Tohru Ifukube, University of Tokyo - Meguro-ku, Tokyo, Japan
Current digital signal processing was utilized practically by rapid progress of processing hardware brought by IC technology and processing algorithms such as FFT and digital filtering. In short, they are for modifying any digitalized signals and classified into following two methods: (1) digital filtering [parametric processing] and (2) analysis-synthesis [non-parametric processing]. Both methods commonly have a weak point when detecting and removing any locally existing frequency components without any side effects. This difficulty will be removed by applying inharmonic frequency analysis. Its fundamental principle was proven by N. Wiener in his publication of "Generalized Harmonic Analysis (GHA)" in 1930. Its application to practical signal processing was achieved by Dr. Y. Hirata in 1994, and the method corresponds to GHA's short time and sequential processing, therefore let us call it Short-Time Generalized Harmonic Analysis (SGHA). The authors have been engaged in research of its fundamental characteristics and application to noise reduction and reported the results at previous AES conventions. This time, SGHA's fundamental theory will be explained together with its characteristics.
Convention Paper 7291 (Purchase now)
P21-5 Quality Improvement Using a Sinusoidal Model in HE-AAC—Jung Geun Kim, Dong-Il Hyun, Dae Hee Youn, Yonsei University - Seoul, Korea; Young Cheol Park, Yonsei University - Wonju-City, Korea
This paper identifies a phenomenon that a signal is distorted because noise floor is generated when restoring a tone in HE-AAC, which does not exist in the original input signal. To solve this matter, it suggests how to restore only the original tonal components in decoding by adding a sinusoidal model to the HE-AAC encoder. In this process, the sinusoidal model is used to analyze a tone and to move it to the place where noise floor is reduced. The lower the bit-rate is, the lower the frequency where the restoration by SBR (Spectral Band Replication) is started becomes; and in the lower frequency, the distortion phenomenon by noise inflow can be sensed easily. Thus, the effect of improvement in the suggested method is greater, and it is beneficial that no additional information or operation in the decoding process is needed.
Convention Paper 7292 (Purchase now)
P21-6 Special Hearing Aid for Stuttering People—Piotr Odya, Gdansk University of Technology - Gdansk, Poland; Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland, and Excellence Center, PROKSIM, Warsaw, Poland
Owing to recent progress in digital signal processor developments it has been possible to build a subminiature device combining speech and a hearing aid. Despite its small dimensions, the device can execute quite complex algorithms and can be easily reprogrammed. The paper puts an emphasis on issues related to the design and implementation of algorithms applicable to both speech and hearing aids. Frequency shifting or delaying the audio signal are often used for speech fluency improvement. The basic frequency altering algorithm is similar to the sound compression algorithm used in some special hearing aids. Therefore, the experimental device presented in this paper provides a universal hearing and speech aid that may be used by hearing or speech impaired persons or by persons suffering from both problems, simultaneously.
Convention Paper 7293 (Purchase now)
P21-7 An Improved Low Complexity AMR-WB+ Encoder Using Neural Networks for Mode Selection—Jérémie Lecomte, Roch Lefebvre, Guy Richard, Université de Sherbrooke - Sherbrooke, Quebec, Canada
This paper presents an alternative mode selector based on neural networks to improve the low-complexity AMR-WB+ standard audio coder especially at low bit rates. The AMR-WB+ audio coder is a multimode coder using both time-domain and frequency-domain modes. In low complexity operation, the standard encoder determines the coding mode on a frame-by-frame basis by essentially applying thresholding to parameters extracted from the input signal and using a logic that favors time-domain modes. The mode selector proposed in this paper reduces this bias and achieves a mode decision, which is closer to the full complexity encoder. This results in measurable quality improvements in both objective and subjective assessments.
Convention Paper 7294 (Purchase now)
Last Updated: 20070820, mei