AES Los Angeles 2014
Poster Session P5
P5 - Audio Signal Processing
Thursday, October 9, 3:00 pm — 4:30 pm (S-Foyer 1)
P5-1 An Evaluation of Chromagram Weightings for Automatic Chord Estimation—Zhengshan Shi, Stanford University - Stanford, CA, USA; Julius O. Smith, III, Stanford University - Stanford, CA, USA
Automatic Chord Estimation (ACE) is a central task in Music Information Retrieval. Generally, audio files are parsed into chroma-based features for further processing in order to estimate the chord being played. Much work has been done to improve the estimation algorithm by means of statistical models for chroma vector transitions, but not as much attention has been given to the loudness model during the feature extraction stage. In this paper we evaluate the effect on chord-recognition accuracy due to the use of various nonlinear transformations and loudness weightings applied to the power spectrum that is “folded" to form the chromagram in which chords are detected. Nonlinear spectral transformations included square-root magnitude, magnitude, magnitude-squared (power spectrum), and dB magnitude. Weightings included A-weighted dB and Gaussian-weighted magnitude.
Convention Paper 9119
P5-2 CUDA Accelerated Audio Digital Signal Processing for Real-Time Algorithms—Nicholas Jillings, Birmingham City University - Birmingham, UK; Yonghao Wang, Birmingham City University - Birmingham, UK
This paper investigates the use of idle graphics processors to accelerate audio DSP for real-time algorithms. Several common algorithms have been identified for acceleration and were executed in multiple thread and block configurations to ascertain the desired configuration for the different algorithms. The GPU and CPU performing on the same data sizes and algorithm are compared against each other. From these results the paper discusses the importance of optimizing the code for GPU operation including the allocating shared resources, optimizing memory transfers, and forced serialization of feedback loops. It also introduces a new method for audio processing using GPU’s as the default processor instead of an accelerator.
Convention Paper 9120
P5-3 A Modified Variable Step Size NLMS Algorithm for Acoustic Echo Cancellation Application—Youhong Lu, Microsoft - Redmond, WA, USA; Syavosh Zadissa, Microsoft - Redmond, WA, USA
The Variable Step Size (VSS) Normalized Least Mean-Square (NLMS) algorithm has been studied in depth. Numerous publications have covered this technique from both theoretical and practical point of views. This contribution builds on the past knowledge and proposes an improvement. This is a single filter approach without any double talk detection. We will show that the proposed technique, in the context of acoustic echo cancellation, offers superior performance in terms of convergence speed, misalignment error, while offering superior resilience to low Echo to Interference Ratio, and Echo Path Change.
Convention Paper 9121
P5-4 Robust Artificial Bandwidth Extension Technique Using Enhanced Parameter Estimation—Jonggeun Jeon, Hanyang University - Korea; Yaxing Li, Hanyang University - Korea; Sangwon Kang, Hanyang University - Korea; Kihyun Choo, Samsung Electronics Co., Ltd. - Suwon, Korea; Eunmi Oh, Samsung Electronics Co., Ltd. - Suwon, Korea; Hosang Sung, Samsung Electronics - Korea
We propose a robust artificial bandwidth extension (ABE) technique to improve the narrowband speech signal quality using enhanced excitation estimation and spectrum envelope. For excitation estimation, we use a whitened narrowband excitation signal, generated by passing the excitation signal through a whitening filter. An adaptive spectral double shifting method is introduced to obtain an enhanced wideband excitation signal. For envelope estimation, we propose an enhanced combined method using the codebook and linear mapping. The proposed ABE system is applied to the decoded output of an adaptive multi-rate (AMR) codec at 12.2 kbps. We evaluate its performance using spectral distortion, wideband perceptual evaluation of speech quality, and a formal listening test. The objective and subjective evaluation confirm that the proposed ABE system provides better speech quality than that of AMR at the same bit rate.
Convention Paper 9122
P5-5 Audio Signal Recovery from Single-Frame Randomly Gated Fourier Magnitudes—Dominic Fannjiang, Davis Senior High School - Davis, CA, USA; Albert Fannjiang, University of California, Davis - Davis, CA, USA
Few-frame phase retrieval is motivated by the demand of real time audio signal processing which is severely ill-posed and fundamentally challenging. This paper is an exploratory study of single-frame phase retrieval of audio signals with two additional ingredients: a random gating function and symmetry-breaking DC component. In general, randomly phased gating and a suitably chosen DC component can bring the success rate of single-frame phase retrieval to unity and yield accurate, stable, fast convergent numerical reconstruction. The tradeoff between the diversity of the gate and the magnitude of DC component is investigated.
Convention Paper 9123
P5-6 Triode Emulator: Part 2—Dimitri Danyuk, Consultant - Palmetto Bay, FL, USA
The paper describes method for accurate emulating of triode behavior at high input levels. Under gross overload the grid current becomes a main origin of distortion. The measurements of grid current for popular 12AX7 triode are presented. In the region of interest grid current dependence on input signal is emulated with a simple circuit. The output harmonic weighting of the emulator is examined and compared with existing solution. The results can be applied to solid-state guitar amplifiers.
Convention Paper 9124
P5-7 A SIMULINK Toolbox of Sigma-Delta Modulators for High Resolution Audio Conversions—Isacco Arnaldi, Birmingham City University - Birmingham, UK; Yonghao Wang, Birmingham City University - Birmingham, UK
Sigma–Delta modulation is the only form of analog-to-digital conversion that allows achievement of high bit resolution at relatively low costs. There is a lack of tools in academic and in industry that allow entry level engineers to familiarize with the concepts governing this conversion technique, especially for high orders, multi-bit, and different architectures of sigma-delta modulators. The goal of this paper is to present a graphical toolbox, developed in Simulink and based on the behavioral model previously presented in  and available in  that allows to simulate and theoretically evaluate ten different architectures, continuous and discrete time, as well as single- and multi-bit implementations of different orders of analog-to-digital sigma-delta modulators.
Convention Paper 9125