Authors:Sharma, Garima; Umapathy, Karthikeyan; Krishnan, Sridhar
Affiliation:Department of Electrical and Computer Engineering, Ryerson University, Toronto, Canada; Department of Electrical and Computer Engineering, Ryerson University, Toronto, Canada; Department of Electrical and Computer Engineering, Ryerson University, Toronto, Canada
Audio signals are classified into speech, music, and environmental sounds. From the evolution of audio features, an adequate amount of work has been seen in speech and music processing. On the other hand, the environmental sounds have not been studied that much, and themajor reason behind it is the lack of coherent information present in an environmental sound compared with the speech signal or a musical sound. The definition to express audio textures is imprecise and insufficient, so audio textures tend to be defined by drawing a comparison to the known sound source (e.g., "it sounds like a motor" or "like a fan"). Audio textures could be either natural or artificial. Natural audio textures, such as heavy rain, fire, and stream flowing, are very common. The artificial audio textures include sounds such as applause, a motor running, someone walking on gravel, babble, and many more. Although these audio textures have been used in virtual reality, music, screen saver sounds, and more, a considerable amount of possible work is still untouched. The aim of this study is to summarize the literature on audio textures, textural features, and their applications. In this survey, the texture synthesis and features are explained in detail.
Download: PDF (HIGH Res) (6.9MB)
Download: PDF (LOW Res) (937KB)
Authors:Kim, Taeho; Pöntynen, Henri; Pulkki, Ville
Affiliation:Aalto University, Espoo, Finland; Aalto University, Espoo, Finland; Aalto University, Espoo, Finland
This paper introduces difference-spectrum filters that can be used to control the perceived vertical direction of a sound source presented from ear-level loudspeakers. The difference-spectrum filter was designed to mimic the macroscopic changes in the spectral envelope of head-related transfer functions (HRTFs) between a target elevation angle and the ear-level elevation (0?), where the HRTF envelopes were obtained from averaging an extensive collection of individual HRTFs in a database. Localization tests were conducted to evaluate the effectiveness of difference-spectrum filters on elevation perception, which showed a promising result in the two-channel stereophonic condition for the virtual sound source. The perceived elevation correlated well with the target elevation angle of difference-spectrum filters in the stereophonic condition, although a weak correlation was observed in the monophonic condition. Thus, the test results show that difference-spectrum filters can create robust illusory elevation perception and enable vertical direction control over a wide range of elevation angles in stereophonic loudspeaker reproduction.
Download: PDF (HIGH Res) (854KB)
Download: PDF (LOW Res) (653KB)
Authors:Bai, Mingsian R.; Kung, Fan-Jie
Affiliation:Department of Electrical Engineering, National Tsing Hua University, Hsinchu 300044, Taiwan; Department of Electrical Engineering, National Tsing Hua University, Hsinchu 300044, Taiwan
This paper describes a speech enhancement system based on a multichannel Wiener filter (MWF) embedded in a generalized sidelobe canceller (GSC). Instead of a fixed beamformer, the minimum variance distortionless response (MVDR) beamformer is used in the signal extracting path. Noise and reverberation covariance matrices are estimated for the MVDR beamformer. The noise power spectral density (PSD) is estimated in light of the spherically isotropic noise model and a blocking-based least-squaresmethod. The reverberation covariance matrix is established using a variance normalization delay linear prediction (NDLP) algorithm. To further reduce residual noise at the GSC output, a postfilter is cascaded. We compare the proposed enhancement system with five baseline methods, the integrated sidelobe cancellation and the linear prediction Kalman filter (ISCLP), the two-stage beamforming approach (TSBA), the blocking-based multichannel Wiener filter (BMWF), the GSC beamformer with a delay-and-sum (DS) as its fixed beamformer and a Wiener postfilter (DS-GSC-PF), and the GSC beamformer with a superdirective beamformer as its fixed beamformer and a Wiener postfilter (SB-GSC-PF). The experiment results have demonstrated that the proposed approach outperforms the ISCLP, TSBA, BMWF, DS-GSC-PF, and SB-GSC-PF methods in terms of objective quality evaluations.
Download: PDF (HIGH Res) (12.6MB)
Download: PDF (LOW Res) (1.2MB)
Authors:Allan, Jon; Leijonhufvud, Susanna
Affiliation:Luleå University of Technology, Sweden; Royal College of Music in Stockholm, Sweden
A cross-disciplinary study between the two research areas of Audio Technology and Music Education was performed to assess how different aspects of education and experience may influence the experience of music listening given a typical streaming service---Spotify.1 The point of departure is that streamed media facilitates a plenitude of versions of the same song. The paper focuses on the differences that these different songs yield from various mastering processes and production choices motivated by the end distribution media and user settings in the playback system that aim to alter the sound. These variations may all lead to differences in musical dynamics and timbre. A listening test was conducted to examine listeners' preferences, the assessed audio quality, and subjects' reports on how the music content affected them when given the possibility to compare versions in a controlled environment. The test subjects (n = 76) represented populations with various educational backgrounds and experience within music and audio technology. Among the results, it was found that education and experience in some cases do affect preferences.
Download: PDF (HIGH Res) (2.0MB)
Download: PDF (LOW Res) (845KB)
Authors:Munroe, Oliver; Novak, Antonin; Simon, Laurent
Affiliation:Laboratoire d’Acoustique de l’Université du Mans, UMR 6613, Institut d’Acoustique - Graduate School, CNRS, Le Mans Université, France; Laboratoire d’Acoustique de l’Université du Mans, UMR 6613, Institut d’Acoustique - Graduate School, CNRS, Le Mans Université, France; Laboratoire d’Acoustique de l’Université du Mans, UMR 6613, Institut d’Acoustique - Graduate School, CNRS, Le Mans Université, France
In the case of an electrodynamic loudspeaker, the reluctance force is purely a nonlinear addition to the Lorentz force. It generates second harmonic and intermodulation distortions and very low--frequency components in the force applied to the moving assembly. Being able to accurately model and then compensate this force is one of the building blocks for a feedforward system designed to linearize an electrodynamic loudspeaker. This work investigates the reluctance force formulation and proposes a more accurate model for motor structures with a shorting ring and compensation algorithm that does not require root finding or model inversions. The compensation is applied to two drive units and shown to reduce both harmonic and intermodulation distortion by around 18 dB between direct current and 1 kHz. Compensation results using parameters fitted to measured and simulated electrical impedances are compared. Finally there is a brief discussion on the implications of the results for shorting ring design.
Download: PDF (HIGH Res) (5.2MB)
Download: PDF (LOW Res) (708KB)
Authors:Macdonell, Marcus; Scott, Jonathan
Affiliation:Corax Audio Labs, Cambridge, New Zealand; University of Waikato, Hamilton, New Zealand
This paper presents the theory and design of a Vector-corrected Network Analyzer realized in the acoustic domain. This is a novel measurement instrument based on the established microwave vector network analyzer. It employs directional couplers to separate forward and reverse-traveling waves in acoustic waveguide. This instrument is intended to supersede the acoustic impedance tube. Advantages include greatly increased measurement speed and potential for traceability to external standards. Traceability is achieved by means of a calibration through an analytical solution of the error matrix produced from the measurement of a limited number of available acoustic standards. Operation is verified through analysis of the acoustic S-parameters of a passive, asymmetrical, reciprocal acoustic device constructed inside the acoustic waveguide. To the best of the authors' knowledge this Acoustic Vector-corrected Network Analyzer is the first of its kind.
Download: PDF (HIGH Res) (20.5MB)
Download: PDF (LOW Res) (1.4MB)
Author:Jovanovic, Vladan M.
Affiliation:Consultant, Bloomington, IN
This author's recent paper on the zeros of the tracking error for various Löfgren alignments showed that the formula originally derived in 1941 for tracking angle zeros in the case of the Löfgren A alignment method ("minimax" optimization of distortions) provides accurate results in practice, but the approximate formula often used for the Löfgren C alignment (Least Mean Squares optimization) does not appear to work as well. The zero tracking error radii were found to be in error by up to 0.6 mm, causing practically all protractors for Löfgren C alignment to be slightly miscalibrated. This paper investigates the Löfgren C case analytically and presents some new formulae for the optimum offset angle, overhang, and zero tracking error radii, which match the numeric optimization results very well.
Download: PDF (HIGH Res) (3.6MB)
Download: PDF (LOW Res) (548KB)
Affiliation:Ingenieurbüro für Akustik und Signalverarbeitung, Griesheim, Germany
This paper describes a digital signal processing method for reducing interference when receiving an analog FM stereo signal. The proposed method uses the left and right audio signals of a stereo receiver as input signals. Unlike conventional FM receiver strategies that reduce the stereo separation broadband or in frequency bands to keep the noise at a bearable level, here the noise is reduced to a quality comparable to mono while at the same time preserving the stereo separation and frequency response. This is achieved by applying signal processing rules derived from the observation of matrixed source signals recorded in intensity, time-of-arrival, and equivalence stereophony. The main part of the noise reduction is based on lowering the magnitude spectrum of the disturbed difference signal to the level of the sum signal, eliminating the excessive width of the stereo base caused by noise. The signal processing method is compatible with the FM stereo transmission standard and applicable worldwide.
Download: PDF (HIGH Res) (22.1MB)
Download: PDF (LOW Res) (3.1MB)