Authors:Adami, Alexander; Taghipour, Armin; Herre, Jürgen
Affiliation:International Audio Laboratories Erlangen (AudioLabs). A joint institution of Friedrich-Alexander-Universität Erlangen-Nürnberg and Fraunhofer IIS, Erlangen, Germany; Empa, Swiss Federal Laboratories for Materials Science and Technology Laboratory for Acoustics / Noise Control, Dübendorf, Switzerland
With applause an audience expresses its enthusiasm and appreciation; it is therefore a vital part of live performances and thus live recordings. Applause sounds range from very sparse sounds with easily distinguishable individual claps in small crowds to extremely dense and almost noise-like sounds in very large crowds. Commonly used perceptual attributes such as loudness, pitch, and timbre seem insufficient to characterize different types of applause, while “density,” which was recently introduced, is an important perceptual attribute for characterizing applause-like sounds. In this paper, the perceptual properties of applause sounds are investigated with a focus on how spectral equalization affects their perception. Two experiments are presented: the extent to which spectral equalization influences the perception of density and the impact of spectral equalization on the perceived similarity of applause sounds with different densities. Additionally, the data of both experiments were jointly evaluated to explore the effect of perceived density and spectral equalization on the perceived similarity of applause sounds. A statistical analysis of the experimental data suggests that spectral equalization has no statistically significant effect on density and only a small but significant effect on similarity. A linear mixed effects model fitted to the experimental data revealed that perceived density differences as well as spectral equalization significantly predict applause similarity. Density differences were found to be the dominating factor. Finally, the results appear applicable to other impulsive sounds with a high rate of transient events.
Download: PDF (HIGH Res) (8.0MB)
Download: PDF (LOW Res) (937KB)
Authors:Adrian, Jens-Alrik; Gerkmann, Timo; van de Par, Steven; Bitzer, Joerg
Affiliation:Institute for Hearing Technology and Audiology, Jade University of Applied Sciences, Oldenburg, Germany; Signal Processing, Department of Informatics, Universität Hamburg, Germany; Acoustics Group, Cluster of Excellence
Algorithms in speech and audio applications are often evaluated under adverse conditions to evaluate their robustness against additive noise. This research describes a method to generate artificial but perceptually plausible acoustic disturbances, thereby providing a controlled and repeatable context for evaluating algorithms. This allows for control of such noise parameters as coloration, modulation, and amplitude distribution independently of each other, while also providing the means to define the amount of coherence among all the signal channels. Results of a listening test in a monaural setup show no significant difference in naturalness between synthesized and original signal. It is not always obvious how to create natural noise. For example, it was observed that white Gaussian noise is often an inappropriate noise. Frequency-dependent modulations on a short time scale appear to contribute to naturalness. Synthesizing vinyl/shellac, which has a particular type of impulse character, requires a unique approach to synthesis. Rain and applause synthesis proved to be challenging.
Download: PDF (HIGH Res) (5.3MB)
Download: PDF (LOW Res) (691KB)
Authors:Fleßner, Jan-Hendrik; Huber, Rainer; Ewert, Stephan D.
Affiliation:HörTech gGmbH and Cluster of Excellence Hearing4All, Oldenburg, Germany; Medizinische Physik and Cluster of Excellence Hearing4All, Universität Oldenburg, Oldenburg, Germany
Binaural or spatial presentation of audio signals has become increasingly important not only in consumer sound reproduction, but also for hearing-assistive devices like hearing aids, where signals in both ears might undergo extensive signal processing. Such processing may introduce distortions to the interaural signal properties that affect perception. In this research, an approach for intrusive binaural auditory-model-based quality prediction (BAM-Q) is introduced. BAM-Q uses a binaural auditory model as front-end to extract the three binaural features: interaural level difference, interaural time difference, and a measure of interaural coherence. The current approach focuses on the general applicability (with respect to binaural signal differences) of the binaural quality model to arbitrary binaural audio signals. Two listening experiments were conducted to subjectively measure the influence of these binaural features and their combinations on binaural quality perception. The results were used to train BAM-Q. Two different hearing aid algorithms were used to evaluate the performance of the model. The correlations between subjective mean ratings and model predictions are higher than 0.9.
Download: PDF (HIGH Res) (1.1MB)
Download: PDF (LOW Res) (1.1MB)
Affiliation:Dysonics, San Francisco, CA, USA
The objective of this research is to create an ensemble of decorrelation filters that, when combined with a specially treated B-Format Room Impulse Response (RIR), will approximate the expected measurements on a real array of outward-facing cardioid microphones in an acoustic diffuse field. This approach relies on the statistical description of reverberation, in particular the autocorrelation with respect to frequency and space. The contributions of this paper allow systematic filter design, the creation of arbitrary angular spacing for three-dimensional loudspeaker arrays, and a simple binaural adaption. It is hoped that DFM will become a complement to existing techniques and assist in the creation of virtual acoustic scenarios that are perceptually indistinguishable from acoustic reality.
Download: PDF (HIGH Res) (582KB)
Download: PDF (LOW Res) (276KB)
Although machines are getting better at understanding audio signals, the tasks that are being undertaken are still relatively basic. Identifying a solo on a particular instrument, working out who is talking, working out when someone is singing, and so forth—these are tasks we mostly do relatively easily as humans. Coming up with an algorithm that enables a machine to do it reliably is surprisingly challenging. Selected papers from the recent conference on semantic audio are summarized.
Download: PDF (538KB)