AES Journal

Journal of the AES

2023 January/February - Volume 71 Number 1/2

Papers

Comparison of Full Factorial and Optimal Experimental Design for Perceptual Evaluation of Audiovisual Quality

Open
Access

Authors:Frans, Randy Fela; Zacharov, Nick; Forchhammer, Søren
Affiliation:SenseLab, FORCE Technology, Hørsholm, Denmark; Department of Electrical and Photonics Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark; Meta Reality Labs., Paris, France; Department of Electrical and Photonics Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
Page:4

Perceptual evaluation of immersive audiovisual quality is often very labor-intensive and costly because numerous factors and factor levels are included in the experimental design. Therefore, the present study aims to reduce the required experimental effort by investigating the effectiveness of optimal experimental design (OED) compared to classical full factorial design (FFD) in the study using compressed omnidirectional video and ambisonic audio as examples. An FFD experiment was conducted and the results were used to simulate 12 OEDs consisting of D-optimal and I-optimal designs varying with replication and additional data points. The fraction of design space plot and the effect test based on the ordinary least-squares model were evaluated, and four OEDs were selected for a series of laboratory experiments. After demonstrating an insignificant difference between the simulation and experimental data, this study also showed that the differences in model performance between the experimental OEDs and FFD were insignificant, except for some interacting factors in the effect test. Finally, the performance of the I-optimal design with replicated points was shown to outperform that of the other designs. The results presented in this study open new possibilities for assessing perceptual quality in a much more efficient way.

Download: PDF (HIGH Res) (6.8MB)

Download: PDF (LOW Res) (896KB)

Be the first to discuss this paper

Optimal Microphone Placement for Single-Channel Sound-Power Spectrum Estimation and Reverberation Effects

Authors:Bellows, Samuel D.; Leishman, Timothy W.
Affiliation:Acoustics Research Group, Department of Physics and Astronomy, Brigham Young University, Provo, UT
Page:20

The sound power produced by an acoustic source comprises its total sound energy radiated in all directions per unit time. As the global emission, it excites the reverberant field of a surrounding room. Conversely, an acoustic signal detected for audio applications, including driving reverberation effects, often results from a microphone at a discrete location that does not capture the global source sound and its sound-power spectrum. This paper explores several physical bases for how measured high-resolution spherical directivity functions and known room conditions allow audio engineers to optimize a microphone position to yield a signal with a mean-squared spectrum best approximating the time-averaged sound-power spectrum. The proposed approaches provide means to capture the global source sound with its attendant audio benefits, including the production of more realistic reverberation effects.

Download: PDF (HIGH Res) (42.3MB)

Download: PDF (LOW Res) (1.1MB)

Be the first to discuss this paper

Adversarial Example Devastation and Detection on Speech Recognition System by Adding Random Noise

Authors:Dong, Mingyu; Yan, Diqun; Gong, Yongkang
Affiliation:College of Information Science and Engineering, Ningbo University, Ningbo Zhejiang, China
Page:34

An automatic speech recognition (ASR) system based on a deep neural network is vulnerable to attack by an adversarial example, especially if the command-dependent ASR fails. A defense method against adversarial examples is proposed to improve the robustness and security of the ASR system. An algorithm of devastation and detection on adversarial examples that can attack current advanced ASR systems is proposed. An advanced text-dependent and command-dependent ASR system is chosen as the target, generating adversarial examples by an optimization-based attack on text-dependent ASR and the genetic-algorithm--based algorithm on command-dependent ASR. The method is based on input transformation of adversarial examples. Different random intensities and kinds of noise are added to adversarial examples to devastate the perturbation previously added to normal examples. Experimental results show that the method performs well. For the devastation of examples, the original speech similarity after adding noise can reach 99.68%, the similarity of adversarial examples can reach zero, and the detection rate of adversarial examples can reach 94%.

Download: PDF (HIGH Res) (6.1MB)

Download: PDF (LOW Res) (892KB)

Be the first to discuss this paper

The Watkins Woofer

Authors:Degraeve, Sébastien; Oclee-Brown, Jack
Affiliation:GP Acoustics (UK), Eccleston Rd, Maidstone, ME15 6QP, UK
Page:45

The Watkins Woofer is an arrangement, invented and patented by William (Bill) Watkins and subsequently used by Infinity, that uses a novel technique to increase the efficiency of an infinite baffle or closed box loudspeaker. Watkins himself described succinctly the principle of operation of his dual-coil woofer, but no rigorous analysis was published. Furthermore, the self- and mutual inductances were ignored, causing a dip in the impedance magnitude. Using the same approach as Thiele and Small, the Watkins woofer is - for the first time - fully analyzed to outline the volume, bandwidth, and sensitivity trade-offs.

Download: PDF (HIGH Res) (1.4MB)

Download: PDF (LOW Res) (615KB)

Be the first to discuss this paper

Analysis of Löfgren's Tonearm Optimization

Author:Hickman, Peet
Affiliation:Department of Physics, Lehigh University, Bethlehem, PA
Page:53

A rigorous proof is given of the new analytic formula recently presented by Jovanovic for the linear offset p in Löfgren C alignment. An alternate but mathematically equivalent form of this formula shows explicitly that p depends weakly on 1/L 2, where L is the effective length of the tonearm. Simplified derivations of several results for Löfgren A are also presented. Approximate formulas for Löfgren C valid in the limit of large L are derived, compared with accurate numerical calculations, and shown to be sufficiently accurate to account qualitatively for the optimum values of the tonearm parameters over a wide range of L.

Download: PDF (HIGH Res) (413KB)

Download: PDF (LOW Res) (339KB)

Be the first to discuss this paper

Engineering Reports

Recordings of a Loudspeaker Orchestra With Multichannel Microphone Arrays for the Evaluation of Spatial Audio Methods

Open
Access

Authors:Ackermann, David; Domann, Julian; Brinkmann, Fabian; Arend, Johannes M.; Schneider, Martin; Pörschmann, Christoph; Weinzier, Stefan
Affiliation:Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Georg Neumann GmbH, Berlin, Germany; Institute of Communications Engineering, Köln – University of Applied Sciences, Köln, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany
Page:62

For live broadcasting of speech, music, or other audio content, multichannel microphone array recordings of the sound field can be used to render and stream dynamic binaural signals in real time. For a comparative physical and perceptual evaluation of conceptually different binaural rendering techniques, recordings are needed in which all other factors affecting the sound (such as the sound radiation of the sources, the room acoustic environment, and the recording position) are kept constant. To provide such a recording, the sound field of an 18- channel loudspeaker orchestra fed by anechoic recordings of a chamber orchestra was captured in two rooms with nine different receivers. In addition, impulse responses were recorded for each sound source and receiver. The anechoic audio signals, the full loudspeaker orchestra recordings, and all measured impulse responses are available with open access in the Spatially Oriented Format for Acoustics (SOFA 2.1, AES69-2022) format. The article presents the recording process and processing chain as well as the structure of the generated database.

Navigation

Journal of the AES

2023 January/February - Volume 71 Number 1/2

Papers

Comparison of Full Factorial and Optimal Experimental Design for Perceptual Evaluation of Audiovisual Quality

Optimal Microphone Placement for Single-Channel Sound-Power Spectrum Estimation and Reverberation Effects

Adversarial Example Devastation and Detection on Speech Recognition System by Adding Random Noise

The Watkins Woofer

Analysis of Löfgren's Tonearm Optimization

Engineering Reports

Recordings of a Loudspeaker Orchestra With Multichannel Microphone Arrays for the Evaluation of Spatial Audio Methods

Standards and Information Documents

AES Standards Committee News

Features

Call for Papers - Special Issue - Sonification

AES New Officers

Departments

Book Review

Conv&Conf

Extras

Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

ABOUT AES

Contact Us