Authors:Frans, Randy Fela; Zacharov, Nick; Forchhammer, Søren
Affiliation:SenseLab, FORCE Technology, Hørsholm, Denmark; Department of Electrical and Photonics Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark; Meta Reality Labs., Paris, France; Department of Electrical and Photonics Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
Perceptual evaluation of immersive audiovisual quality is often very labor-intensive and costly because numerous factors and factor levels are included in the experimental design. Therefore, the present study aims to reduce the required experimental effort by investigating the effectiveness of optimal experimental design (OED) compared to classical full factorial design (FFD) in the study using compressed omnidirectional video and ambisonic audio as examples. An FFD experiment was conducted and the results were used to simulate 12 OEDs consisting of D-optimal and I-optimal designs varying with replication and additional data points. The fraction of design space plot and the effect test based on the ordinary least-squares model were evaluated, and four OEDs were selected for a series of laboratory experiments. After demonstrating an insignificant difference between the simulation and experimental data, this study also showed that the differences in model performance between the experimental OEDs and FFD were insignificant, except for some interacting factors in the effect test. Finally, the performance of the I-optimal design with replicated points was shown to outperform that of the other designs. The results presented in this study open new possibilities for assessing perceptual quality in a much more efficient way.
Download: PDF (HIGH Res) (6.8MB)
Download: PDF (LOW Res) (896KB)
Authors:Bellows, Samuel D.; Leishman, Timothy W.
Affiliation:Acoustics Research Group, Department of Physics and Astronomy, Brigham Young University, Provo, UT
The sound power produced by an acoustic source comprises its total sound energy radiated in all directions per unit time. As the global emission, it excites the reverberant field of a surrounding room. Conversely, an acoustic signal detected for audio applications, including driving reverberation effects, often results from a microphone at a discrete location that does not capture the global source sound and its sound-power spectrum. This paper explores several physical bases for how measured high-resolution spherical directivity functions and known room conditions allow audio engineers to optimize a microphone position to yield a signal with a mean-squared spectrum best approximating the time-averaged sound-power spectrum. The proposed approaches provide means to capture the global source sound with its attendant audio benefits, including the production of more realistic reverberation effects.
Download: PDF (HIGH Res) (42.3MB)
Download: PDF (LOW Res) (1.1MB)
Authors:Dong, Mingyu; Yan, Diqun; Gong, Yongkang
Affiliation:College of Information Science and Engineering, Ningbo University, Ningbo Zhejiang, China
An automatic speech recognition (ASR) system based on a deep neural network is vulnerable to attack by an adversarial example, especially if the command-dependent ASR fails. A defense method against adversarial examples is proposed to improve the robustness and security of the ASR system. An algorithm of devastation and detection on adversarial examples that can attack current advanced ASR systems is proposed. An advanced text-dependent and command-dependent ASR system is chosen as the target, generating adversarial examples by an optimization-based attack on text-dependent ASR and the genetic-algorithm--based algorithm on command-dependent ASR. The method is based on input transformation of adversarial examples. Different random intensities and kinds of noise are added to adversarial examples to devastate the perturbation previously added to normal examples. Experimental results show that the method performs well. For the devastation of examples, the original speech similarity after adding noise can reach 99.68%, the similarity of adversarial examples can reach zero, and the detection rate of adversarial examples can reach 94%.
Download: PDF (HIGH Res) (6.1MB)
Download: PDF (LOW Res) (892KB)
Authors:Degraeve, Sébastien; Oclee-Brown, Jack
Affiliation:GP Acoustics (UK), Eccleston Rd, Maidstone, ME15 6QP, UK
The Watkins Woofer is an arrangement, invented and patented by William (Bill) Watkins and subsequently used by Infinity, that uses a novel technique to increase the efficiency of an infinite baffle or closed box loudspeaker. Watkins himself described succinctly the principle of operation of his dual-coil woofer, but no rigorous analysis was published. Furthermore, the self- and mutual inductances were ignored, causing a dip in the impedance magnitude. Using the same approach as Thiele and Small, the Watkins woofer is - for the first time - fully analyzed to outline the volume, bandwidth, and sensitivity trade-offs.
Download: PDF (HIGH Res) (1.4MB)
Download: PDF (LOW Res) (615KB)
Affiliation:Department of Physics, Lehigh University, Bethlehem, PA
A rigorous proof is given of the new analytic formula recently presented by Jovanovic for the linear offset p in Löfgren C alignment. An alternate but mathematically equivalent form of this formula shows explicitly that p depends weakly on 1/L 2, where L is the effective length of the tonearm. Simplified derivations of several results for Löfgren A are also presented. Approximate formulas for Löfgren C valid in the limit of large L are derived, compared with accurate numerical calculations, and shown to be sufficiently accurate to account qualitatively for the optimum values of the tonearm parameters over a wide range of L.
Download: PDF (HIGH Res) (413KB)
Download: PDF (LOW Res) (339KB)
Authors:Ackermann, David; Domann, Julian; Brinkmann, Fabian; Arend, Johannes M.; Schneider, Martin; Pörschmann, Christoph; Weinzier, Stefan
Affiliation:Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany; Georg Neumann GmbH, Berlin, Germany; Institute of Communications Engineering, Köln – University of Applied Sciences, Köln, Germany; Audio Communication Group, Technische Universität Berlin, Berlin, Germany
For live broadcasting of speech, music, or other audio content, multichannel microphone array recordings of the sound field can be used to render and stream dynamic binaural signals in real time. For a comparative physical and perceptual evaluation of conceptually different binaural rendering techniques, recordings are needed in which all other factors affecting the sound (such as the sound radiation of the sources, the room acoustic environment, and the recording position) are kept constant. To provide such a recording, the sound field of an 18- channel loudspeaker orchestra fed by anechoic recordings of a chamber orchestra was captured in two rooms with nine different receivers. In addition, impulse responses were recorded for each sound source and receiver. The anechoic audio signals, the full loudspeaker orchestra recordings, and all measured impulse responses are available with open access in the Spatially Oriented Format for Acoustics (SOFA 2.1, AES69-2022) format. The article presents the recording process and processing chain as well as the structure of the generated database.
Download: PDF (HIGH Res) (14.1MB)
Download: PDF (LOW Res) (813KB)