AES New York 2019
Paper Session P02
P02 - Audio Signal Processing
Wednesday, October 16, 9:00 am — 12:00 pm (1E11)
Scott Hawley, Belmont University - Nashville, TN, USA
P02-1 Analyzing and Extracting Multichannel Sound Field—Pei-Lun Hsieh, Ambidio - Glendale, CA, USA
Current post production workflow requires sound engineers to create multiple multichannel audio delivery formats. Inaccurate translation between formats may lead to more time and cost for extra manual adjustment; whereas in sound reproduction, it causes misinterpretation of the original mix and deviation from the intended story. This paper proposes a method that combines both analyzing an encoded Ambisonics field from the input multichannel signal and analyzing between each pair of adjacent channels. This allows an overall understanding of the multichannel sound field while having the ability to have a fine extraction from each channel pair. The result can be used to translate between multichannel formats and also to provide a more accurate rendering for immersive stereo playback.
Convention Paper 10221 (Purchase now)
P02-2 Profiling Audio Compressors with Deep Neural Networks—Scott Hawley, Belmont University - Nashville, TN, USA; Benjamin Colburn, ARiA Acoustics - Washington, DC, USA; Stylianos Ioannis Mimilakis, Fraunhofer Institute for Digital Media Technology (IDMT) - Ilmenau, Germany
We present a data-driven approach for predicting the behavior of (i.e., profiling) a given parameterized, non-linear time-dependent audio signal processing effect. Our objective is to learn a mapping function that maps the unprocessed audio to the processed, using time-domain samples. We employ a deep auto-encoder model that is conditioned on both time-domain samples and the control parameters of the target audio effect. As a test-case, we focus on the offline profiling of two dynamic range compressors, one software-based and the other analog. Our results show that the primary characteristics of the compressors can be captured, however there is still sufficient audible noise to merit further investigation before such methods are applied to real-world audio processing workflows.
Convention Paper 10222 (Purchase now)
P02-3 Digital Parametric Filters Beyond Nyquist Frequency—Juan Sierra, Stanford University - Stanford, CA, USA; Meyer Sound Laboratories - Berkeley, CA, USA
Filter Digitization through the Bilinear Transformation is often considered a very good all-around method to produce equalizer sections. The method is well behaved in terms of stability and ease of implementation; however, the frequency warping produced by the transformation leads to abnormalities near the Nyquist frequency. Moreover, it is impossible to design parametric sections whose analog center frequencies are defined above the Nyquist frequency. These filters, even with center frequencies outside of the hearing range, have effects that extend into the hearing bandwidth with desirable characteristics during mixing and mastering. Surpassing these limitations, while controlling the abnormalities of the warping produced by the Bilinear Transform through an alternative definition of the Bilinear constant is the purpose of this paper. In the process, also a correction factor is discussed for the bandwidth of the parametric section to correct abnormalities affecting the digitization of this parameter.
Convention Paper 10224 (Purchase now)
P02-4 Using Volterra Series Modeling Techniques to Classify Black-Box Audio Effects—Ethan Hoerr, Montana State University - Bozeman, MT, USA; Robert C. Maher, Montana State University - Bozeman, MT, USA
Digital models of various audio devices are useful for simulating audio processing effects, but developing good models of nonlinear systems can be challenging. This paper reports on the in-progress work of determining attributes of black-box audio devices using Volterra series modeling techniques. In general, modeling an audio effect requires determination of whether the system is linear or nonlinear, time-invariant or –variant, and whether it has memory. For nonlinear systems, we must determine the degree of nonlinearity of the system, and the required parameters of a suitable model. We explain our work in making educated guesses about the order of nonlinearity in a memoryless system and then discuss the extension to nonlinear systems with memory.
Convention Paper 10225 (Purchase now)
P02-5 Modifying Audio Signals for Reproduction with Reduced Room Effect—Christof Faller, Illusonic GmbH - Uster, Zürich, Switzerland; EPFL - Lausanne, Switzerland
Conventionally, equalizers are applied when reproducing audio signals in rooms to reduce coloration and effect of room resonances. Another approach, filtering audio signals with an inverse of the room impulse response (RIR), can theoretically eliminate the effect of the room in one point. But practical issues arise such as impaired sound at other positions, a need to update when RIRs change, and loudspeaker-challenging signals. A technique is presented, which modifies the time-frequency envelopes (spectrogram) of audio signals, such that the corresponding spectrogram in the room is more similar to the original signal’s spectrogram, i.e., room effect is attenuated. The proposed technique has low sensitivity on RIR and listener position changes.
Convention Paper 10226 (Purchase now)
P02-6 On the Similarity between Feedback/Loopback Amplitude and Frequency Modulation—Tamara Smyth, University of California, San Diego - San Diego, CA, USA
This paper extends previous work in loopback frequency modulation (FM) to a similar system in which an oscillator is looped back to modulate its own amplitude, so called feedback amplitude modulation (FBAM). A continuous-time closed-form solution is presented for each, yielding greatly improved numerical properties, reduced dependency on sampling rate, and a more accurate representation of the feedback by eliminating the unit-sample delay required for discrete-time implementation. Producing similar waveforms, it is shown that FBAM for a known input frequency, is actually a scaled and offset version of loopback FM having a different carrier frequency but same sounding frequency. Two distinct representations are used to show mathematical equivalence between systems while validating the closed-form solution for each.
Convention Paper 10223 (Purchase now)