Location: Zoom virtual Meeting (5pm Central European Time / Germany time zone)
Moderated by: Elena Shabalina - d&b audiotechnik GmbH
Speaker(s): Jonathan D. Ziegler - Institute for Visual Computing, University of Tübingen
Human interaction increasingly relies on telecommunication as an addition to or replacement for immediate contact. Remote participation in conferences, sporting events, or concerts is more common than ever, and with current global restrictions on in-person contact, this has become an inevitable part of many people’s reality. The work presented here aims at improving these encounters by enhancing the auditory experience. Augmenting fidelity and intelligibility can increase the perceived quality and enjoyability of such actions and potentially raise acceptance for modern forms of remote experiences. Two approaches to automatic source localization and multichannel signal enhancement are investigated for applications ranging from small conferences to large arenas.
Three first-order microphones of fixed relative position and orientation are used to create a compact, reactive tracking and beamforming algorithm, capable of producing pristine audio signals in small and mid-sized acoustic environments. With inaudible beam steering and a highly linear frequency response, this system aims at providing an alternative to manually operated shotgun microphones or sets of individual spot microphones, applicable in broadcast, live events, and teleconferencing or for human-computer interaction.
Multiple microphones with unknown spatial distribution are combined to create a large-aperture array using an end-to-end Deep Learning approach. This method combines state-of-the-art single-channel signal separation networks with adaptive, domain-specific channel alignment. The Neural Beamformer is capable of learning to extract detailed spatial relations of channels with respect to a learned signal type, such as speech, and to apply appropriate corrections in order to align the signals. This creates an adaptive beamformer for microphones spaced on the order of up to 100 m.
Jonathan D. Ziegler is a PhD student at the Wilhelm Schickard Institute for Visual Computing at the Eberhard Karls University in Tübingen, focusing on deep learning and audio signal processing. He received his degree in Physics from the Karlsruhe Institute of Technology. In the past five years he has completed two large research projects as part of the Institute for Applied Artificial Intelligence at the Stuttgart Media University, closely collaborating with industry leaders in microphone and console design. In 2020, he joined the console manufacturer Lawo as a machine learning engineer, working on model optimization for real-time applications. He has more than fourteen years of experience as a musician and producer, and ran a small recording studio for over ten years.
The colloquium will be held in English with a 30 min presentation.
The Meeting Format: We will be hosting this meeting using Zoom. After registering, you will receive a confirmation email containing information about joining the webinar. Most participants will have audio and video muted during the meeting. The moderator will un-mute participants in turn to ask a question during the Q&A period. This will be explained again at the beginning of the meeting. For a better quality we suggest to use a headset with microphone.
The presentation will be recorded. By unmuting your microphone you are consenting for your voice to be recorded. By turning on your camera you are consenting for your image to be recorded, which may also be used in a photograph of the event.
ical Event: Add ical event to your office client.
Other Business: there will be a Q&A session after the talk.
Posted: Wednesday, February 3, 2021