Informed Audio Source Separation

Gaël Richard

TELECOM ParisTech and CNRS, France

Audio source separation still remains very challenging in many situations, especially in the undetermined case, when there are fewer observations than sources. This is in particular the case of polyphonic music (e.g. multiple sources) recorded in stereophony (e.g. 2 channels). In order to improve audio source separation performance, many recent works have followed a so-called informed audio source separation approach, where the separation algorithm relies on any kind of additional information about the sources.
The goal of this keynote is to make a comprehensive review of three major trends in informed audio source separation, namely:
Auxiliary data-informed source separation, where the additional information can be for example a musical score corresponding to the musical source to be separated.
User-guided source separation where the additional information is created by a user with the intention to improve the source separation, potentially in an iterative fashion. For example, this can be some indication about source activity in the time-frequency domain.
Coding-based informed source separation where the additional information is created by an algorithm at a so-called encoding stage where both the sources and the mixtures are assumed known. This trend is at the crossroads of source separation and compression, and shares many similarities with the recently introduced Spatial Audio Object Coding (SAOC).

Gaël Richard (SM'06) received the State Engineering degree from Télécom ParisTech, France (formerly ENST) in 1990, the Ph.D. degree from LIMSI-CNRS, University of Paris-XI, in 1994 in speech synthesis, and the Habilitation à Diriger des Recherches degree from the University of Paris XI in September 2001. After the Ph.D. degree, he spent two years at the CAIP Center, Rutgers University, Piscataway, NJ, in the Speech Processing Group of Prof. J. Flanagan, where he explored innovative approaches for speech production. From 1997 to 2001, he successively worked for Matra, Bois d’Arcy, France, and for Philips, Montrouge, France. In particular, he was the Project Manager of several large scale European projects in the field of audio and multimodal signal processing. In September 2001, he joined the Department of Signal and Image Processing, Télécom ParisTech, where he is now a Full Professor in audio signal processing and Head of the Audio, Acoustics, and Waves research group. He is a coauthor of over 150 papers and inventor in a number of patents and is also one of the experts of the European commission in the field of speech and audio signal processing. He was an Associate Editor of the IEEE Transactions on Audio, Speech and Language Processing between 1997 and 2011 and one of the guest editors of the special issue on ``Music Signal Processing' of IEEE Journal on Selected Topics in Signal Processing (2011) and of the special issue on "Informed acoustic source separation" of the EURASIP Journal on Advances in Signal Processing (2013). He currently is a member of the IEEE Audio and Acoustic Signal Processing Technical Committee, member of the EURASIP and AES and senior member of the IEEE.

