Learning-Based Approaches to Speech Enhancement and Separation

    •  Le Roux, J., Vincent, E., Erdogan, H., "Learning-Based Approaches to Speech Enhancement and Separation," Tech. Rep. TR2016-113, Interspeech Tutorials, September 2016.
      BibTeX TR2016-113 PDF
      • @techreport{LeRoux2016sep,
      • author = {Le Roux, Jonathan and Vincent, Emmanuel and Erdogan, Hakan},
      • title = {Learning-Based Approaches to Speech Enhancement and Separation},
      • booktitle = {Interspeech Tutorials},
      • year = 2016,
      • month = sep,
      • url = {}
      • }
  • MERL Contact:
  • Research Areas:

    Artificial Intelligence, Speech & Audio

Being able to isolate a target speech signal from background signals is of direct importance for telephony, hands-free communication and audio surveillance, and it is also critical as a pre-processing step in applications such as voice activity detection, automatic speaker identification, and most importantly automatic speech recognition (ASR) in challenging environments. While speech enhancement and separation methods originally did not rely on training, there has recently been an explosion in the use of machine learning based methods that exploit large amounts of training data. This tutorial will present a broad overview of these methods, analyzing the insights that can be gained from the pre-deep-learning era of graphical modeling and NMF approaches, then diving into an in-depth presentation of recent deep learning approaches encompassing single-channel methods, multi-channel methods, and new directions.


  • Related News & Events

    •  NEWS   MERL Speech & Audio researchers present two sold-out tutorials at Interspeech 2016
      Date: September 8, 2016
      Where: Interspeech 2016, San Francisco, CA
      MERL Contact: Jonathan Le Roux
      Research Area: Speech & Audio
      • MERL Speech and Audio Team researchers Shinji Watanabe and Jonathan Le Roux presented two tutorials on September 8 at the Interspeech 2016 conference, held in San Francisco, CA. Shinji collaborated with Marc Delcroix (NTT Communication Science Laboratories, Japan) to deliver a three-hour lecture on "Recent Advances in Distant Speech Recognition", drawing upon their experience organizing and participating in six different recent robust speech processing challenges. Jonathan teamed with Emmanuel Vincent (Inria, France) and Hakan Erdogan (Sabanci University, Microsoft Research) to give an in-depth tour of the latest advances in "Learning-based Approaches to Speech Enhancement And Separation". This collaboration stemmed from extensive stays at MERL by Emmanuel and Hakan, Emmanuel as a summer visitor, and Hakan as a MERL visiting research scientist for over a year while on sabbatical.

        Both tutorials were sold out, each attracting more than 100 researchers and students in related fields, and received high praise from audience members.