TR2016-115

Recent Advances in Distant Speech Recognition

- Delcroix, M., Watanabe, S., "Recent Advances in Distant Speech Recognition," Tech. Rep. TR2016-115, Interspeech Tutorials, September 2016.
  BibTeX TR2016-115 PDF
  - @techreport{Delcroix2016sep,
  - author = {Delcroix, Marc and Watanabe, Shinji},
  - title = {{Recent Advances in Distant Speech Recognition}},
  - booktitle = {Interspeech Tutorials},
  - institution = {Interspeech},
  - year = 2016,
  - month = sep,
  - url = {https://www.merl.com/publications/TR2016-115}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

Automatic speech recognition (ASR) is being deployed successfully more and more in products such as voice search applications for mobile devices. However, it remains challenging to perform recognition when the speaker is distant from the microphone, because of the presence of noise, attenuation, and reverberation. Research on distant ASR has received increased attention, and has progressed rapidly due to the emergence of 1) deep neural network (DNN) based ASR systems, 2) the launch of recent challenges such as CHiME series, REVERB, ASpIRE, and DIRHA, and 3) the development of new products such as the Microsoft Kinect and the AMAZON Echo. This tutorial will review the recent progresses made in the field of distant speech recognition in the DNN era, including single and multi-channel speech enhancement front-ends, and acoustic modeling techniques for robust back-ends. The tutorial will also introduce practical schemes for building distant ASR systems based on the expertise acquired from past challenges.

Related News & Events

NEWS MERL Speech & Audio researchers present two sold-out tutorials at Interspeech 2016
Date: September 8, 2016
Where: Interspeech 2016, San Francisco, CA
MERL Contact: Jonathan Le Roux
Research Area: Speech & Audio
Brief
- MERL Speech and Audio Team researchers Shinji Watanabe and Jonathan Le Roux presented two tutorials on September 8 at the Interspeech 2016 conference, held in San Francisco, CA. Shinji collaborated with Marc Delcroix (NTT Communication Science Laboratories, Japan) to deliver a three-hour lecture on "Recent Advances in Distant Speech Recognition", drawing upon their experience organizing and participating in six different recent robust speech processing challenges. Jonathan teamed with Emmanuel Vincent (Inria, France) and Hakan Erdogan (Sabanci University, Microsoft Research) to give an in-depth tour of the latest advances in "Learning-based Approaches to Speech Enhancement And Separation". This collaboration stemmed from extensive stays at MERL by Emmanuel and Hakan, Emmanuel as a summer visitor, and Hakan as a MERL visiting research scientist for over a year while on sabbatical.
  
  Both tutorials were sold out, each attracting more than 100 researchers and students in related fields, and received high praise from audience members.

Research Areas:

Abstract: