TR2013-044

Discriminative Methods for Noise Robust Speech Recognition: A CHiME Challenge Benchmark


    •  Tachioka, Y., Watanabe, S., Le Roux, J., Hershey, J.R., "Discriminative Methods for Noise Robust Speech Recognition: A CHiME Challenge Benchmark", International Workshop on Machine Listening in Multisource Environments (CHiME), June 2013.
      BibTeX TR2013-044 PDF
      • @inproceedings{Tachioka2013jun,
      • author = {Tachioka, Y. and Watanabe, S. and {Le Roux}, J. and Hershey, J.R.},
      • title = {Discriminative Methods for Noise Robust Speech Recognition: A CHiME Challenge Benchmark},
      • booktitle = {International Workshop on Machine Listening in Multisource Environments (CHiME)},
      • year = 2013,
      • month = jun,
      • url = {https://www.merl.com/publications/TR2013-044}
      • }
  • MERL Contact:
  • Research Areas:

    Artificial Intelligence, Speech & Audio

The recently introduced second CHiME challenge is a difficult two- microphone speech recognition task with non-stationary interference.
Current approaches in the source-separation community have focused on the front-end problem of estimating the clean signal given the noisy signals. Here we pursue a different approach, focusing on state-of-the-art ASR techniques such as discriminative training and various feature transformations, in addition to simple noise suppression methods based on prior-based binary masking with estimated angle of arrival. In addition, we propose an augmented discriminative feature transformation that can introduce arbitrary features to a discriminative feature transform, an efficient combination method of Discriminative Language Modeling (DLM) and
Minimum Bayes Risk (MBR) decoding in an ASR post-processing stage, and preliminarily investigate the effectiveness of deep neural networks for reverberated and noisy speech recognition. Using these techniques we present a benchmark on the middle-vocabulary subtask of CHiME challenge, showing their effectiveness for this task. Promising results were also obtained for the proposed augmented feature transformation and combination of DLM and MBR decoding. A part of the training code has been released as an advanced
ASR baseline, using the Kaldi speech recognition toolkit.

 

  • Related News & Events

    •  AWARD   CHiME 2012 Speech Separation and Recognition Challenge Best Performance
      Date: June 1, 2013
      Awarded to: Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux and John R. Hershey
      Awarded for: "Discriminative Methods for Noise Robust Speech Recognition: A CHiME Challenge Benchmark"
      Awarded by: International Workshop on Machine Listening in Multisource Environments (CHiME)
      MERL Contact: Jonathan Le Roux
      Research Area: Speech & Audio
      Brief
      • The results of the 2nd 'CHiME' Speech Separation and Recognition Challenge are out! The team formed by MELCO researcher Yuuki Tachioka and MERL Speech & Audio team researchers Shinji Watanabe, Jonathan Le Roux and John Hershey obtained the best results in the continuous speech recognition task (Track 2). This very challenging task consisted in recognizing speech corrupted by highly non-stationary noises recorded in a real living room. Our proposal, which also included a simple yet extremely efficient denoising front-end, focused on investigating and developing state-of-the-art automatic speech recognition back-end techniques: feature transformation methods, as well as discriminative training methods for acoustic and language modeling. Our system significantly outperformed other participants. Our code has since been released as an improved baseline for the community to use.
    •  
    •  NEWS   International Workshop on Machine Listening in Multisource Environments (CHiME) 2013: publication by Jonathan Le Roux, John R. Hershey, Shinji Watanabe and others
      Date: June 1, 2013
      Where: International Workshop on Machine Listening in Multisource Environments (CHiME)
      MERL Contact: Jonathan Le Roux
      Research Area: Speech & Audio
      Brief
      • The paper "Discriminative Methods for Noise Robust Speech Recognition: A CHiME Challenge Benchmark" by Tachioka, Y., Watanabe, S., Le Roux, J. and Hershey, J.R. was presented at the International Workshop on Machine Listening in Multisource Environments (CHiME).
    •