TR2016-011

Minimum Word Error Training of Long Short-Term Memory Recurrent Neural Network Language Models for Speech Recognition


    •  Hori, T.; Hori, C.; Watanabe, S.; Hershey, J.R., "Minimum Word Error Training of Long Short-Term Memory Recurrent Neural Network Language Models for Speech Recognition", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/ICASSP.2016.7472827, March 2016, pp. 5990-5994.
      BibTeX Download PDF
      • @inproceedings{Hori2016mar,
      • author = {Hori, T. and Hori, C. and Watanabe, S. and Hershey, J.R.},
      • title = {Minimum Word Error Training of Long Short-Term Memory Recurrent Neural Network Language Models for Speech Recognition},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2016,
      • pages = {5990--5994},
      • month = mar,
      • doi = {10.1109/ICASSP.2016.7472827},
      • url = {http://www.merl.com/publications/TR2016-011}
      • }
  • MERL Contacts:
  • Research Area:

    Speech & Audio


This paper describes minimum word error (MWE) training of recurrent neural network language models (RNNLMs) for speech recognition. RNNLMs are usually trained to minimize a cross entropy of estimated word probabilities against the correct word sequence, which corresponds to maximum likelihood criterion. However, this training does not necessarily maximize a performance measure in a target task, i.e. it does not minimize word error rate (WER) explicitly in speech recognition. To solve such a problem, several discriminative training methods have already been proposed for n-gram language models, but those for RNNLMs have not sufficiently investigated. In this paper, we propose a MWE training method for RNNLMs, and report significant WER reductions when we applied the MWE method to a standard Elman-type RNNLM and a more advanced model, a Long Short-Term Memory (LSTM) RNNLM. We also present efficient MWE training with N-best lists on Graphics Processing Units (GPUs).