TR2016-173

Automated Structure Discovery and Parameter Tuning of Neural Network Language Model Based on Evolution Strategy

- Takano, T., Moriya, T., Shinozaki, T., Watanabe, S., Hori, T., Duh, K., "Automated Structure Discovery and Parameter Tuning of Neural Network Language Model Based on Evolution Strategy", IEEE Spoken Language Technology Workshop (SLT), DOI: 10.1109/SLT.2016.7846334, December 2016.
  BibTeX TR2016-173 PDF
  - @inproceedings{Takano2016dec,
  - author = {Takano, Tomihiro and Moriya, Takafumi and Shinozaki, Takahiro and Watanabe, Shinji and Hori, Takaaki and Duh, Kevin},
  - title = {{Automated Structure Discovery and Parameter Tuning of Neural Network Language Model Based on Evolution Strategy}},
  - booktitle = {IEEE Spoken Language Technology Workshop (SLT)},
  - year = 2016,
  - month = dec,
  - doi = {10.1109/SLT.2016.7846334},
  - url = {https://www.merl.com/publications/TR2016-173}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

Long short-term memory (LSTM) recurrent neural network based language models are known to improve speech recognition performance. However, significant effort is required to optimize network structures and training configurations. In this study, we automate the development process using evolutionary algorithms. In particular, we apply the covariance matrix adaptation-evolution strategy (CMA-ES), which has demonstrated robustness in other black box hyper-parameter optimization problems. By flexibly allowing optimization of various meta-parameters including layer wise unit types, our method automatically finds a configuration that gives improved recognition performance. Further, by using a Pareto based multiobjective CMA-ES, both WER and computational cost were reduced jointly: after 10 generations, relative WER and computational time reductions for decoding were 4.1% and 22.7% respectively, compared to an initial baseline system whose WER was 8.7%.

Research Areas:

Abstract: