Takaaki Hori

Takaaki Hori
  • Biography

    Before joining MERL in 2015, Takaaki spent 15 years doing research on speech and language technology at Nippon Telegraph, and Telephone (NTT) in Japan. His work includes studies on speech recognition algorithms using weighted finite-state transducers (WFSTs), efficient search algorithms for spoken document retrieval, spoken language understanding, and automatic meeting analysis.

  • Recent News & Events


    See All News & Events for Takaaki
  • Awards

    •  AWARD   MERL's Speech Team Achieves World's 2nd Best Performance at the Third CHiME Speech Separation and Recognition Challenge
      Date: December 15, 2015
      Awarded to: John R. Hershey, Takaaki Hori, Jonathan Le Roux and Shinji Watanabe
      MERL Contacts: Takaaki Hori; Jonathan Le Roux
      Research Area: Speech & Audio
      Brief
      • The results of the third 'CHiME' Speech Separation and Recognition Challenge were publicly announced on December 15 at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015) held in Scottsdale, Arizona, USA. MERL's Speech and Audio Team, in collaboration with SRI, ranked 2nd out of 26 teams from Europe, Asia and the US. The task this year was to recognize speech recorded using a tablet in real environments such as cafes, buses, or busy streets. Due to the high levels of noise and the distance from the speaker's mouth to the microphones, this is very challenging task, where the baseline system only achieved 33.4% word error rate. The MERL/SRI system featured state-of-the-art techniques including multi-channel front-end, noise-robust feature extraction, and deep learning for speech enhancement, acoustic modeling, and language modeling, leading to a dramatic 73% reduction in word error rate, down to 9.1%. The core of the system has since been released as a new official challenge baseline for the community to use.
    •  
    See All Awards for MERL
  • Research Highlights

  • MERL Publications

    •  Watanabe, S., Boyer, F., Chang, X., Guo, P., Hayashi, T., Higuchi, Y., Hori, T., Huang, W.-C., Inaguma, H., Kamo, N., Shigeki, K., Li, C., Shi, J., Subramanian, A.S., Zhang, W., "The 2020 Espnet Update: New Features, Broadened Applications, Performance Improvements, and Future Plans", arXiv, January 2021.
      BibTeX
      • @article{Watanabe2021jan,
      • author = {Watanabe, Shinji and Boyer, Florian and Chang, Xuankai and Guo, Pengcheng and Hayashi, Tomoki and Higuchi, Yosuke and Hori, Takaaki and Huang, Wen-Chin and Inaguma, Hirofumi and Kamo, Naoyuki and Shigeki, Karita and Li, Chenda and Shi, Jing and Subramanian, Aswin S and Zhang, Wangyou},
      • title = {The 2020 Espnet Update: New Features, Broadened Applications, Performance Improvements, and Future Plans},
      • journal = {arXiv},
      • year = 2021,
      • month = jan
      • }
    •  Khurana, S., Moritz, N., Hori, T., Le Roux, J., "Unsupervised Domain Adaptation for Speech Recognition Via Uncertainty Driven Self-Training", arXiv, December 2020.
      BibTeX arXiv
      • @article{Khurana2020dec,
      • author = {Khurana, Sameer and Moritz, Niko and Hori, Takaaki and Le Roux, Jonathan},
      • title = {Unsupervised Domain Adaptation for Speech Recognition Via Uncertainty Driven Self-Training},
      • journal = {arXiv},
      • year = 2020,
      • month = dec,
      • url = {https://arxiv.org/abs/2011.13439}
      • }
    •  Moritz, N., Hori, T., Le Roux, J., "Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification", arXiv, October 2020.
      BibTeX arXiv
      • @article{Moritz2020oct2,
      • author = {Moritz, Niko and Hori, Takaaki and Le Roux, Jonathan},
      • title = {Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification},
      • journal = {arXiv},
      • year = 2020,
      • month = oct,
      • url = {https://arxiv.org/abs/2010.15653}
      • }
    •  Hori, T., Moritz, N., Hori, C., Le Roux, J., "Transformer-based Long-context End-to-end Speech Recognition", Annual Conference of the International Speech Communication Association (Interspeech), DOI: 10.21437/Interspeech.2020-2928, October 2020, pp. 5011-5015.
      BibTeX TR2020-139 PDF
      • @inproceedings{Hori2020oct,
      • author = {Hori, Takaaki and Moritz, Niko and Hori, Chiori and Le Roux, Jonathan},
      • title = {Transformer-based Long-context End-to-end Speech Recognition},
      • booktitle = {Annual Conference of the International Speech Communication Association (Interspeech)},
      • year = 2020,
      • pages = {5011--5015},
      • month = oct,
      • doi = {10.21437/Interspeech.2020-2928},
      • issn = {1990-9772},
      • url = {https://www.merl.com/publications/TR2020-139}
      • }
    •  Moritz, N., Wichern, G., Hori, T., Le Roux, J., "All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection", Annual Conference of the International Speech Communication Association (Interspeech), DOI: 10.21437/Interspeech.2020-2757, October 2020, pp. 3112-3116.
      BibTeX TR2020-138 PDF
      • @inproceedings{Moritz2020oct,
      • author = {Moritz, Niko and Wichern, Gordon and Hori, Takaaki and Le Roux, Jonathan},
      • title = {All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection},
      • booktitle = {Annual Conference of the International Speech Communication Association (Interspeech)},
      • year = 2020,
      • pages = {3112--3116},
      • month = oct,
      • doi = {10.21437/Interspeech.2020-2757},
      • issn = {1990-9772},
      • url = {https://www.merl.com/publications/TR2020-138}
      • }
    See All Publications for Takaaki
  • Videos

  • MERL Issued Patents

    • Title: "Methods and Systems for Recognizing Simultaneous Speech by Multiple Speakers"
      Inventors: Le Roux, Jonathan; Hori, Takaaki; Settle, Shane; Seki, Hiroshi; Watanabe, Shinji; Hershey, John R.
      Patent No.: 10,811,000
      Issue Date: Oct 20, 2020
    • Title: "Method and Apparatus for Open-Vocabulary End-to-End Speech Recognition"
      Inventors: Hori, Takaaki; Watanabe, Shinji; Hershey, John R.
      Patent No.: 10,672,388
      Issue Date: May 2, 2020
    • Title: "Method and Apparatus for Multi-Lingual End-to-End Speech Recognition"
      Inventors: Watanabe, Shinji; Hori, Takaaki; Seki, Hiroshi; Le Roux, Jonathan; Hershey, John R.
      Patent No.: 10,593,321
      Issue Date: Mar 17, 2020
    • Title: "Method and System for Multi-Modal Fusion Model"
      Inventors: Hori, Chiori; Hori, Takaaki; Hershey, John R.; Marks, Tim
      Patent No.: 10,417,498
      Issue Date: Sep 17, 2019
    • Title: "Method and System for Training Language Models to Reduce Recognition Errors"
      Inventors: Hori, Takaaki; Hori, Chiori; Watanabe, Shinji; Hershey, John R.
      Patent No.: 10,176,799
      Issue Date: Jan 8, 2019
    • Title: "Method and System for Role Dependent Context Sensitive Spoken and Textual Language Understanding with Neural Networks"
      Inventors: Hori, Chiori; Hori, Takaaki; Watanabe, Shinji; Hershey, John R.
      Patent No.: 9,842,106
      Issue Date: Dec 12, 2017
    See All Patents for MERL