Takaaki Hori

Takaaki Hori
  • Biography

    Before joining MERL in 2015, Takaaki spent 15 years doing research on speech and language technology at Nippon Telegraph, and Telephone (NTT) in Japan. His work includes studies on speech recognition algorithms using weighted finite-state transducers (WFSTs), efficient search algorithms for spoken document retrieval, spoken language understanding, and automatic meeting analysis.

  • Recent News & Events


    See All News & Events for Takaaki
  • Awards

    •  AWARD   MERL's Speech Team Achieves World's 2nd Best Performance at the Third CHiME Speech Separation and Recognition Challenge
      Date: December 15, 2015
      Awarded to: John R. Hershey, Takaaki Hori, Jonathan Le Roux and Shinji Watanabe
      MERL Contacts: Takaaki Hori; Jonathan Le Roux
      Research Area: Speech & Audio
      Brief
      • The results of the third 'CHiME' Speech Separation and Recognition Challenge were publicly announced on December 15 at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015) held in Scottsdale, Arizona, USA. MERL's Speech and Audio Team, in collaboration with SRI, ranked 2nd out of 26 teams from Europe, Asia and the US. The task this year was to recognize speech recorded using a tablet in real environments such as cafes, buses, or busy streets. Due to the high levels of noise and the distance from the speaker's mouth to the microphones, this is very challenging task, where the baseline system only achieved 33.4% word error rate. The MERL/SRI system featured state-of-the-art techniques including multi-channel front-end, noise-robust feature extraction, and deep learning for speech enhancement, acoustic modeling, and language modeling, leading to a dramatic 73% reduction in word error rate, down to 9.1%. The core of the system has since been released as a new official challenge baseline for the community to use.
    •  
    See All Awards for MERL
  • Research Highlights

  • MERL Publications

    •  Moritz, N., Hori, T., Le Roux, J., "Streaming Automatic Speech Recognition With The Transformer Model", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/ICASSP40776.2020.9054476, April 2020, pp. 6074-6078.
      BibTeX TR2020-040 PDF Video
      • @inproceedings{Moritz2020apr,
      • author = {Moritz, Niko and Hori, Takaaki and Le Roux, Jonathan},
      • title = {Streaming Automatic Speech Recognition With The Transformer Model},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2020,
      • pages = {6074--6078},
      • month = apr,
      • publisher = {IEEE},
      • doi = {10.1109/ICASSP40776.2020.9054476},
      • issn = {2379-190X},
      • isbn = {978-1-5090-6631-5},
      • url = {https://www.merl.com/publications/TR2020-040}
      • }
    •  Sari, L., Moritz, N., Hori, T., Le Roux, J., "Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory For End-To-End ASR", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/ICASSP40776.2020.9054249, April 2020, pp. 7384-7388.
      BibTeX TR2020-037 PDF Video
      • @inproceedings{Sari2020apr,
      • author = {Sari, Leda and Moritz, Niko and Hori, Takaaki and Le Roux, Jonathan},
      • title = {Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory For End-To-End ASR},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2020,
      • pages = {7384--7388},
      • month = apr,
      • publisher = {IEEE},
      • doi = {10.1109/ICASSP40776.2020.9054249},
      • issn = {2379-190X},
      • isbn = {978-1-5090-6631-5},
      • url = {https://www.merl.com/publications/TR2020-037}
      • }
    •  Li, R., Wang, X., Mallidi, H., Watanabe, S., Hori, T., Hermansky, H., "Multi-Stream End-to-End Speech Recognition", IEEE/ACM Transactions on Audio, Speech and Language Processing, DOI: 10.1109/TASLP.2019.2959721, Vol. 28, pp. 646-655, March 2020.
      BibTeX TR2020-030 PDF
      • @article{Li2020mar,
      • author = {Li, Ruizhi and Wang, Xiaofei and Mallidi, Harish and Watanabe, Shinji and Hori, Takaaki and Hermansky, Hynek},
      • title = {Multi-Stream End-to-End Speech Recognition},
      • journal = {IEEE/ACM Transactions on Audio, Speech and Language Processing},
      • year = 2020,
      • volume = 28,
      • pages = {646--655},
      • month = mar,
      • doi = {10.1109/TASLP.2019.2959721},
      • url = {https://www.merl.com/publications/TR2020-030}
      • }
    •  Karita, S., Chen, N., Hayashi, T., Hori, T., Inaguma, H., Jiang, Z., Someki, M., Enrique Yalta Soplin, N., Yamamoto, R., Wang, X., Watanabe, S., Yoshimura, T., Zhang, W., "A Comparative Study on Transformer Vs RNN in Speech Applications", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), DOI: 10.1109/ASRU46091.2019.9003750, December 2019, pp. 449-456.
      BibTeX TR2019-158 PDF
      • @inproceedings{Karita2019dec,
      • author = {Karita, Shigeki and Chen, Nanxin and Hayashi, Tomoki and Hori, Takaaki and Inaguma, Hirofumi and Jiang, Ziyan and Someki, Masao and Enrique Yalta Soplin, Nelson and Yamamoto, Ryuichi and Wang, Xiaofei and Watanabe, Shinji and Yoshimura, Takenori and Zhang, Wangyou},
      • title = {A Comparative Study on Transformer Vs RNN in Speech Applications},
      • booktitle = {IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)},
      • year = 2019,
      • pages = {449--456},
      • month = dec,
      • doi = {10.1109/ASRU46091.2019.9003750},
      • url = {https://www.merl.com/publications/TR2019-158}
      • }
    •  Moritz, N., Hori, T., Le Roux, J., "Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), , December 2019, pp. 936-943.
      BibTeX TR2019-159 PDF
      • @inproceedings{Moritz2019dec,
      • author = {Moritz, Niko and Hori, Takaaki and Le Roux, Jonathan},
      • title = {Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models},
      • booktitle = {IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)},
      • year = 2019,
      • pages = {936--943},
      • month = dec,
      • isbn = {978-1-7281-0305-1},
      • url = {https://www.merl.com/publications/TR2019-159}
      • }
    See All Publications for Takaaki
  • Videos

  • MERL Issued Patents

    • Title: "Method and Apparatus for Open-Vocabulary End-to-End Speech Recognition"
      Inventors: Hori, Takaaki; Watanabe, Shinji; Hershey, John R.
      Patent No.: 10,672,388
      Issue Date: May 2, 2020
    • Title: "Method and Apparatus for Multi-Lingual End-to-End Speech Recognition"
      Inventors: Watanabe, Shinji; Hori, Takaaki; Seki, Hiroshi; Le Roux, Jonathan; Hershey, John R.
      Patent No.: 10,593,321
      Issue Date: Mar 17, 2020
    • Title: "Method and System for Multi-Modal Fusion Model"
      Inventors: Hori, Chiori; Hori, Takaaki; Hershey, John R.; Marks, Tim
      Patent No.: 10,417,498
      Issue Date: Sep 17, 2019
    • Title: "Method and System for Training Language Models to Reduce Recognition Errors"
      Inventors: Hori, Takaaki; Hori, Chiori; Watanabe, Shinji; Hershey, John R.
      Patent No.: 10,176,799
      Issue Date: Jan 8, 2019
    • Title: "Method and System for Role Dependent Context Sensitive Spoken and Textual Language Understanding with Neural Networks"
      Inventors: Hori, Chiori; Hori, Takaaki; Watanabe, Shinji; Hershey, John R.
      Patent No.: 9,842,106
      Issue Date: Dec 12, 2017
    See All Patents for MERL