Publications

215 / 2,666 publications found.


  •  Cho, J., Baskar, M.K., Li, R., Wiesner, M., Mallidi, S.H., Yalta, N., Karafiat, M., Watanabe, S., Hori, T., "Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling", IEEE Spoken Language Technology Workshop, December 2018.
  •  Hayashi, T., Watanabe, S., Zhang, Y., Toda, T., Hori, T., Astudillo, R., Takeda, K., "Back-Translation-Style Data Augmentation for End-to-End ASR", IEEE Spoken Language Technology Workshop, December 2018.
  •  Hori, T., Cho, J., Watanabe, S., "End-to-End Speech Recognition with Word-Based RNN Language Models", IEEE Spoken Language Technology Workshop, December 2018.
  •  Hori, T., Wang, W., Koji, Y., Hori, C., Harsham, B.A., Hershey, J., "Adversarial Training and Decoding Strategies for End-to-end Neural Conversation Models", Computer Speech and Language, December 2018.
  •  Wang, Z.-Q., Le Roux, J., Wang, D., Hershey, J., "End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction", Interspeech, September 2018.
  •  Watanabe, S., Hori, T., Karita, S., Hayashi, T., Nishitoba, J., Unno, Y., Enrique Yalta Soplin, N., Heymann, J., Wiesner, M., Chen, N., Renduchintala, A., Ochiai, T., "ESPnet: End-to-End Speech Processing Toolkit", Interspeech, September 2018.
  •  Wichern, G., Le Roux, J., "Phase Reconstruction with Learned Time-Frequency Representations for Single-Channel Speech Separation", International Workshop on Acoustic Signal Enhancement (IWAENC), September 2018.
  •  Wang, Z.-Q., Le Roux, J., Wang, D., Hershey, J., "End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction", Tech. Rep. TR2018-133, Mitsubishi Electric Research Laboratories, Cambridge, MA, September 2018.
    BibTeX Download PDFAbout TR2018-133
    • @techreport{MERL_TR2018-133,
    • author = {Wang, Z.-Q. and Le Roux, J. and Wang, D. and Hershey, J.},
    • title = {End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction},
    • institution = {MERL - Mitsubishi Electric Research Laboratories},
    • address = {Cambridge, MA 02139},
    • number = {TR2018-133},
    • month = sep,
    • year = 2018,
    • url = {http://www.merl.com/publications/TR2018-133/}
    • }
  •  Seki, H., Hori, T., Watanabe, S., Le Roux, J., Hershey, J., "A Purely End-to-end System for Multi-speaker Speech Recognition", Annual Meeting of the Association for Computational Linguistics (ACL), Jul 16, 2018.
  •  Hori, C., Alamri, H., Wang, J., Wichern, G., Hori, T., Cherian, A., Marks, T.K., Cartillier, V., Lopes, R., Das, A., Essa, I., Batra, D., Parikh, D., "End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features", arXiv, July 13, 2018.
    BibTeX Download PDFAbout TR2018-085
    • @techreport{MERL_TR2018-085,
    • author = {Hori, C. and Alamri, H. and Wang, J. and Wichern, G. and Hori, T. and Cherian, A. and Marks, T.K. and Cartillier, V. and Lopes, R. and Das, A. and Essa, I. and Batra, D. and Parikh, D.},
    • title = {End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features},
    • institution = {MERL - Mitsubishi Electric Research Laboratories},
    • address = {Cambridge, MA 02139},
    • number = {TR2018-085},
    • month = jul,
    • year = 2018,
    • url = {http://www.merl.com/publications/TR2018-085/}
    • }
  •  Alamri, H., Cartillier, V., Lopes, R., Das, A., Wang, J., Essa, I., Batra, D., Parikh, D., Cherian, A., Marks, T.K., Hori, C., "Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7", arXiv, July 12, 2018.
    BibTeX Download PDFAbout TR2018-069
    • @techreport{MERL_TR2018-069,
    • author = {Alamri, H. and Cartillier, V. and Lopes, R. and Das, A. and Wang, J. and Essa, I. and Batra, D. and Parikh, D. and Cherian, A. and Marks, T.K. and Hori, C.},
    • title = {Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7},
    • institution = {MERL - Mitsubishi Electric Research Laboratories},
    • address = {Cambridge, MA 02139},
    • number = {TR2018-069},
    • month = jul,
    • year = 2018,
    • url = {http://www.merl.com/publications/TR2018-069/}
    • }
  •  Seki, H., Hori, T., Watanabe, S., Le Roux, J., Hershey, J., "A Purely End-to-end System for Multi-speaker Speech Recognition", arXiv, July 10, 2018.
    BibTeX Download PDFAbout TR2018-058
    • @techreport{MERL_TR2018-058,
    • author = {Seki, H. and Hori, T. and Watanabe, S. and Le Roux, J. and Hershey, J.},
    • title = {A Purely End-to-end System for Multi-speaker Speech Recognition},
    • institution = {MERL - Mitsubishi Electric Research Laboratories},
    • address = {Cambridge, MA 02139},
    • number = {TR2018-058},
    • month = jul,
    • year = 2018,
    • url = {http://www.merl.com/publications/TR2018-058/}
    • }
  •  Wang, Z.-Q., Le Roux, J., Hershey, J., "End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction", arXiv, July 9, 2018.
    BibTeX Download PDFAbout TR2018-051
    • @techreport{MERL_TR2018-051,
    • author = {Wang, Z.-Q. and Le Roux, J. and Hershey, J.},
    • title = {End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction},
    • institution = {MERL - Mitsubishi Electric Research Laboratories},
    • address = {Cambridge, MA 02139},
    • number = {TR2018-051},
    • month = jul,
    • year = 2018,
    • url = {http://www.merl.com/publications/TR2018-051/}
    • }
  •  Ochiai, T., Watanabe, S., Katagiri, S., Hori, T., Hershey, J.R., "Speaker Adaptation for Multichannel End-to-End Speech Recognition", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2018.
  •  Seki, H., Watanabe, S., Hori, T., Le Roux, J., Hershey, J.R., "An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2018.
  •  Settle, S., Le Roux, J., Hori, T., Watanabe, S., Hershey, J.R., "End-to-End Multi-Speaker Speech Recognition", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2018.
  •  Wang, Z.-Q., Le Roux, J., Hershey, J.R., "Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2018.
  •  Wang, Z.-Q., Le Roux, J., Hershey, J.R., "Alternative Objective Functions for Deep Clustering", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), April 2018.
  •  Watanabe, S., Hori, T., Karita, S., Hayashi, T., Nishitoba, J., Unno, Y., Enrique Yalta Soplin, N., Heymann, J., Wiesner, M., Chen, N., Renduchintala, A., Ochiai, T., "ESPnet: End-to-End Speech Processing Toolkit," Tech. Rep. TR2018-036, arXiv, March 2018.
    BibTeX Download PDFAbout TR2018-036
    • @techreport{MERL_TR2018-036,
    • author = {Watanabe, S. and Hori, T. and Karita, S. and Hayashi, T. and Nishitoba, J. and Unno, Y. and Enrique Yalta Soplin, N. and Heymann, J. and Wiesner, M. and Chen, N. and Renduchintala, A. and Ochiai, T.},
    • title = {ESPnet: End-to-End Speech Processing Toolkit},
    • institution = {MERL - Mitsubishi Electric Research Laboratories},
    • address = {Cambridge, MA 02139},
    • number = {TR2018-036},
    • month = mar,
    • year = 2018,
    • url = {http://www.merl.com/publications/TR2018-036/}
    • }
  •  Hori, C., Hori, T., "End-to-end Conversation Modeling Track in DSTC6", Dialog System Technology Challenges, December 2017.