TR2018-069

Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7


    •  Alamri, H., Cartillier, V., Lopes, R., Das, A., Wang, J., Essa, I., Batra, D., Parikh, D., Cherian, A., Marks, T.K., Hori, C., "Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7", arXiv, July 12, 2018.
      BibTeX Download PDF
      • @article{Alamri2018jul,
      • author = {Alamri, Huda and Cartillier, Vincent and Lopes, Raphael and Das, Abhishek and Wang, Jue and Essa, Irfan and Batra, Dhruv and Parikh, Devi and Cherian, Anoop and Marks, Tim K. and Hori, Chiori},
      • title = {Audio Visual Scene-Aware Dialog (AVSD) Challenge at DSTC7},
      • journal = {arXiv},
      • year = 2018,
      • month = jul,
      • url = {https://www.merl.com/publications/TR2018-069}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio


Scene-aware dialog systems will be able to have conversations with users about the objects and events around them. Progress on such systems can be made by integrating state-of-the-art technologies from multiple research areas including end-to-end dialog systems visual dialog, and video description. We introduce the Audio Visual SceneAware Dialog (AVSD) challenge and dataset. In this challenge, which is one track of the 7th Dialog System Technology Challenges (DSTC7) workshop1 , the task is to build a system that generates responses in a dialog about an input video.