Anoop Cherian

Anoop Cherian
  • Biography

    Anoop was a postdoctoral researcher in the LEAR group at Inria from 2012-2015 where his research was on the estimation and tracking of human poses in videos. From 2015-2017, he was a Research Fellow at the Australian National University, where he worked on the problem of recognizing human activities in video sequences. Anoop is the recipient of the Best Student Paper award at the Intl. Conference on Image Processing in 2012. Currently, his research focus is on modeling the semantics of video data.

  • Recent News & Events

    •  NEWS   Anoop Cherian gave an invited talk at the Multi-modal Video Analysis Workshop, ECCV 2020
      Date: August 23, 2020
      Where: European Conference on Computer Vision (ECCV), online, 2020
      MERL Contact: Anoop Cherian
      Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio
      Brief
      • MERL Principal Research Scientist Anoop Cherian gave an invited talk titled "Sound2Sight: Audio-Conditioned Visual Imagination" at the Multi-modal Video Analysis workshop held in conjunction with the European Conference on Computer Vision (ECCV), 2020. The talk was based on a recent ECCV paper that describes a new multimodal reasoning task called Sound2Sight and a generative adversarial machine learning algorithm for producing plausible video sequences conditioned on sound and visual context.
    •  
    •  NEWS   MERL's Scene-Aware Interaction Technology Featured in Mitsubishi Electric Corporation Press Release
      Date: July 22, 2020
      Where: Tokyo, Japan
      MERL Contacts: Anoop Cherian; Bret Harsham; Chiori Hori; Takaaki Hori; Jonathan Le Roux; Tim Marks; Alan Sullivan; Anthony Vetro
      Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio
      Brief
      • Mitsubishi Electric Corporation announced that the company has developed what it believes to be the world’s first technology capable of highly natural and intuitive interaction with humans based on a scene-aware capability to translate multimodal sensing information into natural language.

        The novel technology, Scene-Aware Interaction, incorporates Mitsubishi Electric’s proprietary Maisart® compact AI technology to analyze multimodal sensing information for highly natural and intuitive interaction with humans through context-dependent generation of natural language. The technology recognizes contextual objects and events based on multimodal sensing information, such as images and video captured with cameras, audio information recorded with microphones, and localization information measured with LiDAR.

        Scene-Aware Interaction for car navigation, one target application, will provide drivers with intuitive route guidance. The technology is also expected to have applicability to human-machine interfaces for in-vehicle infotainment, interaction with service robots in building and factory automation systems, systems that monitor the health and well-being of people, surveillance systems that interpret complex scenes for humans and encourage social distancing, support for touchless operation of equipment in public areas, and much more. The technology is based on recent research by MERL's Speech & Audio and Computer Vision groups.


        Demonstration Video:



        Link:

        Mitsubishi Electric Corporation Press Release
    •  

    See All News & Events for Anoop
  • Internships with Anoop

    • CV1552: Multimodal Reasoning

      MERL is looking for a self-motivated intern to work on problems at the intersection of video understanding, audio processing, and language models. The ideal candidate would be a PhD student with a strong mathematical background in machine learning and computer vision. The candidate must have prior experience in using deep learning methods for image and video representations (such as using scene graphs) and deep audio analysis (such as source separation, localization, etc.). Proficiency in Python and flexibility in using different deep learning software (especially Pytorch) is expected. The intern is expected to collaborate with computer vision and speech teams at MERL to develop algorithms and prepare manuscripts for scientific publications. The internship is for 3 months with flexible start date. This internship is preferred to be onsite at MERL, but may be done remotely where you live if the COVID pandemic makes it necessary.

    • CV1553: Graph Representations for Action Recognition

      MERL is looking for a self-motivated intern to work on problems at the intersection of video understanding and graph representation learning for solving action recognition problems. The ideal candidate would be a senior year (>=3) PhD student with a strong mathematical background in machine learning and computer vision and who has published at least one paper in a top-tier machine learning or computer vision venue (NIPS/CVPR/ECCV/ICCV/ICML/PAMI etc.). The candidate must have prior experience in using deep learning methods for video understanding (such as action recognition, scene graph representations, etc.) and language models (such as in visual question answering or captioning). Proficiency in Python and flexibility in using different deep learning software (such as Pytorch) is expected. The internship is for 3 months with flexible start date. This internship is preferred to be onsite at MERL, but may be done remotely where you live if the COVID pandemic makes it necessary.

    See All Internships at MERL
  • MERL Publications

    •  Cherian, A., Chatterjee, M., Ahuja, N., "Sound2Sight: Generating Visual Dynamics from Sound and Context", European Conference on Computer Vision (ECCV), Vedaldi, A., Bischof, H., Brox, Th., Frahm, J.-M., Eds., August 2020.
      BibTeX TR2020-121 PDF Software
      • @inproceedings{Cherian2020aug,
      • author = {Cherian, Anoop and Chatterjee, Moitreya and Ahuja, Narendra},
      • title = {Sound2Sight: Generating Visual Dynamics from Sound and Context},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2020,
      • editor = {Vedaldi, A., Bischof, H., Brox, Th., Frahm, J.-M.},
      • month = aug,
      • publisher = {Springer},
      • url = {https://www.merl.com/publications/TR2020-121}
      • }
    •  Geng, S., Gao, P., Hori, C., Le Roux, J., Cherian, A., "Spatio-Temporal Scene Graphs for Video Dialo", arXiv, July 2020.
      BibTeX arXiv
      • @article{Geng2020jul,
      • author = {Geng, Shijie and Gao, Peng and Hori, Chiori and Le Roux, Jonathan and Cherian, Anoop},
      • title = {Spatio-Temporal Scene Graphs for Video Dialo},
      • journal = {arXiv},
      • year = 2020,
      • month = jul,
      • url = {https://arxiv.org/abs/2007.03848}
      • }
    •  Cherian, A., Aeron, S., "Representation Learning via Adversarially-Contrastive Optimal Transport", International Conference on Machine Learning (ICML), H. Daumé and A. Singh, Eds., July 2020, pp. 10675-10685.
      BibTeX TR2020-093 PDF Software
      • @inproceedings{Cherian2020jul,
      • author = {Cherian, Anoop and Aeron, Shuchin},
      • title = {Representation Learning via Adversarially-Contrastive Optimal Transport},
      • booktitle = {International Conference on Machine Learning (ICML)},
      • year = 2020,
      • editor = {H. Daumé and A. Singh},
      • pages = {10675--10685},
      • month = jul,
      • url = {https://www.merl.com/publications/TR2020-093}
      • }
    •  Kumar, A., Marks, T., Mou, W., Wang, Y., Cherian, A., Jones, M.J., Liu, X., Koike-Akino, T., Feng, C., "LUVLi Face Alignment: Estimating Landmarks’ Location, Uncertainty, and Visibility Likelihood", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI: 10.1109/CVPR42600.2020.00826, June 2020.
      BibTeX TR2020-067 PDF Video Data Software
      • @inproceedings{Kumar2020jun,
      • author = {Kumar, Abhinav and Marks, Tim and Mou, Wenxuan and Wang, Ye and Cherian, Anoop and Jones, Michael J. and Liu, Xiaoming and Koike-Akino, Toshiaki and Feng, Chen},
      • title = {LUVLi Face Alignment: Estimating Landmarks’ Location, Uncertainty, and Visibility Likelihood},
      • booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2020,
      • month = jun,
      • publisher = {IEEE},
      • doi = {10.1109/CVPR42600.2020.00826},
      • issn = {2575-7075},
      • isbn = {978-1-7281-7168-5},
      • url = {https://www.merl.com/publications/TR2020-067}
      • }
    •  Huang, R., Xu, W., Lee, T.-Y., Cherian, A., Wang, Y., Marks, T., "FX-GAN: Self-Supervised GAN Learning via Feature Exchange", IEEE Winter Conference on Applications of Computer Vision (WACV), February 2020, pp. 3194-3202.
      BibTeX TR2020-014 PDF
      • @inproceedings{Huang2020feb,
      • author = {Huang, Rui and Xu, Wenju and Lee, Teng-Yok and Cherian, Anoop and Wang, Ye and Marks, Tim},
      • title = {FX-GAN: Self-Supervised GAN Learning via Feature Exchange},
      • booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
      • year = 2020,
      • pages = {3194--3202},
      • month = feb,
      • url = {https://www.merl.com/publications/TR2020-014}
      • }
    See All Publications for Anoop
  • Other Publications

    •  Anoop Cherian and Stephen Gould, "Second-order Temporal Pooling for Action Recognition", International Journal of Computer Vision (IJCV), 2018.
      BibTeX
      • @Article{cherian2018ijcv,
      • author = {Cherian, Anoop and Gould, Stephen},
      • title = {Second-order Temporal Pooling for Action Recognition},
      • journal = {International Journal of Computer Vision (IJCV)},
      • year = 2018,
      • publisher = {Springer}
      • }
    •  Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian and Stephen Gould, "Visual Permutation Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018.
      BibTeX
      • @Article{cherian2018permutation,
      • author = {Santa Cruz, Rodrigo and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
      • title = {Visual Permutation Learning},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2018,
      • publisher = {IEEE}
      • }
    •  Jue Wang, Anoop Cherian, Fatih Porikli and Stephen Gould, "Video Representation Learning Using Discriminative Pooling", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_representation_cvpr18,
      • author = {Wang, Jue and Cherian, Anoop and Porikli, Fatih and Gould, Stephen},
      • title = {Video Representation Learning Using Discriminative Pooling},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Suryansh Kumar, Anoop Cherian, Yuchao Dai and Hongdong Li, "Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_rigid_cvpr18,
      • author = {Kumar, Suryansh and Cherian, Anoop and Dai, Yuchao and Li, Hongdong},
      • title = {Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Anoop Cherian, Suvrit Sra, Stephen Gould and Richard Hartley, "Non-Linear Temporal Subspace Representations for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_temporal_cvpr18,
      • author = {Cherian, Anoop and Sra, Suvrit and Gould, Stephen and Hartley, Richard},
      • title = {Non-Linear Temporal Subspace Representations for Activity Recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Anoop Cherian, Basura Fernando, Mehrtash Harandi and Stephen Gould, "Generalized Rank Pooling for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      BibTeX
      • @Inproceedings{cherian2017generalized,
      • author = {Cherian, Anoop and Fernando, Basura and Harandi, Mehrtash and Gould, Stephen},
      • title = {Generalized Rank Pooling for Activity Recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2017
      • }
    •  Anoop Cherian, Panagiotis Stanitsas, Mehrtash Harandi, Vassilios Morellas and Nikolaos Papanikolopoulos, "Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices", International Conference on Computer Vision (ICCV), 2017.
      BibTeX
      • @Inproceedings{cherian_rigid_iccv17,
      • author = {Cherian, Anoop and Stanitsas, Panagiotis and Harandi, Mehrtash and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices},
      • booktitle = {International Conference on Computer Vision (ICCV)},
      • year = 2017
      • }
    •  Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian and Stephen Gould, "DeepPermNet: Visual Permutation Learning", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      BibTeX
      • @Inproceedings{cruz2017deeppermnet,
      • author = {Cruz, Rodrigo Santa and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
      • title = {DeepPermNet: Visual Permutation Learning},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2017
      • }
    •  Anoop Cherian, Vassilios Morellas and Nikolaos Papanikolopoulos, "Bayesian Non-Parametric clustering for positive definite matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016.
      BibTeX
      • @Article{cherian2016bayesian,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Bayesian Non-Parametric clustering for positive definite matrices},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2016,
      • publisher = {IEEE}
      • }
    •  Piotr Koniusz and Anoop Cherian, "Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
      BibTeX
      • @Inproceedings{koniusz2016sparse,
      • author = {Koniusz, Piotr and Cherian, Anoop},
      • title = {Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2016
      • }
    •  Piotr Koniusz, Anoop Cherian and Fatih Porikli, "Tensor representations via kernel linearization for action recognition from 3D skeletons", European Conference on Computer Vision (ECCV), 2016.
      BibTeX
      • @Inproceedings{koniusz2016tensor,
      • author = {Koniusz, Piotr and Cherian, Anoop and Porikli, Fatih},
      • title = {Tensor representations via kernel linearization for action recognition from 3D skeletons},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2016,
      • organization = {Springer}
      • }
    •  Anoop Cherian, Julien Mairal, Karteek Alahari and Cordelia Schmid, "Mixing body-part sequences for human pose estimation", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
      BibTeX
      • @Inproceedings{cherian2014mixing,
      • author = {Cherian, Anoop and Mairal, Julien and Alahari, Karteek and Schmid, Cordelia},
      • title = {Mixing body-part sequences for human pose estimation},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2014
      • }
    •  Anoop Cherian, "Nearest neighbors using compact sparse codes", International Conference on Machine Learning (ICML), 2014.
      BibTeX
      • @Inproceedings{cherian2014nearest,
      • author = {Cherian, Anoop},
      • title = {Nearest neighbors using compact sparse codes},
      • booktitle = {International Conference on Machine Learning (ICML)},
      • year = 2014
      • }
    •  Anoop Cherian and Suvrit Sra, "Riemannian sparse coding for positive definite matrices", European Conference on Computer Vision (ECCV), 2014.
      BibTeX
      • @Inproceedings{cherian2014riemannian,
      • author = {Cherian, Anoop and Sra, Suvrit},
      • title = {Riemannian sparse coding for positive definite matrices},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2014,
      • organization = {Springer}
      • }
    •  Anoop Cherian, Suvrit Sra, Arindam Banerjee and Nikolaos Papanikolopoulos, "Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2013.
      BibTeX
      • @Article{cherian2013jensen,
      • author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
      • title = {Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2013,
      • publisher = {IEEE}
      • }
    •  Anoop Cherian, Vassilios Morellas, Nikolaos Papanikolopoulos and Saad J Bedros, "Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications", Computer Vision and Pattern Recognition (CVPR), 2011.
      BibTeX
      • @Inproceedings{cherian2011dirichlet,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos and Bedros, Saad J},
      • title = {Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications},
      • booktitle = {Computer Vision and Pattern Recognition (CVPR)},
      • year = 2011
      • }
    •  Anoop Cherian, Suvrit Sra, Arindam Banerjee and Nikolaos Papanikolopoulos, "Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence", International Conference on Computer Vision (ICCV), 2011.
      BibTeX
      • @Inproceedings{cherian2011efficient,
      • author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
      • title = {Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence},
      • booktitle = {International Conference on Computer Vision (ICCV)},
      • year = 2011
      • }
    •  Suvrit Sra and Anoop Cherian, "Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval", Machine Learning and Knowledge Discovery in Databases (ECML), 2011.
      BibTeX
      • @Article{sra2011generalized,
      • author = {Sra, Suvrit and Cherian, Anoop},
      • title = {Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval},
      • journal = {Machine Learning and Knowledge Discovery in Databases (ECML)},
      • year = 2011
      • }
    •  Anoop Cherian, Vassilios Morellas and Nikolaos Papanikolopoulos, "Accurate 3D ground plane estimation from a single image", International Conference on Robotics and Automation, 2009.
      BibTeX
      • @Inproceedings{cherian2009accurate,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Accurate 3D ground plane estimation from a single image},
      • booktitle = {International Conference on Robotics and Automation},
      • year = 2009
      • }
  • Software Downloads