Anoop Cherian

- Phone: 617-621-7519
- Email:
-
Position:
Research / Technical Staff
Senior Principal Research Scientist -
Education:
Ph.D., University of Minnesota, 2013 -
Research Areas:
- Computer Vision
- Machine Learning
- Artificial Intelligence
- Speech & Audio
- Human-Computer Interaction
- Optimization
- Robotics
External Links:
Anoop's Quick Links
-
Biography
Anoop was a postdoctoral researcher in the LEAR group at Inria from 2012-2015 where his research was on the estimation and tracking of human poses in videos. From 2015-2017, he was a Research Fellow at the Australian National University, where he worked on the problem of recognizing human activities in video sequences. Anoop is the recipient of the Best Student Paper award at the Intl. Conference on Image Processing in 2012. Currently, his research focus is on modeling the semantics of video data.
-
Recent News & Events
-
NEWS MERL researchers presenting four papers and co-organizing a workshop at CVPR 2023 Date: June 18, 2023 - June 22, 2023
Where: Vancouver/Canada
MERL Contacts: Anoop Cherian; Michael J. Jones; Suhas Lohit; Kuan-Chuan Peng
Research Areas: Artificial Intelligence, Computer Vision, Machine LearningBrief- MERL researchers are presenting 4 papers and co-organizing a workshop at the CVPR 2023 conference, which will be held in Vancouver, Canada June 18-22. CVPR is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.
1. “Are Deep Neural Networks SMARTer than Second Graders,” by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin Smith, and Joshua B. Tenenbaum
We present SMART: a Simple Multimodal Algorithmic Reasoning Task and the associated SMART-101 dataset for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6-8 age group. Our experiments using SMART-101 reveal that powerful deep models are not better than random accuracy when analyzed for generalization. We also evaluate large language models (including ChatGPT) on a subset of SMART-101 and find that while these models show convincing reasoning abilities, their answers are often incorrect.
Paper: https://arxiv.org/abs/2212.09993
2. “EVAL: Explainable Video Anomaly Localization,” by Ashish Singh, Michael J. Jones, and Erik Learned-Miller
This work presents a method for detecting unusual activities in videos by building a high-level model of activities found in nominal videos of a scene. The high-level features used in the model are human understandable and include attributes such as the object class and the directions and speeds of motion. Such high-level features allow our method to not only detect anomalous activity but also to provide explanations for why it is anomalous.
Paper: https://arxiv.org/abs/2212.07900
3. "Aligning Step-by-Step Instructional Diagrams to Video Demonstrations," by Jiahao Zhang, Anoop Cherian, Yanbin Liu, Yizhak Ben-Shabat, Cristian Rodriguez, and Stephen Gould
The rise of do-it-yourself (DIY) videos on the web has made it possible even for an unskilled person (or a skilled robot) to imitate and follow instructions to complete complex real world tasks. In this paper, we consider the novel problem of aligning instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) with video segments from in-the-wild videos. We present a new dataset: Ikea Assembly in the Wild (IAW) and propose a contrastive learning framework for aligning instruction diagrams with video clips.
Paper: https://arxiv.org/pdf/2303.13800.pdf
4. "HaLP: Hallucinating Latent Positives for Skeleton-Based Self-Supervised Learning of Actions," by Anshul Shah, Aniket Roy, Ketul Shah, Shlok Kumar Mishra, David Jacobs, Anoop Cherian, and Rama Chellappa
In this work, we propose a new contrastive learning approach to train models for skeleton-based action recognition without labels. Our key contribution is a simple module, HaLP: Hallucinating Latent Positives for contrastive learning. HaLP explores the latent space of poses in suitable directions to generate new positives. Our experiments using HaLP demonstrates strong empirical improvements.
Paper: https://arxiv.org/abs/2304.00387
The 4th Workshop on Fair, Data-Efficient, and Trusted Computer Vision
MERL researcher Kuan-Chuan Peng is co-organizing the fourth Workshop on Fair, Data-Efficient, and Trusted Computer Vision (https://fadetrcv.github.io/2023/) in conjunction with CVPR 2023 on June 18, 2023. This workshop provides a focused venue for discussing and disseminating research in the areas of fairness, bias, and trust in computer vision, as well as adjacent domains such as computational social science and public policy.
- MERL researchers are presenting 4 papers and co-organizing a workshop at the CVPR 2023 conference, which will be held in Vancouver, Canada June 18-22. CVPR is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.
-
NEWS MERL Researchers Present Thirteen Papers at the 2023 IEEE International Conference on Robotics and Automation (ICRA) Date: May 29, 2023 - June 2, 2023
Where: 2023 IEEE International Conference on Robotics and Automation (ICRA)
MERL Contacts: Anoop Cherian; Radu Corcodel; Siddarth Jain; Devesh K. Jha; Toshiaki Koike-Akino; Tim K. Marks; Daniel N. Nikovski; Arvind Raghunathan; Diego Romeres
Research Areas: Computer Vision, Machine Learning, Optimization, RoboticsBrief- MERL researchers will present thirteen papers, including eight main conference papers and five workshop papers, at the 2023 IEEE International Conference on Robotics and Automation (ICRA) to be held in London, UK from May 29 to June 2. ICRA is one of the largest and most prestigious conferences in the robotics community. The papers cover a broad set of topics in Robotics including estimation, manipulation, vision-based object recognition and segmentation, tactile estimation and tool manipulation, robotic food handling, robot skill learning, and model-based reinforcement learning.
In addition to the paper presentations, MERL robotics researchers will also host an exhibition booth and look forward to discussing our research with visitors.
- MERL researchers will present thirteen papers, including eight main conference papers and five workshop papers, at the 2023 IEEE International Conference on Robotics and Automation (ICRA) to be held in London, UK from May 29 to June 2. ICRA is one of the largest and most prestigious conferences in the robotics community. The papers cover a broad set of topics in Robotics including estimation, manipulation, vision-based object recognition and segmentation, tactile estimation and tool manipulation, robotic food handling, robot skill learning, and model-based reinforcement learning.
See All News & Events for Anoop -
-
Research Highlights
-
MERL Publications
- "Active Sparse Conversations for Improved Audio-Visual Embodied Navigation", arXiv, June 2023. ,
- "H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions", IEEE International Conference on Robotics and Automation (ICRA), DOI: 10.1109/ICRA48891.2023.10160575, May 2023, pp. 7272-7278.BibTeX TR2023-009 PDF
- @inproceedings{Ota2023may,
- author = {Ota, Kei and Tung, Hsiao-Yu and Smith, Kevin and Cherian, Anoop and Marks, Tim K. and Sullivan, Alan and Kanezaki, Asako and Tenenbaum, Joshua B.},
- title = {H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions},
- booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
- year = 2023,
- pages = {7272--7278},
- month = may,
- publisher = {IEEE},
- doi = {10.1109/ICRA48891.2023.10160575},
- url = {https://www.merl.com/publications/TR2023-009}
- }
, - "Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation", IEEE International Conference on Robotics and Automation (ICRA), DOI: 10.1109/ICRA48891.2023.10160644, May 2023, pp. 9296-9302.BibTeX TR2023-010 PDF
- @inproceedings{Cherian2023may,
- author = {Cherian, Anoop and Jain, Siddarth and Marks, Tim K. and Sullivan, Alan},
- title = {Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation},
- booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
- year = 2023,
- pages = {9296--9302},
- month = may,
- publisher = {IEEE},
- doi = {10.1109/ICRA48891.2023.10160644},
- url = {https://www.merl.com/publications/TR2023-010}
- }
, - "Aligning Step-by-Step Instructional Diagrams to Video Demonstrations", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), May 2023, pp. 2483-2492.BibTeX TR2023-034 PDF
- @inproceedings{Zhang2023may,
- author = {Zhang, Jiahao and Cherian, Anoop and Liu, Yanbin and Shabat, Itzik Ben and Rodriguez, Cristian and Gould, Stephen},
- title = {Aligning Step-by-Step Instructional Diagrams to Video Demonstrations},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2023,
- pages = {2483--2492},
- month = may,
- publisher = {CVF},
- url = {https://www.merl.com/publications/TR2023-034}
- }
, - "HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), May 2023, pp. 18846-18856.BibTeX TR2023-035 PDF
- @inproceedings{Shah2023may,
- author = {Shah, Anshul and Roy, Aniket and Shah, Ketul and Mishra, Shlok Kumar and Jacobs, David and Cherian, Anoop and Chellappa, Rama},
- title = {HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2023,
- pages = {18846--18856},
- month = may,
- publisher = {CVF},
- url = {https://www.merl.com/publications/TR2023-035}
- }
,
-
Other Publications
- "Second-order Temporal Pooling for Action Recognition", International Journal of Computer Vision (IJCV), 2018.BibTeX
- @Article{cherian2018ijcv,
- author = {Cherian, Anoop and Gould, Stephen},
- title = {Second-order Temporal Pooling for Action Recognition},
- journal = {International Journal of Computer Vision (IJCV)},
- year = 2018,
- publisher = {Springer}
- }
, - "Visual Permutation Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018.BibTeX
- @Article{cherian2018permutation,
- author = {Santa Cruz, Rodrigo and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
- title = {Visual Permutation Learning},
- journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
- year = 2018,
- publisher = {IEEE}
- }
, - "Video Representation Learning Using Discriminative Pooling", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.BibTeX
- @Inproceedings{cherian_representation_cvpr18,
- author = {Wang, Jue and Cherian, Anoop and Porikli, Fatih and Gould, Stephen},
- title = {Video Representation Learning Using Discriminative Pooling},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2018
- }
, - "Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.BibTeX
- @Inproceedings{cherian_rigid_cvpr18,
- author = {Kumar, Suryansh and Cherian, Anoop and Dai, Yuchao and Li, Hongdong},
- title = {Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2018
- }
, - "Non-Linear Temporal Subspace Representations for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.BibTeX
- @Inproceedings{cherian_temporal_cvpr18,
- author = {Cherian, Anoop and Sra, Suvrit and Gould, Stephen and Hartley, Richard},
- title = {Non-Linear Temporal Subspace Representations for Activity Recognition},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2018
- }
, - "Generalized Rank Pooling for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.BibTeX
- @Inproceedings{cherian2017generalized,
- author = {Cherian, Anoop and Fernando, Basura and Harandi, Mehrtash and Gould, Stephen},
- title = {Generalized Rank Pooling for Activity Recognition},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2017
- }
, - "Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices", International Conference on Computer Vision (ICCV), 2017.BibTeX
- @Inproceedings{cherian_rigid_iccv17,
- author = {Cherian, Anoop and Stanitsas, Panagiotis and Harandi, Mehrtash and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
- title = {Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices},
- booktitle = {International Conference on Computer Vision (ICCV)},
- year = 2017
- }
, - "DeepPermNet: Visual Permutation Learning", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.BibTeX
- @Inproceedings{cruz2017deeppermnet,
- author = {Cruz, Rodrigo Santa and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
- title = {DeepPermNet: Visual Permutation Learning},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2017
- }
, - "Bayesian Non-Parametric clustering for positive definite matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016.BibTeX
- @Article{cherian2016bayesian,
- author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
- title = {Bayesian Non-Parametric clustering for positive definite matrices},
- journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
- year = 2016,
- publisher = {IEEE}
- }
, - "Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.BibTeX
- @Inproceedings{koniusz2016sparse,
- author = {Koniusz, Piotr and Cherian, Anoop},
- title = {Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2016
- }
, - "Tensor representations via kernel linearization for action recognition from 3D skeletons", European Conference on Computer Vision (ECCV), 2016.BibTeX
- @Inproceedings{koniusz2016tensor,
- author = {Koniusz, Piotr and Cherian, Anoop and Porikli, Fatih},
- title = {Tensor representations via kernel linearization for action recognition from 3D skeletons},
- booktitle = {European Conference on Computer Vision (ECCV)},
- year = 2016,
- organization = {Springer}
- }
, - "Mixing body-part sequences for human pose estimation", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.BibTeX
- @Inproceedings{cherian2014mixing,
- author = {Cherian, Anoop and Mairal, Julien and Alahari, Karteek and Schmid, Cordelia},
- title = {Mixing body-part sequences for human pose estimation},
- booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2014
- }
, - "Nearest neighbors using compact sparse codes", International Conference on Machine Learning (ICML), 2014.BibTeX
- @Inproceedings{cherian2014nearest,
- author = {Cherian, Anoop},
- title = {Nearest neighbors using compact sparse codes},
- booktitle = {International Conference on Machine Learning (ICML)},
- year = 2014
- }
, - "Riemannian sparse coding for positive definite matrices", European Conference on Computer Vision (ECCV), 2014.BibTeX
- @Inproceedings{cherian2014riemannian,
- author = {Cherian, Anoop and Sra, Suvrit},
- title = {Riemannian sparse coding for positive definite matrices},
- booktitle = {European Conference on Computer Vision (ECCV)},
- year = 2014,
- organization = {Springer}
- }
, - "Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2013.BibTeX
- @Article{cherian2013jensen,
- author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
- title = {Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices},
- journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
- year = 2013,
- publisher = {IEEE}
- }
, - "Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications", Computer Vision and Pattern Recognition (CVPR), 2011.BibTeX
- @Inproceedings{cherian2011dirichlet,
- author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos and Bedros, Saad J},
- title = {Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications},
- booktitle = {Computer Vision and Pattern Recognition (CVPR)},
- year = 2011
- }
, - "Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence", International Conference on Computer Vision (ICCV), 2011.BibTeX
- @Inproceedings{cherian2011efficient,
- author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
- title = {Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence},
- booktitle = {International Conference on Computer Vision (ICCV)},
- year = 2011
- }
, - "Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval", Machine Learning and Knowledge Discovery in Databases (ECML), 2011.BibTeX
- @Article{sra2011generalized,
- author = {Sra, Suvrit and Cherian, Anoop},
- title = {Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval},
- journal = {Machine Learning and Knowledge Discovery in Databases (ECML)},
- year = 2011
- }
, - "Accurate 3D ground plane estimation from a single image", International Conference on Robotics and Automation, 2009.BibTeX
- @Inproceedings{cherian2009accurate,
- author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
- title = {Accurate 3D ground plane estimation from a single image},
- booktitle = {International Conference on Robotics and Automation},
- year = 2009
- }
,
- "Second-order Temporal Pooling for Action Recognition", International Journal of Computer Vision (IJCV), 2018.
-
Downloads
-
Simple Multimodal Algorithmic Reasoning Task Dataset
-
Audio-Visual-Language Embodied Navigation in 3D Environments
-
Instance Segmentation GAN
-
Audio Visual Scene-Graph Segmentor
-
Generalized One-class Discriminative Subspaces
-
Generating Visual Dynamics from Sound and Context
-
Adversarially-Contrastive Optimal Transport
-
Landmarks’ Location, Uncertainty, and Visibility Likelihood
-
Gradient-based Nikaido-Isoda
-
Discriminative Subspace Pooling
-
-
Videos
-
MERL Issued Patents
-
Title: "System and Method for Manipulating Two-Dimensional (2D) Images of Three-Dimensional (3D) Objects"
Inventors: Marks, Tim; Medin, Safa; Cherian, Anoop; Wang, Ye
Patent No.: 11,663,798
Issue Date: May 30, 2023 -
Title: "InSeGAN: A Generative Approach to Instance Segmentation in Depth Images"
Inventors: Cherian, Anoop; Pais, Goncalo; Marks, Tim; Sullivan, Alan
Patent No.: 11,651,497
Issue Date: May 16, 2023 -
Title: "Method and System for Scene-Aware Interaction"
Inventors: Hori, Chiori; Cherian, Anoop; Chen, Siheng; Marks, Tim; Le Roux, Jonathan; Hori, Takaaki; Harsham, Bret A.; Vetro, Anthony; Sullivan, Alan
Patent No.: 11,635,299
Issue Date: Apr 25, 2023 -
Title: "Scene-Aware Video Encoder System and Method"
Inventors: Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan; Marks, Tim; Sullivan, Alan
Patent No.: 11,582,485
Issue Date: Feb 14, 2023 -
Title: "Low-latency Captioning System"
Inventors: Hori, Chiori; Hori, Takaaki; Cherian, Anoop; Marks, Tim; Le Roux, Jonathan
Patent No.: 11,445,267
Issue Date: Sep 13, 2022 -
Title: "Anomaly Detector for Detecting Anomaly using Complementary Classifiers"
Inventors: Cherian, Anoop; Wang, Jue
Patent No.: 11,423,698
Issue Date: Aug 23, 2022 -
Title: "System and Method for a Dialogue Response Generation System"
Inventors: Hori, Chiori; Cherian, Anoop; Marks, Tim; Hori, Takaaki
Patent No.: 11,264,009
Issue Date: Mar 1, 2022 -
Title: "Scene-Aware Video Dialog"
Inventors: Geng, Shijie; Gao, Peng; Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan
Patent No.: 11,210,523
Issue Date: Dec 28, 2021
-
Title: "System and Method for Manipulating Two-Dimensional (2D) Images of Three-Dimensional (3D) Objects"