Anoop Cherian

Anoop Cherian
  • Biography

    Anoop was a postdoctoral researcher in the LEAR group at Inria from 2012-2015 where his research was on the estimation and tracking of human poses in videos. From 2015-2017, he was a Research Fellow at the Australian National University, where he worked on the problem of recognizing human activities in video sequences. Anoop is the recipient of the Best Student Paper award at the Intl. Conference on Image Processing in 2012. Currently, his research focus is on modeling the semantics of video data.

  • Recent News & Events

    •  NEWS    Anoop Cherian gives a podcast interview with AI Business
      Date: September 26, 2023
      Where: Virtual
      MERL Contact: Anoop Cherian
      Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
      Brief
      • Anoop Cherian, a Senior Principal Research Scientist in the Computer Vision team at MERL, gave a podcast interview with award-winning journalist, Deborah Yao. Deborah is the editor of AI Business -- a leading content platform for artificial intelligence and its applications in the real world, delivering its readers up-to-the-minute insights into how AI technologies are currently affecting the global economy and society. The podcast was based on the recent research that Anoop and his colleagues did at MERL with his collaborators at MIT; this research attempts to objectively answer the pertinent question: are current deep neural networks smarter than second graders? The podcast discusses shortcomings in the recent artificial general intelligence systems with regard to their capabilities for knowledge abstraction, learning, and generalization, which are brought out by this research.
    •  
    •  NEWS    MERL researchers presenting four papers and organizing the VLAR-SMART101 Workshop at ICCV 2023
      Date: October 2, 2023 - October 6, 2023
      Where: Paris/France
      MERL Contacts: Moitreya Chatterjee; Anoop Cherian; Michael J. Jones; Toshiaki Koike-Akino; Suhas Lohit; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Ye Wang
      Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
      Brief
      • MERL researchers are presenting 4 papers and organizing the VLAR-SMART-101 workshop at the ICCV 2023 conference, which will be held in Paris, France October 2-6. ICCV is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.

        1. Conference paper: “Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis,” by Nithin Gopalakrishnan Nair, Anoop Cherian, Suhas Lohit, Ye Wang, Toshiaki Koike-Akino, Vishal Patel, and Tim K. Marks

        Conditional generative models typically demand large annotated training sets to achieve high-quality synthesis. As a result, there has been significant interest in plug-and-play generation, i.e., using a pre-defined model to guide the generative process. In this paper, we introduce Steered Diffusion, a generalized framework for fine-grained photorealistic zero-shot conditional image generation using a diffusion model trained for unconditional generation. The key idea is to steer the image generation of the diffusion model during inference via designing a loss using a pre-trained inverse model that characterizes the conditional task. Our model shows clear qualitative and quantitative improvements over state-of-the-art diffusion-based plug-and-play models, while adding negligible computational cost.

        2. Conference paper: "BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus," by Valter Piedade and Pedro Miraldo

        We derive a dynamic Bayesian network that updates individual data points' inlier scores while iterating RANSAC. At each iteration, we apply weighted sampling using the updated scores. Our method works with or without prior data point scorings. In addition, we use the updated inlier/outlier scoring for deriving a new stopping criterion for the RANSAC loop. Our method outperforms the baselines in accuracy while needing less computational time.

        3. Conference paper: "Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes," by Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael J. Jones, Pedro Miraldo, and Erik Learned-Miller

        We present a novel approach to estimating camera rotation in crowded, real-world scenes captured using a handheld monocular video camera. Our method uses a novel generalization of the Hough transform on SO3 to efficiently find the camera rotation most compatible with the optical flow. Because the setting is not addressed well by other data sets, we provide a new dataset and benchmark, with high-accuracy and rigorously annotated ground truth on 17 video sequences. Our method is more accurate by almost 40 percent than the next best method.

        4. Workshop paper: "Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection" by Manish Sharma*, Moitreya Chatterjee*, Kuan-Chuan Peng, Suhas Lohit, and Michael Jones

        While state-of-the-art object detection methods for RGB images have reached some level of maturity, the same is not true for Infrared (IR) images. The primary bottleneck towards bridging this gap is the lack of sufficient labeled training data in the IR images. Towards addressing this issue, we present TensorFact, a novel tensor decomposition method which splits the convolution kernels of a CNN into low-rank factor matrices with fewer parameters. This compressed network is first pre-trained on RGB images and then augmented with only a few parameters. This augmented network is then trained on IR images, while freezing the weights trained on RGB. This prevents it from over-fitting, allowing it to generalize better. Experiments show that our method outperforms state-of-the-art.

        5. “Vision-and-Language Algorithmic Reasoning (VLAR) Workshop and SMART-101 Challenge” by Anoop Cherian,  Kuan-Chuan Peng, Suhas Lohit, Tim K. Marks, Ram Ramrakhya, Honglu Zhou, Kevin A. Smith, Joanna Matthiesen, and Joshua B. Tenenbaum

        MERL researchers along with researchers from MIT, GeorgiaTech, Math Kangaroo USA, and Rutgers University are jointly organizing a workshop on vision-and-language algorithmic reasoning at ICCV 2023 and conducting a challenge based on the SMART-101 puzzles described in the paper: Are Deep Neural Networks SMARTer than Second Graders?. A focus of this workshop is to bring together outstanding faculty/researchers working at the intersections of vision, language, and cognition to provide their opinions on the recent breakthroughs in large language models and artificial general intelligence, as well as showcase their cutting edge research that could inspire the audience to search for the missing pieces in our quest towards solving the puzzle of artificial intelligence.

        Workshop link: https://wvlar.github.io/iccv23/
    •  

    See All News & Events for Anoop
  • Research Highlights

  • MERL Publications

    •  Hori, C., Wang, P., Rahman, M., Vaca-Rubio, C., Khurana, S., Cherian, A., Le Roux, J., "Wi-Fi based Indoor Monitoring Enhanced by Multimodal Fusion", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2024.
      BibTeX TR2024-012 PDF
      • @inproceedings{Hori2024mar,
      • author = {Hori, Chiori and Wang, Pu and Rahman, Mahbub and Vaca-Rubio, Cristian and Khurana, Sameer and Cherian, Anoop and Le Roux, Jonathan},
      • title = {Wi-Fi based Indoor Monitoring Enhanced by Multimodal Fusion},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2024,
      • month = mar,
      • url = {https://www.merl.com/publications/TR2024-012}
      • }
    •  Carmichael, Z., Jones, L.S., Cherian, A., Michael J., , Scheirer, W., "Pixel-Grounded Prototypical Part Networks", IEEE Winter Conference on Applications of Computer Vision (WACV), January 2024.
      BibTeX TR2024-002 PDF Presentation
      • @inproceedings{Carmichael2024jan,
      • author = {Carmichael, Zachariah and Jones, Lohit, Suhas and Cherian, Anoop and Michael J. and Scheirer, Walter},
      • title = {Pixel-Grounded Prototypical Part Networks},
      • booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
      • year = 2024,
      • month = jan,
      • url = {https://www.merl.com/publications/TR2024-002}
      • }
    •  Liu, X., Paul, S., Chatterjee, M., Cherian, A., "CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments", AAAI Conference on Artificial Intelligence, December 2023.
      BibTeX TR2023-154 PDF
      • @inproceedings{Liu2023dec2,
      • author = {Liu, Xiulong and Paul, Sudipta and Chatterjee, Moitreya and Cherian, Anoop},
      • title = {CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments},
      • booktitle = {AAAI Conference on Artificial Intelligence},
      • year = 2023,
      • month = dec,
      • url = {https://www.merl.com/publications/TR2023-154}
      • }
    •  Zhu, X., Jha, D.K., Romeres, D., Sun, L., Tomizuka, M., Cherian, A., "Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection", arXiv, December 2023.
      BibTeX arXiv
      • @article{Zhu2023dec,
      • author = {Zhu, Xinghao and Jha, Devesh K. and Romeres, Diego and Sun, Lingfeng and Tomizuka, Masayoshi and Cherian, Anoop},
      • title = {Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection},
      • journal = {arXiv},
      • year = 2023,
      • month = dec,
      • url = {https://arxiv.org/abs/2312.10571}
      • }
    •  He, Y., Shin, S., Cherian, A., Markham, A., Trigon, N., "Sound3DVDet: 3D Sound Source Detection using Multiview Microphone Array and RGB Images", IEEE Winter Conference on Applications of Computer Vision (WACV), December 2023.
      BibTeX TR2023-144 PDF
      • @inproceedings{He2023dec,
      • author = {He, Yuhang and Shin, Sangyun and Cherian, Anoop and Markham, Andrew and Trigon, Niki},
      • title = {Sound3DVDet: 3D Sound Source Detection using Multiview Microphone Array and RGB Images},
      • booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
      • year = 2023,
      • month = dec,
      • url = {https://www.merl.com/publications/TR2023-144}
      • }
    See All MERL Publications for Anoop
  • Other Publications

    •  Anoop Cherian and Stephen Gould, "Second-order Temporal Pooling for Action Recognition", International Journal of Computer Vision (IJCV), 2018.
      BibTeX
      • @Article{cherian2018ijcv,
      • author = {Cherian, Anoop and Gould, Stephen},
      • title = {Second-order Temporal Pooling for Action Recognition},
      • journal = {International Journal of Computer Vision (IJCV)},
      • year = 2018,
      • publisher = {Springer}
      • }
    •  Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian and Stephen Gould, "Visual Permutation Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018.
      BibTeX
      • @Article{cherian2018permutation,
      • author = {Santa Cruz, Rodrigo and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
      • title = {Visual Permutation Learning},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2018,
      • publisher = {IEEE}
      • }
    •  Jue Wang, Anoop Cherian, Fatih Porikli and Stephen Gould, "Video Representation Learning Using Discriminative Pooling", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_representation_cvpr18,
      • author = {Wang, Jue and Cherian, Anoop and Porikli, Fatih and Gould, Stephen},
      • title = {Video Representation Learning Using Discriminative Pooling},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Suryansh Kumar, Anoop Cherian, Yuchao Dai and Hongdong Li, "Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_rigid_cvpr18,
      • author = {Kumar, Suryansh and Cherian, Anoop and Dai, Yuchao and Li, Hongdong},
      • title = {Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Anoop Cherian, Suvrit Sra, Stephen Gould and Richard Hartley, "Non-Linear Temporal Subspace Representations for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
      BibTeX
      • @Inproceedings{cherian_temporal_cvpr18,
      • author = {Cherian, Anoop and Sra, Suvrit and Gould, Stephen and Hartley, Richard},
      • title = {Non-Linear Temporal Subspace Representations for Activity Recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2018
      • }
    •  Anoop Cherian, Basura Fernando, Mehrtash Harandi and Stephen Gould, "Generalized Rank Pooling for Activity Recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      BibTeX
      • @Inproceedings{cherian2017generalized,
      • author = {Cherian, Anoop and Fernando, Basura and Harandi, Mehrtash and Gould, Stephen},
      • title = {Generalized Rank Pooling for Activity Recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2017
      • }
    •  Anoop Cherian, Panagiotis Stanitsas, Mehrtash Harandi, Vassilios Morellas and Nikolaos Papanikolopoulos, "Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices", International Conference on Computer Vision (ICCV), 2017.
      BibTeX
      • @Inproceedings{cherian_rigid_iccv17,
      • author = {Cherian, Anoop and Stanitsas, Panagiotis and Harandi, Mehrtash and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Learning Discriminative Alpha-Beta Divergences for Positive Definite Matrices},
      • booktitle = {International Conference on Computer Vision (ICCV)},
      • year = 2017
      • }
    •  Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian and Stephen Gould, "DeepPermNet: Visual Permutation Learning", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      BibTeX
      • @Inproceedings{cruz2017deeppermnet,
      • author = {Cruz, Rodrigo Santa and Fernando, Basura and Cherian, Anoop and Gould, Stephen},
      • title = {DeepPermNet: Visual Permutation Learning},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2017
      • }
    •  Anoop Cherian, Vassilios Morellas and Nikolaos Papanikolopoulos, "Bayesian Non-Parametric clustering for positive definite matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016.
      BibTeX
      • @Article{cherian2016bayesian,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Bayesian Non-Parametric clustering for positive definite matrices},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2016,
      • publisher = {IEEE}
      • }
    •  Piotr Koniusz and Anoop Cherian, "Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
      BibTeX
      • @Inproceedings{koniusz2016sparse,
      • author = {Koniusz, Piotr and Cherian, Anoop},
      • title = {Sparse coding for third-order super-symmetric tensor descriptors with application to texture recognition},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2016
      • }
    •  Piotr Koniusz, Anoop Cherian and Fatih Porikli, "Tensor representations via kernel linearization for action recognition from 3D skeletons", European Conference on Computer Vision (ECCV), 2016.
      BibTeX
      • @Inproceedings{koniusz2016tensor,
      • author = {Koniusz, Piotr and Cherian, Anoop and Porikli, Fatih},
      • title = {Tensor representations via kernel linearization for action recognition from 3D skeletons},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2016,
      • organization = {Springer}
      • }
    •  Anoop Cherian, Julien Mairal, Karteek Alahari and Cordelia Schmid, "Mixing body-part sequences for human pose estimation", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
      BibTeX
      • @Inproceedings{cherian2014mixing,
      • author = {Cherian, Anoop and Mairal, Julien and Alahari, Karteek and Schmid, Cordelia},
      • title = {Mixing body-part sequences for human pose estimation},
      • booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2014
      • }
    •  Anoop Cherian, "Nearest neighbors using compact sparse codes", International Conference on Machine Learning (ICML), 2014.
      BibTeX
      • @Inproceedings{cherian2014nearest,
      • author = {Cherian, Anoop},
      • title = {Nearest neighbors using compact sparse codes},
      • booktitle = {International Conference on Machine Learning (ICML)},
      • year = 2014
      • }
    •  Anoop Cherian and Suvrit Sra, "Riemannian sparse coding for positive definite matrices", European Conference on Computer Vision (ECCV), 2014.
      BibTeX
      • @Inproceedings{cherian2014riemannian,
      • author = {Cherian, Anoop and Sra, Suvrit},
      • title = {Riemannian sparse coding for positive definite matrices},
      • booktitle = {European Conference on Computer Vision (ECCV)},
      • year = 2014,
      • organization = {Springer}
      • }
    •  Anoop Cherian, Suvrit Sra, Arindam Banerjee and Nikolaos Papanikolopoulos, "Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2013.
      BibTeX
      • @Article{cherian2013jensen,
      • author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
      • title = {Jensen-Bregman logdet divergence with application to efficient similarity search for covariance matrices},
      • journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
      • year = 2013,
      • publisher = {IEEE}
      • }
    •  Anoop Cherian, Vassilios Morellas, Nikolaos Papanikolopoulos and Saad J Bedros, "Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications", Computer Vision and Pattern Recognition (CVPR), 2011.
      BibTeX
      • @Inproceedings{cherian2011dirichlet,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos and Bedros, Saad J},
      • title = {Dirichlet process mixture models on symmetric positive definite matrices for appearance clustering in video surveillance applications},
      • booktitle = {Computer Vision and Pattern Recognition (CVPR)},
      • year = 2011
      • }
    •  Anoop Cherian, Suvrit Sra, Arindam Banerjee and Nikolaos Papanikolopoulos, "Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence", International Conference on Computer Vision (ICCV), 2011.
      BibTeX
      • @Inproceedings{cherian2011efficient,
      • author = {Cherian, Anoop and Sra, Suvrit and Banerjee, Arindam and Papanikolopoulos, Nikolaos},
      • title = {Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet divergence},
      • booktitle = {International Conference on Computer Vision (ICCV)},
      • year = 2011
      • }
    •  Suvrit Sra and Anoop Cherian, "Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval", Machine Learning and Knowledge Discovery in Databases (ECML), 2011.
      BibTeX
      • @Article{sra2011generalized,
      • author = {Sra, Suvrit and Cherian, Anoop},
      • title = {Generalized dictionary learning for symmetric positive definite matrices with application to nearest neighbor retrieval},
      • journal = {Machine Learning and Knowledge Discovery in Databases (ECML)},
      • year = 2011
      • }
    •  Anoop Cherian, Vassilios Morellas and Nikolaos Papanikolopoulos, "Accurate 3D ground plane estimation from a single image", International Conference on Robotics and Automation, 2009.
      BibTeX
      • @Inproceedings{cherian2009accurate,
      • author = {Cherian, Anoop and Morellas, Vassilios and Papanikolopoulos, Nikolaos},
      • title = {Accurate 3D ground plane estimation from a single image},
      • booktitle = {International Conference on Robotics and Automation},
      • year = 2009
      • }
  • Software & Data Downloads

  • Videos

  • MERL Issued Patents

    • Title: "Artificial Intelligence System for Classification of Data Based on Contrastive Learning"
      Inventors: Cherian, Anoop; Aeron, Shuchin
      Patent No.: 11,809,988
      Issue Date: Nov 7, 2023
    • Title: "System and Method for Manipulating Two-Dimensional (2D) Images of Three-Dimensional (3D) Objects"
      Inventors: Marks, Tim; Medin, Safa; Cherian, Anoop; Wang, Ye
      Patent No.: 11,663,798
      Issue Date: May 30, 2023
    • Title: "InSeGAN: A Generative Approach to Instance Segmentation in Depth Images"
      Inventors: Cherian, Anoop; Pais, Goncalo; Marks, Tim; Sullivan, Alan
      Patent No.: 11,651,497
      Issue Date: May 16, 2023
    • Title: "Method and System for Scene-Aware Interaction"
      Inventors: Hori, Chiori; Cherian, Anoop; Chen, Siheng; Marks, Tim; Le Roux, Jonathan; Hori, Takaaki; Harsham, Bret A.; Vetro, Anthony; Sullivan, Alan
      Patent No.: 11,635,299
      Issue Date: Apr 25, 2023
    • Title: "Scene-Aware Video Encoder System and Method"
      Inventors: Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan; Marks, Tim; Sullivan, Alan
      Patent No.: 11,582,485
      Issue Date: Feb 14, 2023
    • Title: "Low-latency Captioning System"
      Inventors: Hori, Chiori; Hori, Takaaki; Cherian, Anoop; Marks, Tim; Le Roux, Jonathan
      Patent No.: 11,445,267
      Issue Date: Sep 13, 2022
    • Title: "Anomaly Detector for Detecting Anomaly using Complementary Classifiers"
      Inventors: Cherian, Anoop; Wang, Jue
      Patent No.: 11,423,698
      Issue Date: Aug 23, 2022
    • Title: "System and Method for a Dialogue Response Generation System"
      Inventors: Hori, Chiori; Cherian, Anoop; Marks, Tim; Hori, Takaaki
      Patent No.: 11,264,009
      Issue Date: Mar 1, 2022
    • Title: "Scene-Aware Video Dialog"
      Inventors: Geng, Shijie; Gao, Peng; Cherian, Anoop; Hori, Chiori; Le Roux, Jonathan
      Patent No.: 11,210,523
      Issue Date: Dec 28, 2021
    See All Patents for MERL