Computer Vision
Extracting meaning and building representations of visual objects and events in the world.
Our main research themes cover the areas of deep learning and artificial intelligence for object and action detection, classification and scene understanding, robotic vision and object manipulation, 3D processing and computational geometry, as well as simulation of physical systems to enhance machine learning systems.
Quick Links
-
Researchers
Tim K.
Marks
Anoop
Cherian
Michael J.
Jones
Chiori
Hori
Matthew
Brand
Hassan
Mansour
Jonathan
Le Roux
Suhas
Lohit
Petros T.
Boufounos
Anthony
Vetro
Radu
Corcodel
Devesh K.
Jha
Siddarth
Jain
Dehong
Liu
Daniel N.
Nikovski
Diego
Romeres
Ye
Wang
Kuan-Chuan
Peng
Arvind
Raghunathan
Pedro
Miraldo
Gordon
Wichern
William S.
Yerazunis
Jose
Amaya
Toshiaki
Koike-Akino
Yanting
Ma
Philip V.
Orlik
Kei
Ota
Huifang
Sun
Yebin
Wang
Moitreya
Chatterjee
Jing
Liu
-
Awards
-
AWARD Best Paper - Honorable Mention Award at WACV 2021 Date: January 6, 2021
Awarded to: Rushil Anirudh, Suhas Lohit, Pavan Turaga
MERL Contact: Suhas Lohit
Research Areas: Computational Sensing, Computer Vision, Machine LearningBrief- A team of researchers from Mitsubishi Electric Research Laboratories (MERL), Lawrence Livermore National Laboratory (LLNL) and Arizona State University (ASU) received the Best Paper Honorable Mention Award at WACV 2021 for their paper "Generative Patch Priors for Practical Compressive Image Recovery".
The paper proposes a novel model of natural images as a composition of small patches which are obtained from a deep generative network. This is unlike prior approaches where the networks attempt to model image-level distributions and are unable to generalize outside training distributions. The key idea in this paper is that learning patch-level statistics is far easier. As the authors demonstrate, this model can then be used to efficiently solve challenging inverse problems in imaging such as compressive image recovery and inpainting even from very few measurements for diverse natural scenes.
- A team of researchers from Mitsubishi Electric Research Laboratories (MERL), Lawrence Livermore National Laboratory (LLNL) and Arizona State University (ASU) received the Best Paper Honorable Mention Award at WACV 2021 for their paper "Generative Patch Priors for Practical Compressive Image Recovery".
-
AWARD MERL Researchers win Best Paper Award at ICCV 2019 Workshop on Statistical Deep Learning in Computer Vision Date: October 27, 2019
Awarded to: Abhinav Kumar, Tim K. Marks, Wenxuan Mou, Chen Feng, Xiaoming Liu
MERL Contact: Tim K. Marks
Research Areas: Artificial Intelligence, Computer Vision, Machine LearningBrief- MERL researcher Tim Marks, former MERL interns Abhinav Kumar and Wenxuan Mou, and MERL consultants Professor Chen Feng (NYU) and Professor Xiaoming Liu (MSU) received the Best Oral Paper Award at the IEEE/CVF International Conference on Computer Vision (ICCV) 2019 Workshop on Statistical Deep Learning in Computer Vision (SDL-CV) held in Seoul, Korea. Their paper, entitled "UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss," describes a method which, given an image of a face, estimates not only the locations of facial landmarks but also the uncertainty of each landmark location estimate.
-
AWARD CVPR 2011 Longuet-Higgins Prize Date: June 25, 2011
Awarded to: Paul A. Viola and Michael J. Jones
Awarded for: "Rapid Object Detection using a Boosted Cascade of Simple Features"
Awarded by: Conference on Computer Vision and Pattern Recognition (CVPR)
MERL Contact: Michael J. Jones
Research Area: Machine LearningBrief- Paper from 10 years ago with the largest impact on the field: "Rapid Object Detection using a Boosted Cascade of Simple Features", originally published at Conference on Computer Vision and Pattern Recognition (CVPR 2001).
See All Awards for MERL -
-
News & Events
-
TALK [MERL Seminar Series 2023] Dr. Suraj Srinivas presents talk titled Pitfalls and Opportunities in Interpretable Machine Learning Date & Time: Tuesday, March 14, 2023; 1:00 PM
Speaker: Suraj Srinivas, Harvard University
MERL Host: Suhas Lohit
Research Areas: Artificial Intelligence, Computer Vision, Machine LearningAbstractIn this talk, I will discuss our recent research on understanding post-hoc interpretability. I will begin by introducing a characterization of post-hoc interpretability methods as local function approximators, and the implications of this viewpoint, including a no-free-lunch theorem for explanations. Next, we shall challenge the assumption that post-hoc explanations provide information about a model's discriminative capabilities p(y|x) and instead demonstrate that many common methods instead rely on a conditional generative model p(x|y). This observation underscores the importance of being cautious when using such methods in practice. Finally, I will propose to resolve this via regularization of model structure, specifically by training low curvature neural networks, resulting in improved model robustness and stable gradients.
-
EVENT MERL's Virtual Open House 2022 Date & Time: Monday, December 12, 2022; 1:00pm-5:30pm ET
Location: Mitsubishi Electric Research Laboratories (MERL)/Virtual
Research Areas: Applied Physics, Artificial Intelligence, Communications, Computational Sensing, Computer Vision, Control, Data Analytics, Dynamical Systems, Electric Systems, Electronic and Photonic Devices, Machine Learning, Multi-Physical Modeling, Optimization, Robotics, Signal Processing, Speech & Audio, Digital VideoBrief- Join MERL's virtual open house on December 12th, 2022! Featuring a keynote, live sessions, research area booths, and opportunities to interact with our research team. Discover who we are and what we do, and learn about internship and employment opportunities.
See All News & Events for Computer Vision -
-
Research Highlights
-
Internships
-
CV1992: High precision pose estimation of deformable objects
MERL is seeking a highly motivated intern to conduct original research in high precision pose estimation of deformable objects. Applicants are required to have a strong background in image processing, machine vision and point cloud processing using depth cameras. The internship is open to PhD students, preferably specializing in Computer Vision, with a strong publication record, solid programming skills in Python and/or C/C++, and preferably some experience using tactile sensors. Internship duration and start date are flexible.
See All Internships for Computer Vision -
-
Recent Publications
- "Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation", IEEE International Conference on Robotics and Automation (ICRA), March 2023.BibTeX TR2023-010 PDF
- @inproceedings{Cherian2023mar,
- author = {Cherian, Anoop and Jain, Siddarth and Marks, Tim K. and Sullivan, Alan},
- title = {Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation},
- booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
- year = 2023,
- month = mar,
- url = {https://www.merl.com/publications/TR2023-010}
- }
, - "H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions", IEEE International Conference on Robotics and Automation (ICRA), March 2023.BibTeX TR2023-009 PDF
- @inproceedings{Ota2023mar,
- author = {Ota, Kei and Tung, Hsiao-Yu and Smith, Kevin and Cherian, Anoop and Marks, Tim K. and Sullivan, Alan and Kanezaki, Asako and Tenenbaum, Joshua B.},
- title = {H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions},
- booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
- year = 2023,
- month = mar,
- url = {https://www.merl.com/publications/TR2023-009}
- }
, - "Fast and Accurate 3D Registration from Line Intersection Constraints", International Journal of Computer Vision, February 2023.BibTeX TR2023-007 PDF
- @article{Mateus2023feb,
- author = {Mateus, Andre and Ranade, Siddhant and Ramalingam, Srikumar and Miraldo, Pedro},
- title = {Fast and Accurate 3D Registration from Line Intersection Constraints},
- journal = {International Journal of Computer Vision},
- year = 2023,
- month = feb,
- url = {https://www.merl.com/publications/TR2023-007}
- }
, - "Cross-Domain Video Anomaly Detection without Target Domain Adaptation", IEEE Winter Conference on Applications of Computer Vision (WACV), Crandall, D. and Gong, B. and Lee, Y. J. and Souvenir, R. and Yu, S., Eds., DOI: 10.1109/WACV56688.2023.00261, January 2023, pp. 2578-2590.BibTeX TR2023-001 PDF Video Presentation
- @inproceedings{Aich2023jan,
- author = {Aich, Abhishek and Peng, Kuan-Chuan and Roy-Chowdhury, Amit K.},
- title = {Cross-Domain Video Anomaly Detection without Target Domain Adaptation},
- booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
- year = 2023,
- editor = {Crandall, D. and Gong, B. and Lee, Y. J. and Souvenir, R. and Yu, S.},
- pages = {2578--2590},
- month = jan,
- publisher = {IEEE},
- doi = {10.1109/WACV56688.2023.00261},
- issn = {2642-9381},
- isbn = {978-1-6654-9346-8},
- url = {https://www.merl.com/publications/TR2023-001}
- }
, - "Learning Occlusion-Aware Dense Correspondences for Multi-Modal Images", IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), November 2022.BibTeX TR2022-149 PDF
- @inproceedings{Shimoya2022nov,
- author = {Shimoya, Ryosuke and Morimoto, Tahashi and van Baar, Jeroen and Boufounos, Petros T. and Ma, Yanting and Mansour, Hassan},
- title = {Learning Occlusion-Aware Dense Correspondences for Multi-Modal Images},
- booktitle = {IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)},
- year = 2022,
- month = nov,
- url = {https://www.merl.com/publications/TR2022-149}
- }
, - "Learning Partial Equivariances from Data", Advances in Neural Information Processing Systems (NeurIPS), November 2022.BibTeX TR2022-148 PDF Presentation
- @inproceedings{Romero2022nov,
- author = {Romero, David and Lohit, Suhas},
- title = {Learning Partial Equivariances from Data},
- booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
- year = 2022,
- month = nov,
- url = {https://www.merl.com/publications/TR2022-148}
- }
, - "What Makes a “Good” Data Augmentation in Knowledge Distillation – A Statistical Perspective", Advances in Neural Information Processing Systems (NeurIPS), November 2022.BibTeX TR2022-147 PDF
- @inproceedings{Wang2022nov,
- author = {Wang, Huan and Lohit, Suhas and Jones, Michael J. and Fu, Raymond},
- title = {What Makes a “Good” Data Augmentation in Knowledge Distillation – A Statistical Perspective},
- booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
- year = 2022,
- month = nov,
- url = {https://www.merl.com/publications/TR2022-147}
- }
, - "Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation", Advances in Neural Information Processing Systems (NeurIPS), November 2022.BibTeX TR2022-140 PDF
- @inproceedings{Chatterjee2022nov,
- author = {Chatterjee, Moitreya and Ahuja, Narendra and Cherian, Anoop},
- title = {Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation},
- booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
- year = 2022,
- month = nov,
- url = {https://www.merl.com/publications/TR2022-140}
- }
,
- "Discriminative 3D Shape Modeling for Few-Shot Instance Segmentation", IEEE International Conference on Robotics and Automation (ICRA), March 2023.
-
Videos
-
Human Perspective Scene Understanding via Multimodal Sensing
-
[MERL Seminar Series Spring 2022] Self-Supervised Scene Representation Learning
-
[MERL Seminar Series Spring 2022] Learning Speech Representations with Multimodal Self-Supervision
-
HealthCam: A system for non-contact monitoring of vital signs
-
[MERL Seminar Series 2021] Learning to See by Moving: Self-supervising 3D scene representations for perception, control, and visual reasoning
-
[MERL Seminar Series 2021] Look and Listen: From Semantic to Spatial Audio-Visual Perception
-
Towards Human-Level Learning of Complex Physical Puzzles
-
Scene-Aware Interaction Technology
-
3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances
-
Joint 3D Reconstruction of a Static Scene and Moving Objects
-
Direct Multichannel Tracking
-
FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
-
FasTFit: A fast T-spline fitting algorithm
-
CASENet: Deep Category-Aware Semantic Edge Detection
-
Object Detection and Tracking in RGB-D SLAM via Hierarchical Feature Grouping
-
Pinpoint SLAM: A Hybrid of 2D and 3D Simultaneous Localization and Mapping for RGB-D Sensors
-
Action Detection Using A Deep Recurrent Neural Network
-
3D Reconstruction
-
MERL Research on Autonomous Vehicles
-
Saffron - Digital Type System
-
Obstacle Detection
-
Semantic Scene Labeling
-
Robot Bin Picking
-
Dose optimization for particle beam therapy
-
Sapphire - High Accuracy NC Milling Simulation
-
Deep Hierarchical Parsing for Semantic Segmentation
-
Global Local Face Upsampling Network
-
Gaussian Conditional Random Field Network for Semantic Segmentation
-
Fast Graspability Evaluation on Single Depth Maps for Bin Picking with General Grippers
-
Point-Plane SLAM for Hand-Held 3D Sensors
-
Tracking an RGB-D Camera Using Points and Planes
-
Fast Plane Extraction in Organized Point Clouds Using Agglomerative Hierarchical Clustering
-
Calibration of Non-Overlapping Cameras Using an External SLAM System
-
Voting-Based Pose Estimation for Robotic Assembly Using a 3D Sensor
-
Fast Object Localization and Pose Estimation in Heavy Clutter for Robotic Bin Picking
-
Learning to rank 3D features
-
-
Software Downloads
-
SOurce-free Cross-modal KnowledgE Transfer
-
Instance Segmentation GAN
-
Audio Visual Scene-Graph Segmentor
-
Generalized One-class Discriminative Subspaces
-
Generating Visual Dynamics from Sound and Context
-
Adversarially-Contrastive Optimal Transport
-
MotionNet
-
Contact-Implicit Trajectory Optimization
-
FoldingNet++
-
Landmarks’ Location, Uncertainty, and Visibility Likelihood
-
Gradient-based Nikaido-Isoda
-
Circular Maze Environment
-
Discriminative Subspace Pooling
-
Kernel Correlation Network
-
Fast Resampling on Point Clouds via Graphs
-
FoldingNet
-
Joint Geodesic Upsampling
-
Plane Extraction using Agglomerative Clustering
-
Partial Group Convolutional Neural Networks
-