Computer Vision
Extracting meaning and building representations of visual objects and events in the world.
Our main research themes cover the areas of deep learning and artificial intelligence for object and action detection, classification and scene understanding, robotic vision and object manipulation, 3D processing and computational geometry, as well as simulation of physical systems to enhance machine learning systems.
Quick Links
-
Researchers
Jeroen
van Baar
Tim
Marks
Michael
Jones
Anoop
Cherian
Alan
Sullivan
Matthew
Brand
Chiori
Hori
Hassan
Mansour
Ronald
Perry
Takaaki
Hori
Jay
Thornton
Radu
Corcodel
Petros
Boufounos
Devesh
Jha
Suhas
Lohit
Daniel
Nikovski
Arvind
Raghunathan
Diego
Romeres
Anthony
Vetro
Ye
Wang
Dehong
Liu
Gordon
Wichern
William
Yerazunis
Bret
Harsham
Siddarth
Jain
Toshiaki
Koike-Akino
Philip
Orlik
Huifang
Sun
Yebin
Wang
Kuan-Chuan
Peng
-
Awards
-
AWARD Best Paper - Honorable Mention Award at WACV 2021 Date: January 6, 2021
Awarded to: Rushil Anirudh, Suhas Lohit, Pavan Turaga
MERL Contact: Suhas Lohit
Research Areas: Computational Sensing, Computer Vision, Machine LearningBrief- A team of researchers from Mitsubishi Electric Research Laboratories (MERL), Lawrence Livermore National Laboratory (LLNL) and Arizona State University (ASU) received the Best Paper Honorable Mention Award at WACV 2021 for their paper "Generative Patch Priors for Practical Compressive Image Recovery".
The paper proposes a novel model of natural images as a composition of small patches which are obtained from a deep generative network. This is unlike prior approaches where the networks attempt to model image-level distributions and are unable to generalize outside training distributions. The key idea in this paper is that learning patch-level statistics is far easier. As the authors demonstrate, this model can then be used to efficiently solve challenging inverse problems in imaging such as compressive image recovery and inpainting even from very few measurements for diverse natural scenes.
- A team of researchers from Mitsubishi Electric Research Laboratories (MERL), Lawrence Livermore National Laboratory (LLNL) and Arizona State University (ASU) received the Best Paper Honorable Mention Award at WACV 2021 for their paper "Generative Patch Priors for Practical Compressive Image Recovery".
-
AWARD MERL Researchers win Best Paper Award at ICCV 2019 Workshop on Statistical Deep Learning in Computer Vision Date: October 27, 2019
Awarded to: Abhinav Kumar, Tim K. Marks, Wenxuan Mou, Chen Feng, Xiaoming Liu
MERL Contact: Tim Marks
Research Areas: Artificial Intelligence, Computer Vision, Machine LearningBrief- MERL researcher Tim Marks, former MERL interns Abhinav Kumar and Wenxuan Mou, and MERL consultants Professor Chen Feng (NYU) and Professor Xiaoming Liu (MSU) received the Best Oral Paper Award at the IEEE/CVF International Conference on Computer Vision (ICCV) 2019 Workshop on Statistical Deep Learning in Computer Vision (SDL-CV) held in Seoul, Korea. Their paper, entitled "UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss," describes a method which, given an image of a face, estimates not only the locations of facial landmarks but also the uncertainty of each landmark location estimate.
-
AWARD R&D100 award for Deep Learning-based Water Detector Date: November 16, 2018
Awarded to: Ziming Zhang, Alan Sullivan, Hideaki Maehara, Kenji Taira, Kazuo Sugimoto
MERL Contact: Alan Sullivan
Research Areas: Artificial Intelligence, Computer Vision, Machine LearningBrief- Researchers and developers from MERL, Mitsubishi Electric and Mitsubishi Electric Engineering (MEE) have been recognized with an R&D100 award for the development of a deep learning-based water detector. Automatic detection of water levels in rivers and streams is critical for early warning of flash flooding. Existing systems require a height gauge be placed in the river or stream, something that is costly and sometimes impossible. The new deep learning-based water detector uses only images from a video camera along with 3D measurements of the river valley to determine water levels and warn of potential flooding. The system is robust to lighting and weather conditions working well during the night as well as during fog or rain. Deep learning is a relatively new technique that uses neural networks and AI that are trained from real data to perform human-level recognition tasks. This work is powered by Mitsubishi Electric's Maisart AI technology.
See All Awards for Computer Vision -
-
News & Events
-
EVENT MERL Virtual Open House 2020 Date & Time: Wednesday, December 9, 2020; 1:00-5:00PM EST
MERL Contacts: Elizabeth Phillips; Jeroen van Baar; Anthony Vetro
Location: Virtual
Research Areas: Applied Physics, Artificial Intelligence, Communications, Computational Sensing, Computer Vision, Control, Data Analytics, Dynamical Systems, Electric Systems, Electronic and Photonic Devices, Machine Learning, Multi-Physical Modeling, Optimization, Robotics, Signal Processing, Speech & AudioBrief- MERL will host a virtual open house on December 9, 2020. Live sessions will be held from 1-5pm EST, including an overview of recent activities by our research groups and a talk by Prof. Pierre Moulin of University of Illinois at Urbana-Champaign on adversarial machine learning. Registered attendees will also be able to browse our virtual booths at their convenience and connect with our research staff on engagement opportunities including internship, post-doc and research scientist openings, as well as visiting faculty positions.
Registration: https://mailchi.mp/merl/merl-virtual-open-house-2020
Schedule: https://www.merl.com/events/voh20
Current internship and employment openings:
https://www.merl.com/internship/openings
https://www.merl.com/employment/employment
Information about working at MERL:
https://www.merl.com/employment
- MERL will host a virtual open house on December 9, 2020. Live sessions will be held from 1-5pm EST, including an overview of recent activities by our research groups and a talk by Prof. Pierre Moulin of University of Illinois at Urbana-Champaign on adversarial machine learning. Registered attendees will also be able to browse our virtual booths at their convenience and connect with our research staff on engagement opportunities including internship, post-doc and research scientist openings, as well as visiting faculty positions.
-
NEWS Computer vision and robotics researcher Siddarth Jain appointed as an Associate Editor for the IEEE Robotics and Automation Letters (RA-L) Date: October 13, 2020
MERL Contact: Siddarth Jain
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, RoboticsBrief- Computer vision and robotics researcher, Siddarth Jain, has been appointed to the editorial board of the IEEE Robotics and Automation Letters (RA-L) as an Associate Editor. Siddarth joined MERL in September 2019 after obtaining his Ph.D. in robotics from Northwestern University, where he developed novel robotics systems to help people with motor-impairments in performing activities of daily living tasks.
RA-L publishes peer-reviewed articles in areas of robotics and automation. RA-L also provides a unique feature to the authors with the opportunity to publish a paper in a peer-reviewed journal and present the same paper at the annual flagship robotics conferences of IEEE RAS, including ICRA, IROS, and CASE.
- Computer vision and robotics researcher, Siddarth Jain, has been appointed to the editorial board of the IEEE Robotics and Automation Letters (RA-L) as an Associate Editor. Siddarth joined MERL in September 2019 after obtaining his Ph.D. in robotics from Northwestern University, where he developed novel robotics systems to help people with motor-impairments in performing activities of daily living tasks.
See All News & Events for Computer Vision -
-
Research Highlights
-
Internships
-
SP1551: Algorithms for Large-Scale Optimal Transport
The Computational Sensing team at MERL is seeking motivated individuals to develop scalable optimal transport algorithms. Ideal candidates should be Ph.D. students with research experience in optimal transport and scalable optimal transport algorithms. Experience with GPU implementations is a plus. Publication of the results produced during our internships is expected. The duration of the internships is anticipated to be 3 months. Start date is flexible. This internship is preferred to be onsite at MERL, but may be done remotely where you live if the COVID pandemic makes it necessary.
-
CV1552: Multimodal Reasoning
MERL is looking for a self-motivated intern to work on problems at the intersection of video understanding, audio processing, and language models. The ideal candidate would be a PhD student with a strong mathematical background in machine learning and computer vision. The candidate must have prior experience in using deep learning methods for image and video representations (such as using scene graphs) and deep audio analysis (such as source separation, localization, etc.). Proficiency in Python and flexibility in using different deep learning software (especially Pytorch) is expected. The intern is expected to collaborate with computer vision and speech teams at MERL to develop algorithms and prepare manuscripts for scientific publications. The internship is for 3 months with flexible start date. This internship is preferred to be onsite at MERL, but may be done remotely where you live if the COVID pandemic makes it necessary.
-
SP1585: Three dimensional Imaging from Compton Camera
The Computational Sensing team at MERL is seeking motivated and qualified individuals to develop algorithms that reconstruct a three dimensional distribution of a radioactive source when observed using a Compton camera. The project goal is to improve the performance and develop an uncertainty analysis of these algorithms. Ideal candidates should be Ph.D. students and have solid background and publication record in 3D Compton imaging. Experience in computational tomography, imaging inverse problems, and large-scale optimization is also preferred. Publication of the results produced during our internships is expected. The duration of the internships is anticipated to be 3-6 months. Start date is flexible. This internship is preferred to be onsite at MERL, but may be done remotely where you live if the COVID pandemic makes it necessary.
See All Internships for Computer Vision -
-
Openings
See All Openings at MERL -
Recent Publications
- "Recovering Trajectories of Unmarked Joints in 3D Human Actions Using Latent Space Optimization", IEEE Winter Conference on Applications of Computer Vision (WACV), January 2021.BibTeX TR2021-004 PDF
- @inproceedings{Lohit2021jan,
- author = {Lohit, Suhas and Anirudh, Rushil and Turaga, Pavan},
- title = {Recovering Trajectories of Unmarked Joints in 3D Human Actions Using Latent Space Optimization},
- booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
- year = 2021,
- month = jan,
- url = {https://www.merl.com/publications/TR2021-004}
- }
, - "Generative Patch Priors for Practical Compressive Image Recovery", IEEE Winter Conference on Applications of Computer Vision (WACV), January 2021.BibTeX TR2021-003 PDF
- @inproceedings{Anirudh2021jan,
- author = {Anirudh, Rushil and Lohit, Suhas and Turaga, Pavan},
- title = {Generative Patch Priors for Practical Compressive Image Recovery},
- booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV)},
- year = 2021,
- month = jan,
- url = {https://www.merl.com/publications/TR2021-003}
- }
, - "Near-Infrared Imaging Photoplethysmography During Driving", IEEE Transactions on Intelligent Transportation Systems, December 2020.BibTeX TR2020-161 PDF
- @article{Nowara2020dec,
- author = {Nowara, Ewa and Marks, Tim and Mansour, Hassan and Veeraraghavan, Ashok},
- title = {Near-Infrared Imaging Photoplethysmography During Driving},
- journal = {IEEE Transactions on Intelligent Transportation Systems},
- year = 2020,
- month = dec,
- url = {https://www.merl.com/publications/TR2020-161}
- }
, - "Spatio- Temporal Graph Scattering Transform", IEEE Transactions on Pattern Analysis and Machine Intelligence, December 2020.BibTeX TR2020-166 PDF
- @article{Chen2020dec,
- author = {Chen, Siheng and Li, Maosen and Chen, Xu and Zhang, Ya and Wang, Yanfeng and Tian, Qi},
- title = {Spatio- Temporal Graph Scattering Transform},
- journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
- year = 2020,
- month = dec,
- url = {https://www.merl.com/publications/TR2020-166}
- }
, - "Interactive Tactile Perception for Classification of Novel Object Instances", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), November 2020.BibTeX TR2020-143 PDF
- @inproceedings{Corcodel2020nov,
- author = {Corcodel, Radu and Jain, Siddarth and van Baar, Jeroen},
- title = {Interactive Tactile Perception for Classification of Novel Object Instances},
- booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
- year = 2020,
- month = nov,
- url = {https://www.merl.com/publications/TR2020-143}
- }
, - "Sound2Sight: Generating Visual Dynamics from Sound and Context", European Conference on Computer Vision (ECCV), Vedaldi, A., Bischof, H., Brox, Th., Frahm, J.-M., Eds., August 2020.BibTeX TR2020-121 PDF Software
- @inproceedings{Cherian2020aug,
- author = {Cherian, Anoop and Chatterjee, Moitreya and Ahuja, Narendra},
- title = {Sound2Sight: Generating Visual Dynamics from Sound and Context},
- booktitle = {European Conference on Computer Vision (ECCV)},
- year = 2020,
- editor = {Vedaldi, A., Bischof, H., Brox, Th., Frahm, J.-M.},
- month = aug,
- publisher = {Springer},
- url = {https://www.merl.com/publications/TR2020-121}
- }
, - "Representation Learning via Adversarially-Contrastive Optimal Transport", International Conference on Machine Learning (ICML), H. Daumé and A. Singh, Eds., July 2020, pp. 10675-10685.BibTeX TR2020-093 PDF Software
- @inproceedings{Cherian2020jul,
- author = {Cherian, Anoop and Aeron, Shuchin},
- title = {Representation Learning via Adversarially-Contrastive Optimal Transport},
- booktitle = {International Conference on Machine Learning (ICML)},
- year = 2020,
- editor = {H. Daumé and A. Singh},
- pages = {10675--10685},
- month = jul,
- url = {https://www.merl.com/publications/TR2020-093}
- }
, - "Collaborative Motion Prediction via Neural Motion Message Passing", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.BibTeX TR2020-072 PDF
- @inproceedings{Hu2020jun,
- author = {Hu, Yue and Chen, Siheng and Zhang, Ya and Gu, Xiao},
- title = {Collaborative Motion Prediction via Neural Motion Message Passing},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2020,
- month = jun,
- url = {https://www.merl.com/publications/TR2020-072}
- }
,
- "Recovering Trajectories of Unmarked Joints in 3D Human Actions Using Latent Space Optimization", IEEE Winter Conference on Applications of Computer Vision (WACV), January 2021.
-
Videos
-
Towards Human-Level Learning of Complex Physical Puzzles
-
3D Object Discovery and Modeling Using Single RGB-D Images Containing Multiple Object Instances
-
Joint 3D Reconstruction of a Static Scene and Moving Objects
-
Direct Multichannel Tracking
-
FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
-
FasTFit: A fast T-spline fitting algorithm
-
CASENet: Deep Category-Aware Semantic Edge Detection
-
Object Detection and Tracking in RGB-D SLAM via Hierarchical Feature Grouping
-
Pinpoint SLAM: A Hybrid of 2D and 3D Simultaneous Localization and Mapping for RGB-D Sensors
-
Action Detection Using A Deep Recurrent Neural Network
-
Saffron - Digital Type System
-
Sapphire - High Accuracy NC Milling Simulation
-
MERL Research on Autonomous Vehicles
-
Dose optimization for particle beam therapy
-
3D Reconstruction
-
Robot Bin Picking
-
Semantic Scene Labeling
-
Obstacle Detection
-
Deep Hierarchical Parsing for Semantic Segmentation
-
Global Local Face Upsampling Network
-
Gaussian Conditional Random Field Network for Semantic Segmentation
-
Fast Graspability Evaluation on Single Depth Maps for Bin Picking with General Grippers
-
Point-Plane SLAM for Hand-Held 3D Sensors
-
Tracking an RGB-D Camera Using Points and Planes
-
Fast Plane Extraction in Organized Point Clouds Using Agglomerative Hierarchical Clustering
-
Calibration of Non-Overlapping Cameras Using an External SLAM System
-
Voting-Based Pose Estimation for Robotic Assembly Using a 3D Sensor
-
Fast Object Localization and Pose Estimation in Heavy Clutter for Robotic Bin Picking
-
Learning to rank 3D features
-
-
Software Downloads
-
Generating Visual Dynamics from Sound and Context
-
Adversarially-Contrastive Optimal Transport
-
MotionNet
-
Contact-Implicit Trajectory Optimization
-
FoldingNet++
-
Landmarks’ Location, Uncertainty, and Visibility Likelihood
-
Gradient-based Nikaido-Isoda
-
Circular Maze Environment
-
Discriminative Subspace Pooling
-
Kernel Correlation Network
-
Fast Resampling on Point Clouds via Graphs
-
FoldingNet
-
Joint Geodesic Upsampling
-
Plane Extraction using Agglomerative Clustering
-