TR2017-068

MonoRGBD-SLAM: Simultaneous Localization and Mapping Using Both Monocular and RGBD Cameras


    •  Yousif, K., Taguchi, Y., Ramalingam, S., "MonoRGBD-SLAM: Simultaneous Localization and Mapping Using Both Monocular and RGBD Cameras", IEEE International Conference on Robotics and Automation (ICRA), DOI: 10.1109/​ICRA.2017.7989521, May 2017.
      BibTeX TR2017-068 PDF
      • @inproceedings{Yousif2017may2,
      • author = {Yousif, Khalid and Taguchi, Yuichi and Ramalingam, Srikumar},
      • title = {MonoRGBD-SLAM: Simultaneous Localization and Mapping Using Both Monocular and RGBD Cameras},
      • booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
      • year = 2017,
      • month = may,
      • doi = {10.1109/ICRA.2017.7989521},
      • url = {https://www.merl.com/publications/TR2017-068}
      • }
  • Research Areas:

    Computer Vision, Robotics

Abstract:

RGBD SLAM systems have shown impressive results, but the limited field of view (FOV) and depth range of typical RGBD cameras still cause problems for registering distant frames. Monocular SLAM systems, in contrast, can exploit wide-angle cameras and do not have the depth range limitation, but are unstable for textureless scenes. We present a SLAM system that uses both an RGBD camera and a wide-angle monocular camera for combining the advantages of the two sensors. Our system extracts 3D point features from RGBD frames and 2D point features from monocular frames, which are used to perform both RGBD-to-RGBD and RGBDto-monocular registration. To compensate for different FOV and resolution of the cameras, we generate multiple virtual images for each wide-angle monocular image and use the feature descriptors computed on the virtual images to perform the RGBD-to-monocular matches. To compute the poses of the frames, we construct a graph where nodes represent RGBD and monocular frames and edges denote the pairwise registration results between the nodes. We compute the global poses of the nodes by first finding the minimum spanning trees (MSTs) of the graph and then pruning edges that have inconsistent poses due to possible mismatches using the MST result. We finally run bundle adjustment on the graph using all the consistent edges. Experimental results show that our system registers a larger number of frames than using only an RGBD camera, leading to larger-scale 3D reconstruction.