Detection, Tracking and 3D Modeling of Objects with Sparse RGB-D SLAM and Interactive Perception

We present an interactive perception system that enables an autonomous agent to deliberately interact with its environment and produce 3D object models. Our system verifies object hypotheses through interaction and simultaneously maintains 3D SLAM maps for each rigidly moving object hypothesis in the scene. We rely on depth-based segmentation and a multigroup registration scheme to classify features into various object maps. Our main contribution lies in the employment of a novel segment classification scheme that allows the system to handle incorrect object hypotheses, common in cluttered environments due to touching objects or occlusion. We start with a single map and initiate further object maps based on the outcome of depth segment classification. For each existing map, we select a segment to interact with and execute a manipulation primitive with the goal of disturbing it. If the resulting set of depth segments has at least one segment that did not follow the dominant motion pattern of its respective map, we split the map, thus yielding updated object hypotheses. We show qualitative results with a Fetch manipulator and objects of various shapes, which showcase the viability of the method for identifying and modelling multiple objects through repeated interactions.