Computer Vision

Computer vision and machine learning for processing data from across space and time to extract meaning and build representations of objects and events in the world.

The research in the Imaging group at MERL covers all aspects of extracting information from images. For instance, from a picture of a scene we can compute features that allow the detection and location of specific objects. Or we learn a dictionary for the appearance of local patches in an image and use it to classify regions and objects or to improve the image quality. We can track a moving object in video to quantify its trajectory. In some cases we can modify the actual image creation process to make subsequent information extraction more effective. For instance, multiple flash exposures can be used to identify an object's edges.

Several of our current projects involve 3D analysis based on 2D images. For example, we have developed algorithms for estimation of object pose so that a robot arm can grasp an object from a cluttered workspace. In another project, we infer automobile position in a city through matching of camera images to a 3D city model. For medical radiation treatment, we align patient position by matching current x-rays to simulated x-rays obtained by project. In all these cases, the algorithms we have developed must be very fast and accurate. We have also developed algorithms that operate directly on 3D data for reconstruction, detection, and recognition.

For several years, MERL has been a leader in computational photography and imaging. Given that many images are now computer processed prior to viewing, this research seeks to modify the capture stage to optimize the information transfer into the computer and ultimately into the final usage-perhaps human viewing, or perhaps more computer analysis to extract quantitative measures from the image. In this research MERL has been able to dramatically improve corrections for motion and focus blur, achieve spatial and temporal super-resolution in video, and conceive novel camera optics for wide field of view stereo reconstruction.