Learning low-level vision
We have developed a machine learning based method which applies to various problems in low-level vision. We seek the scene interpretation that best explains image data. For example, we may want to estimate the projected velocities (scene) which best explain two consecutive image frames (image). This yields an efficient method to form low-level scene interpretations, which should apply to a variety of low-level vision problems. We have demonstrated the technique for motion analysis and estimating high resolution images from low-resolution ones.
Background & Objective: We want to get a computer to solve vision problems which are trivial for people: interpret a line drawing; estimate the 3-d shape of an object depicted in a photograph; estimate depth from a stereo pair of image; estimate motion from an image sequence. Of course, algorithms exist for many of these problems, but many are brittle. We seek to exploit the memory capacity of modern computers for solving these problems. We developed a common machine learning framework which applies to all these problems. We hope that learning-based, memory-intensive approach will be more reliable than other algorithms. There may be many applications of this technology. This research might ultimately lead to a vision chip, which could input image data and output a mid-level scene representation, such as 3-d shape, or reflectances. It might lead to a method to estimate high resolution images from low-resolution ones.
Technical Discussion: We ask: can a visual system correctly interpret a visual scene if it models (1) the probability that any local scene patch generated the local image, and (2) the probability that any local scene is the neighbor to any other? The first probabilities allow making scene estimates from local image data, and the second allow these local estimates to propagate. Third, we propagate probabilities in the Markov network, taking advantage of a "factorization approximation", where we ignore the effects of network loops. This method is fast, and in practise for the problems we have studied, proves to be reliable, as well.
| Technical Reports: | |
| Learning Low-Level Vision | |
| Learning low-level vision | |
Technology Areas:
Computer Vision
Advanced Digital Television
Digital Communications
Modification Date: January 23, 2007
