A Factorization Approach to Grouping
Grouping is an important perceptual problem: how do we look at an image and rapidly group the image into clusters of components? This would be a useful task for computers to perform efficiently, having a variety of potential applications in image editing, user interfaces, or image interpretation. We explain how this method, using standard mathematical machinery, arises from an axiomatic approach applied to the problem of grouping.
Background & Objective: The figure shows a collection of line elements. It is quite apparent that their mutual positions contain some `global' information. While it is somewhat unclear how to define this global information, it is natural to consider a local property: the `pairwise similarity' of these points: two points that are close by are `similar' and two points that are far apart are `different'. This is a common situation in vision: we extract tokens from an image using some early visual process, and the notion of `closeness' between pairs of tokens is natural and well defined. The tokens may be anything: from pixels to points, to edgels, to textured patches. While it is unclear how to extract, and even how to define, the global high-level properties of the scene, it is easy and natural to define the pairwise affinity of any two tokens. This idea comes to us from the work by Shi and Malik on grouping using normalized-cuts. We explored a simpler approach than that of Shi and Malik: rather than formulating a grouping problem explicitly, we notice that a useful global property of the scene, the foreground set, may be both `discovered' and estimated starting from the notion of pairwise closeness, or pairwise affinity, of individual elements. We have developed a simple algorithm which factorizes the matrix of pairwise element afinities, and it compares favorably with the algorithm of Shi and Malik.
Technical Discussion: The foreground group in a scene may be `discovered' and computed as a factorized approximation to the pairwise affinity of the elements in the scene. A pointwise approximation of the pairwise affinity information may in fact be interpreted as a `saliency' index, and the foreground of the scene may be obtained by thresholding it. An algorithm called `affinity factorization' is thus obtained which may be used for grouping. We have demonstrated the affinity factorization algorithm on displays composed of points, of lines and of brightness values. The affinity factorization algorithm is shown to be computationally efficient (O(n) floating-point operations for a scene composed of n elements) and to perform well on displays where the background is unstructured.
Outside Collaborations: This is joint work with Prof. Pietro Perona, of the California Institute of Technology, Pasadena, CA, USA.
Contact: Joseph Katz
| Technical Reports: | |
| A factorization approach to grouping | |
Technology Area: Artificial Intelligence
Modification Date: September 12, 2007

