Motion-Based Optical Sensing with Multiple AR Cameras
In this project, we investigate novel designs and algorithms for an optical sensing device which can be used as a 6-DOF "joystick" and a servo/controller for real and/or virtual objects. The optical motion sensor makes use of Mitsubishi Electric's "Artificial Retina" (AR) CMOS chip, an inexpensive, fast and low-power image sensor whose on-chip processing can speed-up the computation of 2D camera motion. By attaching these motion-sensing units to an object (human, toy, car, etc.) we can measure its (relative) motion in real-time using passive optics. Applications include: an optical mouse (2D pointing device), an optical "wand" (3D pointing device) and servo control of real/virtual objects.
Background & Objective: In computer vision, the recovery of camera motion (and 3D scene structure) from optical flow is a relatively mature field and is the basis for various passive navigation techniques. Assuming that the scene is motion-less, the optical flow will be due entirely to the 3D motion field resulting from the camera movement (egomotion). However, we propose a much simpler analysis: given minimal occlusion, large depth, little specularity and minor lighting changes, the optical flow field of ego-motion can be represented globally by a single displacement vector which summarizes the camera motion as projected on the image plane, referred to simply as the 2D motion. Multiple 2D motion sensors of this simple design, when properly attached to a moving object, should recover all 6 DOF necessary for tracking.
Technical Discussion: One simple but effective strategy is to compute the displacements independently using horizontal and vertical projections of the image. The resulting 1D projection signals can be easily tracked with inter-frame analysis (displacement matching) to compute the 2D motion vector (dx,dy). Mitsubishi Electric's "Artificial Retina" CMOS image sensor ("AR chip") is well-suited to this task since it can compute horizontal and vertical projections in hardware ("on-chip"). The most elementary 6-DOF sensor consists of 3 orthogonally mounted cameras with a camera looking outward along each axis. In theory, one can recover all 6 DOFs with this configuration. In practice, redundancy (more than 3 cameras) enhances performance.
Technology Area: Computer Vision
Modification Date: September 12, 2007
