Multi-Camera Systems
The nature of single-camera single-room architecture multi-camera surveillance applications demands automatic and accurate calibration, detection of object of interest, tracking, fusion of multiple modalities to solve inter-camera correspondence problem, easy access and retrieving video data, capability to make semantic query, and effective abstraction of video content. Although several multi-camera setups have been adapted for 3D vision problems, the non-overlapping camera systems have not investigated thoroughly. Considering the huge amount of the video data a multi-camera system may produces over a short time period, more sophisticated tools for control, representation, and content analysis became an urgent need.
Background & Objective: We designed a framework where we can extract the object-wise semantics from a non-overlapping field-of-view multi-camera system. This framework has four main components: camera calibration, automatic tracking, inter-camera data fusion, and query generation. We developed an object-based video content labeling method to restructure the camera-oriented videos into object-oriented results. We proposed a summarization technique using the motion activity characteristics of the encoded video segments to provide a solution to the storage and presentation of the immense video data. To solve the calibration problem, we developed a correlation matrix and dynamic programming based method.
Technical Discussion: An automatic object tracking and video summarization method for multi-camera systems with a large number of non-overlapping field-of-view cameras is developed. In this framework, video sequences are stored for each object as opposed to storing a sequence for each camera. Object-based representation enables annotation of video segments, and extraction of content semantics for further analysis. We also present a novel solution to the inter-camera color calibration problem. The transitive model function enables effective compensation for lighting changes and radiometric distortions for large-scale systems. After initial calibration, objects are tracked at each camera by background subtraction and mean-shift analysis. The correspondence of objects between different cameras is established by using a Bayesian Belief Network. This framework empowers the user to get a concise response to queries such as, "Which locations did an object visit on Monday and what did it do there?"
Technology Area: Computer Vision
Modification Date: July 15, 2004
