TR2005-040

Compressed Domain Video Object Segmentation



We propose a compressed domain video object segmentation method for MPEG or MPEG-like encoded videos. Computational superiority is the main advantage of the compressed domain processing. In addition to computational advantage, the compressed domain video process possesses two important features, which are very attractive for object analysis. First, the texture characteristics are provided by the DCT coefficiens with the need of only partial decoding. Second, the motion information is readily available without incurring cost of complicated motion estimation process for not intra only MPEG encoded videos. In the proposed method, we first exploit the macro-block structure of the MPEG encoded video to decrease the spatial resolution of the processed data, which exponentially reduces the computational load. Further reduction of complexity is achieved by temporal grouping of the intra-coded and estimated frames into a single feature layer. The video segmentation is achieved by using the combination of DCt coefficients for I-frames and block motion veactors for P-frames. A frequency-temporal data structure is constructed. Starting from the blocks where the AC-coefficient energy and local inter-bloack DC-coefficient variance is small, the homogeneous volumes are enlarged by evaluating the distance of candidate vectors to the volume characteristics. Affine motion models are fit to volumes. Finally, a hierarchical clustering stage iteratively merges the most similar parts to generate an object partition tree as an output. The experimental results have shown that the proposed compressed domain video segmentation method provides the similar results as by using spatial domain process with much less computational complexity.