Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models

    •  Xie, L.; Xu, P.; Chang, S.-F.; Divakaran, A.; Sun, H., "Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models", Pattern Recognition Letters, Vol. 25, No. 7, pp. 767-775, May 2004.
      BibTeX Download PDF
      • @article{Xie2004may,
      • author = {Xie, L. and Xu, P. and Chang, S.-F. and Divakaran, A. and Sun, H.},
      • title = {Structure Analysis of Soccer Video with Domain Knowledge and Hidden Markov Models},
      • journal = {Pattern Recognition Letters},
      • year = 2004,
      • volume = 25,
      • number = 7,
      • pages = {767--775},
      • month = may,
      • url = {}
      • }
  • MERL Contact:
  • Research Area:

    Digital Video

In this paper, we present statistical techniques for parsing the structure of produced soccer programs. The problem is important for applicaitons such as personalized video streaming and browsing systems, in which vides are segmented into different states and important states are selected based on user preferences. While prior work focuses on the detection of special events such as goals or corner kicks, this paper is concerned with generic structural elements of the game. We define two mutually exclusive states of the fame, play and break based on the rules of soccer. Automatic detection of such generic states represents an original challenging issue due to high appearance diversities and temporal dynamics of such states in different videos. We select a salient feature set from the compressed domain, dominant color ratio and motion intensity, based on the special syntax and content characteristics of soccer videos. We then model the stochastic structures of each state of the game with a set of hidden Markov models. Finally, higher-level transitions are taken into account and dynamic programming techniques are used to obtain the maximum likelihood segmentation of the video sequence. The system achieves a promising classification accuracy of 83.5%, with light-weight computation on feature extraction and model inference, aas well as a satisfactory accuracy in boundary timing.