TR2004-008

Audio-Visual Event Detection Based on Mining of Semantic Audio-Visual Labels


    •  Goh, K.-S., Miyahara, K., Radhakrishan, R., Xiong, Z., Divakaran, A., "Audio-Visual Event Detection Based on Mining of Semantic Audio-Visual Labels", SPIE Conference on Storage and Retrieval for Multimedia Databases, January 2004, vol. 5307, pp. 292-299.
      BibTeX TR2004-008 PDF
      • @inproceedings{Goh2004jan,
      • author = {Goh, K.-S. and Miyahara, K. and Radhakrishan, R. and Xiong, Z. and Divakaran, A.},
      • title = {Audio-Visual Event Detection Based on Mining of Semantic Audio-Visual Labels},
      • booktitle = {SPIE Conference on Storage and Retrieval for Multimedia Databases},
      • year = 2004,
      • volume = 5307,
      • pages = {292--299},
      • month = jan,
      • url = {https://www.merl.com/publications/TR2004-008}
      • }
Abstract:

Removing commercials from television programs is a much sought- after feature for a personal video recorder. In this paper, we employ an unsupervised clustering scheme (CM Detect) to detect commercials in television programs. Each program is first divided into Ws-minute chunks, and we extract audio and visual features from each of these chunks. Next, we apply k-means clustering to assign each chunk with a commercial/program label. In contrast to other methods, we do not make any assumptions regarding the program content. Thus, our method is highly content-adaptive and computationally inexpensive. Through empirical studies on various content, including American news, Japanese news, and sports programs, we demonstrate that our method is able to filter out most of the commercials without falsely removing the regular program.

 

  • Related News & Events

    •  NEWS    SPIE Conference on Storage and Retrieval for Multimedia Databases 2004: 2 publications by Ajay Divakaran and others
      Date: January 20, 2004
      Where: SPIE Conference on Storage and Retrieval for Multimedia Databases
      Brief
      • The papers "Audio-Visual Event Detection Based on Mining of Semantic Audio-Visual Labels" by Goh, K.-S., Miyahara, K., Radhakrishan, R., Xiong, Z. and Divakaran, A. and "Video Mining Using Combinations of Unsupervised and Supervised Learning Techniques" by Divakaran, A., Miyaraha, K., Peker, K.A., Radhakrishnan, R. and Xiong, Z. were presented at the SPIE Conference on Storage and Retrieval for Multimedia Databases.
    •