1 / 24

Efficient Activity Detection Using Maximum-Subgraph Search Techniques

This paper presents a novel approach to human activity detection in continuous video by framing it as a maximum-weight connected subgraph problem. Traditional methods typically split the process into two separate stages, which can be inefficient. Our method utilizes a learned space-time graph constructed from the test sequence to enhance detection accuracy. We employ a linear SVM to train feature weights and leverage both low-level (HoG, HoF) and high-level descriptors to improve robustness in detecting complex activities even in noisy backgrounds. Experimental results indicate significant reductions in computation time while enhancing detection reliability.

quanda
Télécharger la présentation

Efficient Activity Detection Using Maximum-Subgraph Search Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Activity Detection with Max-Subgraph Search Chao-Yeh Chen and Kristen Grauman University of Texas at Austin

  2. Outline • Introduction • Approach • Define weighted nodes • Link nodes • Search for the maximum-weight graph • Experimental Result • Conclusion

  3. Introduction • Existing methods tend to separate activity detection into two distinct stages: • 1. generates space-time candidate regions of interest from the test video • 2. scores each candidate according to how well it matches a given activity model (often a classifier).

  4. How to detect human activity in continuous video? • Status quo approaches:

  5. Introduction • We pose activity detection as a maximum-weight connected subgraph problem over a learned space-time graphconstructed on the test sequence.

  6. Approach

  7. Classifier training for feature weights • Learn a linear SVM from training data, the scoring function would have the form: • let   denote j-th bin count for histogram h(S) , the j-th word is associated with a weight for j = 1,…,K, where K is the dimension of histogram h. 

  8. Classifier training for feature weights • Thus the classifier response for subvolumeS  is: Candidate subvolume SVM weight for j-th word Num occurrences of j-th word SVM weight for i-th feature's word

  9. Bag-of-feature(Bof)

  10. Localized SpaceTime Features • Low-level descriptors • we use HoG and HoF computed in local space-time cubes [14, 10]. These descriptors capture the appearance and motion in the video. • High-level descriptors

  11. Define weighted nodes • Divide space-time volume into frame-level or space-time nodes. • Compute the weight of nodes from the features inside them.

  12. Link nodes • Two different link strategies: • 1. Neighbors only for frame-level nodes(T-Subgraph) or space-time nodes(ST-Subgraph). • 2. First two neighbors for frame-level nodes(T-Jump-Subgraph).

  13. Search for the maximum-weight graph • Transform max-weight subgraph problem into a prize-collecting Steiner tree problem. • Solve efficiently with branch and cut method from [15]. [15]An algorithmic framework for the exact solution of the prize-collecting Steiner tree problem. Math. Prog., 2006.

  14. Experimental Result • Datasets

  15. Baselines • T-Sliding • ST-Cube-Sliding • ST-Cube-Subvolume[29] J. Yuan, Z. Liu, and Y. Wu. Discriminative subvolume search for efficient action detection. In CVPR, 2009.

  16. UCF Sports data

  17. Hollywood data

  18. MSR dataset

  19. Example of ST-Subgraph

  20. Overview of all methods on the three datasets

  21. High-level vs Low-level descriptors

  22. Conclusion • Compare to sliding window search ,it significantly reduces computation time. • Flexible node structure offers more robust detection in noisy backgrounds. • High-level descriptor shows promise for complex activities by incorporating semantic relationships between humans and objects in video.

More Related