1 / 34

A General Framework for Tracking Multiple People from a Moving Camera

A General Framework for Tracking Multiple People from a Moving Camera. Wongun Choi, Caroline Pantofaru, Silvio Savarese. IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013. Overview. Motivation Related Work Introduction Proposed Method Experiment Result Conclusion.

floyd
Télécharger la présentation

A General Framework for Tracking Multiple People from a Moving Camera

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A General Framework for Tracking MultiplePeople from a Moving Camera Wongun Choi, Caroline Pantofaru, Silvio Savarese IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013

  2. Overview • Motivation • Related Work • Introduction • Proposed Method • Experiment Result • Conclusion

  3. Motivation 1.Final goal is tracking multiple people from a moving camera, including outdoor video scene and indoor video scene. 2.There are some challenge to solve: • People have variety poses • Complexity of the motion patterns of multiple people in the same scene • Changeable scene and illumination effect

  4. Related work • Tracking by online learning : Learning appearance model [10],[5],[34],[7],[26] Color histogram and mean shift [10] • Tracking with a moving camera: Probabilistic framework multiple detectors [42],[43] Stereo and graphical model [12],[13] [5] S. Avidan. Ensemble tracking. In PAMI, 2007 [7] C. Bibby and I. Reid. Robust real-time visual tracking using pixelwiseposteriors. In ECCV, 2008 [10] D. Comaniciu and P. Meer. Mean shift:Arobust approach toward feature space analysis. In PAMI, 2002. [12] A. Ess, B. Leibe, K. Schindler, and L. van Gool. A mobile vision system for robust multi-person tracking. In CVPR, 2008. [13] A. Ess, B. Leibe, K. Schindler, and L. van Gool. Robust multi person tracking from a mobile platform. PAMI, 2009. [26] S. Kwak, W. Nam, B. Han, and J. Han. Learning occlusion with likelihoods for visual tracking. In ICCV, 2011 [34] D. Ramanan, D. Forsyth, and A. Zisserman. Tracking people by learning their appearance. PAMI, Jan. 2007. [42] C. Wojek, S. Walk, S. Roth, and B. Schiele. Monocular 3d scene understanding with explicit occlusion reasoning. In CVPR, 2011. [43] C. Wojek, S. Walk, and B. Schiele. Multi-cue onboard pedestrian detection. In CVPR, 2009

  5. Introduction(1) To solve these issues proposed method: • People have variety poses : Fusing multiple person detection method and some observations • Complexity of the motion patterns of multiple people in the same scene Build a motion model that capture the interaction between targets • Changeable scene and illumination effect Proposed a novel 3D model which explain the process of video generation

  6. Introduction(2) Observation cues:

  7. Introduction(3) Build 3D Model:

  8. Introduction(4) Particle filter: 1.Def: posterior density estimation algorithms that estimate the posterior density of the state-space by directly implementing the Bayesian recursion equations 2.Using sampling for generating state distribution of posterior and using resampling To reconstruct the new distribution

  9. Introduction(5) Reversible-Jump Markov Chain Monte Carlo(RJMCMC): A class of algorithms for sampling from probability distributions based on constructing a Markov chain which allows changes of the dimensionality of the state

  10. Proposed Method System overview: 1.Using observation cues to generate detection hypotheses and an observation Model 2.Build a motion model account both for people’s unexpected motions as well as interactions between people 3. Sampling procedure for the RJ-MCMC tracker which include evaluation(resampling)

  11. Proposed Method Model representation:

  12. Proposed Method • Using as random variables and model their relationship by joint posterior probability • The tracking problem can formulate as finding maximum-a-posteri (MAP) • Observation likelihood • Motion model (transition model) • Posterior at time t-1

  13. Proposed Method • Observation likelihood: Camera projection function:

  14. Proposed Method Target Observation Likelihood: j:detectors wj: weight for detector j

  15. Proposed Method Target Observation Likelihood: 1) pedestrian detector 2) upper body detector 3) target-specific detector based on appearance model 4) detector based on upper-body shape from depth 5) face detector 6) skin detector 7) motion detector

  16. Proposed Method Pedestrian and upper body detector using HOG:

  17. Proposed Method Face detector using OpenCV Viola-jones face detector:

  18. Proposed Method Skin color detector using threshold on HSV color space:

  19. Proposed Method Depth shape detector using world coordinate system:

  20. Proposed Method Motion detector by project motion points into image plane and threshold:

  21. Proposed Method Geometric Feature likelihood by interest point detector: is the uniform distribution

  22. Proposed Method Motion prior:

  23. Proposed Method Camera motion prior:

  24. Proposed Method Target motion prior:

  25. Proposed Method Existence prior:

  26. Proposed Method Motion prior: Independent Interacting

  27. Proposed Method Independent Motion prior : update

  28. Proposed Method Interacting Motion prior: Mode variable

  29. Proposed Method Repulsion: Group motion: Repulsion force

  30. Proposed Method Tracking by Reversible Jump Markov Chain Monte Carlo Particle filtering: • Sampling: • Convert posterior problem:

  31. Experimental result • Using ETH dataset [12] • Video frame rate ~14Hz • Resolution 640*480 pixels

  32. Experimental result • Single frame detection accuracy via overlap ratio between the ground truth bounding box and tracked bounding box.

  33. Experimental result

  34. Conclusion • Combine probabilistic model with joint variables • Relationship between the camera, targets’ and geometric features • Combine multiple cues • adaptable to different sensor configurations and different environments • Allowing people to interact • Automatically detecting people

More Related