1 / 36

The Recognition of Human Movement Using Temporal Templates

The Recognition of Human Movement Using Temporal Templates. Liat Koren. Lecture subjects. Introduction Prior work The Temporal Templates Usage example. Introduction. Computer vision trends Less image or camera motion More on labeling of action Reasons More computational power

damon
Télécharger la présentation

The Recognition of Human Movement Using Temporal Templates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Recognition of Human Movement Using Temporal Templates Liat Koren

  2. Lecture subjects • Introduction • Prior work • The Temporal Templates • Usage example

  3. Introduction • Computer vision trends • Less image or camera motion • More on labeling of action • Reasons • More computational power • Wireless application • Interactive environments

  4. Introduction – cont. • Recent efforts are in Three Dimensional object reconstruction • Assuming it will have to be used in the recognition of human motion. • This article claims otherwise • View-based approach • Direct recognition

  5. Motivating Example

  6. Motivating Example

  7. Motivating Example • Static pictures • Hard to recognize. • Sequence on motion • Human can recognize without three dimensional reconstruction. • Conclusion • It is possible to recognize movement using only the motion itself.

  8. 3D Based recognition • Process • Recover the pose of the person at each time instant using a 3D model. • The model’s projected image should be as close as possible to the object(e.g. edges of body in the image) • Drawbacks • Complicated process • Human interference is usually required • Special imaging environment

  9. 2D Based recognition • Action is a sequence of static poses of object. • Requires • Normalization • Removal of background

  10. Wilson and Bobik’s approach • Actions are usually hand gestures • Representation • Actual image • Grayscale • No background • Benefits: • Hand appearance is fairly similar over a wide range of people • Problems • Actions that include the appearance of the whole body are not visually consistent across different people.

  11. Yamato’s et al. approach • Representation • No background • Black and white silhouettes • Matching • Vector quantize • Usage of a mathematical method • Benefits • Help handling the variability between people • Problems • Disappearance of movement inside the silhouette

  12. Summery of prior work • Action is a sequence of static poses. • Requires individual features or properties that can be extracted and tracked from each frame. • Recognition of movement from a sequence of images is a complicated task. • Usually requires previous recognition and segmentation of the person.

  13. Motion based recognition • Attempt to characterize the motion itself without reference to the underlying static poses of the body. • Possible approaches • Blob like representation • Tracking of predefined regions (e.g., legs, head, mouth) using motion. • Face expression patches • Whole body patches • Measure typical patterns of muscle activation

  14. Terms • Movement • where– motion has occurred in image sequence. • MEI – Motion Energy Image • how – the motion is moving. • MHI – Motion History Image + Temporal Templates

  15. Temporal Templates • Representation of movement • View specific • Movement is motion in time • Vector image that can be matched against stored representations of movements. • Assumptions • Background is static • Camera movements can be removed • Motion of irrelevant objects can be eliminated

  16. Motion-Energy Images where did the movement occurred ….

  17. Motion-Energy Images • Notice that: • If τis very big, all the differences are accumulated • Τ has a vast influence on the temporal representation of a movement.

  18. Motion-Energy Images • Smooth change in the viewing angle causes a smooth change in the viewed image, thus coarse sampling of the viewing circle is enough (30°)

  19. Motion-History Images • Intensity of a pixel represents the temporal history in that pixel. • Newer movement is brighter.

  20. Motion-History Images One may wonder, why not use only MHI ?Answers will be given later… • A time-window of size τ is used – movement older than τ is ignored. • The results of the article uses a simple replacement and decay operator: Notice that MEI can be calculated out of MHI by painting in white any non-black pixel

  21. MEI and MHI in a nutshell • MEI and MHI are two vector images designed to encode a variety of motion properties. • Benefits in this representation is that the calculation is recursive, thus only up-to-date information need to be stored, making the computation both fast and space efficient.

  22. Matching Temporal Templates • Collect training examples of each movement from a variety of viewing angles. • Compute statistical representation of the MHI/MEI images (Hu moments) • Given an input movement: • Calculate a statistical representation • Use mahalanobis distance to find a stored movement, that is the nearest to the input.

  23. Mahalanobis Distance Example

  24. Reasoning for the algorithm • Mahanobis distance provides: • Good matching as shown in the results of the article. • Simple calculation which makes real-time applications feasible. • Hu moments allow representation of images, that is invariant to scale or translation. One problem with Hu moments is that: “Hu moments are difficult to reason about intuitively” (the authors)

  25. Testing the system 18 exercises performed by experienced aerobic instructor. MEIs are on the bottom rows.

  26. Why both MHI and MEI ? Because MHI and MEI perceive two different characteristics of the movement (the “where” and the “how”) they look different ,and thus, both essential.

  27. 30° 90° 60° 120° 150° 180° 0° First experiment • Input 30° left of the subject • Match against all seven views of all 18 moves • 12 out of 18 are correctly recognized

  28. Analyze the results of 1st exp. false correct input Move 13 in 30 ° Move 6 in 0 ° The correct match

  29. Combining multiple views • Two cameras with orthogonal views • Minimize the sum of the mahalanobis distance between the two input templates and two stored views of movement that have 90° between them. • Hidden assumption: we know the angular relationship between the cameras.

  30. 120° 30° 90° 60° 150° 0° Second Experiment • Input with two cameras: • 30° left of the subject • 60° right of the subject • Match against all seven views of all 18 moves • 15 out of 18 are correctly recognized

  31. Analyze the results of 2nd exp. false correct input The correct match Move 16 Move 15

  32. Segmentation and Recognition • Problem : speed of performance is different among different people. • Solution: Segmentation • When training the system, calculate τmax and τmin for each movement. • Use algorithm to match over a wide range of τ.

  33. Problems • Problems with current system • One person partially occludes another • Solution: Use several cameras • More than one person appears in the view point • Solution: use a tracking bounding box

  34. More Problems • Motion of part of the body is not specified during a movement • Possible solutions • Automatically mask away regions of this type of motion • Always include them • Camera motion • Rather easy to eliminate since camera motion is limited. • Person is performing the movement while locomotion

  35. The KidsRoom: An Application • room is aware of the children (at most 4) • The room takes the children to a story. • The room’s reaction is influenced by the actions of the children. • Current story : adventurous tour to monster land • In the last scene the monsters teach the children to dance. • Then, the monsters follow the children if they perform movements they “know” • The narration coerces the children to room locations where occlusions is not a problem

  36. The End

More Related