Bayesian Decision Theory Case Studies

# Bayesian Decision Theory Case Studies

## Bayesian Decision Theory Case Studies

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Bayesian Decision TheoryCase Studies CS479/679 Pattern RecognitionDr. George Bebis

2. Case Study I • A. Madabhushi and J. Aggarwal, A bayesian approach to human activity recognition, 2nd International Workshop on Visual Surveillance, pp. 25-30, June 1999.

3. Human activity recognition • Recognize human actions using visual information. • Useful for monitoring of human activity in department stores, airports, high-security buildings etc. • Building systems that can recognize any type of action is a difficult and challenging problem.

4. Goal • Build a system that is capable of recognizing the following 10 (ten) actions, from a frontal or lateralview: • sitting down • standing up • bending down • getting up • hugging • squatting • rising from a squatting position • bending sideways • falling backward • walking

5. Rationale and Approach • Rationale • People sit, stand, walk, bend down, and get up in a more or less similar fashion. • Human actions can be recognized by tracking various body parts. • Head motion trajectory • The head of a person moves in a characteristic fashion during these actions. • Recognition is formulated as Bayesian classification using the movement of the head over consecutive frames.

6. Strengths and Weaknesses • Strengths • The system can recognize actions where the gait of the subject in the test sequence differs considerably from the training sequences. • Also, it can recognize actions for people of varying physical structure (i.e., tall, short, fat, thin etc.). • Weaknesses • Only actions in the frontal or lateral view can be recognized successfully by this system. • Certain assumptions might not be valid.

7. Main Steps input output

8. Action Representation • Estimate the centroid of the head in each frame: • Find the absolute differences in successive frames: | | | |

9. Head Detection and Tracking • The centroid of the head is tracked from frame to frame. • Accurate head detection and tracking are crucial. • Detection was performed manually here.

10. Bayesian Formulation • Given an input sequence, the posterior probabilities are computed for each action using the Bayes rule: Assumption:

11. Probability Density Estimation • Feature vectors X and Y are assumed to be independent (valid?), following a multi-variateGaussian distribution:

12. Probability Density Estimation (cont’d) • The samplecovariance matrices are used to estimate ΣXand ΣY : • Two distributions are estimated for each action corresponding to the frontal and lateral views (i.e., 20 densities total). ΣX ΣY

13. Recognition • Given an input sequence, the posterior probabilities are computed for each of the stored actions (i.e., 20 values). • The input action is classified based on the most likely action:

14. Discriminating Similar Actions • In some actions, the head moves in a similar fashion, making it difficult to distinguish these actions from one another; for example: (1) The head moves downward without much sideward deviation in the following actions: * squatting * sitting down * bending down

15. Discriminating Similar Actions (cont’d) (2) The head moves upward without much sideward deviation in the following actions: * standing up * rising * getting up • A number of heuristics are used to distinguish among these actions. • e.g., when bending down, the head goes much lower than when sitting down.

16. Training • A fixed CCD camera working at 2 frames per second was used to obtain the training sequences. • People of diverse physical appearance were used to model the actions. • Subjects were asked to perform the actions at a comfortable pace.

17. Training (cont’d) • To train the system, 38 sequences were taken of each person performing all the actions of interest in both the frontal and lateral views. • It was found that each action can be completed within 10 frames. • Only the first 10 frames from each sequence were used for training/testing (i.e., 5 seconds)

18. Testing • For testing, 39 sequences were used. • Of the 39 sequences, 31 were classified correctly. • Of the 8 sequences classified incorrectly, 6 were assigned to the correct action but to the wrong view.

19. Results (cont’d)

20. Practical Issues • How would you find the first and last frames of an action in general (segmentation)? • Is the system robust to recognizing an action from incomplete sequences (i.e., assuming that several frames are missing)? • Current system is unable to recognize several actions at the same time.

21. Extension • J. Usabiaga, G. Bebis, A. Erol, MirceaNicolescu, and Monica Nicolescu, "Recognizing Simple Human Actions Using 3D Head Trajectories", Computational Intelligence, vol. 23, no. 4, pp. 484-496, 2007.

22. Case Study II • J. Yang and A. Waibel, A Real-time Face Tracker, Proceedings of WACV'96, 1996.

23. Goal and Steps • Goal • Build a system that can detect and track a person’s face while the person moves freely in a room. • Main Steps (1) Detectarbitrary human faces in various environments using a generic skin-color model. (2) Trackthe face of interest by controlling the camera position and zoom. (3) Adaptskin-color model parameters based on individual appearance and lighting conditions.

24. System Components • A probabilistic model to characterize skin-color distributions of human faces. • A motion model to estimate human motion and to predict search window in the next frame. • A camera model to predict camera motion (i.e., camera’s response was much slower than frame rate).

25. Search Window

26. Why Using Skin Color for Face Detection? • Traditional systems performed face detection using template matching or facial features. • Using skin-color leads to a faster and more robust approach compared to template matching or facial feature extraction.

27. Challenges Using Skin Color • Human skin colors differ from person to person. • The color representation of a face obtained by a camera is influenced by many factors (e.g., ambient light, motion etc.) • Different cameras produce significantly different color values, even for the same person under the same lighting conditions.

28. Chromatic Color Space • RGB is not the best color representation for characterizing skin-color (i.e., it represents not only color but also brightness). • Represent skin-color in the chromatic space which is defined from the RGB space as follows: (the normalized blue component is redundant since r + g + b = 1)

29. Skin-Color Clustering • Skin colors do not fall randomly in chromatic color space but form clusters at specific points.

30. Skin-Color Clustering (cont’d) • Distributions of skin-colors of different people are clustered in chromatic color space • i.e., they differ much less in color than in brightness (skin-color distribution of 40 people - different races)

31. Skin-Color Model • Experiments (i.e., assuming different lighting conditions and different persons) have shown that the skin-color distribution has a regular shape. • Idea: represent skin-color distribution using a Gaussian with mean μ and covariance Σ:

32. Parameter Estimation • Select skin-color regions from a set of face images. • Estimate the mean and covariance of skin-color distribution using the sample mean and covariance:

33. Face detection using the skin-color model • Each pixel x in the input image is converted into the chromatic color space and compared with the distribution of the skin-color model.

34. Example

35. Dealing with skin-color-like objects • It is impossible in general to detect only faces simply from the result of color matching • e.g., background may contain skin colors

36. Dealing with skin-color-like objects (cont’d) • Additional information should be used for rejecting false positives(e.g., geometric features, motion etc.)

37. Skin-color model adaptation • If a person is moving, the apparent skin colors change as the person’s position relative to the camera or light changes. • Idea: adapt model parameters to handle these changes.

38. Skin-color model adaptation (cont’d) • N determines how long the past parameters will influence the current parameters. • The weighting factors ai, bi, ci determine how much the past parameters will influence current parameters. = =

39. System initialization • Automatic mode • A general skin-color model is used to identify skin-color regions. • Motion and shape information is used to reject non-face regions. • The largest face region is selected (face closest to the camera). • Skin-color model is adapted to the face being tracked.

40. System initialization (cont’d) • Interactive mode • The user selects a point on the face of interest using the mouse. • The tracker searches around the point to find the face using a general skin-color model. • Skin-color model is adapted to the face being tracked.

41. Tracking Speed