190 likes | 306 Vues
This study presents a method for finding the best-fit upper body joint configuration in a 2D video using belief propagation. By subtracting frames, applying thresholds, and utilizing a priori data, the algorithm efficiently infers joint probabilities. It detects candidate joint states based on face detection, shoulder assumptions, and energy maps. Belief propagation is used to propagate probabilities among joint candidates, leading to faster inference. The method improves accuracy and speed by considering multiple frame probabilities and leveraging a Markov model for joint position calculations. The approach is enhanced by discarding unlikely states, fitting skin color and wrist locations, and enabling smoother joint transitions. Additionally, the study discusses the possibility of converting 2D data to 3D using joint projections.
E N D
Multiple Frame Motion Inference Using Belief Propagation Jiang GaoJianbo Shi Presented By: Gilad Kapelushnik Visual Recognition, Spring 2005, Technion IIT.
Abstract S4(X,Y) • Find “best fit” upper body joint configuration. • Input is a 2D video • Each joint is described by its location on a 2D grid. S5(X,Y) S1(X,Y) S2(X,Y) S6(X,Y) S3(X,Y) Let J be a joint configuration – {S1,S2,S3,S4,S5,S6} We would like to find:
Motion Energy Image • Step 1: Subtract two sequential frames. • Step 2: Apply threshold.
From #NrgPixels To Probability • Sum the Energy Pixels in the Patch. • Calculate probability using the following: S5(10,60) S6(40,30)
0.12 0.84 0.19 0.68 0.02 Main Idea • Find configuration J with the highest probability. • Computing all possible probabilities is inefficient. • a-Priori data give better and faster results. • removing impossible configurations reduce inference time.
a-Priori Data • A probability table for Each P(Sx,Sy). • Compute probability at grid crossing. • Use nearest neighbor for the rest of the image. • Example: • For right arm - P(S2,S3) • Red – Low probability • Green – High probability
Detect Candidate states (1) • Face is detected using face detection algorithm. • Initial assumption of Shoulders from face and pose. • Even using BP there are too many possible states to go through. • Candidates for elbows from shoulders & Energy Map. • Candidates for Wrists from skin color model.
Red for left wrist Pink for right wrist Detect Candidate states (2) • Many states can be discarded. • Remove close candidate states. • Pros: Much faster inference. • Cons: Less accurate. • Note: This is only an option. Fits skin color and wrist location Blue for elbow
The Markov Model • Empty Circles - States - 2D positions of joints • Full Circles - Observations - Computed from energy map. • Each state correspond to an observation.
Belief Propagation (1) • Solve inference problem using an algorithm with Linear complexity. • Each joint has a vector with probabilities for each candidate. Shoulder Elbow Wrist
Belief Propagation (2) Sum over all candidates Message from k to i (all messages from the neighbors). This is actually a vector with a probability for each state. • For each iteration: • Each node sends a message to its neighbor nodes containing the “wanted” probability (for each state). • Messages are computed according to: m41 1 Normalize variable. m14 m12 A-priori Data for each state. m21 2 Observation (# of Energy pixels in patch) for each state converted to a probability. Message from i to j. m32 m23 3
1 2 Message from 1 to 2 Belief Propagation (3) - Example 4 states 2 states
Belief Propagation (4) • BP converge after 2-4 iterations (giving the right a-Priori data). • For every joint there is a probability vector for each candidate state.
Multiple Frame Probability • Multiple frame (8) is proposed for smoother transition between configurations. • Prevents joints changing their state to a different which is “far away” (Euclidian distance). • Though BP was designed to work with loopy-free models, the author stated that it worked fine. And for those who really want to know:
2D to 3D • 2D -> 3D by Taylor (2000). • Assuming (u1,v1) and (u2,v2) are projections then depth can be retrieved using the following:
Results(3) Errors accrue when 2 joints intersect each other. On some occasions, even when limbs intersect, it was possible to infer correctly.