220 likes | 360 Vues
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING. Salih Burak Gokturk. OVERVIEW. PROBLEM DESCRIPTION TRAINING STAGE TESTING STAGE EXPERIMENTS CONCLUSION. Components of the recognition system. Training with stereo. Data. Classifier. New Data. Testing with mono. Output.
E N D
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING Salih Burak Gokturk
OVERVIEW • PROBLEM DESCRIPTION • TRAINING STAGE • TESTING STAGE • EXPERIMENTS • CONCLUSION
Components of the recognition system Training with stereo Data Classifier New Data Testing with mono Output Analysis -Face Tracking • Intelligence • Support Vector Machine • Classifier Shape Parameters
? PROBLEM DESCRIPTION(Tracking )
X(t) [ Rigid, Open Mouth, Smile] ? [ Rigid, Open Mouth, Smile] PROBLEM DESCRIPTION (Recognition) Training Testing Data Classifier New Data Output
OVERVIEW • PROBLEM DESCRIPTION • TRAINING STAGE • TESTING STAGE • EXPERIMENTS • CONCLUSION
Monocular Tracking And Classification Stereo Tracking Learn Shape Data • p - degrees of freedom
Support Vector Machines (SVM) Training Testing (Classifier) Data Classifier New Data Output - Best discriminating hypersurface between two class of objects • Map the data to high dimension • using a map function • The hypersurface in the feature • space corresponds to a hyperplane • in the mapped space
OVERVIEW • PROBLEM DESCRIPTION • TRAINING STAGE • TESTING STAGE • EXPERIMENTS • CONCLUSION
I(x(t)) LUKAS TOMASI KANADE OPTICAL FLOW TRACKER EXTENDED TO 3D X(t+1) X(t) ? TIME t+1 I(t+1)
One to Many Application of Support Vector Machines (SVM) - One hypersurface per class is calculated - A new data is tested for each hypersurface - A different probability is assigned to ith class
OVERVIEW • PROBLEM DESCRIPTION • TRAINING STAGE • TESTING STAGE • EXPERIMENTS • CONCLUSION
Training (Stereo) with 2 people, totally 240 frames • Testing with 3 people • 5 expressions: neutral, open mouth, close mouth, • smile, raise eyebrow • velocity term is added to the shape vector: • Two other classifiers were tested: • 1 - Clustering 2 – N-Nearest Neighbor
Decision of the system Input Neutral Open mouth Close mouth Smile Raise eyebrow Neutral (44) 32 6 3 0 3 Open mouth (80) 0 76 4 0 0 Close Mouth (50) 0 1 49 0 0 Smile (87) 2 0 0 81 4 Raise Eyebrow (21) 3 0 0 0 18 Performance of the system for different expressions Table 1
Comparison Between Different Methods SVM with kernel erbf SVM with kernel rbf Clustering N-Nearest with N=9 N-Nearest with N=5 Same person 176/182 170/182 161/182 173/182 173/182 Total 256/282 253/282 242/283 255/282 253/282 Table 2
Comparison Between Different Methods with only one person training set SVM with kernel erbf SVM with kernel rbf Clustering N-Nearest with N=9 N-Nearest with N=5 Same person 98/110 99/110 109/110 109/110 110/110 Total 216/282 207/282 233/282 231/282 229/282 • Training (Stereo) with 1 person, totally 130 frames • Testing with 3 people • 5 expressions: neutral, open mouth, close mouth, • smile, raise eyebrow Table 3
Comparison Between Different Methods with three emotional expressions SVM with kernel erbf SVM with kernel rbf Clustering N-Nearest with N=9 N-Nearest with N=5 N-Nearest with N=3 N-Nearest with N=1 Same person 164/165 165/165 152/165 163/165 164/165 164/165 164/165 Total 222/228 223/228 213/228 225/228 224/228 223/228 223/228 • Training (Stereo) with 2 people, totally 240 frames • Testing with 3 people • 3 emotional expressions: neutral, happy, surprise • Transition between expressions are separated Table 4
Performance Comparison Between Previous Expression Recognition Work Recognition Rate Pose Change Number of Expressions Test/Train Subject Number of Data Comments Chen et.al, ICME 2000 %89 Direct camera view 7 Different subject 470 images Problem with different people Wang et.al, AFGR 1998 %96 Direct camera view 3 Different subject 29 image sequence Sequence classification (easier) Lien et.al, AFGR 1998 %85-%93 ~10 degrees rotation 4 Different subject ~130 images Only upper part of the face is classified Hiroshi et.al, ICPR 1996 %70 ~45-60 degrees rotation 5 Same subject 900 images Permits for rotations, but rates are not as good Chang et.al, IJCNN 1999 %92 Direct camera view 3 Different subject 38 images Small test and training set Matsuno et.al, ICCV 1995 %80 Direct camera view 4 Different subject 45 images Small test and training set Hong et.al, AFGR 1998 %65-%85 Direct camera view 7 Same and different subject ~250 images %85 with known person % 65 with unknown person Hong et.al, AFGR 1998 %81-%97 Direct camera view 3 Same and different subject ~250 images %97 with known person % 81 with unknown person Sakaguchi et.al, ICPR 1996 %84 Direct camera view 6 Same subject - The test and training set not mentioned Our Work %91 ~70-80 degrees rotation 5 Different subject 282 images Table 2 Our Work %98 ~70-80 degrees rotation 3 Different subject 228 images Table 4 - Emotional Expressions
OVERVIEW • PROBLEM DESCRIPTION • TRAINING STAGE • TESTING STAGE • EXPERIMENTS • CONCLUSION
Conclusions • Breakthrough facial expression recognition rates . • 3-D is the right way to go… Future Work • Test with more subjects and expressions. • further application to face recognition (?)