250 likes | 473 Vues
Statistical Learning of Multi-View Face Detection. Microsoft Research Asia Stan Li, Long Zhu, Zhen Qiu Zhang, Andrew Blake, Hong Jiang Zhang, Harry Shum Presented by Derek Hoiem. Overview. Viola-Jones AdaBoost FloatBoost Approach Multi-View Face Detection FloatBoost Results
E N D
Statistical Learning of Multi-View Face Detection Microsoft Research Asia Stan Li, Long Zhu, Zhen Qiu Zhang, Andrew Blake, Hong Jiang Zhang, Harry Shum Presented by Derek Hoiem
Overview • Viola-Jones AdaBoost • FloatBoost Approach • Multi-View Face Detection • FloatBoost Results • FloatBoost vs. AdaBoost • FloatBoost Discussion
Face Detection Overview • Evaluate windows at all locations in many scales Classifier Non-Object Object
Viola-Jones AdaBoost • Weak classifiers formed out of simple features • In sequential stages, features are selected and weak classifiers trained with emphasis on misclassified examples • Integral images and a cascaded classifier allow real-time face detection
Viola-Jones Features • For a 24 x 24 image: 190,800 semi-continuous features • Computed in constant time using integral image • Weak classifiers consist of filter response threshold Vertical Horizontal On-Off-On Diagonal
Integral Image y = I8 – I7– I6 + I5+ I4 – I3 – I2 + I1 I( x1, y1 ) I( x2, y2 ) I( x4, y4 ) I( x3, y3 ) I( x5, y5 ) I( x6, y6 ) I( x7, y7 ) I( x8, y8 )
Cascade of Classifiers Input Signal (Image Window) 40% Stage 1 1 Weak Classifier 60% 40% Stage 2 5 Weak Classifiers Class 2 (Non-Face) 60% 99.999% … 0.001% 40% Stage N 1200 Weak Classifiers Class 1 (Face)
Viola-Jones AdaBoost Algorithm • Strong classifier formed from weak classifiers: • At each stage, new weak classifier chosen to minimize bound on classification error (confidence weighted): • This gives the form for our weak classifier:
Viola-Jones AdaBoostPros and Cons • Very fast • Moderately high accuracy • Simple implementation/concept • Greedy search through feature space • Highly constrained features • Very high training time
FloatBoost • Weak classifiers formed out of simple features • In each stage, the weak classifier that reduces error most is added • In each stage, if any previously added classifier contributes to error reduction less than the latest addition, this classifier is removed • Result is a smaller feature set with same classification accuracy
MS FloatBoost Features • For a 20 x 20 image: over 290,000 features (~500K ?) • Computed in constant time using integral image • Weak classifiers consist of filter response threshold Microsoft Viola-Jones
FloatBoost Weak Classifiers • Can be portrayed as density estimation on single variables using average shifted histograms with weighted examples • Each weak classifier is a 2-bin histogram from weighted examples • Weights serve to eliminate overcounting due to dependent variables • Strong classifier is a combination of estimated weighted PDFs for selected features
Multi-View Face DetectionHead Rotations In-Plane Rotations: -45 to 45 degrees Out of Plane Rotation: -90 to 90 degrees Moderate Nodding
Multi-View Face DetectionMerging Results Frontal Right Side Left Side
Multi-View Face DetectionSummary • Simple, rectangular features used • FloatBoost selects and trains weak classifiers • A cascade of strong classifiers makes up the overall detector • A coarse-to-fine evaluation is used to efficiently find a broad range of out-of-plane rotated faces
Results: Frontal (MIT+CMU) FloatBoost/AdaBoost/RBK • 20x20 images • 3000 original faces, 6000 total • 100,000 non-faces Schneiderman FloatBoost FloatBoost vs. Adaboost
Results: MS Adaboost vs. Viola-Jones Adaboost • More flexible features • Confidence-weighted AdaBoost • Smaller image size
Results: Profile No Quantitative Results!!!
FloatBoost vs. AdaBoost • FloatBoost finds a more potent set of weak classifiers through a less greedy search • FloatBoost results in a faster, more accurate classifier • FloatBoost requires longer training times (5 times longer)
FloatBoost vs. AdaBoost 1 Strong Classifier, 4000 objects, 4000 non-objects, 99.5% fixed detection
FloatBoost: Pros • Very Fast Detection (5 fps multi-view) • Fairly High Accuracy • Simple Implementation
FloatBoost: Cons • Very long training time • Not highest accuracy • Does it work well for non-frontal faces and other objects?