210 likes | 334 Vues
This paper presents KLBoosting, an advanced boosting method that utilizes Kullback-Leibler divergence for optimal feature selection, enhancing the performance of classifiers in tasks like face detection. The algorithm iteratively selects features and adjusts training sample weights to minimize classification error. Starting with a pool of candidate features, it selects the most relevant ones, refining the feature set through sequential optimization and linear projections. By comparing KLBoosting to realBoost and AdaBoost, the research demonstrates significant improvements, particularly in generating optimal features for robust face detection algorithms.
E N D
Kullback-Leibler Boosting Ce Liu, Hueng-Yeung Shum Microsoft Research Asia CVPR 2003 Presented by Derek Hoiem
RealBoost Review • Start with some candidate feature set • Initialize training sample weights • Loop: • Add feature to minimize error bound • Reweight training examples, giving more weight to misclassified examples • Assign weight to weak classifier according to weighted error of training samples • Exit loop after N features have been added
The Basic Idea of KLBoosting • Similar to RealBoost except: • Features are general linear projections • Generates optimal features • Uses KL divergence to select features • Finer tuning on coefficients
Linear Features • KLBoosting: • VJ Adaboost:
What makes a feature good? • KLBoosting: • RealBoost: • Minimize upper bound on classification error
Creating the feature set • Sequential 1-D Optimization • Begin with large initial set of features (linear projections) • Choose top L features according to KL-Div • Initial feature = weighted sum of L features • Search for optimal feature in directions of L features
Example • Initial feature set: x x x x x x x x
Example • Top two features (by KL-Div): x x x x x x x x w1 w2
Example • Initial feature (weighted combo by KL): x x x x x x x x w1 f0 w2
Example • Optimize over w1 x x x x x x x x f1= f0 + B* w1 w1 f1 w2 B = -a1..a1
Example • Optimize over w2 x x x x x x x x f2= f1 + B* w2 w1 f2 w2 B = -a2..a2 (and repeat…)
Creating the feature set First three features Selecting the first feature
Classification = ½ in RealBoost
Parameter Learning • With each added feature k: • Set first a1..ak-1 to current optimal value • Set ak to 0 • Minimize recognition error on training: • Solve using greedy algorithm
KLBoost vs AdaBoost 1024 candidate features for AdaBoost
Face detection: candidate features 52,400 2,800450
Face detection: training samples • 8760 faces + mirror images • 2484 non-face images 1.34B patches • Cascaded classifier allows bootstrapping
Face detection: final features top ten global semantic global not semantic local
Results Test time: .4 sec per 320x240 image x x x x 8 85 853 Schneiderman (2003)
Comments • Training time? • Which improves performance: • Generating optimal features? • KL feature selection? • Optimizing alpha coefficients?