Object Detection Using Semi-Naïve Bayes to Model Sparse Structure

Object Detection Using Semi-Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics InstituteCarnegie Mellon University

Object Detection • Find all instances of object X (e.g. X = human faces)

Examples of Detected Objects

Chosen variable Sparse Structure of Statistical Dependency Chosen variable Chosen variable

Sparse Structure of Statistical Dependency Chosen coefficient Chosen coefficient Chosen coefficient

Detection using a Classifier “Object is present” (at fixed size and alignment) Classifier “Object is NOT present”(at fixed size and alignment)

e.g. S1 = (x21, x34, x65, x73, x123) S2 = (x3, x8, x17, x65, x73, x111) Proposed Model: Semi-Naïve Bayes input variables subsets • Kononenko (1991), Pazzini (1996), Domingos and Pazzini (1997), Rokach and Maimon (2001)

Goal: Automatic subset grouping S1 = (x21, x34, x65, x73, x123) S2 = (x3, x8, x17, x65, x73, x111) . . . Sn = (x14, x16, x17, x23, x85, x101, x103, x107)

Generation of Subsets • “modeling error for assuming independence” q is size of the subset

Generation of Subsets • Selection of variables - “discrimination power” q is size of the subset

Pair-Wise Measurement • pair-wise measurements Pair-affinity

Visualization of C(x,*)(frontal faces) x x x

Measure over a Subset Subset-affinity

Generation of Candidate Subsets x1 x2 x3 . . . . . . . . . . . . . . . . . . . . . . . . xm C(x1, x2) C(x1, x3) . . . . . . . . C(xm-1, xm) Heuristic search and selective evaluation of D(Si) S1 S2 . . . . . . . . . . . . . . . . . . Sp

subset size vs. modeling power • Model complexity limited by number of training examples, etc. • Examples of limited modeling power • 5 modes in a mixture model • 7 projection onto principal components

Log-likelihood function = Table Si = (xi1, xi2, . . ., xiq) vector quantization table look-up

Sub-Classifier Training by Counting fi Pi (fi |w1) fi Pi(fi |w2)

Example of VQ xi1 xi2 xi3 . . . xiq projection on to 3 principal components c1 c2 c3 quantization to m levels z1 z2 z3 f = z1m0 + z2m1 + z3m2

Candidatelog-likelihoodfunctions h1(S1) h2(S2) . . . hP(SP) Evaluate on training data E1,w1 E1,w2 E2,w1 E2,w2 . . . Ep,w1 Ep,w2 Evaluate ROCs ROC1ROC2 . . . ROCP Order top Q log-likelihoodfunctions hj1(Sj1) hj2(Sj2) . . . hjQ(SjQ)

hj1(Sj1) + h1(S1) . . . hjQ(SjQ) + hp(Sp) Form pQ pairsof log-likelihoodfunctions Sum Evaluations Ej1,w1+E1,w1Ej1,w2+ E1,w2 . . . EjQ,w1+Ep,w1 EjQ,w2+ Ep,w2 Evaluate ROCs ROC1 . . . ROCQP Order top Q pairs of log-likelihoodfunctions hk1,1(Sk1,1) + hk1,2(Sk1,2) . . . hkQ,1(SkQ,1) + hkQ,2(SkQ,2) . . . Repeat for n iterations

Cross-Validation Selects Classifier Q Candidates: H1(x1, x2, . . ., xr) = hk1,1(Sk1,1) + hk1,2(Sk1,2) +. . .+ hk1,n(Sk1,n) . . . HQ(x1, x2, . . ., xr) = hkQ,1(SkQ,1) + hkQ,2(SkQ,2) + . . . + hQ,n(SkQ,n) H1(x1, x2, . . ., xr) . . . HQ(x1, x2, . . ., xr) Cross-validation H*(x1, x2, . . ., xr)

Example subsets learned for telephones

Evaluation of Classifier “Object is present” (at fixed size and alignment) Classifier “Object is NOT present”(at fixed size and alignment)

1) Compute feature values f1 = #5710 f2 = #3214 fn = #723

P2( #3214 | w1) Pn( #723 | w1) log log = 0.03 = 0.23 P2( #3214 | w2) Pn( #723 | w2) 2) Look-Up Log-Likelihoods P1( #5710 | w1) f1 = #5710 log = 0.53 P1( #5710 | w2) f2 = #3214 fn = #723

P2( #3214 | w1) Pn( #723 | w1) log log = 0.23 = 0.03 P2( #3214 | w2) Pn( #723 | w2) 3) Make Decision P1( #5710 | w1) log = 0.53 P1( #5710 | w2) > l 0.53 + 0.03 + . . . + 0.23 S <

Detection using a Classifier “Object is present” (at fixed size and alignment) Classifier “Object is NOT present”(at fixed size and alignment)

View-based Classifiers FaceClassifier #1 FaceClassifier #2 FaceClassifier #3

Search in scale Detection: Apply Classifier Exhaustively Search in position

P2( #3214 | w1) Pn( #723 | w1) log log = 0.23 = 0.03 P2( #3214 | w2) Pn( #723 | w2) Decision can be made by partial evaluation P1( #5710 | w1) log = 0.53 P1( #5710 | w2) > l 0.53 + 0.03 + . . . + 0.23 S <

Detection Computational Strategy Apply log [p1(S1|w1) / p1(S1|w2)]exhaustively to scaled input image Apply log [p3(S3|w1) / p3(S3|w2)]further reduced search space Apply log [p2(S2|w1) / p2(S2|w2)]reduced search space Computational strategy changes with size of search space

Compute M2 feature values Look-up M2 log-likelihood values Repeat for N2 Candidates Candidate-Based Evaluation

Compute N2 + M2 +2MN feature values Look-up M2 log-likelihood values Repeat for N2 Candidates Feature-Based Evaluation

Adaboost using confidence-rated predictions [Shapire and Singer, 1999] Cross-validationimages Images that donot contain object Bootstrapping [Sung and Poggio, 1995] Cascade Implementation Create candidate subsets Train candidate log-likelihood functions Training images of non-object Training imagesof object Select log-likelihood functions Retrain selected log-likelihoodfunctions using Adaboost Determine detection threshold Automatically select non-objectexamples for next stage Increment stage

Face, eye, ear detection

Frontal Face Detection • MIT-CMU Frontal Face Test Set [Sung and Poggio, 1995; Rowley, Baluja and Kanade, 1997] • 180 ms 300x200 image • 400 ms 300x500 image • Top Rank Video TREC 2002 Face Detection • Top Rank 2002 ARDA VACE Face Detection algorithm evaluation AMD Athalon 1.2GHz

Face & Eye Detection for Red-Eye Removal from Consumer Photos CMU Face Detector

Eye Detection • Experiments performed independently at NIST • Sequested data set: 29,627 mugshots • Eyes correctly located (radius of 15 pixels) 98.2% (assumed one face per image) • Thanks to Jonathon Phillips, Patrick Grother, and Sam Trahan for their assistance in running these experiments

Realistic Facial Manipulation:Earring Example With Jason Pinto

Telephone Detection

Cart, pose 1

Cart, pose 2

Cart, pose 3

Door Handle Detection

Summary of Classifier Design • Sparse structure of statistical dependency in many image classification problem • Semi-naïve Bayes Model • Automatic learning structure of semi-naïve Bayes classifier: • Generation of many candidate subsets • Competition among many log-likelihood functions to find best combination CMU on-line face detector:http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

Object Detection Using Semi-Naïve Bayes to Model Sparse Structure