1 / 106

Recognition Part I

Recognition Part I. CSE 576. What we have seen so far: Vision as Measurement Device. Real-time stereo on Mars. Physics-based Vision. Virtualized Reality. Structure from Motion. Slide Credit: Alyosha Efros. Visual Recognition. What does it mean to “see”? “What” is “where”, Marr 1982

nura
Télécharger la présentation

Recognition Part I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RecognitionPart I CSE 576

  2. What we have seen so far:Vision as Measurement Device Real-time stereo on Mars Physics-based Vision Virtualized Reality Structure from Motion Slide Credit: AlyoshaEfros

  3. Visual Recognition • What does it mean to “see”? • “What” is “where”, Marr 1982 • Get computers to “see”

  4. Visual Recognition Verification Is this a car?

  5. Visual Recognition Classification: Is there a car in this picture?

  6. Visual Recognition Detection: Where is the car in this picture?

  7. Visual Recognition Pose Estimation:

  8. Visual Recognition Activity Recognition: What is he doing? What is he doing?

  9. Visual Recognition Object Categorization: Sky Person Tree Horse Car Person Bicycle Road

  10. Visual Recognition Segmentation Sky Tree Car Person

  11. Find the chair in this image This is a chair Object recognitionIs it really so hard? Output of normalized correlation

  12. Find the chair in this image Object recognitionIs it really so hard? Pretty much garbage Simple template matching is not going to make it

  13. Challenges 1: view point variation slide by Fei Fei, Fergus & Torralba Michelangelo 1475-1564

  14. Challenges 2: illumination slide credit: S. Ullman

  15. Challenges 3: occlusion slide by Fei Fei, Fergus & Torralba Magritte, 1957

  16. Challenges 4: scale slide by Fei Fei, Fergus & Torralba

  17. Challenges 5: deformation slide by Fei Fei, Fergus & Torralba Xu, Beihong 1943

  18. Challenges 6: background clutter slide by Fei Fei, Fergus & Torralba Klimt, 1913

  19. Challenges 7: object intra-class variation slide by Fei-Fei, Fergus & Torralba

  20. Let’s start with finding Faces How to tell if a face is present?

  21. One simple method: skin detection skin • Skin classifier • A pixel X = (R,G,B) is skin if it is in the skin region • But how to find this region? • Skin pixels have a distinctive range of colors • Corresponds to region(s) in RGB color space • for visualization, only R and G components are shown above

  22. Skin classifier • Given X = (R,G,B): how to determine if it is skin or not? Skin detection • Learn the skin region from examples • Manually label pixels in one or more “training images” as skin or not skin • Plot the training data in RGB space • skin pixels shown in orange, non-skin pixels shown in blue • some skin pixels may be outside the region, non-skin pixels inside. Why?

  23. Skin classification techniques Skin Skin • Skin classifier • Given X = (R,G,B): how to determine if it is skin or not? • Nearest neighbor • find labeled pixel closest to X • choose the label for that pixel • Data modeling • Model the distribution that generates the data (Generative) • Model the boundary (Discriminative)

  24. Classification • Probabilistic • Supervised Learning • Discriminative vs. Generative • Ensemble methods • Linear models • Non-linear models

  25. Let’s play with probability for a bitRemembering simple stuff

  26. Probability continuous X discrete X • Basic probability • X is a random variable • P(X) is the probability that X achieves a certain value • or • Conditional probability: P(X | Y) • probability of X given that we already know Y

  27. Thumbtack & Probabilities … D={xi|i=1…n}, P(D|θ) = ΠiP(xi |θ) • P(Heads) = , P(Tails) = 1- • Flips are i.i.d.: • Independent events • Identically distributed according to Binomial distribution • Sequence D of H Heads and T Tails

  28. Maximum Likelihood Estimation • Data: Observed set D of H Heads and T Tails • Hypothesis:Binomial distribution • Learning: finding is an optimization problem • What’s the objective function? • MLE: Choose  to maximize probability of D

  29. Parameter learning Set derivative to zero, and solve!

  30. But, how many flips do I need? 3 heads and 2 tails. = 3/5, I can prove it! What if I flipped 30 heads and 20 tails? Same answer, I can prove it! What’s better? Umm… The more the merrier???

  31. A bound (from Hoeffding’sinequality) Prob. of Mistake Exponential Decay! N For N=H+T, and Let *be the true parameter, for any >0:

  32. What if I have prior beliefs? In the beginning After observations Observe flips e.g.: {tails, tails} • Wait, I know that the thumbtack is “close” to 50-50. What can you do for me now? • Rather than estimating a single , we obtain a distribution over possible values of 

  33. How to use Prior Data Likelihood Normalization Posterior Prior • Or equivalently: • Also, for uniform priors:  reduces to MLE objective Use Bayes rule:

  34. Beta prior distribution – P() Likelihood function: Posterior:

  35. MAP for Beta distribution MAP: use most likely parameter:

  36. What about continuous variables?

  37. We like Gaussians because • Affine transformation (multiplying by scalar and adding a constant) are Gaussian • X ~ N(,2) • Y = aX + b Y ~ N(a+b,a22) • Sum of Gaussians is Gaussian • X ~ N(X,2X) • Y ~ N(Y,2Y) • Z = X+Y Z ~ N(X+Y, 2X+2Y) • Easy to differentiate

  38. Learning a Gaussian • Collect a bunch of data • Hopefully, i.i.d. samples • e.g., exam scores • Learn parameters • Mean: μ • Variance: σ

  39. MLE for Gaussian: • Log-likelihood of data: Prob. of i.i.d. samples D={x1,…,xN}:

  40. MLE for mean of a Gaussian What’s MLE for mean?

  41. MLE for variance Again, set derivative to zero:

  42. Learning Gaussian parameters MLE:

  43. Fitting a Gaussian to Skin samples

  44. Skin detection results

  45. Supervised Learning: find f Given: Training set {(xi,yi) | i= 1 … n} Find: A good approximation to f : X  Y What is x? What is y?

  46. Simple Example: Digit Recognition 0 1 2 1 Screw You, I want to use Pixels :D ?? • Input: images / pixel grids • Output: a digit 0-9 • Setup: • Get a large collection of example images, each labeled with a digit • Note: someone has to hand label all this data! • Want to learn to predict labels of new, future digit images • Features: ?

  47. Lets take a probabilistic approach!!! • Can we directly estimate the data distribution P(X,Y)? • How do we represent these? How many parameters? • Prior, P(Y): • Suppose Y is composed of kclasses • Likelihood, P(X|Y): • Suppose X is composed of n binary features

  48. Conditional Independence • X is conditionally independent of Y given Z, if the probability distribution for X is independent of the value of Y, given the value of Z • e.g., • Equivalent to:

  49. Naïve Bayes • Naïve Bayes assumption: • Features are independent given class: • More generally:

  50. The Naïve Bayes Classifier Y X1 X2 Xn • Given: • Prior P(Y) • n conditionally independent features X given the class Y • For each Xi, we have likelihood P(Xi|Y) • Decision rule:

More Related