1 / 59

SIFT (scale invariant feature transform)

SIFT (scale invariant feature transform). 2011.4.14 Reporter: Fei-Fei Chen. What is Computer Vision?. Local Invariant Feature. Applications. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking Change in illumination

josie
Télécharger la présentation

SIFT (scale invariant feature transform)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SIFT (scale invariant feature transform) 2011.4.14 Reporter: Fei-Fei Chen

  2. What is Computer Vision?

  3. Local Invariant Feature

  4. Applications • Wide-baseline matching • Object recognition • Texture recognition • Scene classification • Robot wandering • Motion tracking • Change in illumination • 3D camera viewpoint • etc.

  5. Object recognition

  6. 3D object recognition

  7. Image retrieval (1/3) … > 5000 images change in viewing angle

  8. Image retrieval (2/3) 22 correct matches

  9. Image retrieval (3/3) … > 5000 images change in viewing angle + scale change

  10. Automatic image stitching (1/2)

  11. Automatic image stitching (2/2)

  12. Motivation: Matching Problem • Find corresponding features across two or more views.

  13. Motivation: Patch Matching • Elements to be matched are image patches of fixed size • Task: Find the best (most similar) patch in a second image.

  14. Not all patches are created equal • Intuition: This would be a good match for matching, since it is very distinctive.

  15. Not all patches are created equal • Intuition: This would be a BAD patch for matching, since it is not very distinctive.

  16. What are corners? • Intuitively, junctions of contours. • Generally more stable features over change of viewpoint. • Intuitively, large variations in the neighborhood of the point in all directions. • They are good features to match!

  17. descriptor detector SIFT • Detection of Scale-Space Extrema • Accuracy Keypoint localization • Orientation assignment • Keypoint descriptor

  18. 1. Detection of scale-space extrema • For scale invariance, search for stable features across all possible scales using a continuous function of scale, scale space. • SIFT uses DoG filter for scale space because it is efficient and as stable as scale-normalized Laplacian of Gaussian.

  19. DoG filtering Convolution with a variable-scale Gaussian Difference-of-Gaussian (DoG) filter Convolution with the DoG filter

  20. Scale space  doubles for the next octave K=2(1/s) Dividing into octave is for efficiency only.

  21. Detection of scale-space extrema

  22. Keypoint localization X is selected if it is larger or smaller than all 26 neighbors

  23. 2. Accurate keypoint localization • Reject(1) points with low contrast (flat) • (2) poorly localized along an edge (edge) • Fit a 3D quadratic function for sub-pixel maxima 6 5 1 -1 +1 0

  24. 2. Accurate keypoint localization • Taylor series of several variables • Two variables

  25. 2. Accurate keypoint localization • Taylor expansion in a matrix form, x is a vector, f maps x to a scalar Hessian matrix (often symmetric) gradient

  26. 2D illustration

  27. Derivation of matrix form

  28. 2. Accurate keypoint localization • x is a 3-vector • Remove sample point if offset is larger than 0.5 • Throw out low contrast (<0.03)

  29. Eliminating edge responses Hessian matrix at keypoint location Let Keep the points with r=10

  30. 3. Orientation assignment • By assigning a consistent orientation, the keypoint descriptor can be orientation invariant. • For a keypoint, L is the Gaussian-smoothed image with the closest scale, (Lx, Ly) m θ orientation histogram (36 bins)

  31. Orientation assignment

  32. Orientation assignment

  33. Orientation assignment

  34. Orientation assignment σ=1.5*scale of the keypoint

  35. Orientation assignment

  36. Orientation assignment

  37. Orientation assignment accurate peak position is determined by fitting

  38. Orientation assignment 36-bin orientation histogram over 360°, weighted by m and 1.5*scale falloff Peak is the orientation Local peak within 80% creates multiple orientations About 15% has multiple orientations and they contribute a lot to stability

  39. 4. Local image descriptor • Thresholded image gradients are sampled over 16x16 array of locations in scale space • Create array of orientation histograms (w.r.t. key orientation) • 8 orientations x 4x4 histogram array = 128 dimensions • Normalized for intensity variance, clip values larger than 0.2, renormalize σ=0.5*width

  40. For rotation invariance For scale invariance Conclusions for SIFT • Detection of Scale-Space Extrema • Accuracy Keypointlocalization • Orientation assignment • Keypointdescriptor Remove unstable feature points For illumination invariance

  41. Conclusions for SIFT • Image scale invariance. • Image rotation invariance. • Robust matching across a substantial range of (1) affine distortion, (2) change in 3D viewpoint, (3) addition of noise, (4) change in illumination.

  42. Feature matching • For a feature x, he found the closest feature x1and the second closest feature x2. If the distance ratio of d(x, x1) and d(x, x2) is smaller than 0.8, then it is accepted as a match.

  43. Maxima in DoG

  44. Remove low contrast

  45. Remove edges

  46. SIFT descriptor

  47. SIFT descriptor

  48. SIFT descriptor

  49. SIFT descriptor

  50. Image Matching

More Related