1 / 65

Paper Overviews

Paper Overviews. 3 types of descriptors : SIFT / PCA-SIFT ( Ke , Sukthankar ) GLOH ( Mikolajczyk , Schmid ) DAISY ( Tola , et al, Winder, et al) Comparison of descriptors ( Mikolajczyk , Schmid ). Paper Overviews. PCA-SIFT: SIFT-based but with a smaller descriptor

mimir
Télécharger la présentation

Paper Overviews

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paper Overviews • 3 types of descriptors: • SIFT / PCA-SIFT (Ke, Sukthankar) • GLOH (Mikolajczyk, Schmid) • DAISY (Tola, et al, Winder, et al) • Comparison of descriptors (Mikolajczyk, Schmid)

  2. Paper Overviews PCA-SIFT: SIFT-based but with a smaller descriptor GLOH: modifies the SIFT descriptor for robustness and distinctiveness DAISY: novel descriptor that uses graph cuts for matching and depth map estimation

  3. SIFT • “Scale Invariant Feature Transform” • 4 stages: • Peak selection • Keypoint localization • Keypoint orientation • Descriptors

  4. SIFT • “Scale Invariant Feature Transform” • 4 stages: • Peak selection • Keypoint localization • Keypoint orientation • Descriptors

  5. SIFT • 1. Peak Selection • Make Gaussian pyramid http://www.cra.org/Activities/craw_archive/dmp/awards/2006/Bolan/DMP_Pages/filters.html

  6. SIFT • 1. Peak Selection • Find local peaks using difference of Gaussians • - Peaks are found at different scales http://www.cra.org/Activities/craw_archive/dmp/awards/2006/Bolan/DMP_Pages/filters.html

  7. SIFT • “Scale Invariant Feature Transform” • 4 stages: • Peak selection • Keypoint localization • Keypoint orientation • Descriptors

  8. SIFT • 2. Keypoint Localization • Remove peaks that are “unstable”: • Peaks in low-contrast areas • Peaks along edges • Features not distinguishable

  9. SIFT • “Scale Invariant Feature Transform” • 4 stages: • Peak selection • Keypoint localization • Keypoint orientation • Descriptors

  10. SIFT • 3. Keypoint Orientation • Make histogram of gradients for a patch of pixels • Orient all patches so the dominant gradient direction is vertical http://www.inf.fu-berlin.de/lehre/SS09/CV/uebungen/uebung09/SIFT.pdf

  11. SIFT • “Scale Invariant Feature Transform” • 4 stages: • Peak selection • Keypoint localization • Keypoint orientation • Descriptors

  12. SIFT • 4. Descriptors • Ideal descriptor: • Compact • Distinctive from other descriptors • Robust against lighting / viewpoint changes

  13. SIFT • 4. Descriptors • A SIFT descriptor is a 128-element vector: • 4x4 array of 8-bin histograms • Each histogram is a smoothed representation of gradient orientations of the patch

  14. PCA-SIFT • Changes step 4 of the SIFT process to create different descriptors • Rationale: • Construction of SIFT descriptors is complicated • Reason for constructing them that way is unclear • Is there a simpler alternative?

  15. PCA-SIFT • “Principal Component Analysis” (PCA) • A widely-used method of dimensionality reduction • Used with SIFT to make a smaller feature descriptor • By projecting the gradient patch into a smaller space

  16. PCA-SIFT • Creating a descriptor for keypoints: • Create patch eigenspace • Create projection matrix • Create feature vector

  17. PCA-SIFT • 1. Create patch eigenspace • For each keypoint: • Take a 41x41 patch around the keypoint • Compute horizontal / vertical gradients • Put all gradient vectors for all keypoints into a matrix

  18. PCA-SIFT • 1. Create patch eigenspace • M = matrix of gradients for all keypoints • Calculate covariance of M • Calculate eigenvectors of covariance(M)

  19. PCA-SIFT • 2. Create projection matrix • Choose first n eigenvectors • This paper uses n = 20 • This is the projection matrix • Store for later use, no need to re-compute

  20. PCA-SIFT • 3. Create feature vector • For a single keypoint: • Take its gradient vector, project it with the projection matrix • Feature vector is of size n • This is called Grad PCA in the paper • “Img PCA” - use image patch instead of gradient • Size difference: 128 elements (SIFT) vs. n = 20

  21. PCA-SIFT • Results • Tested SIFT vs. “Grad PCA” and “Img PCA” on a series of image variations: • Gaussian noise • 45° rotation followed by 50% scaling • 50% intensity scaling • Projective warp

  22. PCA-SIFT • Results (Precision-recall curves) • Grad PCA (black) generally outperforms Img PCA (pink) and SIFT (purple) except when brightness is reduced • Both PCA methods outperform SIFT with illumination changes

  23. PCA-SIFT • Results • PCA-SIFT also gets more matches correct on images taken at different viewpoints

  24. A Performance Evaluation of Local Descriptors KrystianMikojaczyk and CordiliaSchmid

  25. Problem Setting for Comparison • Matching Problem From a slide of David G. Lowe (IJCV 2004) As we did in Project2: Panorama, we want to find correct pairs of points in two images.

  26. Overview of Compared Methods • Region Detector detects interest points • Region Descriptor describes the points • Matching Strategy How to find pairs of points in two images?

  27. Region Detector • Harris Points • Blob Structure Detector 1. Harris-Laplace Regions (similar to DoG) 2. Hessian-Laplace Regions 3. Harris-Affine Region 4. Hessian-Affine Region • Edge Detector  Canny Detector

  28. Descriptor Dimension Category Distance Measure SIFT 128 SIFT Based Descriptors Euclidean PCA-SIFT 36 GLOH 128 Shape Context 36 Similar to SIFT, but focues on Edge locations with Canny Detector Spin 50 A sparse set of affine-invariant local patches are used Steerable Filter 14 Differential Descriptors Forcuses on the properties of local derivaties (local jet) Mahalanobis Differential Invariants 14 Complex Filters 1681 Consists of many fileters Gradient Moments 20 Moment based descriptor Cross Correlation 81 Uniformaly sampled locations Region Descriptors

  29. Matching Strategy • Threshold-Based Matching • Nearest Neighbor Matching – Threshold • Nearest Neighbor Matching – Distance Ratio DB: the first neighbor DB: the first neighbor DC: the second neighbor

  30. Peformance Measurements • Repeatability rate, ROC • Recall-Precision TP (True Positive) # of correct maches Recall = = Actual positive Total # of correct matches TP (True Positive) # of correct maches Precision = = Predicted positive # of correct matches + # of false matches

  31. Example of Recall-Precision • Let's say that our method detected.. * 50 corrsponding pairs were extracted * 40 detected pairs were correct pairs * As a groud truth, there are 200 correct pairs! Then, Recall = C/B = 40/200 = 20% Precision = C/A = 40/50 = 80% C A A B B Actual pos Predicted Pos The perfect descriptor gives 100% recall for any value of Precision!!

  32. DataSet • 6 different transformed images Rotation Zoom + Rotation Image Blur Viewpoint Change Light Change JPEG Compression

  33. Matching Strategies * Hessian-Affine Regions Threshold based Matching NearnestNeigbor Matching – Threshold NearnestNeigbor Matching – Distance Ratio

  34. View Point Change With Hessian Affine Regions With Harris-Affine Regions

  35. Scale Change with Rotation Hessian-Laplace Regions Harris-Laplace Regions

  36. Image Rotation of 30~45 degree Harris Points

  37. Image Blur Hessian Affine Regions

  38. JPEG Compression * Hessian-Affine Regions

  39. IlluminationChanges * Hessian-Affine Regions

  40. Ranking of Descriptor High Peformance 1. SIFT-based descriptors, 128 dimensions GLOH, SIFT 2. Shape Context, 36 dimensions 3. PCA-SIFT, 36 dimensions 4. Gradient moments & Steerable Filters ( 20 dimensions ) & ( 14 dimensions) 5. Other descriptors Low Peformance Note: This performance is for matching problem. This is not general performance.

  41. Ranking of Difficult Image Transformation easy easy 1. Structured Scene 2. Textured Scene 1. Scale & Rotation & illumination 2. JPEG Compression 3. Image Blur 4. View Point Change difficult Two Textured Scenes difficult

  42. Other Results • Hessian Regions are better than Harris Regions • NearnestNeigbor based matching is better than a simple threshold based matching • SIFT becomes better when nearenestneigbor distance ration is used • Robust region descriptors peformbettern than point-wise descriptors • Image Rotation does not have big impact on the accuracy of descriptors

  43. EnginTola, Vincent Lepetit, Pascal Fua EcolePolytechniqueFederale de Lausanne, Switzerland A Fast Local Descriptor for Dense Matching

  44. Paper novelty • Introduces DAISY local image descriptor • much faster to compute than SIFT for dense point matching • works on the par or better than SIFT • DAISY descriptors are fed into expectation-maximization (EM) algorithm which uses graph cuts to estimate the scene’s depth. • works on low-quality images such as the ones captured by video streams

  45. SIFT local image descriptor • SIFT descriptor is a 3–D histogram in which two dimensions correspond to image spatial dimensions and the additional dimension to the image gradient direction (normally discretized into 8 bins)

  46. SIFT local image descriptor • Each bin contains a weighted sum of the norms of the image gradients around its center, where the weights roughly depend on the distance to the bin center

  47. DAISY local image descriptor • Gaussian convolved orientation maps are calculated for every direction : Gaussian convolution filter with variance S : image gradient in direction o (.)+ : operator (a)+ = max(a, 0) : orientation maps • Every location in contains a value very similar to what a bin in SIFT contains: a weighted sum computed over an area of gradient norms

  48. DAISY local image descriptor

  49. DAISY local image descriptor • Histograms at every pixel location are computed : histogram at location (u, v) : Gaussian convolved orientation maps • Histograms are normalized to unit norm • Local image descriptor is computed as : the location with distance R from (u,v) in the direction given by j when the directions are quantized into N values

  50. From Descriptor to Depth Map • The model uses EM to estimate depth map Z and occlusion map O by maximizing : descriptor of image n

More Related