1 / 62

Lecture 2

Lecture 2. Mei-Chen Yeh 03/09/2010. Outline. Demos Image representation and feature extraction Global features Local features: SIFT Assignment #2 (due: 03/16). Demos. Augmented Reality http://www.youtube.com/watch?v=P9KPJlA5yds http://www.youtube.com/watch?v=U2uH-jrsSxs Tracking

morse
Télécharger la présentation

Lecture 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 2 Mei-Chen Yeh 03/09/2010

  2. Outline • Demos • Image representation and feature extraction • Global features • Local features: SIFT • Assignment #2 (due: 03/16)

  3. Demos • Augmented Reality • http://www.youtube.com/watch?v=P9KPJlA5yds • http://www.youtube.com/watch?v=U2uH-jrsSxs • Tracking • Traffic • Counting people • Image search • MyFinder: http://128.111.56.44/myFinder/ • Simplicity: http://wang14.ist.psu.edu/cgi-bin/zwang/regionsearch_show.cgi • Image annotation • ALIPR: http://alipr.com/ • Embedded face detection and recognition • Tiling slide show • Pivot: http://www.technologyreview.com/video/?vid=533

  4. Multimedia Systems: A Multidisciplinary Subject • Signal Processing • Data Mining • Machine Learning • Pattern Recognition • Networking • … and more!

  5. Topics (1) • Image/video processing • Feature extraction • Video syntax analysis • Compression

  6. Topics (2) • Content-based image/video retrieval • Copy detection • Region-based retrieval • Multi-dimensional indexing

  7. Topics (3) • Multimodal system • Audio processing • Multimodality analysis

  8. Topics (4) • Semantic concept detection • Object detection • Object recognition

  9. Topics (5) • Tracking • Motion features • Models • Single-, multiple-object tracking

  10. Topic (6) • Qualify of Service/Experience • QoE Framework • VoIP System Evaluation • Imaging System Evaluation

  11. Resources of the readings • ACM International Conference on Multimedia • The premier annual event on multimedia research, technology, and art • Started since 1993 • >400 attendees • Program: Content, Systems, Applications, HC tracks • Full papers (16%), short papers (28%) • Technical demonstrations, open source software competition, the doctoral symposium, tutorials (6), workshops (11), a brave new topic session, panels (2), Multimedia grand challenge • IEEE Transactions on Multimedia

  12. Image Representations

  13. Multimedia file formats • A list of some formats used in the popular product “Macromedia Director” • These formats differ mainly in how data are compressed. • Features are normally extracted from raw data.

  14. 1-bit images • Each pixel is stored as a single bit (0 or 1), so also referred to as binary image. • So-called 1-bitmonochrome image No color

  15. 8-bit gray-level images • Each pixel has a gray-value between 0 and 255. (0=>black, 255=>white) • Image resolution refers to the number of pixels in a digital image • A 640 x 480 grayscale image requires ??? kB One byte per pixel 640x480 = 307,200 ~ 300 kB

  16. 24-bit color images • Each pixel is represented by three bytes, usually representing RGB. • This format supports 256x256x256 (16,777,216) possible colors. • A 640x480 24-bit color image would require 921.6 kB! Lena: 1997 Lena: 1972

  17. Image Features

  18. Feature types • Global features • Color • Shape • Texture • Local features • SIFT • SURF • Self-similarity descriptor • Shape context descriptor • … … A fixed-length feature vector … …

  19. Color histogram • A color histogram counts pixels with a given pixel value in Red, Green, and Blue (RGB). • An example of histogram that has 2563 bins, for 24-bit color images:

  20. Color histogram (cont.) • Quantization

  21. Color histogram (cont.) • Problems of such a representation SAME! Case 1 SAME! Case 2 SAME! Case 3

  22. Search by color histograms

  23. Regional color • Divide the image into regions • Extract a color histogram for each region • Put together those color histograms into a long feature vector

  24. Textures • Many natural and man-made objects are distinguished by their texture. • Man-made textures • Walls, clothes, rugs… • Natural textures • Water, clouds, sand, grass, … What is this?

  25. Examples More: http://www.ux.uis.no/~tranden/brodatz.html

  26. Texture features • Structural • Describe arrangement of texture elements • E.g., “texton model”, “texel model” • Statistical • Characterize texture in terms of statistics • E.g., co-occurrence matrix, Markov random field • Spectral • Analyze in spatial-frequency domain • E.g., Fourier transform, Gabor filter, wavelets

  27. Textual Properties • Coarseness: coarse vs. fine • Contrast: high vs. low • Orientation: directional vs. non-directional • Edge: line-like vs. blob-like • Regularity: regular vs. random • Roughness: rough vs. smooth

  28. Shape • Boundary-based feature • Use only the outer boundary of the shape • E.g. Fourier descriptor, shape context descriptor • Region-based feature • Use the entire shape region • Local descriptors

  29. Shape: Fourier descriptor

  30. Properties • Invariant to translation, scale, and rotation

  31. Feature types • Global features • Color • Shape • Texture • Local features • SIFT • SURF • Self-similarity descriptor • Shape context descriptor • … … A fixed-length feature vector … …

  32. David G. Lowe. Distinctive Image Features from Scale-Invariant Key-points, IJCV, 2004

  33. What is SIFT? • Scale Invariant Feature Transform (SIFT) is an approach for detecting and extracting local feature descriptors from an image. • SIFT feature descriptors are reasonably invariant to • scaling • rotation • image noise • changes in illumination • small changes in viewpoint

  34. Types of invariance viewing angle illumination scale rotation

  35. 621 128 162.38 155.79 44.30 2.615 7 6 0 0 0 0 0 1 58 63 1 0 7 6 1 8 8 9 0 0 24 42 39 14 0 0 0 0 0 0 7 2 44 7 0 0 23 22 6 69 137 64 0 0 0 0 11 137 55 12 0 0 2 25 137 112 0 0 0 0 3 17 30 6 34 1 0 0 20 51 137 89 137 89 0 0 0 15 115 102 137 47 0 0 4 37 26 43 0 0 0 0 19 45 4 0 0 0 0 0 0 16 137 53 33 2 0 0 0 56 137 51 57 2 0 0 0 3 14 35 0 0 0 0 0 2 0 0 282.47 185.76 27.80 2.009 0 0 0 0 0 0 0 0 1 41 13 1 0 12 4 0 5 17 15 16 17 83 35 16 19 0 0 1 2 13 24 104 0 1 9 0 0 0 0 0 22 127 127 5 0 0 0 1 127 127 75 16 6 0 0 70 55 2 0 1 0 0 25 127 1 1 9 0 0 1 1 2 115 22 49 4 0 0 0 68 127 127 30 4 0 0 0 58 67 127 69 0 0 0 5 20 2 0 0 0 4 65 5 2 85 50 6 0 1 15 2 30 56 93 53 19 0 0 4 41 22 127 86 1 0 2 17 20 ………. Number of keypoints Feature dimension

  36. Matching two images

  37. Densely cover the image (an image with 500x500 pixels => 2000 feature vectors) • Distinctive • Invariant to image scale, rotation, and partially invariant to changing viewpoints and illumination • Perform the best among local descriptors • K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” PAMI 05.

  38. Simple test (scale and rotate) 214 matches! • Scale to 60% and rotate 30 degree 693 keypoints 349 keypoints

  39. Simple test (illumination) 467 matches! 693 keypoints 633 keypoints

  40. Simple test (different appearance) 25 matches! 693 keypoints 728 keypoints

  41. Simple test (different appearance) 1 match! 693 keypoints 832 keypoints

  42. Simple Test (different appearance with occlusion) 0 match! 693 keypoints 1124 keypoints

  43. About SIFT… • How to generate SIFT feature descriptors? • How to use SIFT features descriptors (for object recognition, image retrieval, etc.) ?

  44. Interest point detector + descriptor SIFT: Overview • Major stages of SIFT computation An image Identify potential interest points (location, scale) Scale-space extrema detection Localize candidate keypoints Reduced sets of (location, scale) Keypoint localization Identify the dominant orientations (location, scale, orientation) Orientation assignment Build a descriptor based on histogram of gradients in local neighborhood Keypoint descriptor feature vectors (128-d)

  45. Step 1: Scale-space extrema detection • How do we detect locations that are invariant to scale change of the image? • Detecting extrema in scale-space • For a given image I(x,y), its linear scale-space representation: • Be efficiently implemented by searching for local peaks in a series of DoG (difference-of-Gaussian) images

  46. Step 1: Scale-space extrema detection k2σ kσ σ

  47. Gaussian images DoG images

  48. Step 2: Scale-space extrema detection DoG If X is the largest or the smallest of all of its neighbors, X is called a keypoint. DoG DoG

  49. Why DoG? • An efficient function to compute • A close approximation to the scale-normalized Laplacian of Gaussian • Lindeberg showed that the normalization of the Laplacian with the factor σ2 is required for true scale invariance. (1994) • Mikolajczyk found that the maxima and minima of produce the most stable image features. (2002) • DoG v.s.

More Related