1 / 27

Data-driven Visual Similarity for Cross-domain Image Matching

Data-driven Visual Similarity for Cross-domain Image Matching. Abhinav Shrivastava * Tomasz Malisiewicz Abhinav Gupta Alexei A. Efros Carnegie Mellon University * MIT. To appear in  SIGGRAPH Asia, 2011. Outline. Instruction Approach Data-driven Uniqueness

alika
Télécharger la présentation

Data-driven Visual Similarity for Cross-domain Image Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data-driven Visual Similarity for Cross-domain Image Matching AbhinavShrivastava *Tomasz MalisiewiczAbhinav GuptaAlexei A. Efros Carnegie Mellon University *MIT To appear in SIGGRAPH Asia, 2011

  2. Outline • Instruction • Approach • Data-driven Uniqueness • Algorithm Description • Experimental Validation • Sketch-to-Image Matching • Painting-to-Image Matching • Applications • Limitations

  3. Instruction

  4. Visual matching approaches • Exact matching: These methods usually fail when tasked with finding similar, but not identical objects (e.g., try using GOOGLE GOGGLES app to find a cup, or a chair).

  5. Approximate matching : • Most focus on employing various image representations that aim to capture the important, salient parts of the image. (GIST, HoG) • Content-Based Image Retrieval (CBIR), the aim is to retrieve semantically-relevant images, even if they do not appear to be visually similar.

  6. Cross-domain matching: • Particular domains • sketches to photographs [Chen et al. 2009; Eitz et al. 2010] • photos under different illuminants [Chong et al. 2008] • Across multiple domains • Matching local selfsimilarities across images and videos work by Shechtmanand Irani [2007]

  7. each query image decides what is the best way to weight its constituent parts.

  8. Approach • There are two requirements for a good visual similarity function: • It has to focus on the content of the image (the “what”), rather that the style (the “how”). • It should be scene-dependent.

  9. Data-driven Uniqueness: • Re-weightthe different elements of an image based on how unique they are, the resulting similarity function would. • Compute uniqueness in a data-driven way — against a very large dataset of randomly selected images. • The features that would best discriminate this image (the positive sample) against the rest of the data (the negative samples).

  10. Given the learned, query-dependent weight vector wq, the visual similarity between a query image Iq and any other image/sub-image Ii can be defined simply as: where xi is Ii’s extracted feature vector. • We employ the linear Support Vector Machine (SVM) to learn the feature weight vector. • image feature : Histogram of Oriented Gradients (HOG) template descriptor

  11. To visualize how the SVM captures the notion of data-driven uniqueness:

  12. Algorithm Description: • Learning the weight vector wqamounts to minimizing the following convex objective function: • Each query image (Iq) is represented with a rigid grid-like HoGfeature template (xq). • Due to image misalignment, we create a set of extrapositive data-points, P, by applying smalltransformations (shift,scale and aspect ratio) to the query image Iq, and generating xi foreach sample. • The SVM classifier is learned using IqandP aspositive samples. • Set containing millions of sub-images N(extracted from 10,000 randomly selected Flickr images), as negatives. • We use LIBSVM [Chang and Lin 2011] for learning wqwith a common regularization parameter λ= 100 and the standard hinge loss function h(x) = max(0,1-x).

  13. Experimental Validation • To demonstrate our approach, we performed a number of imagematching experiments on different image datasets, comparingagainst the following popular baseline methods: • Tiny Images • GIST • BoW • Spatial Pyramid • Normalized-HoG (N-HoG)

  14. Sketch-to-Image Matching • We collected a dataset of 50sketches(25 cars and 25 bicycles) to be used as queries. • The sketches were used to query into the PASCALVOC dataset [Everingham et al. 2007].

  15. For quantitative evaluation, we compared how many car and bicycle images were retrieved in the top-K images for car and bicycle sketches respectively. • We used the bounded mean Average Precision (mAP) metric used by [J´egou et al. 2008].

  16. Painting-to-Image Matching • We collected a dataset of 50 paintings of outdoor scenes in a diverse set of painting styles geographical locations. • The retrieval set was sub-sampled from the 6:4M GPS-tagged Flickr images of [Hays and Efros 2008]. • For each query, we created a set of 5,000 images randomly sampled within a 50 mile radius of each painting’s location.

  17. Applications • Internet Re-photography

  18. Painting2GPS

  19. Visual Scene Exploration

  20. Limitations • Two main failure modes : • We fail to find a good match due to the relatively small size of our dataset (10,000 images) compared to Google’s billions of indexed images. • The query scene is so cluttered that it is difficult for any algorithm to decide which parts of the scene – the car, the people on sidewalk, the building in the background – it should focus on.

  21. Thank you!

More Related