1 / 64

Sensible Visual Search

Sensible Visual Search. Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky and Lyndon Kennedy). User Expectation for Web Search. Keyword search is still the primary search method

fedora
Télécharger la présentation

Sensible Visual Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky and Lyndon Kennedy)

  2. User Expectation for Web Search • Keyword search is still the primary search method • Straightforward extension to visual search “…type in a few words at most, then expect the engine to bring back the perfect results. More than 95 percent of us never use the advanced search features most engines include, …” – The Search, J. Battelle, 2003

  3. Keyword-based Visual Search Paradigm

  4. Web Image Search • Text Query : “Manhattan Cruise” over Goggle Image • What are in the results? • Why are these images returned? • How to choose better search terms?

  5. Minor Changes in Keywords  Big Difference • Text Query : “Cruise around Manhattan”

  6. + - Semantic Indexes Anchor Snow Soccer Building Outdoor When metadata are unavailable:Automatic Image Classification • Audio-visual features • Geo, social features • SVM or graph models • Context fusion • Rich semantic description based on content analysis . . . Statistical models

  7. A few good detectors for LSCOM concepts waterfront bridge crowd explosion fire US flag Military personnel Remember there are many not so good detectors.

  8. Keyword Search over Statistical Detector Scores Columbia374: Objects, people, location, scenes, events, etc Concepts defined by expert analysts over news video www.ee.columbia.edu/cuvidsearch

  9. Query “car crash snow” over TRECVID video using LSCOM concepts • How are keywords mapped to concepts? • What classifiers work? What don’t? • How to improve the search terms?

  10. Frustration of Uninformed Users of Keyword Search • Difficult to choose meaningful words/concepts without in-depth knowledge of entire vocabulary

  11. Pains of Uninformed Users • Forced to take “one shot” searches, iterating queries with a trial and error approach...

  12. Challenge: user frustration in visual search ? Research needed to address user frustration A lot of work on content analytics

  13. Proposal: Sensible SearchMake search experience more Sensible • Help users stay “informed” • in selecting effective keywords/concepts • in understanding the search results • in manipulating the search criteria rapidly and flexibly • Keep users engaged • Instant feedback with minimal disruptionas opposed to “trial-and-error”

  14. A prototype CuZero: Zero-Latency Informed Search & Navigation

  15. Informed User: Instant Informed Query Formulation

  16. Informed User for Visual Search:Instant visual concept suggestion Instant Concept Suggestion query time concept mining

  17. Lexical mapping LSCOM Mapping keywords to concept definition, synonyms, sense context, etc

  18. Co-occurrent concepts car images road Baghdad to attend the game I see more goals and the players did not offer great that Beijing Games as the beginning of his brilliance Nayyouf 10 this atmosphere the culture of sports championship George Rizq led Hertha for the local basketball game the wisdom and sports championship of the president Basketballcourts and the American won the Saint Denis on the Phoenix Suns because of the 50 point for 19 in their role within the National Association of Basketball text

  19. results visual mining dominant concepts person suits Query-Time Concept Mining

  20. CuZero Real-Time Query Interface (demo) Auto-completespeech transcripts Instant Concept Suggestion

  21. A prototype CuZero: Zero-Latency Informed Search & Navigation (Zavesky and Chang, MIR2008)

  22. CMU Informedia Concept Filter only outdoor only people Informed User:Intuitive Exploration of Results linear browser restricts inspection flexibility

  23. Informed User:Rapid Exploration of Results Media Mill Rotor Browser

  24. Revisit the user struggle… Car detector Snow detector How did each concept influence the results? Car crash detector Query: {car, snow, car_crash}

  25. “sky” “water” “boat” CuZero:Real-Time Multi-Concept Navigation Map • Create a multi-concept gradient map • Direct user control: “nearness” = “more influence” • Instant display for each location, without new query

  26. Achieve Breadth-Depth Flexibilityby Dual Space Navigation (demo) • Breadth: Quick scan of many permutations • Depth: Instant exploration of results with fixed weights many query permutations Deep exploration of single permutation

  27. Latency Analysis: Workflow Pipeline execute query and download ranked concept list package results with scores transmit to client unpackage results at interface score images by concept weights; guarantee unique positions download images to interface in cached mode time to execute is disproportional!

  28. Pipelined processing for low latency Concept formulation (“car”) concept formulation (“snow”) • Overlap (concept formulation) with (map rendering) • Hide rendering latency during user interaction • Course-to-fine concept map planning/rendering • Speed optimization on-going …

  29. Challenge: user frustration in visual search ? Research needed to address user frustration sensible search:(1) query (2) visualize + (3) analyze

  30. Help Users Make Sense of Image Trend Query: “John Kennedy” • Many re-used content found • How did it occur? • What manipulations? • What distribution path? • Correlation with perspective change?

  31. Manipulationcorrelated withPerspective Anti-Vietnam War,Ronald and Karen Bowen, 1969 Raising the Flag on Iwo Jima Joe Rosenthal, 1945

  32. Reused Images Over Time

  33. Issue a text query Find duplicate images, merge into clusters Explore history/trend Get top 1000 results from web search engine Rank clusters (size?, original rank?) Question for Sensible Search: Insights from Plain Search Results?

  34. Duplicate Clusters Reveal Image Provenance Biggest Clusters Contain Iconic Images Smallest Clusters Contain Marginal Images

  35. Deeper Analysis of Search Results: Visual Migration Map (VMM) (Kennedy and Chang, ACM Multimedia 2008) Visual Migration Map Duplicate Cluster

  36. “Most Original” at the root Visual Migration Map (VMM) Images Derived through Series of Manipulations VMM uncovers history of image manipulation and plausible dissemination paths among content owners and users. “Most Divergent” at the leaves

  37. Ground truth VMM is hard to get • Hypothesis • Approximation of history is feasible by visual analysis. • Detect manipulation types between two images • Derive large scale history among a large image set

  38. Cropped Insertion Original Overlay Scaled Gray Basic Image Manipulation Operators • Each is observable by inspecting the pair • Each implies direction (one image derived from other) • Other possible manipulations: color correction, multiple compression, sharpening, blurring

  39. Detecting Near-Duplicates • Duplicate detection is very useful and relatively reliable • Remaining challenges: scalability/speed; video duplicates; object (sub-image) (TRECVID08) Graph Matching [Zhang & Chang, 2004] Matching SIFT points [Lowe, 1999]

  40. Scale Detection • Draw bounding box around matching points in each image • Compare heights/widths of each box • Relative difference in box size can be used to normalize scales

  41. Color Removal • Simple case: image stored in single channel file • Other cases: image is grayscale, but stored in 3-channel file • Expect little difference in values in various channels within pixels

  42. More Challenging:Overlay Detection? • Given two images, we can observe that a region is different between the two • But how do we know which is the original?

  43. Cropping or Insertion? • Can find differences in image area • But is the smaller-area due to a crop or is the larger area due to an insertion? Original Cropping Insertion

  44. Use Context from Many Duplicates Normalize Scales and Positions Get average value for each pixel “Composite” image

  45. Image B Composite B Residue B Cropping Detection w/ Context • In cropping, we expect the content outside the crop area to be consistent with the composite image Image A Composite A Residue A

  46. Image B Composite B Residue B Overlay Detection w/ Context • Comparing images against composite image reveals portions that differ from typical content • Image with divergent content may have overlay Image A Composite A Residue A

  47. Image B Composite B Residue B Insertion Detection w/ Context • In insertion, we expect the area outside the crop region to be different from the typical content Image A Composite A Residue A

  48. Evaluation: Manipulation Detection Context-Free Context-Dependent • Context-Free detectors have near-perfect performance • Context-Dependent detectors still have errors • Consistency checking can further improve the accuracy • Are these error-prone results sufficient to build manipulation histories?

  49. r Not Plausible Inferring Direction from Consistency

  50. a Plausible Manipulation Direction from Consistency

More Related