1 / 87

Using Analogy to Discover the Meaning of Pictures

Using Analogy to Discover the Meaning of Pictures. Melanie Mitchell Computer Science Department Portland State University and External Professor Santa Fe Institute. An image-understanding task:. High-level perception. “Meaning”. ?. Simple Segmentation. Color, Shape, Texture.

brooklyn
Télécharger la présentation

Using Analogy to Discover the Meaning of Pictures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Analogy to Discover the Meaning of Pictures Melanie Mitchell Computer Science Department Portland State University and External Professor Santa Fe Institute

  2. An image-understanding task:

  3. High-level perception “Meaning” ? Simple Segmentation Color, Shape, Texture Object recognition Pattern recognition Low-level vision

  4. High-level perception “Meaning” Simple Segmentation Color, Shape, Texture Object recognition Pattern recognition Low-level vision The “SEMANTIC GAP’

  5. The HMAX model for object recognition(Serre, Wolf, Bileschi, Risenhuber, and Poggio, 2006)

  6. Gabor Filters Gabor filter: Essentially a localized Fourier transform in the image. Filter has associated frequency , scale s, and orientation . Response measures extent to which  is present at orientation at scale s centered about pixel (x,y).

  7. S1 units: Gabor filters (one per pixel) 16 scales / frequencies, 4 orientations

  8. C1 unit: Maximum value of group of S1 units, pooled over slightly different positions and scales 8 scales / frequencies, 4 orientations

  9. S2 units: Radial Basis Functions over “Natural Image Patches” • Idea is that natural images contain universal, low-level features that are useful in classifying objects. • Randomly sample small “crops” from natural images, and feed them through S1 and C1 layers. • Collect a set of N patches , {Pi | i 1, ..., N}, of C1 layer from this random sample. • Now, with new image, a unit S2i corresponding to Pi gets input X from C1 layer, computes a radial basis function: • Gives “degree” to which feature Piis present in input X.

  10. C2 units: Compute maximum over groups of S2 inputs

  11. Feature vector representing image Support Vector Machine classification

  12. Object detection (here, “car”) with HMAX model (Bileschi, 2006)

  13. Sample of results from Poggio model (Serre et al., 2006) (Bileschi, 2006)

  14. Question: Is this a picture of “dog walking”?

  15. Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash holds attached to action action walking

  16. But...

  17. Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash holds attached to Dogs action action running walking

  18. But...

  19. Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash holds attached to Dogs action action Cat running Iguana walking

  20. But...

  21. But...

  22. But...

  23. But...

  24. Can we use a simple ontology to answer this question? “Dog walking” Person Dog leash Helicopter Bicycle Car holds attached to Dogs action action Cat running Iguana walking

  25. But...

  26. Why is image-understanding hard for computers?

  27. Why is image-understanding hard for computers? • It is vastly open-ended.

  28. Dog grooming Fanny pack Dog walking Gasoline Lawn mower Sidewalk Beach Stick Inside Runway Sky Helicopter Leash Army Grass Airplane Dog Outside Person Ground Holding Attached to Tree Backpack Car Far from Close to Standing Running Above Left of Walking Track

  29. Why is image-understanding hard for computers? • It is vastly open-ended. • Can’t solve by feeding image’s feature vector to all known “object classifiers”; in general too many such classifiers, and they are too imperfect! (Compare with StreetScenes system.) • In general can’t even construct high-level“feature vector” ahead of time, since there are too many possible features and you don’t know which features are relevant. • Need dynamics! Need to construct “probable”, coherent, consistent, representation of picture at “recognition time”. Construction process must allow different parts of representation to influence one another dynamically.

  30. In constructing representation, need to limit exploration of features to the most promising possibilities ― but how do you know which ones are promising without exploring them? • Need prior, higher-level knowledge to interact with lower-level vision in both directions (bottom-up and top-down). • Need to allow prior knowledge to be “fluid” – allow concepts to “slip”. Need to perceive essential similarity in the face of superficial differences (analogy-making). • In short, need “active symbols”: concepts with dynamic activation (relevance) that can be activated by other active symbols, spread activation to conceptual neighbors, and that can push for themselves to be instantiated in the perception of a situation.

  31. Concept network Active Symbol Architectures(Hofstadter et al.) “Top-down” perceptual agents (codelets) Workspace Temperature “Bottom-up” perceptual agents (codelets)

  32. Architecture of Copycat Concept network (Slipnet) a b c ---> a b d i i j j k k --> ? Perceptual and structure-building agents (codelets) Workspace Temperature

  33. Idealizing analogy-making

  34. Idealizing analogy-making abc ---> abd ijk ---> ?

  35. Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor)

  36. Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor) ijd (replace rightmost letter by ‘d’)

  37. Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor) ijd (replace rightmost letter by ‘d’) ijk (replace all ‘c’s by ‘d’s)

  38. Idealizing analogy-making abc ---> abd ijk ---> ijl (replace rightmost letter by successor) ijd (replace rightmost letter by ‘d’) ijk (replace all ‘c’s by ‘d’s) abd (replace any string by ‘abd’)

  39. Idealizing analogy-making abc ---> abd iijjkk ---> ?

  40. Idealizing analogy-making abc ---> abd iijjkk ---> iijjkl Replace rightmost letter by successor

  41. Idealizing analogy-making abc ---> abd iijjkk ---> ?

  42. Idealizing analogy-making abc ---> abd iijjkk ---> iijjll Replace rightmost “letter” by successor

  43. Idealizing analogy-making abc ---> abd kji ---> ?

  44. Idealizing analogy-making abc ---> abd kji ---> kjj Replace rightmost letter by successor

  45. Idealizing analogy-making abc ---> abd kji ---> ?

  46. Idealizing analogy-making abc ---> abd kji ---> lji Replace “rightmost” letter by successor

  47. Idealizing analogy-making abc ---> abd kji ---> ?

  48. Idealizing analogy-making abc ---> abd kji ---> ?

  49. Idealizing analogy-making abc ---> abd kji ---> kjh Replace rightmost letter by “successor”

More Related