html5-img
1 / 94

Bayesian models of human inductive learning Josh Tenenbaum MIT

Bayesian models of human inductive learning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Computer Science and AI Lab (CSAIL). Collaborators. Vikash Mansinghka. Tom Griffiths. Pat Shafto. Charles Kemp. Takeshi Yamada. Chris Baker. Naonori Ueda. Lauren Schmidt.

tex
Télécharger la présentation

Bayesian models of human inductive learning Josh Tenenbaum MIT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian models of human inductive learning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Computer Science and AI Lab (CSAIL)

  2. Collaborators Vikash Mansinghka Tom Griffiths Pat Shafto Charles Kemp Takeshi Yamada Chris Baker Naonori Ueda Lauren Schmidt Funding: US NSF, AFOSR, ONR, DARPA, NTT Communication Sciences Laboratories, Schlumberger, Eli Lilly & Co., James S. McDonnell Foundation

  3. The probabilistic revolution in AI • Principled and effective solutions for inductive inference from ambiguous data: • Vision • Robotics • Machine learning • Expert systems / reasoning • Natural language processing • Standard view: no necessary connection to how the human brain solves these problems.

  4. Bayesian models of cognition Visual perception [Weiss, Simoncelli, Adelson, Richards, Freeman, Feldman, Kersten, Knill, Maloney, Olshausen, Jacobs, Pouget, ...] Language acquisition and processing [Brent, de Marken, Niyogi, Klein, Manning, Jurafsky, Keller, Levy, Hale, Johnson, Griffiths, Perfors, Tenenbaum, …] Motor learning and motor control [Ghahramani, Jordan, Wolpert, Kording, Kawato, Doya, Todorov, Shadmehr,…] Associative learning [Dayan, Daw, Kakade, Courville, Touretzky, Kruschke, …] Memory [Anderson, Schooler, Shiffrin, Steyvers, Griffiths, McClelland, …] Attention [Mozer, Huber, Torralba, Oliva, Geisler, Yu, Itti, Baldi, …] Categorization and concept learning [Anderson, Nosfosky, Rehder, Navarro, Griffiths, Feldman, Tenenbaum, Rosseel, Goodman, Kemp, Mansinghka, …] Reasoning [Chater, Oaksford, Sloman, McKenzie, Heit, Tenenbaum, Kemp, …] Causal inference [Waldmann, Sloman, Steyvers, Griffiths, Tenenbaum, Yuille, …] Decision making and theory of mind [Lee, Stankiewicz, Rao, Baker, Goodman, Tenenbaum, …]

  5. Everyday inductive leaps How can people learn so much about the world from such limited evidence? • Learning concepts from examples “horse” “horse” “horse”

  6. “tufa” “tufa” “tufa” Learning concepts from examples

  7. Everyday inductive leaps How can people learn so much about the world from such limited evidence? • Kinds of objects and their properties • The meanings of words, phrases, and sentences • Cause-effect relations • The beliefs, goals and plans of other people • Social structures, conventions, and rules

  8. The solution Strong prior knowledge (inductive bias).

  9. What is the relation between y and x?

  10. What is the relation between y and x?

  11. What is the relation between y and x?

  12. What is the relation between y and x?

  13. The solution Strong prior knowledge (inductive bias). • How does background knowledge guide learning from sparsely observed data? • What form does the knowledge take, across different domains and tasks? • How is that knowledge itself learned? Our goal: Computational models that answer these questions, with strong quantitative fits to human behavioral data and a bridge to state-of-the-art AI and machine learning.

  14. The approach: from statistics to intelligence • How does background knowledge guide learning from sparsely observed data? Bayesian inference: 2. What form does background knowledge take, across different domains and tasks? Probabilities defined over structured representations: graphs, grammars, predicate logic, schemas, theories. 3. How is background knowledge itself acquired, constraining learning while maintaining flexibility? Hierarchical probabilistic models, with inference at multiple levels of abstraction. Nonparametric models in which complexity grows automatically as the data require.

  15. Basics of Bayesian inference • Bayes’ rule: • An example • Data: John is coughing • Some hypotheses: • John has a cold • John has lung cancer • John has a stomach flu • Likelihood P(d|h) favors 1 and 2 over 3 • Prior probability P(h) favors 1 and 3 over 2 • Posterior probability P(h|d) favors 1 over 2 and 3

  16. “Similarity”, “Typicality”, “Diversity” Property induction • How likely is the conclusion, given the premises? Gorillas have T9 hormones. Seals have T9 hormones. Squirrels have T9 hormones. Flies have T9 hormones. Gorillas have T9 hormones. Seals have T9 hormones. Squirrels have T9 hormones. Horses have T9 hormones. Gorillas have T9 hormones. Chimps have T9 hormones. Monkeys have T9 hormones. Baboons have T9 hormones. Horses have T9 hormones.

  17. The computational problem “Transfer Learning”, “Semi-Supervised Learning” ? Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ? ? ? ? ? ? ? ? New property Features 85 features for 50 animals (Osherson et al.): e.g., for Elephant: ‘gray’, ‘hairless’, ‘toughskin’, ‘big’, ‘bulbous’, ‘longleg’, ‘tail’, ‘chewteeth’, ‘tusks’, ‘smelly’, ‘walks’, ‘slow’, ‘strong’, ‘muscle’, ‘fourlegs’,…

  18. Horses have T9 hormones Rhinos have T9 hormones Cows have T9 hormones } X Y Hypotheses h Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ? ? ? ? ? ? ? ? ... ... Prior P(h)

  19. Horses have T9 hormones Rhinos have T9 hormones Cows have T9 hormones } X Y Hypotheses h Prediction P(Y | X) Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ? ? ? ? ? ? ? ? ... ... Prior P(h)

  20. mouse P(form) squirrel chimp gorilla P(structure | form) P(data | structure) Hierarchical Bayesian Framework F: form Tree with species at leaf nodes S: structure F1 F2 F3 F4 Has T9 hormones mouse squirrel chimp gorilla ? ? ? D: data …

  21. P(D|S): How the structure constrains the data of experience • Define a stochastic process over structure S that generates candidate property extensions h. • Intuition: properties should vary smoothly over structure. Smooth: P(h) high Not smooth: P(h) low

  22. P(D|S): How the structure constrains the data of experience S Gaussian Process (~ random walk, diffusion) [Zhu, Lafferty & Ghahramani 2003] y Threshold h

  23. P(D|S): How the structure constrains the data of experience S Gaussian Process (~ random walk, diffusion) [Zhu, Lafferty & Ghahramani 2003] y Threshold h

  24. Structure S Data D Species 1 Species 2 Species 3 Species 4 Species 5 Species 6 Species 7 Species 8 Species 9 Species 10 Features 85 features for 50 animals (Osherson et al.): e.g., for Elephant: ‘gray’, ‘hairless’, ‘toughskin’, ‘big’, ‘bulbous’, ‘longleg’, ‘tail’, ‘chewteeth’, ‘tusks’, ‘smelly’, ‘walks’, ‘slow’, ‘strong’, ‘muscle’, ‘fourlegs’,…

  25. [c.f., Lawrence, 2004; Smola & Kondor 2003]

  26. Structure S Data D Species 1 Species 2 Species 3 Species 4 Species 5 Species 6 Species 7 Species 8 Species 9 Species 10 ? ? ? ? ? ? ? ? Features New property 85 features for 50 animals (Osherson et al.): e.g., for Elephant: ‘gray’, ‘hairless’, ‘toughskin’, ‘big’, ‘bulbous’, ‘longleg’, ‘tail’, ‘chewteeth’, ‘tusks’, ‘smelly’, ‘walks’, ‘slow’, ‘strong’, ‘muscle’, ‘fourlegs’,…

  27. Cows have property P. Elephants have property P. Horses have property P. Tree 2D Gorillas have property P. Mice have property P. Seals have property P. All mammals have property P.

  28. Testing different priors Inductive bias Correct bias Wrong bias Too weak bias Too strong bias

  29. Learning about spatial properties Geographic inference task: “Given that a certain kind of native American artifact has been found in sites near city X, how likely is the same artifact to be found near city Y?” 2D Tree

  30. chimp mouse gorilla squirrel squirrel chimp gorilla mouse Hierarchical Bayesian Framework F: form Chain Tree Space mouse squirrel S: structure gorilla chimp F1 F2 F3 F4 D: data mouse squirrel chimp gorilla

  31. Snake Turtle Crocodile Robin Ostrich Bat Orangutan Discovering structural forms Snake Turtle Bat Crocodile Robin Orangutan Ostrich Ostrich Robin Crocodile Snake Turtle Bat Orangutan

  32. Snake Turtle Crocodile Robin Ostrich Bat Orangutan Discovering structural forms “Great chain of being” Snake Turtle Bat Crocodile Robin Plant Rock Angel Orangutan Ostrich God Linnaeus Ostrich Robin Crocodile Snake Turtle Bat Orangutan

  33. People can discover structural forms Tree structure for biological species • Periodic structure for chemical elements • Scientific discoveries • Children’s cognitive development • Hierarchical structure of category labels • Clique structure of social groups • Cyclical structure of seasons or days of the week • Transitive structure for value “great chain of being” Systema Naturae Kingdom Animalia  Phylum Chordata   Class Mammalia     Order Primates       Family Hominidae        Genus Homo          Species Homo sapiens (1837) (1735) (1579)

  34. Typical structure learning algorithms assume a fixed structural form Flat Clusters Line Circle K-Means Mixture models Competitive learning Guttman scaling Ideal point models Circumplex models Grid Tree Euclidean Space Hierarchical clustering Bayesian phylogenetics Self-Organizing Map Generative topographic mapping MDS PCA Factor Analysis

  35. The ultimate goal “Universal Structure Learner” K-Means Hierarchical clustering Factor Analysis Guttman scaling Circumplex models Self-Organizing maps ··· Data Representation

  36. A “universal grammar” for structural forms Form Process Form Process

  37. chimp mouse gorilla squirrel chimp squirrel mouse gorilla Linear Tree Clusters F: form Favors simplicity mouse squirrel S: structure chimp gorilla Favors smoothness [Zhu et al., 2003] F1 F2 F3 F4 D: data mouse squirrel chimp gorilla

  38. Development of structural forms as more data are observed “blessing of abstraction”

  39. Summary so far F: form Bayesian inference over hierarchies of structured representations provides a framework to understand core questions of human cognition: • What is the content and form of human knowledge, at multiple levels of abstraction? • How does abstract domain knowledge guide learning of new concepts? • How is abstract domain knowledge learned? What must be built in? mouse squirrel S: structure chimp gorilla F1 F2 F3 F4 mouse squirrel chimp gorilla D: data

  40. Other questions • How can we learn domain structures if we do not already know in advance which features are relevant? • How can we discover richer models of a domain, with multiple ways of structuring objects? • How can we learn models for more complex domains, with not just a single object-property matrix but multiple different types of objects, their properties and relations to each other? • How do these ideas & tools apply to other aspects of cognition, beyond categorizing and predicting the properties of objects?

  41. A single way of structuring a domain rarely describes all its features… Raw data matrix:

  42. A single way of structuring a domain rarely describes all its features… Conventional clustering (CRP mixture):

  43. Learning multiple structures to explain different feature subsets (Shafto et al.; Shafto, Mansinghka, Tenenbaum, Yamada & Ueda, 2007) CrossCat: System 1 System 2 System 3

  44. 3 9 1 13 5 11 7 14 2 10 6 12 4 8 15 3 9 1 13 5 11 7 14 2 10 6 3 9 1 13 5 11 7 14 2 10 6 12 4 8 15 12 4 8 15 Discovering structure in relational data Input Output 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 person 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 person TalksTo(person,person)

  45. 3 9 1 13 5 11 7 14 2 10 6 12 4 8 15 3 9 1 13 5 11 7 14 2 10 6 12 4 8 15 Infinite Relational Model (IRM)(Kemp, Tenenbaum, Griffiths, Yamada & Ueda, AAAI 06) z 3 9 1 13 5 11 7 14 2 10 6 0.9 0.1 0.1 12 4 8 15 h 0.1 0.1 0.9 0.9 0.1 0.1 O

  46. Infinite Relational Model (IRM)(Kemp, Tenenbaum, Griffiths, Yamada & Ueda, AAAI 06) concept predicate concept Biomedical predicate data from UMLS (McCrae et al.): • 134 concepts: enzyme, hormone, organ, disease, cell function ... • 49 predicates: affects(hormone, organ), complicates(enzyme, cell function), treats(drug, disease), diagnoses(procedure, disease) …

More Related