1 / 87

Bayesian models of human learning and reasoning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Comp

Bayesian models of human learning and reasoning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Computer Science and AI Lab (CSAIL). Collaborators. Tom Griffiths. Charles Kemp. Noah Goodman. Chris Baker. Amy Perfors. Vikash Mansinghka. Lauren Schmidt. Pat Shafto.

victoria
Télécharger la présentation

Bayesian models of human learning and reasoning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Comp

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian models of human learning and reasoning Josh Tenenbaum MIT Department of Brain and Cognitive Sciences Computer Science and AI Lab (CSAIL)

  2. Collaborators Tom Griffiths Charles Kemp Noah Goodman Chris Baker Amy Perfors Vikash Mansinghka Lauren Schmidt Pat Shafto

  3. The probabilistic revolution in AI • Principled and effective solutions for inductive inference from ambiguous data: • Vision • Robotics • Machine learning • Expert systems / reasoning • Natural language processing • Standard view: no necessary connection to how the human brain solves these problems.

  4. Bayesian models of cognition Visual perception [Weiss, Simoncelli, Adelson, Richards, Freeman, Feldman, Kersten, Knill, Maloney, Olshausen, Jacobs, Pouget, ...] Language acquisition and processing [Brent, de Marken, Niyogi, Klein, Manning, Jurafsky, Keller, Levy, Hale, Johnson, Griffiths, Perfors, Tenenbaum, …] Motor learning and motor control [Ghahramani, Jordan, Wolpert, Kording, Kawato, Doya, Todorov, Shadmehr,…] Associative learning [Dayan, Daw, Kakade, Courville, Touretzky, Kruschke, …] Memory [Anderson, Schooler, Shiffrin, Steyvers, Griffiths, McClelland, …] Attention [Mozer, Huber, Torralba, Oliva, Geisler, Movellan, Yu, Itti, Baldi, …] Categorization and concept learning [Anderson, Nosfosky, Rehder, Navarro, Griffiths, Feldman, Tenenbaum, Rosseel, Goodman, Kemp, Mansinghka, …] Reasoning [Chater, Oaksford, Sloman, McKenzie, Heit, Tenenbaum, Kemp, …] Causal inference [Waldmann, Sloman, Steyvers, Griffiths, Tenenbaum, Yuille, …] Decision making and theory of mind [Lee, Stankiewicz, Rao, Baker, Goodman, Tenenbaum, …]

  5. Everyday inductive leaps How can people learn so much about the world from such limited evidence? • Learning concepts from examples “horse” “horse” “horse”

  6. “tufa” “tufa” “tufa” Learning concepts from examples

  7. Everyday inductive leaps How can people learn so much about the world from such limited evidence? • Kinds of objects and their properties • The meanings of words, phrases, and sentences • Cause-effect relations • The beliefs, goals and plans of other people • Social structures, conventions, and rules

  8. Modeling Goals • Principled quantitative models of human behavior, with broad coverage and a minimum of free parameters and ad hoc assumptions. • Explain how and why human learning and reasoning works, in terms of (approximations to) optimal statistical inference in natural environments. • A framework for studying people’s implicit knowledge about the structure of the world: how it is structured, used, and acquired. • A two-way bridge to state-of-the-art AI and machine learning.

  9. The approach: from statistics to intelligence • How does background knowledge guide learning from sparsely observed data? Bayesian inference: 2. What form does background knowledge take, across different domains and tasks? Probabilities defined over structured representations: graphs, grammars, predicate logic, schemas, theories. 3. How is background knowledge itself acquired? Hierarchical probabilistic models, with inference at multiple levels of abstraction. Flexible nonparametric models in which complexity grows with the data.

  10. Outline • Predicting everyday events • Learning concepts from examples • The big picture

  11. Basics of Bayesian inference • Bayes’ rule: • An example • Data: John is coughing • Some hypotheses: • John has a cold • John has lung cancer • John has a stomach flu • Likelihood P(d|h) favors 1 and 2 over 3 • Prior probability P(h) favors 1 and 3 over 2 • Posterior probability P(h|d) favors 1 over 2 and 3

  12. Bayesian inference in perception and sensorimotor integration (Weiss, Simoncelli & Adelson 2002) (Kording & Wolpert 2004)

  13. Everyday prediction problems(Griffiths & Tenenbaum, 2006) • You read about a movie that has made $60 million to date. How much money will it make in total? • You see that something has been baking in the oven for 34 minutes. How long until it’s ready? • You meet someone who is 78 years old. How long will they live? • Your friend quotes to you from line 17 of his favorite poem. How long is the poem? • You meet a US congressman who has served for 11 years. How long will he serve in total? • You encounter a phenomenon or event with an unknown extent or duration, ttotal, at a random time or value of t <ttotal.What is the total extent or duration ttotal?

  14. Bayesian analysis P(ttotal|t)  P(t|ttotal) P(ttotal)  1/ttotal P(ttotal) Assume random sample (for 0 < t < ttotal else = 0) Form of P(ttotal)? e.g., uninformative (Jeffreys) prior  1/ttotal

  15. Bayesian analysis P(ttotal|t)  1/ttotal 1/ttotal posterior probability Random sampling “Uninformative” prior P(ttotal|t) ttotal t Best guess for ttotal: t* such that P(ttotal > t*|t) = 0.5 Yields Gott’s Rule: Guess t*= 2t

  16. Evaluating Gott’s Rule • You read about a movie that has made $78 million to date. How much money will it make in total? • “$156 million” seems reasonable. • You meet someone who is 35 years old. How long will they live? • “70 years” seems reasonable. • Not so simple: • You meet someone who is 78 years old. How long will they live? • You meet someone who is 6 years old. How long will they live?

  17. Priors P(ttotal) based on empirically measured durations or magnitudes for many real-world events in each class: Median human judgments of the total duration or magnitude ttotal of events in each class, given that they are first observed at a duration or magnitude t, versus Bayesian predictions (median of P(ttotal|t)).

  18. You learn that in ancient Egypt, there was a great flood in the 11th year of a pharaoh’s reign. How long did he reign?

  19. You learn that in ancient Egypt, there was a great flood in the 11th year of a pharaoh’s reign. How long did he reign? How long did the typical pharaoh reign in ancient egypt?

  20. Summary: prediction • Predictions about the extent or magnitude of everyday events follow Bayesian principles. • Contrast with Bayesian inference in perception, motor control, memory: no “universal priors” here. • Predictions depend rationally on priors that are appropriately calibrated for different domains. • Form of the prior (e.g., power-law or exponential) • Specific distribution given that form (parameters) • Non-parametric distribution when necessary. • In the absence of concrete experience, priors may be generated by qualitative background knowledge.

  21. Learning concepts from examples “tufa” “tufa” • Word learning “tufa” • Property induction Cows have T9 hormones. Seals have T9 hormones. Squirrels have T9 hormones. All mammals have T9 hormones. Cows have T9 hormones. Sheep have T9 hormones. Goats have T9 hormones. All mammals have T9 hormones.

  22. The computational problem(c.f., semi-supervised learning) ? Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ? ? ? ? ? ? ? ? New property Features (85 features from Osherson et al. E.g., for Elephant: ‘gray’, ‘hairless’, ‘toughskin’, ‘big’, ‘bulbous’, ‘longleg’, ‘tail’, ‘chewteeth’, ‘tusks’, ‘smelly’, ‘walks’, ‘slow’, ‘strong’, ‘muscle’, ‘quadrapedal’,…)

  23. Similarity-based models Human judgments of argument strength Model predictions Cows have property P. Elephants have property P. Horses have property P. All mammals have property P. Gorillas have property P. Mice have property P. Seals have property P. All mammals have property P.

  24. Beyond similarity-based induction Poodles can bite through wire. German shepherds can bite through wire. • Reasoning based on dimensional thresholds:(Smith et al., 1993) • Reasoning based on causal relations:(Medin et al., 2004; Coley & Shafto, 2003) Dobermans can bite through wire. German shepherds can bite through wire. Salmon carry E. Spirus bacteria. Grizzly bears carry E. Spirus bacteria. Grizzly bears carry E. Spirus bacteria. Salmon carry E. Spirus bacteria.

  25. Horses have T9 hormones Rhinos have T9 hormones Cows have T9 hormones } X Y Hypotheses h Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ? ? ? ? ? ? ? ? ... ... Prior P(h)

  26. Horses have T9 hormones Rhinos have T9 hormones Cows have T9 hormones } X Y Hypotheses h Prediction P(Y | X) Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ? ? ? ? ? ? ? ? ... ... Prior P(h)

  27. Where does the prior come from? Horse Cow Chimp Gorilla Mouse Squirrel Dolphin Seal Rhino Elephant ... ... Prior P(h) Why not just enumerate all logically possible hypotheses along with their relative prior probabilities?

  28. Chimps have T9 hormones. Gorillas have T9 hormones. Taxonomic similarity Poodles can bite through wire. Dobermans can bite through wire. Jaw strength Salmon carry E. Spirus bacteria. Grizzly bears carry E. Spirus bacteria. Food web relations Different sources for priors

  29. mouse P(form) squirrel chimp gorilla P(structure | form) P(data | structure) Hierarchical Bayesian Framework F: form Tree with species at leaf nodes Background knowledge S: structure F1 F2 F3 F4 Has T9 hormones mouse squirrel chimp gorilla ? ? ? D: data …

  30. The value of structural form knowledge: inductive bias

  31. mouse squirrel chimp gorilla Hierarchical Bayesian Framework F: form Tree with species at leaf nodes S: structure F1 F2 F3 F4 Has T9 hormones mouse squirrel chimp gorilla ? ? ? D: data … Property induction

  32. P(D|S): How the structure constrains the data of experience • Define a stochastic process over structure S that generates hypotheses h. • Intuitively, properties should vary smoothly over structure. Smooth: P(h) high Not smooth: P(h) low

  33. P(D|S): How the structure constrains the data of experience S Gaussian Process (~ random walk, diffusion) [Zhu, Ghahramani & Lafferty 2003] y Threshold h

  34. P(D|S): How the structure constrains the data of experience S Gaussian Process (~ random walk, diffusion) [Zhu, Lafferty & Ghahramani 2003] y Threshold h

  35. Structure S Data D Species 1 Species 2 Species 3 Species 4 Species 5 Species 6 Species 7 Species 8 Species 9 Species 10 Features 85 features for 50 animals (Osherson et al.): e.g., for Elephant: ‘gray’, ‘hairless’, ‘toughskin’, ‘big’, ‘bulbous’, ‘longleg’, ‘tail’, ‘chewteeth’, ‘tusks’, ‘smelly’, ‘walks’, ‘slow’, ‘strong’, ‘muscle’, ‘fourlegs’,…

  36. [c.f., Lawrence, 2004; Smola & Kondor 2003]

  37. Structure S Data D Species 1 Species 2 Species 3 Species 4 Species 5 Species 6 Species 7 Species 8 Species 9 Species 10 ? ? ? ? ? ? ? ? Features New property 85 features for 50 animals (Osherson et al.): e.g., for Elephant: ‘gray’, ‘hairless’, ‘toughskin’, ‘big’, ‘bulbous’, ‘longleg’, ‘tail’, ‘chewteeth’, ‘tusks’, ‘smelly’, ‘walks’, ‘slow’, ‘strong’, ‘muscle’, ‘fourlegs’,…

  38. Cows have property P. Elephants have property P. Horses have property P. Tree 2D Gorillas have property P. Mice have property P. Seals have property P. All mammals have property P.

  39. Inductive bias Correct bias Wrong bias No bias Too strong bias Testing different priors

  40. A connectionist alternative(Rogers and McClelland, 2004) Species Features Emergent structure: clustering on hidden unit activation vectors

  41. Reasoning about spatially varying properties “Native American artifacts” task

  42. Property type “has T9 hormones” “can bite through wire” “carry E. Spirus bacteria” Theory Structure taxonomic tree directed chain directed network + diffusion process + drift process + noisy transmission Class A Class B Class C Class D Class E Class F Class G Class D Class D Class A Class A Class F Class E Class C Class C Class B Class G Class E Class B Class F Hypotheses Class G Class A Class B Class C Class D Class E Class F Class G . . . . . . . . .

  43. Reasoning with two property types “Given that X has property P, how likely is it that Y does?” Herring Biological property Tuna Mako shark Sand shark Dolphin Human Disease property Kelp Tree Web Sand shark (Shafto, Kemp, Bonawitz, Coley & Tenenbaum) Mako shark Human Herring Tuna Kelp Dolphin

  44. Summary so far • A framework for modeling human inductive reasoning as rational statistical inference over structured knowledge representations • Qualitatively different priors are appropriate for different domains of property induction. • In each domain, a prior that matches the world’s structure fits people’s judgments well, and better than alternative priors. • A language for representing different theories: graph structure defined over objects + probabilistic model for the distribution of properties over that graph. • Remaining question: How can we learn appropriate theories for different domains?

  45. chimp mouse gorilla squirrel squirrel chimp gorilla mouse Hierarchical Bayesian Framework F: form Chain Tree Space mouse squirrel S: structure gorilla chimp F1 F2 F3 F4 D: data mouse squirrel chimp gorilla

  46. Snake Turtle Crocodile Robin Ostrich Bat Orangutan Discovering structural forms Snake Turtle Bat Crocodile Robin Orangutan Ostrich Ostrich Robin Crocodile Snake Turtle Bat Orangutan

  47. Snake Turtle Crocodile Robin Ostrich Bat Orangutan Discovering structural forms “Great chain of being” Snake Turtle Bat Crocodile Robin Plant Rock Angel Orangutan Ostrich God Linnaeus Ostrich Robin Crocodile Snake Turtle Bat Orangutan

  48. People can discover structural forms Tree structure for biological species • Periodic structure for chemical elements • Scientific discoveries • Children’s cognitive development • Hierarchical structure of category labels • Clique structure of social groups • Cyclical structure of seasons or days of the week • Transitive structure for value “great chain of being” Systema Naturae Kingdom Animalia  Phylum Chordata   Class Mammalia     Order Primates       Family Hominidae        Genus Homo          Species Homo sapiens (1837) (1735) (1579)

  49. Typical structure learning algorithms assume a fixed structural form Flat Clusters Line Circle K-Means Mixture models Competitive learning Guttman scaling Ideal point models Circumplex models Grid Tree Euclidean Space Hierarchical clustering Bayesian phylogenetics Self-Organizing Map Generative topographic mapping MDS PCA Factor Analysis

More Related