1 / 90

A Probabilistic Approach to Semantic Representation

A Probabilistic Approach to Semantic Representation. Tom Griffiths Mark Steyvers Josh Tenenbaum. How do we store the meanings of words? question of representation requires efficient abstraction. How do we store the meanings of words? question of representation

Télécharger la présentation

A Probabilistic Approach to Semantic Representation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Probabilistic Approach to Semantic Representation Tom Griffiths Mark Steyvers Josh Tenenbaum

  2. How do we store the meanings of words? • question of representation • requires efficient abstraction

  3. How do we store the meanings of words? • question of representation • requires efficient abstraction • Why do we store this information? • function of semantic memory • predictive structure

  4. Doc1 Doc2 Doc3 … words 34 0 3 in 0 12 2 semantic 5 19 6 spaces … 11 … 6 … 1 … words in spaces semantic Latent Semantic Analysis(Landauer & Dumais, 1997) co-occurrence matrix high dimensional space SVD X U D V T

  5. Mechanistic Claim Some component of word meaning can be extracted from co-occurrence statistics

  6. Mechanistic Claim Some component of word meaning can be extracted from co-occurrence statistics But… • Why should this be true? • Is the SVD the best way to treat these data? • What assumptions are we making about meaning?

  7. Mechanism and Function Some component of word meaning can be extracted from co-occurrence statistics Semantic memory is structured to aid retrieval via context-specific prediction

  8. Functional Claim Semantic memory is structured to aid retrieval via context-specific prediction • Motivates sensitivity to co-occurrence statistics • Identifies how co-occurrence data should be used • Allows the role of meaning to be specified exactly, and finds a meaningful decomposition of language

  9. A Probabilistic Approach • The function of semantic memory • The psychological problem of meaning • One approach to meaning • Solving the statistical problem of meaning • Maximum likelihood estimation • Bayesian statistics • Comparisons with Latent Semantic Analysis • Quantitative • Qualitative

  10. A Probabilistic Approach • The function of semantic memory • The psychological problem of meaning • One approach to meaning • Solving the statistical problem of meaning • Maximum likelihood estimation • Bayesian statistics • Comparisons with Latent Semantic Analysis • Quantitative • Qualitative

  11. The Function of Semantic Memory • To predict what concepts are likely to be needed in a context, and thereby ease their retrieval • Similar to rational accounts of categorization and memory (Anderson, 1990) • Same principle appears in semantic networks (Collins & Quillian, 1969; Collins & Loftus, 1975)

  12. The Psychological Problem of Meaning • Simply memorizing whole word-document co-occurrence matrix does not help • Generalization requires abstraction, and this abstraction identifies the nature of meaning • Specifying a generative model for documents allows inference and generalization

  13. One Approach to Meaning • Each document a mixture of topics • Each word chosen from a single topic • from parameters • from parameters

  14. One Approach to Meaning wP(w|z = 1) = f (1) wP(w|z = 2) = f (2) HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY 0.2 SCIENTIFIC 0.0 KNOWLEDGE 0.0 WORK 0.0 RESEARCH 0.0 MATHEMATICS 0.0 HEART 0.0 LOVE 0.0 SOUL 0.0 TEARS 0.0 JOY 0.0 SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK 0.2 RESEARCH 0.2 MATHEMATICS 0.2 topic 1 topic 2

  15. One Approach to Meaning Choose mixture weights for each document, generate “bag of words” q = {P(z = 1), P(z = 2)} {0, 1} {0.25, 0.75} {0.5, 0.5} {0.75, 0.25} {1, 0} MATHEMATICS KNOWLEDGE RESEARCH WORK MATHEMATICS RESEARCH WORK SCIENTIFIC MATHEMATICS WORK SCIENTIFIC KNOWLEDGE MATHEMATICS SCIENTIFIC HEART LOVE TEARS KNOWLEDGE HEART MATHEMATICS HEART RESEARCH LOVE MATHEMATICS WORK TEARS SOUL KNOWLEDGE HEART WORK JOY SOUL TEARS MATHEMATICS TEARS LOVE LOVE LOVE SOUL TEARS LOVE JOY SOUL LOVE TEARS SOUL SOUL TEARS JOY

  16. One Approach to Meaning q • Generative model for co-occurrence data • Introduced by Blei, Ng, and Jordan (2002) • Clarifies pLSI (Hofmann, 1999) z w

  17. Matrix Interpretation documents topics documents C = F Q topics words words normalized co-occurrence matrix mixture components mixture weights A form of non-negative matrix factorization

  18. Matrix Interpretation documents topics documents C = F Q topics words words documents vectors vectors documents C = U D VT vectors words words vectors

  19. The Function of Semantic Memory • Prediction of needed concepts aids retrieval • Generalization aided by a generative model • One generative model: mixtures of topics • Gives non-negative, non-orthogonal factorization of word-document co-occurrence matrix

  20. A Probabilistic Approach • The function of semantic memory • The psychological problem of meaning • One approach to meaning • Solving the statistical problem of meaning • Maximum likelihood estimation • Bayesian statistics • Comparisons with Latent Semantic Analysis • Quantitative • Qualitative

  21. The Statistical Problem of Meaning • Generating data from parameters easy • Learning parameters from data is hard • Two approaches to this problem • Maximum likelihood estimation • Bayesian statistics

  22. Inverting the Generative Model • Maximum likelihood estimation • Variational EM (Blei, Ng & Jordan, 2002) • Bayesian inference WT + DT parameters WT + T parameters 0 parameters

  23. Bayesian Inference • Sum in the denominator over Tn terms • Full posterior only tractable to a constant

  24. Markov Chain Monte Carlo • Sample from a Markov chain which converges to target distribution • Allows sampling from an unnormalized posterior distribution • Can compute approximate statistics from intractable distributions (MacKay, 2002)

  25. Gibbs Sampling For variables x1, x2, …, xn Draw xi(t) from P(xi|x-i) x-i = x1(t),x2(t),…, xi-1(t), xi+1(t-1), …, xn(t-1)

  26. Gibbs Sampling (MacKay, 2002)

  27. Gibbs Sampling • Need full conditional distributions for variables • Since we only sample z we need number of times word w assigned to topic j number of times topic j used in document d

  28. Gibbs Sampling iteration 1

  29. Gibbs Sampling iteration 1 2

  30. Gibbs Sampling iteration 1 2

  31. Gibbs Sampling iteration 1 2

  32. Gibbs Sampling iteration 1 2

  33. Gibbs Sampling iteration 1 2

  34. Gibbs Sampling iteration 1 2

  35. Gibbs Sampling iteration 1 2

  36. Gibbs Sampling iteration 1 2 … 1000

  37. A Visual Example: Bars sample each pixel from a mixture of topics pixel = word image = document

  38. A Visual Example: Bars

  39. From 1000 Images

  40. Interpretable Decomposition • SVD gives a basis for the data, but not an interpretable one • The true basis is not orthogonal, so rotation does no good

  41. Application to Corpus Data • TASA corpus: text from first grade to college • Vocabulary of 26414 words • Set of 36999 documents • Approximately 6 million words in corpus

  42. A Selection of Topics THIRD FIRST SECOND THREE FOURTH FOUR GRADE TWO FIFTH SEVENTH SIXTH EIGHTH HALF SEVEN SIX SINGLE NINTH END TENTH ANOTHER BRAIN NERVE SENSE SENSES ARE NERVOUS NERVES BODY SMELL TASTE TOUCH MESSAGES IMPULSES CORD ORGANS SPINAL FIBERS SENSORY PAIN IS NATURE WORLD HUMAN PHILOSOPHY MORAL KNOWLEDGE THOUGHT REASON SENSE OUR TRUTH NATURAL EXISTENCE BEING LIFE MIND ARISTOTLE BELIEVED EXPERIENCE REALITY CURRENT ELECTRICITY ELECTRIC CIRCUIT IS ELECTRICAL VOLTAGE FLOW BATTERY WIRE WIRES SWITCH CONNECTED ELECTRONS RESISTANCE POWER CONDUCTORS CIRCUITS TUBE NEGATIVE ART PAINT ARTIST PAINTING PAINTED ARTISTS MUSEUM WORK PAINTINGS STYLE PICTURES WORKS OWN SCULPTURE PAINTER ARTS BEAUTIFUL DESIGNS PORTRAIT PAINTERS STUDENTS TEACHER STUDENT TEACHERS TEACHING CLASS CLASSROOM SCHOOL LEARNING PUPILS CONTENT INSTRUCTION TAUGHT GROUP GRADE SHOULD GRADES CLASSES PUPIL GIVEN SPACE EARTH MOON PLANET ROCKET MARS ORBIT ASTRONAUTS FIRST SPACECRAFT JUPITER SATELLITE SATELLITES ATMOSPHERE SPACESHIP SURFACE SCIENTISTS ASTRONAUT SATURN MILES THEORY SCIENTISTS EXPERIMENT OBSERVATIONS SCIENTIFIC EXPERIMENTS HYPOTHESIS EXPLAIN SCIENTIST OBSERVED EXPLANATION BASED OBSERVATION IDEA EVIDENCE THEORIES BELIEVED DISCOVERED OBSERVE FACTS

  43. A Selection of Topics JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTUNITIES WORKING TRAINING SKILLS CAREERS POSITIONS FIND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY EARN ABLE SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BIOLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIENTIST STUDYING SCIENCES BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIELD PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNIS TEAMS GAMES SPORTS BAT TERRY FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POLES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORCE MAGNETS BE MAGNETISM POLE INDUCED STORY STORIES TELL CHARACTER CHARACTERS AUTHOR READ TOLD SETTING TALES PLOT TELLING SHORT FICTION ACTION TRUE EVENTS TELLS TALE NOVEL MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNESS STRANGE FEELING WHOLE BEING MIGHT HOPE DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PERSON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECTIONS CERTAIN WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL DIVE DOLPHIN UNDERWATER

  44. A Selection of Topics JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTUNITIES WORKING TRAINING SKILLS CAREERS POSITIONS FIND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY EARN ABLE SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BIOLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIENTIST STUDYING SCIENCES BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIELD PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNIS TEAMS GAMES SPORTS BAT TERRY FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POLES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORCE MAGNETS BE MAGNETISM POLE INDUCED STORY STORIES TELL CHARACTER CHARACTERS AUTHOR READ TOLD SETTING TALES PLOT TELLING SHORT FICTION ACTION TRUE EVENTS TELLS TALE NOVEL MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNESS STRANGE FEELING WHOLE BEING MIGHT HOPE DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PERSON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECTIONS CERTAIN WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL DIVE DOLPHIN UNDERWATER

  45. A Probabilistic Approach • The function of semantic memory • The psychological problem of meaning • One approach to meaning • Solving the statistical problem of meaning • Maximum likelihood estimation • Bayesian statistics • Comparisons with Latent Semantic Analysis • Quantitative • Qualitative

  46. Probabilistic Queries • can be computed in different ways • Fixed topic assumption: • Multiple samples:

  47. Quantitative Comparisons • Two types of task • general semantic tasks: dictionary, thesaurus • prediction of memory data • All tests use LSA with 400 vectors, and a probabilistic model with 100 samples each using 500 topics

  48. Fill in the Blank • 12856 sentences extracted from WordNet • Overall performance • LSA gives median rank of 3393 • Probabilistic model gives median rank of 3344 his cold deprived him of his sense of _ silence broken by dogs barking _ a _ hybrid accent

  49. Fill in the Blank

  50. Synonyms • 280 sets of five synonyms from WordNet, ordered by number of senses • Two tasks: • Predict first synonym • Predict last synonym • Increasing number of synonyms BREAK (78) EXPOSE (9) DISCOVER (8) DECLARE (7) REVEAL (3) CUT (72) REDUCE (19) CONTRACT (12) SHORTEN (5) ABRIDGE (1) RUN (53) GO (34) WORK (25) FUNCTION (9) OPERATE (7)

More Related