500 likes | 983 Vues
Modular Approaches to Crossword Puzzles and Other Language Games Michael L. Littman Rutgers University mlittman@cs.rutgers.edu Motivation Software that can understand language. Question answering Constructing databases from text Natural-language interaction
E N D
Modular Approaches toCrossword Puzzles and Other Language Games Michael L. Littman Rutgers University mlittman@cs.rutgers.edu Crosswords and Constraint Satisfaction
Motivation Software that can understand language. • Question answering • Constructing databases from text • Natural-language interaction • Automatic summarization/briefing How can programs represent meaning? Crosswords and Constraint Satisfaction
Traditional Approach ~like(me, bananas) ^ respect(me, x situation(x) ^ humorous(bananas,x)) I don’t really like bananas, but have long respected their humorous potential. Semantics tricky; word meanings informal. Behavioral approach: do something with it. Crosswords and Constraint Satisfaction
Language Games Like other games: • Evaluation process clean. • Challenging (and fun!) for people. Unlike logical games: • Meaning matters! • No closed world assumption; messy. • Learning necessary... moving target. Machine performance far from humans’. Crosswords and Constraint Satisfaction
Word Games Super-human performance common: • Scrabble™: Maven, near-perfect (Sheppard 02) • Boggle™: millisecond solutions (Boyan 98) • Hangman (Littman 00) • 99.97% 9-letter words under 5 guesses • 1.35 misses on average Crosswords and Constraint Satisfaction
Trivial Pursuit™ Race around board, answer questions. Categories: Geography, Entertainment, History, Literature, Science, Sports Crosswords and Constraint Satisfaction
Wigwam QA via AQUA (Abney et al. 00) • back off: word match in order helps score. • “When was Amelia Earhart's last flight?” • 1937, 1897 (birth), 1997 (reenactment) • Named entities only, 100G of web pages Move selection via MDP (Littman 00) • Estimate category accuracy. • Minimize expected turns to finish. Crosswords and Constraint Satisfaction
Modular Approach to QA High-performance question answering system uses a variety of approaches: • huge corpus of text on many topics • database of questions and answers • tables of facts • combines multiple extraction methods Meaning has many faces Crosswords and Constraint Satisfaction
Wigwam’s Knowledge wigwam me trivia web arts & literature .3 .6 .6 .9 entertainment .3 .3 .5 .9 science & nature .2 .7 .7 .7 geography .1 .2 .4 .9 history .1 .2 .5 .9 sports & leisure .025 .6 .7 .4 ~turns/game 414 48 22 8 Crosswords and Constraint Satisfaction
Who Wants to Be a Millionaire “You know, we ought to enter her in one of those TV quiz shows. We could make a fortune.” (Danny Dunn in Williams & Abrashkin 58) Mult. choice questions, increasing difficulty • 100, 200, 300, 500, 1000 • 2000, 4000, 8000, 16000, 32000 • 64000, 125000, 250000, 500000, 1000000 Crosswords and Constraint Satisfaction
Question Answering Approach Choose highest ranked choice. • 75%, 68%, 56% (Clarke, Cormack & Lynam 01) Expected value (always go on): • $3,689 • Most value due to (rare) $1M. People: • $97,357 Crosswords and Constraint Satisfaction
The Humble Crossword Crosswords and Constraint Satisfaction
NYT, Saturday, October 10th, 1998 Crosswords and Constraint Satisfaction
Variety of Clue Types ThesaurusCut off _ _ _ _ _ _ _ _ Puns & WordplayMonk’s head? _ _ _ _ _ Arts & Literature“Foundation Trilogy” author _ _ _ _ _ _ Popular CulturePal of Pooh _ _ _ _ _ _ EncyclopedicMountain known locally as Chomolungma _ _ _ _ _ _ _ CrosswordeseKind of coal or coat _ _ _ ISOLATED ABBOT ASIMOV TIGGER EVE REST PEA Crosswords and Constraint Satisfaction
PROVERB: System Design Candidate generation(Keim et al. 99) • Like information retrieval: clue implies target • Variety of approaches used simultaneously Merging • Like meta search engine: create master list Grid filling(Shazeer et al. 99) • Like constraint satisfaction: fit answer to grid Probabilities are the common language Crosswords and Constraint Satisfaction
PROVERB System Architecture Crosswords and Constraint Satisfaction
Modules: ClueDB Nymph pursuer: SATYR Bugs pursuer: ELMER Nymph chaser: SATYR Place for an ace: SLEEVE Highball ingredient: RYEHighball ingredient: ICE exact: Highball ingredient: RYE partial: Ace place?: SLEEVE TransModule: Bugs chaser: ELMER AlsoDijkstra[1-2], d[1-2]c, lsicwdb X chaser Xpursuer Crosswords and Constraint Satisfaction
Available at .com ClueDB Comparison exact TransModule partial Coverage 40.3% 73.0% 92.6% Accuracy 91.4% 79.8% 71.0% # Returned 1.3 1.5 493.0 Crosswords and Constraint Satisfaction
Modules: Other DBs Database modules: Transform clue to DB query. imdb: Warner of Hollywood: OLAND wordnet: Fruitless: ARID Syntactic: Variations of fill-in-the-blanks. also blanks_{books, geo, movies, music, quotes}, kindof blanks_movies: “Heavens ____!”: ABOVE Web search: Not used in experimental system. google: “The Way To Natural Beauty” author, 1980: Also rogetsyns, geo, writers, compass, myth, TIEGS altavista, yahoo, infoseek, EbModule, lsiency, etc. Crosswords and Constraint Satisfaction
Modules: Backstops Word lists: Ignore clue, return all words. wordList: 10,000 words, perhaps: NOVELETTE Implicit modules: Probability distributions over all strings of words (e.g., bigram). segmenter: 1934 Hall and Nordhoff adventure novel: PITCAIRNISLAND AlsobigWordList, wordList, DbList Crosswords and Constraint Satisfaction
Merging Candidate Lists Crosswords and Constraint Satisfaction
Module Performance Crosswords and Constraint Satisfaction
CSPs Constraint satisfaction is a core CS task. Apps: • planning and scheduling • design • vision • natural language understanding • temporal reasoning • protocol verification Crossword puzzles the poster child. Crosswords and Constraint Satisfaction
Grid Filling and CSPs Crosswords and Constraint Satisfaction
CSPs and IR Domain from ranked candidate list? Tortellini topping: TRATORIA, COUSCOUS, SEMOLINA, PARMESAN, RIGATONI, PLATEFUL, FORDLTDS, SCOTTIES, ASPIRINS, MACARONI, FROSTING, RYEBREAD, STREUSEL, LASAGNAS, GRIFTERS, BAKERIES,… MARINARA, REDMEATS, VESUVIUS, … Standard recall/precision tradeoff. Crosswords and Constraint Satisfaction
Probabilities to the Rescue? Annotate domain with the bias. Crosswords and Constraint Satisfaction
Solution Probability Proportional to the product of the probability of the individual choices. Can pick sol’n with maximum probability. Maximizes prob. of whole puzzle correct. Won’t maximize number of words correct. Crosswords and Constraint Satisfaction
Posterior Score Posterior probability of a candidate in a slot is sum of the solution probabilities. Crosswords and Constraint Satisfaction
Maximum Expected Overlap Max words in common with random sol’n. Q: expected overlap qxi: prob. of word in slot i. PP-complete. Equivalent to stochastic satisfiability. Crosswords and Constraint Satisfaction
Fast Approximation Compute exact posterior quickly on trees. • Only consider slots reachable in d steps. • Assume independence of paths (tree). Increase d iteratively, improve approx. Cache intermediate results (DP). Polytime! Crosswords and Constraint Satisfaction
Connection to Turbo Decoding Turbo decoding (loopy belief propagation). • transmitting messages from deep space • true message = crossword solution • double encoding = across/down clues • corruption = answer uncertainty • 4-cycles Same decoding algorithm in use! Crosswords and Constraint Satisfaction
Artificial Problems Random on 5x5 grids. Improves with d. NYT: 52% to 90%. Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Grid Filling Crosswords and Constraint Satisfaction
Final: 88% words, 97% letters Crosswords and Constraint Satisfaction
PROVERB Results Test collection (370 puzzles, @~15 min.) • 95% words, 98% letters, 46% puzzles • NYT: 89.5% (95.5% MTW, 85.0% TFSS) • Ablation: ClueDB only 88%, no ClueDB 27% American Crossword Puzzle Tournament • 1998: 190/251, 80% words (vs. 100%) • tricks: letter pairs, words in single square • 1999: 147/261, 75% words • tricks: Home is near: ALASKA Crosswords and Constraint Satisfaction
TOEFL Synonyms Used in college applications. fish • scale • angle • swim • dredge Crosswords and Constraint Satisfaction
Synonym Approaches Latent Semantic Indexing (Landauer & Dumais 97) • Analyze 30k paragraphs, 300d embedding • 64% ~Boulder http://lsa.colorado.edu/ Pointwise Mutual Information-IR (Turney 01) • Counts in Altavista (350M); 74% (us: 77%) Thesaurus • http://Wordsmyth.net: 98% prec.; 74% cov. • Combine with PMI-IR: 92% Crosswords and Constraint Satisfaction
Verbal Analogies Used in college boards (SATs, GREs), and as an intelligence test. cat : meow :: • mouse : scamper • bird : peck • dog : bark • horse : groom • lion : scratch Crosswords and Constraint Satisfaction
Wrap Up Modular language-game systems. PROVERB: • Human-competitive performance. • Components theoretically motivated. • Probabilistically grounded. Crosswords and Constraint Satisfaction
What’s Next? Better module merging: Much work has been ad hoc. Now evaluating a probabilistic combining rule. RL: Corpus-based approach to behavior. Recognize how similarities to past experience. Meaning comes from experience, not rules. Crosswords and Constraint Satisfaction