100 likes | 214 Vues
This document provides an in-depth overview of IBM Watson, a QA system developed by IBM utilizing a massively parallel, evidence-based architecture. Built on the IBM Power7 platform, Watson employs extensive preprocessing, natural language tools, and a corpus derived from diverse text sources. Key components include content acquisition, hypothesis scoring, and game strategies. The system analyzes question types, retrieves relevant text, and uses over 100 techniques to generate and rank answers, showcasing its advanced capabilities in natural language processing and structured data utilization.
E N D
A Brief Overview of Watson CSC 9010 Spring 2011. Paula Matuszek
Watson • QA system developed by IBM and collaborators • “massively parallel probabilistic evidence-based architecture” • Hardware is a high-end IBM system, the IBM Power7 platform. • 10 Power7 server blades • 90 servers • 4 processors/server • 8 cores/processor • Robotic arm to press the buzzer. • Input is text only, no speech recognition, no visual. CSC 9010 Spring 2011. Paula Matuszek
Watson • Software is built on top of UIMA: unstructured information management application. UIMA is a framework build by IBM and since open-sourced. • The information corpus was downloaded and indexed offline; no web access during the game. • Corpus was developed from a large variety of text sources: • baseline from wikipedia, Project Gutenberg, newspaper articles, thesauri, etc. • extend with web retrieval, extract potentially relevant text “nuggets”, score for informative, merge best into corpus • Primary corpus is unstructured text, not semantically tagged or formal knowledge base. • About 2% of Jeopardy! answers can be looked up directly. • Also leverages semistructured and structured sources such as Wordnet and Yago. CSC 9010 Spring 2011. Paula Matuszek
Components of DeepQA • About 100 different techniques overall. • Content acquisition: corpus, sample games. Offline, before game itself. • Preprocessing • Natural Language Tools • Retrieve possible answers • Score answers • Buzz in • Game strategies CSC 9010 Spring 2011. Paula Matuszek
Preprocessing • Determine question category • factoid • decomposable • puzzle • Note: excluded questions with AV components and “special instruction” categories • Determine lexical answer type (LAT) • film? person? place? novel? song? • about 2500 in sample of 20,000 questions. About 12% of clues do not indicate type CSC 9010 Spring 2011. Paula Matuszek
Initial Natural Language Processing • Parse question • Semantically tag the components of the question • Reference or coreference resolution • Named entity recognition • Relation detection • Decomposition into subqueries CSC 9010 Spring 2011. Paula Matuszek
Retrieve Relevant Text • Component most similar to a web search • Focus is on recall • Search engines include Indri, Lucene, SPARQL • For some “Closed” LATs (All US States, presidents, etc) can generate candidate list directly • Otherwise extract actual answer • title? • person? etc • Several hundred hypotheses typically generated CSC 9010 Spring 2011. Paula Matuszek
Score Hypotheses • Evaluate candidate answers • soft filtering. Fast light-weight filters prune answers to about 100 • evidence retrieval. Additional structured or unstructured queries • Score answers • LOTS of algorithms! -- More than 50 components • Range from simple word counts to complex spatial and temporal reasoning • Creates an evidence profile: taxonomic, geospatial, temporal, source reliability, etc • Merge answers • Determine ranking and confidence estimation CSC 9010 Spring 2011. Paula Matuszek
And Game Strategies! • Picking a category • tries to find the daily double • goes for lower cost categories to help learn the category • When to bet? • Normally will buzz in if >50% certain • Will buzz in lower if only way to win • Will not buzz in if can’t lose except with a mistake • How much to bet? CSC 9010 Spring 2011. Paula Matuszek
References • A clip of the end of the Jeopardy! game http://www.youtube.com/watch?v=8W36OuMU0yE • A good high-level overview. theswimmingsubmarine.blogspot.com/2011/02/how-ibms-deep-question-answering.html • A detailed description: www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf • Many clips, blogs, and links: ibmwatson.com CSC 9010 Spring 2011. Paula Matuszek