1 / 1

Answering Portuguese Questions

Goal of a question answering (QA) system is to answer precisely questions formulated in natural language. Different from the more widely known search engines such as Google which retrieve documents based on a set of keywords.

selena
Télécharger la présentation

Answering Portuguese Questions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Goal of a question answering (QA) system is to answer precisely questions formulated in natural language. • Different from the more widely known search engines such as Google which retrieve documents based on a set of keywords. • A general domain QA system it is not specially tuned or prepared to answer questions in a particular domain or subject. Iterate over all the alternative questions Anaphor Resolution Search Doc Collections News, Wikipedia Search Web Question Reformulation Question NER N-Grams Esfinge Answer=NIL Database of Co-ocurrences • General domain Portuguese QA system • Use of information redundancy to retrieve documents (CHAVE collection, Wikipedia and Web) • Anaphor resolution, using PALAVRAS [Bick, 2000] • Multiple question generation, multiple answers • Experimenting with several types of search patterns • Named entity recognizer SIEMÊS [Sarmento, 2006] used to retrieve candidate answers to questions that imply answers of particular types of NE. • Web interface and source code used in some of the system's modules available at http://www.linguateca.pt/Esfinge/ Choice of longer answers Filters Search Supporting Documents Answer Selection Answer Answer(s)‏ CLEF Results • The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access. • Esfinge participates in CLEF since 2004. • Errors at CLEF 2007: Test set 1: 200 questions from QA@CLEF 2007 for PT-PT Table 1. Result of the experiments (F:170 factoid questions; D: 30 definition questions) • Wrong or incomplete search patterns (63/165 wrong answers)‏ • Document retrieval failure (33/165 wrong answers)‏ • Missing patterns to identify the type of answer • Search in Wikipedia Table 2. Causes for wrong answer in the best run Test set 2: 200 questions from QA@CLEF 2008 for PT-PT Table 3. Result of the experiments (F:171 factoid questions; D: 29 definition questions) Answering Portuguese Questions Luís Fernando Costa & Luís Miguel Cabral {Luis.costa, Luis.M.Cabral}@sintef.no Linguateca / SINTEF ICT PB 124, Blindern NO-0314 Oslo, Norway http://www.linguateca.pt Question Answering Arquitecture Experiments • More complete search patterns (added noun phrases) • Remove the verbs from the search • Combine two types of search patterns simultaneously: • Example: “Que país declarou a independência em 1291?” • Predefined text patterns: • "declarou a independência em 1291“ país / 20 • país declarou a independência em 1291 / 1 • Patterns generated using PALAVRAS: • declarou; a independência em 1291; país; • a independência em 1291; país; (without verbs) Bick, E.: The Parsing System "Palavras": Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Aarhus: Aarhus University Press (2000) Sarmento, L.: SIEMÊS - a named entity recognizer for Portuguese relying on similarity rules. In 7th Workshop on Computational Processing of Written and Spoken Language (PROPOR'2006)(Itatiaia, RJ, Brasil, 13-17 May 2006), Springer, pp. 90-99. Conclusions • Using patterns without verbs as a backup strategy yield better results both with 2007 and 2008 QA@CLEF questions), but only for factoid questions. • Benefits of the combination of two types of search patterns • were not confirmed by the experiment with 2008 questions. • Errors moved to a later stage in the system’s execution. • This work was done in the scope of the Linguateca, contract nº339/1.3/C/NAC, project jointly funded by the Portuguese Government and the European Union.

More Related