1 / 12

Cross-Language French-English Question Answering using the DLT System at CLEF 2003

Cross-Language French-English Question Answering using the DLT System at CLEF 2003. Aoife O’Gorman Igal Gabbay Richard F.E. Sutcliffe. Documents and Linguistic Technology Group Univeristy of Limerick. Outline. Objectives System architecture Key components Task performance evaluation

hansel
Télécharger la présentation

Cross-Language French-English Question Answering using the DLT System at CLEF 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cross-Language French-English Question Answering using the DLT System at CLEF 2003 Aoife O’Gorman Igal Gabbay Richard F.E. Sutcliffe Documents and Linguistic Technology Group Univeristy of Limerick

  2. Outline • Objectives • System architecture • Key components • Task performance evaluation • Findings

  3. Objectives • Learn the issues involved in multilingual QA • Combine the components of our existing English and French monolingual QA systems

  4. System architecture Query classification Query translation (Google) & re-formulation Named entity recognition Text retrieval (dtSearch) Answer entity selection

  5. Query classification • Categories based on translated TREC 2002 queries • Keyword based classification • what_country • De quel pays le jeu de croquet est-il originaire • De quel nation..? • Unknown

  6. Query translation and re-formulation • Submitting the French query in its original form on the Google Language Tools page • Tokenisation • Selective removal of stopwords • Example: • Qui a été élu gouverneur de la California? • Who was elected governor of California? • [ ‘elected’, ‘governor’, ‘California’]

  7. Text Retrieval: Submitting queries to dtSearch • dtSeach indexed the doc collection based on <DOC> tags • Inserting a w/1 connector between two capitalised words • Submitting untranslated quotations for exact match • Inserting an AND connnector between all other terms (Boolean) • Limited verb expansion based on common verbs used in TREC questions

  8. Named Entity Recoginition: General Names • Captures any instances of general names in cases where we are not sure what to look for. • A general_name is defined in our system to be up to five capitalised terms interspersed with optional prepositions. • Examples: Limerick City • University of Limerick

  9. Answer entity selection • highest_scoring • What year was Robert Frost born? • in entity(date,[1,8,7,5],[[],[],[], [], [1,8,7,5]],[],[],[]), poet target([Robert]) target(Frost]) was target([born]) in San Francisco • most_frequent • When did “The Simpsons” first appear on television? • When target([The]) target([Simpsons]) was target(first]) broadcast in entity(date[1,9,8,9,,[[],[],[],[],[],[1,9,8,9],[],[],])

  10. Task performance evaluation Adapted from Magnini (2003)

  11. Findings • Query classification: unexpected formulation of queries, too few categories • Translation: problems with names, titles, • - We need better query-specific translation • - Localisation of names/titles • - Possibly limit translation to search terms • An interface could be built for the parser to enable it to be tested by an end user • Error types 6-13 could be investigated and the parser extended to handle some of them • Practical studies in the use of STS could be carried out

  12. Findings • Text retrieval: allow relaxation and more sophisticated expansion of search queries • Named entity recognition: find better alternatives to answer questions of type Unknown • Answer entity selection: take into account distance and density of query terms • Usability issue: answers may need to be translated back to French

More Related