250 likes | 395 Vues
Question Answering over Linked Data. London Text Analytics Meetup , 5 August 2011. Danica Damljanović contact: danica.damljanovic@gmail.com. The goal. Automatically answer Natural Language questions by a machine as if it were a human. MARY <is a> PERSON
E N D
Question Answering over Linked Data London Text Analytics Meetup, 5 August 2011 DanicaDamljanović contact: danica.damljanovic@gmail.com
The goal Automatically answer Natural Language questions by a machine as if it were a human.
MARY <is a> PERSON UNIVERSITY OF SHEFFIELD <is an> ORGANISATION MARY <works for> UNIVERSITY OF SHEFFIELD SHEFFIELD <is a> CITY UNIVERSITY OF SHEFFIELD <is located in> SHEFFIELD UNITED KINGDOM <is a> COUNTRY SHEFFIELD <is located in> UNITED KINGDOM MARY <lives in> SHEFFIELD SELECT ?country WHERE { ?person <lives in> ?city ?city <located in> ?country • FILTER ?person = MARY } Mary works for University of Sheffield, which is located in Sheffield. Sheffield is located in the United Kingdom. Mary lives in Sheffield.
Motivation In which country does Mary live?
Building Question-Answering systems: Challenges Application developers: customise the system when porting it to work with different kind of data e.g. Where >> Location/City Who >> Person, Organisation
Requirements • Flexibility of the supported language • No strict adherence to syntax • Habitable system: the user easily makes the queries and also avoids constructions that are not supported without any problems • Portable with minimal customisation
FREyA - Feedback, Refinement, Extended VocabularyAggregator • Feedback: showing the user the system’s interpretation of the query • Refinement: • resolving ambiguity: generating dialog whenever one term refers to more than one concept in the ontology (precision) • Extended Vocabulary: • expressiveness: generating dialog whenever an “unknown” term appears in the question (recall)
Feedback in FREyA • http://gate.ac.uk/freya ESWC 2010
Clarification dialogs • Generated by combining the syntactic parsing and ontology-based lookup • the system learns from the user’s selections • “No ranking will be perfect” • Customisation through the dialog with the user • application developers vs. end-users
Learning IF THEN
The User Controls the Output POC min geo:loElevation point POC geo:isLowestPointOf geo:LoPoint POC max state geo:stateArea area geo:State
evaluation • Question-Answering over Linked Data Challenge • Two datasets of different kind, MusicBrainz and DBPedia • 50 training questions per dataset along with the correct answers • 50 testing questions per dataset without the answer
FREyA: results • Question-Answering over Linked Data challenge (ESWC’11, Crete) • FREyA the only system that participated with both provided datasets demonstrating portability • DBPedia, f-measure 0.58 (PowerAqua 0.5) • MusicBrainz, f-measure 0.71 (SWIP 0.66) • 94.4 % precision and recall on the Mooney GeoQuery dataset (PANTO recall 88.05%, Querix precision 87.11%)
Conclusion • Combining syntactic parsing with ontology-based lookup in an interactive process of feedback and query refinement can increase the precision and recall of Question-Answering over Linked Data, • while reducing the time for customisation by shifting some tasks from application developers to end users.
Future Challenges • Output: • Correct answer OR • Identifying the flaws in the data? • Ranking/disambiguation algorithms to improve MRR
thank you for your attention! questions? Contact: danica.damljanovic@gmail.com
Demos • http://gate.ac.uk/sale/dd/ • United States geography • MusicBrainz • Dbpedia