Federated Ontology Search

Federated Ontology Search Vasco Calais Pedro, Eric Nyberg and Jaime Carbonell Presenter: PushkarAcharya

Overview • Introduction • Ontological Search • Ontology description and selection • Merging • Scoring • Results • Related Work • Challenges and Future Work • Conclusion

Introduction • Large number of open-domain ontologies available • Cyc, SUMO, Omega, Thought Treasure, Swoogle, etc. • Offer easily accessible and open domain information

Introduction • Large number of open-domain ontologies available • Cyc, SUMO, Omega, Thought Treasure, Swoogle, etc. • Offer easily accessible and open domain information • CHALLENGES ?? • Information merging and reuse • Different frameworks and languages

Introduction • SOLUTIONS?? • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • Establish Full mapping between concepts and relations • Absorb other ontologies

Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • Drawbacks – • Non-scalability • Losing autonomy of ontological knowledge • Language level mismatches • Ontology level mismatches • Updating mappings as Ontologies are updated

Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • At application developer side – Querying each ontology individually

Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • At application developer side – Querying each ontology individually • Middleware • Query multiple ontologies and merge results • Form ontological chains and inferences

Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • At application developer side – Querying each ontology individually • Middleware • Only for small fragments of ontologies • On demand basis • Take advantage of redundant and complementary knowledge to improve performance • Parallelize query execution

Ontological Search • This approach will be successful only if the “search” is separated from information need and ontology. • Abstracts the formal representation of query as required by the ontologies • Describes 3 operators – • Rel(a, b, rels) • Parents(a) • Children(a) • By defining operators we delegate the their execution to ontologies. • Freedom to use extended features

Ontological Search

Ontological Search • Constraint – The output of the query execution should be in form of a Rooted Directed Acyclic Graph

Ontological Search • Two sub-problems – • Ontology Description and Selection • Merging and Scoring

Ontology Description and Selection • Goal : Selection of subset of relevant ontologies • Can be modeled as P(O,q) • Difficult with constant updates to ontologies • Use of inference engines and logic mechanisms • Evaluate relative utility of different ontologies by comparing results generated for given input query. • Comparison against gold set of queries. • Time consuming process • Paper uses a parameter to model general accuracy for a given resource. • Use of machine learning algorithms like expectation maximization

Merging • Reduces problem of merging ambiguous concepts • Primary goal is to find complementary information in the results • Makes the result more complete • Involves inexact graph matching and maximum common sub-graph problems • When dealing with non-isomorphic graphs • NP-Complete problem

Merging • Isomorphic graphs

Merging • Graph similarity • Cost Based Distance • Use of edit operations • Feature Based Distance • Use a set of invariants established from the graph structural description • Maximum Common Subgraph • Maximum clique detection

Merging • Localized Confidence Boosting • Confidence is indicative of the reliability of the association. • Graphs are broken into tuples (cx, cy, r) and merged if the tuples are similar. Confidence is boosted when merging using Soft Or –

Merging • Tuple Similarity • Based on the linear combination of edge similarity and concept similarity • Uses Q-Gram distance for comparing concepts or relations

Scoring • Score outcome of each operator before final score • Each operator focuses on either precision or recall • Precision operator : relation • Recall operator : similarity • Precision : relevant results in retrieved outcome • Recall : fraction of relevant instances that are retrieved

Scoring • Precision = relevant instances in outcome Total outcome • Recall = relevant instances in outcome Relevant results

Scoring • Precision scoring metric • Recall scoring metric

Results • Experimental Setup: Type Checking • Ontologies used: WordNet and ThoughtTreasure • 9558 pairs from Javelin question answering system in TREC QA • Gold standard, for a subset of full set of pairs, was created in order to test the accuracy

Results Improved Confidence after merging Recall Precision and recall

Related Work • Different way to approach same problem • FCA-merge algorithm • IF-Map method • PROMPT system • SWOOGLE • DRAGO

Challenges and Future Work • Current approach is not robust to relations in different ontologies differ significantly • Compare the structures in which the 2 concepts occur to determine similarity • Ontology description in constantly changing ontologies is difficult • Future Work: Model an ontology based on use of random queries to determine the domain of the ontology

Conclusions • Approach discussed here presents several benefits over full merge • Helps mitigate the issue of dynamic ontologies • Establishes a parallel to federated search

Questions??

Federated Ontology Search