220 likes | 341 Vues
Ontology Evaluation and Ranking using OntoQA. By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye. Why “ OntoQA ?”. More and more ontologies are being introduced Difficult to find good ontology related to user’s work
E N D
Ontology Evaluation and Ranking using OntoQA By. SamirTatir and I.BudakArpinar Department of Industrial Engineering Park Jihye
Why “OntoQA?” • More and more ontologies are being introduced • Difficult to find good ontology related to user’s work • Need tools for evaluating and ranking the ontologies • Provides a flexible technique to rankontologies based on user’s contents and relevance • OntoQA is the first approach that evaluates ontologies using their instances as well as schemas
Contents • Architecture • Terminology • The Metrics • Schema Metrics • Instance Metrics • Ontology Score Calculation • Experiments and Evaluation • Conclusion
Architecture • Input Ontology • OntoQA calculates metric values
Architecture • Input Ontology and Keywords • OntoQA calculates metric values Uses metric values to evaluate the overall contents of the ontology and obtain its relevance to the keywords Uses WordNet to expand the keywords to include any related keywords that might exist in the ontology
Architecture • Input Keywords • OntoQA uses Swoogle to find the RDF and OWL ontologies in the top 20 results returned by Swoogle OntoQA then evaluates each of the ontologies OntoQA finally displays the list of ontologiesranked by their score
Terminology • Schema • A set of classes, • A set of relationships, • A set of class-ancestor pairs, • Knowledgebase • A set of instances, • A class instantiation function, • A relationship instantiation function,
Metrics • Two dimension • Schema Ontology design and its potential for rich knowledge representation • Instances Placement of instance data and distribution of the data • Overall Knowledgebase • Class-specific metrics • Relationship-specific metrics
Schema Metrics (1) • Relationship Diversity(RD) : Whether user prefers a taxonomy or diverse relationships If RD value is close to 0, most of the relationships are inheritance relationship IF RD value is close to 1, most of the relationships are non-inheritance
Schema Metrics (2) • Schema Deepness(SD) : Distinguish Shallow ontology from a deep ontology If SD value is low, ontology would be deep, and covers specific domain in detailed manner IF SD value is high, ontology would be shallow, and represents a wide range of general knowledge ?
Instance Metrics (1) Overall KB Metrics • Class Utilization(CU) : Indicate how classes defined in the schema are being utilized in the Knowledgebase C’ is the set of populated classes If CU value is low, knowledgebase does not have data that exemplifies all the knowledge that exists in the schema
Instance Metrics (1) Overall KB Metrics • Cohesion(Coh) : Represents the number of connected components in the KB • Class Instance Distribution(CID) : Indicate how instances are spread across the classes on the schema Standard deviation in the number of instances per class
Instance Metrics (2) Class Specific Metrics • Class Connectivity(Conn) : Indicate centrality of a class NIREC (C) is the set of relationships, instances of the class have with instances of other classes
Instance Metrics (2) Class Specific Metrics • Class Importance (Imp) : Indicate what parts of the ontology are considered focal and what parts are on the edge Number of instances that belong to the inheritance subtree rooted at in the KB, compared to the total number of class instances in the KB
Instance Metrics (2) Class Specific Metrics • Relationship Utilization(RU) : Reflects how the relationships defined for each class in the schema are being used at instance level is the set of distinct relationships used by instances of a class , is the set of relationships a class has with another class ,
Instance Metrics (3) Relationship-Specific Metrics • Relationship Importance(Imp) : Measures percentage of importance of the current relationship Number of instances of relationship in the KB, compared to the total number of property instances in the KB (RI)
Ontology Score Calculation • Evaluation of Ontology based on the entered keywords • The terms entered by the user are extended by addinganyrelated terms • Determines the class and relationship whose name contain any term of the extended set of terms • Aggregate the overall metrics to get overall score for the ontology
Experiments and Evaluation • Compare the ranking of the ontoQA, OntoRank of Swoogle, group of expert users. • OntoRank1) • Similar to Google’s pageRank approach • Gives preference to PopularOntologies wPR(a) is weighted PageRank variation 1)Finin T., et all. Swoogle:Searching for knowledge on the Semantic Web
Experiments and Evaluation • Problem of OntoRank1) • If two copies of the same ontology are placed in two different locations and one of these locations is cited more than the other, it will rank the copy at this popular location higher than the other copy • OntoQA will give both ontologies the same ranking 1)Finin T., et all. Swoogle:Searching for knowledge on the Semantic Web
Experiments and Evaluation With Balanced Weight With Higher Weight for Schema Size OntoQA Swoogle Swoogle OntoQA user user
Conclusion • Different from other approaches in that it is tunable, requires minimal user involvement • Consider both the schema and the instances of a populated ontology
Review • Ranking result depends highly on the Weight • Difficult to decide proper Weight • Due to inconsistent metrics, every metric has its own range => “same weight” doesn’t mean “same preference” • About 10 kinds of metrics, too many cases of combination