1 / 15

Benchmarking Reasoners for Multi-Ontology Applications

Benchmarking Reasoners for Multi-Ontology Applications. Ameet N Chitnis, Abir Qasem and Jeff Heflin. 11 November 2007. Talk Organization. Motivation ( a.k.a. why yet another benchmark? ) and Influences The Workload Domain Ontologies, map ontologies, data sources, queries The Metrics

Télécharger la présentation

Benchmarking Reasoners for Multi-Ontology Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Benchmarking Reasoners for Multi-OntologyApplications Ameet N Chitnis, Abir Qasem and Jeff Heflin 11 November 2007

  2. Talk Organization • Motivation (a.k.a. why yet another benchmark?) and Influences • The Workload • Domain Ontologies, map ontologies, data sources, queries • The Metrics • How do we generate things? • Domain ontology generation • Map ontology Generation • Parameters & Relationships • Map Generator Algorithm • Data Source Generation • Query Generation • Sample Workload • Conclusion & Future Work

  3. Motivation As the Semantic Web matures … • OWL Ontologies and data from various organizations will gain commercial value • Alignment of different ontologies and integration of data that commit to them will be a viable business enterprise • Quite possibly we will have post development alignments between ontologies (Alignment tools, third parties etc.) • Currently DBPedia, Hawkeye provides some form of third party alignments (non commercial) • We wanted to develop a benchmark that reflects the above reality

  4. Influences • Lehigh University Benchmark (LUBM) by Y. Guo, Z. Pan, and J. Heflin. (ISWC 2004) • Extended LUBM (can support both OWL Lite and OWL DL) by L. Ma, Y. Yang, Z. Qiu, G, Xie and Y. Pan. (ESWC 2006) • Statistical Analysis of the available Semantic Web ontologies by Tempich, C. and Volz, R. (ISWC 2003) • Benchmarking DL systems by I. Horrocks and P. Patel-Schneider. (DL Workshop 1998) • Internet topology generator by J. Winick and S. Jamin. (University of Michigan)

  5. The Workload (1) • Domain ontologies • “Simple” ontologies. We can control number of classes, properties, and branching factor of the hierarchies • Data sources • We can control number of data sources that commit to a given ontology, number of classes that will have individuals, number of properties that will connect those individuals, number of triples. • Queries • Extensional queries in SPARQL. • We can control the mix of classes, properties, individuals • We can control selectivity

  6. The Workload (2) • Map ontologies: Main focus of this work • In our work a map ontology consists solely of “mapping” axioms that establish alignment between two domain ontologies • This is just for convenience of generation and analysis. Semantically they are not much different from the domain ontologies • Macro level: • We generate Directed acyclic graph of domain ontologies • Every edge represents a map ontology • Micro level: • We can control the type of axioms that are used to map two domain ontologies

  7. Metrics

  8. Domain Ontology Generation • Simple taxonomy • The number to generate vary in a normal distribution with a user supplied value for the mean • Given a branching factor and number of terms we generate a balanced tree • Complex axioms are left for map ontologies

  9. Map Ontology Generation Inputs • No. of Ontologies we want in the workload • Average Out-degree (referred to as out below) • Diameter The number of maps created is approximately equal to - • maps ~(total onts-terminal onts)* out However we do not have terminal onts as a parameter A reasonable approximation is Terminal ontologies ~ (onts*out)/(diameter+out) Thus we have Number of maps ~ (onts*out*diameter)/(diameter+out)

  10. Map Generator Algorithm 1. Determine and mark the number of terminal nodes 2. Create a path of diameter length 3. Choose targets for every non-terminal ontology. Constraints: • No Cycles • No path greater than diameter • Non-terminal nodes should not become terminal Create the corresponding map ontologies by generating mapping axioms • Update the parameters of the source and the target

  11. Mapping axioms • Given two domain ontologies and a desired distribution of OWL constructors and restrictions • We choose terms from the domain ontologies and create an axiom that connects them • We can generate fairly complex axioms • E.g. O1:A ⊔ O1:B ⊑∃ O2:P.O2:C ⊓∀O2:Q.O2:D • Currently the algorithm is restricted to generating axioms that will keep the ontology to OWLII (a subset of OWL used by OBII, Qasem et al. 2007, ISWC NFR workshop) • But this is NOT a limitation of our approach

  12. Source Generation • Choose an ontology • Choose number of classes to create individuals • Generate triples • We can either generate random individuals or • Use the domain and range information to connect the individuals with properties

  13. Query Generation SPARQL Queries (SELECT) • Choose the first predicate from the classes of an ontology. • We bias the next predicate with a 75% chance of being one of the properties from the ontology. • We make use of shared variables in order to implement “joins”. A shared variable is equally likely to be in the subject as well as the object position. • For single predicate queries all the variables are distinguished. For others, on an average 2/3rd of the variables are distinguished and the rest are non-distinguished. • There exists a 10% chance for a constant.

  14. A Sample Workload • We used the benchmark to evaluate OBII – a distributed query answering system • We compared it with a “baseline” system which was essentially a KAON2 wrapper • Some characteristics of the workload • 50% of classes had individuals • On an average we generated 75 triples in a source • Generated configurations as large as 100 domain ontologies with about 1000 data sources

  15. Conclusion and Future Work • A focus on workload that accounts for post development alignments • Micro level - controlling mapping axioms • Macro level - controlling how ontologies are mapped • Domain ontologies synthesis can be expanded to support complex axioms • Experiment with different characteristics • Hubs and Authorities (different in-degree / out-degree pattern)

More Related