190 likes | 309 Vues
Instance Data Evaluation for Semantic Web-Based Knowledge Management Systems. Jiao Tao 1 , Li Ding 2 , Deborah L. McGuinness 3 Tetherless World Constellation Rensselaer Polytechnic Institute Troy, NY, USA 1 PhD Student 2 Postdoctoral Research Fellow
E N D
Instance Data Evaluation for Semantic Web-Based Knowledge Management Systems Jiao Tao1, Li Ding2, Deborah L. McGuinness3 Tetherless World Constellation Rensselaer Polytechnic Institute Troy, NY, USA 1 PhD Student 2 Postdoctoral Research Fellow 3Tetherless World Senior Constellation Professor
Semantic Web-based KMS • The Semantic Web is a next generation of the Web which formally defines the relations among terms with ontologies, gives well-defined meaning to information, and enables machines to comprehend the content on the Web (Berners-Lee, Hendler, & Lassila 2001). • Semantic Web-based Knowledge Management Systems enable the next generation of KMS • Applies semantic web technologies to improve on traditional knowledge-management approaches or realize emerging knowledge-services requirements (Davies, Lytras, & Sheth 2007) • Schemas are represented as ontologies (O) and data is SW instance data (D)
Data Evaluation in SW-based KMS: State of the Art • In SW-based KMS, instance data often accounts for orders of magnitude more data than ontology (Ding & Finin 2006). • However most data evaluation work (Rocha et al. 1998) focuses on ontology evaluation, i.e., checking whether the ontologies correctly describe the domain of interest. • There is very little, if any, work on evaluating the conformance between ontologies and instance data.
Instance Data Evaluation in SW-based KMS 1. Create KMS schema as ontologies O (including embedded semantic expectations) Do semantic expectations match between O and D? D O No syntax errors? Web 4. Publish KMS instance data D 3. Instantiate KMS ontologies O 2. Acquire KMS ontologies Semantic expectation mismatches: (i) Logical inconsistencies (ii) Potential issues
Generic Evaluation Process (GEP) • Load instance data D • Is loading failing? • Parse instance data D • Is D syntactically correct? • Load referenced ontologies O = {O1,O2, …} • Is Oi reachable? where Oi defines the terms used by D. • Inspect logical inconsistencies in D • Is Oi logically consistent? • Merge all consistent referenced ontologies into O' • Are D+O’ logically consistent? • Inspect potential issues in D • Compute DC = INF(D,O') which includes all triples in D and O', and all inferred sub-class/sub-property relations • Is there any potential issue in D?
Potential Issues • Unexpected Individual Type (UIT) Issue • rdfs:domain • rdfs:range • owl:allValuesFrom • Redundant Individual Type (RIT) Issue • Non-specific Individual Type (NSIT) Issue • Missing Property Value (MPV) Issue • owl:cardinality • owl:minCardinality • Excessive Property Value (EPV) Issue • owl:cardinality • owl:maxCardinality
Graph Patterns of Potential Issues • Example: Missing Property Value Issue Make sure all instances of wine have a Maker specified
SPARQL Solutions forPotential Issue Detection • Example: MPV Issue
Implementation and Evaluation • Demo: TW OIE Service http://onto.rpi.edu/demo/oie/ • Comparative experiment results
Status, Current and Future Work • TW OIE implemented and Service provided as part of the Inference Web Explanation Framework (IW – McGuinness and Pinheiro da Silva, 2004) • Ongoing work: characterize and detect potential (integrity) issues in instance data • An Initial Investigation on Evaluating Semantic Web Instance Data(WWW 2008) • Characterizing and Detecting Integrity Issues in OWL Instance Data (OWLED 2008 EU) • Integrity Constraint Modeling and Checking for Semantic Web Data An Answer Set Programming-based Approach (submitted to ESWC 2009) • Future work: • Formal representation for expressive integrity constraints • Automatic updates to data to fix problems • Enhanced explanation capabilities
References • T. Berners-Lee, J. Hendler, and O. Lassila, The Semantic Web: A New Form of Web Content that Is Meaningful to Computers Will Unleash a Revolution of New Possibilities, Scientific American, pp. 34–43, 2001. • J. Davies, M. Lytras, and A. Sheth, Semantic-Web-Based Knowledge Management, IEEE Internet Computing, Vol. 11, No. 5, pp. 14-6, 2007. • L. Ding, and T. Finin, Characterizing the Semantic Web on the Web, ISWC, pp. 242-257, 2006. • R. A. Rocha, S. M. Huff, P. J. Haug, D. A. Evans, and B. E. Bray, Evaluation of a Semantic Data Model for Chest Radiology: Application of a New Methodology, Methods of Information in Medicine, Vol. 37, No.4-5, pp. 477-490, 1998. • D. L. McGuinness and P. Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pp 397-413, 2004.
Semantic Web based infrastructure PML is an explanation interlingua Represent knowledge provenance (who, where, when…) Represent justifications and workflow traces across system boundaries Inference Web provides a toolkit for data management and visualization Inference Web Explanation Architecture WWW Toolkit Trust computation IWTrust OWL-S/BPEL SDS Trace of web service discovery Proof Markup Language (PML) End-user friendly visualization IW Explainer/ Abstractor * Learners Learning Conclusions Expert friendly Visualization Trust KIF/N3 JTP/CWM IWBrowser Theorem prover/Rules search engine based publishing Justification SPARK-L SPARK IWSearch Trace of task execution Provenance provenance registration Text Analytics IWBase UIMA Trace of information extraction
Global View Views of Explanation filtered focused global abstraction Explanation (in PML) • Explanation as a graph • Customizable browser options • Proof style • Sentence format • Lens magnitude • Lens width • More information • Provenance metadata • Source PML • Proof statistics • Variable bindings • Link to tabulator • … discourse trust provenance McGuinness – Microsoft eScience – December 8, 2008
Provenance View Views of Explanation filtered focused global abstraction Explanation (in PML) • Source metadata: name, description, … • Source-Usage metadata: which fragment of a source has been used when discourse trust provenance McGuinness – Microsoft eScience – December 8, 2008
Links • Tetherless World Instance Ontology Instance Evaluator: http://onto.rpi.edu/demo/oie/ • Inference Web inference-web.org • Semantic eScience class link (with book to follow) http://tw.rpi.edu/wiki/Semantic_e-Science