190 likes | 323 Vues
This paper discusses the importance of instance data evaluation within Semantic Web-based Knowledge Management Systems (KMS). It highlights that while most data evaluation efforts focus on ontology assessment, little attention is given to the conformance between ontologies and instance data. The authors propose a Generic Evaluation Process (GEP) to systematically evaluate instance data against defined ontologies, addressing potential integrity issues. The research also explores techniques for detecting logical inconsistencies and ensuring data quality in KMS, setting the groundwork for future improvements in data management and integrity in the Semantic Web.
E N D
Instance Data Evaluation for Semantic Web-Based Knowledge Management Systems Jiao Tao1, Li Ding2, Deborah L. McGuinness3 Tetherless World Constellation Rensselaer Polytechnic Institute Troy, NY, USA 1 PhD Student 2 Postdoctoral Research Fellow 3Tetherless World Senior Constellation Professor
Semantic Web-based KMS • The Semantic Web is a next generation of the Web which formally defines the relations among terms with ontologies, gives well-defined meaning to information, and enables machines to comprehend the content on the Web (Berners-Lee, Hendler, & Lassila 2001). • Semantic Web-based Knowledge Management Systems enable the next generation of KMS • Applies semantic web technologies to improve on traditional knowledge-management approaches or realize emerging knowledge-services requirements (Davies, Lytras, & Sheth 2007) • Schemas are represented as ontologies (O) and data is SW instance data (D)
Data Evaluation in SW-based KMS: State of the Art • In SW-based KMS, instance data often accounts for orders of magnitude more data than ontology (Ding & Finin 2006). • However most data evaluation work (Rocha et al. 1998) focuses on ontology evaluation, i.e., checking whether the ontologies correctly describe the domain of interest. • There is very little, if any, work on evaluating the conformance between ontologies and instance data.
Instance Data Evaluation in SW-based KMS 1. Create KMS schema as ontologies O (including embedded semantic expectations) Do semantic expectations match between O and D? D O No syntax errors? Web 4. Publish KMS instance data D 3. Instantiate KMS ontologies O 2. Acquire KMS ontologies Semantic expectation mismatches: (i) Logical inconsistencies (ii) Potential issues
Generic Evaluation Process (GEP) • Load instance data D • Is loading failing? • Parse instance data D • Is D syntactically correct? • Load referenced ontologies O = {O1,O2, …} • Is Oi reachable? where Oi defines the terms used by D. • Inspect logical inconsistencies in D • Is Oi logically consistent? • Merge all consistent referenced ontologies into O' • Are D+O’ logically consistent? • Inspect potential issues in D • Compute DC = INF(D,O') which includes all triples in D and O', and all inferred sub-class/sub-property relations • Is there any potential issue in D?
Potential Issues • Unexpected Individual Type (UIT) Issue • rdfs:domain • rdfs:range • owl:allValuesFrom • Redundant Individual Type (RIT) Issue • Non-specific Individual Type (NSIT) Issue • Missing Property Value (MPV) Issue • owl:cardinality • owl:minCardinality • Excessive Property Value (EPV) Issue • owl:cardinality • owl:maxCardinality
Graph Patterns of Potential Issues • Example: Missing Property Value Issue Make sure all instances of wine have a Maker specified
SPARQL Solutions forPotential Issue Detection • Example: MPV Issue
Implementation and Evaluation • Demo: TW OIE Service http://onto.rpi.edu/demo/oie/ • Comparative experiment results
Status, Current and Future Work • TW OIE implemented and Service provided as part of the Inference Web Explanation Framework (IW – McGuinness and Pinheiro da Silva, 2004) • Ongoing work: characterize and detect potential (integrity) issues in instance data • An Initial Investigation on Evaluating Semantic Web Instance Data(WWW 2008) • Characterizing and Detecting Integrity Issues in OWL Instance Data (OWLED 2008 EU) • Integrity Constraint Modeling and Checking for Semantic Web Data An Answer Set Programming-based Approach (submitted to ESWC 2009) • Future work: • Formal representation for expressive integrity constraints • Automatic updates to data to fix problems • Enhanced explanation capabilities
References • T. Berners-Lee, J. Hendler, and O. Lassila, The Semantic Web: A New Form of Web Content that Is Meaningful to Computers Will Unleash a Revolution of New Possibilities, Scientific American, pp. 34–43, 2001. • J. Davies, M. Lytras, and A. Sheth, Semantic-Web-Based Knowledge Management, IEEE Internet Computing, Vol. 11, No. 5, pp. 14-6, 2007. • L. Ding, and T. Finin, Characterizing the Semantic Web on the Web, ISWC, pp. 242-257, 2006. • R. A. Rocha, S. M. Huff, P. J. Haug, D. A. Evans, and B. E. Bray, Evaluation of a Semantic Data Model for Chest Radiology: Application of a New Methodology, Methods of Information in Medicine, Vol. 37, No.4-5, pp. 477-490, 1998. • D. L. McGuinness and P. Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pp 397-413, 2004.
Semantic Web based infrastructure PML is an explanation interlingua Represent knowledge provenance (who, where, when…) Represent justifications and workflow traces across system boundaries Inference Web provides a toolkit for data management and visualization Inference Web Explanation Architecture WWW Toolkit Trust computation IWTrust OWL-S/BPEL SDS Trace of web service discovery Proof Markup Language (PML) End-user friendly visualization IW Explainer/ Abstractor * Learners Learning Conclusions Expert friendly Visualization Trust KIF/N3 JTP/CWM IWBrowser Theorem prover/Rules search engine based publishing Justification SPARK-L SPARK IWSearch Trace of task execution Provenance provenance registration Text Analytics IWBase UIMA Trace of information extraction
Global View Views of Explanation filtered focused global abstraction Explanation (in PML) • Explanation as a graph • Customizable browser options • Proof style • Sentence format • Lens magnitude • Lens width • More information • Provenance metadata • Source PML • Proof statistics • Variable bindings • Link to tabulator • … discourse trust provenance McGuinness – Microsoft eScience – December 8, 2008
Provenance View Views of Explanation filtered focused global abstraction Explanation (in PML) • Source metadata: name, description, … • Source-Usage metadata: which fragment of a source has been used when discourse trust provenance McGuinness – Microsoft eScience – December 8, 2008
Links • Tetherless World Instance Ontology Instance Evaluator: http://onto.rpi.edu/demo/oie/ • Inference Web inference-web.org • Semantic eScience class link (with book to follow) http://tw.rpi.edu/wiki/Semantic_e-Science