1 / 21

From XML to Semantic Web

From XML to Semantic Web. Changqing Li Tok Wang Ling. Department of Computer Science School of Computing National University of Singapore. Outline. Introduction Background Preliminary A genetic model to organize ontologies The translations Related work Conclusion.

percy
Télécharger la présentation

From XML to Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From XML to Semantic Web Changqing Li Tok Wang Ling Department of Computer Science School of Computing National University of Singapore

  2. Outline • Introduction • Background • Preliminary • A genetic model to organize ontologies • The translations • Related work • Conclusion

  3. Background – Ontology description languages • Resource Description Framework (RDF) [8] • Organizes information in a Subject-Verb-Object (SVO) (or Resource-Property-Resource triples) form [8] Ora Lassila and Ralph R. Swick: Resource description framework (RDF). 1999.

  4. Background – Ontology description languages • RDF Schema (RDFS) [1] • DARPA Agent Markup Language (DAML) [10] • Ontology Inference Layer (OIL) [6] • DAML+OIL [4] • Web Ontology Language (OWL) [3] • They are all based on the RDF syntax • They all define more primitives to describe information, e.g. “rdfs:subClassOf”, “owl:equivalentClass” etc. [1] Dan Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification 1.0, W3C Candidate Recommendation 27 March 2000. [3] Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel chneider and Lynn Andrea Stein. OWL Web Ontology Language Reference. [4] Frank van Harmelen, Peter F. Patel-Schneider, and Ian Horrocks. Reference description of the DAML+OIL (March 2001) ontology markup language [6] I. Horrocks, D. Fensel, J. Broekstra, S. Decker, M. Erdmann, C. Goble, F. van Harmelen, M. Klein, S. Staab, R. Studer, and E. Motta. The Ontology Inference Layer OIL. [10] Lynn Andrea Stein, Dan Connolly, and Deborah McGuinness. DAML Ontology language specification. October 2000

  5. <student id="HD1234567"> <name>John</name> <contact_no>9876543</contact_no> <course code="CS4321"> <name>database</name> <grade>A</grade> </course> <part_time> <position>programmer</position> </part_time> </student> The “Student.xml” student 2, 0:1, 0:n sc, 2, 3:8, 4:n course part_time id name contact_no sc code name grade position ORA-SS schema diagram Preliminary • ORA-SS [9] • Object class • Relationship type • Attribute of object class and relationship type • Grade is an attribute of relationship type “sc” [9] Tok Wang Ling, Mong Li Lee, Gillian Dobbie. Semistructured Database Design, Springer, 2005

  6. <student id="HD1234567"> <name>John</name> <contact_no>9876543</contact_no> <course code="CS4321"> <name>database</name> <grade>A</grade> </course> <part_time> <position>programmer</position> </part_time> </student> The “Student.xml” student 2, 0:1, 0:n sc, 2, 3:8, 4:n course part_time id name contact_no sc code name grade position ORA-SS schema diagram. Motivating example • Distinguish the student name and course name • The semantics is clearer when changing student to student_employee • The semantics of the relationship type name “sc” is not clear • DTD and XML Schema have the same problem, and even worse, e.g they do not have the “sc” for the relationship type

  7. RDF Inheritance RDFS Inheritance OWL Inheritance Inheritance Course Ontology Person Ontology Inheritance Inheritance … contact_ number home_phone Employee Ontology Student Ontology Block Inheritance Inheritance Inheritance Mutation per:contact _number per:home _phone EmployeeWoringHome Ontology StudentEmployee Ontology … office_phone Atavism Block … … per.home_phone emp:contact _number Ontology hierarchy A genetic model to organize ontologies • Inheritance • Ontology inherits ontology languages • Lower level ontologies inherits higher level ontologies • Block • Employee Ontology does not inherit the “home_phone” from Person Ontology • Atavism • “home_phone” is an atavism in EmployeeWorkingHome Ontology • Mutation • “contact_number” of Person Ontology is a mutation in Employee Ontology • Multiple inheritance • StudentEmployee Ontology inherits both Student and Employee ontologies • StudentEmployee Ontology inherits the “contact_number” from Student Ontology Atavism– the concepts of grandparent ontology blocked by parent ontology is reused in this ontology. Mutation – same name but different semantics in parent and child ontology.

  8. Our translations • Semantic translation • Structural translation • Schematic translation

  9. Semantic translation • The Semantic Translation (SemT) from an XML file or schema to a semantic web file or schema in this paper means that the XML elements, attributes and values are replaced with concepts from ontologies. • Rule SemT 1 (Rule for that only one matched concept is returned from ontologies). The XML (schema) element, attribute or value is replaced with this only returned concept

  10. Semantic translation • Rule SemT 2 to 4 are used for that more than one matched concepts are returned • Rule SemT 2 (Rule for Multiple Inheritance and Block). If the child ontology inherits several parent ontologies, the concept from that unblocked parent ontology is selected for the replacement. • Rule SemT 3 (Rule for Atavism). If the concept of the grandparent or ancestor ontology is an atavism in the grandchild or descendant ontology, the concepts in the grandchild or descendant ontology are used for the replacement. • Rule SemT 4 (Rule for Mutation). If a concept in the parent ontology is a mutation in the child ontology, the concept in the child ontology is used for the replacement. Example 1. If an XML is about student employee, the StudentEmplyee ontology is specified for search, and the ancestor ontologies of this specified ontology will be searched also. The “contact_number” from Student Ontology is used for replacement.

  11. Semantic translation • Rule SemT 5 (Rule for that no matched concept are returned from ontologies). If the element, attribute or value cannot be found in the ontologies, our system suggests adding new concepts into the ontologies (adding new concepts needs the confirmation from the domain expert).

  12. Semantic translation • Rule Sem 6 (Rule for Numbers). If the values in the XML are numbers, such as the contact_no “9876543”, they need not be searched in ontologies. • Rule Sem 7 (Rule for Person Names). If the values in the XML are person names (or company names etc.), such as “John”, they need not be searched in ontologies.

  13. Structural translation • Structural Translation (StrT) in this paper refers to the translation of an XML file or schema to a file or schema complying with the RDF structure i.e. SVO format. • Rule StrT 1 (Rule for checking structure). For any path of the XML from the root to the leaf, if the nesting is notresource, property, resource, property, resource etc. interleaved, this XML does not satisfy the RDF structure. • Rule StrT 2 (Rule for modifying structure).If resources or properties are required to be inserted in the XML to satisfy the RDF structure, the resources or properties are searched in the ontology hierarchy based on the domain and range of properties (not based on name).

  14. Schematic translation • Schematic Translation (SchT) in this paper means that some features of the XML schema are translated to follow the RDF, RDFS and OWL languages. • Rule SchT 1 (Rule for ID and ID reference). For the object identifier of ORA-SS or the ID attribute of DTD, it will be translated to the “rdf:ID” (an identification primitive of RDF) and the value for the primary key will be kept unchanged. We use the “rdf:resource” to refer to the referenced object. • Rule SchT 2 (Rule for default and fixed values). If the value of an attribute is a default or fixed value, it is kept unchanged. • Rule SchT 3 (Rule for order sensitive, composite and disjunctive attributes). The order sensitive attribute is translated to the “rdf:Seq”, the composite attribute to the “rdf:Bag”, and the disjunctive attribute to the “rdf:Alt”. • Rule SchT 4 (Rule for cardinality). The cardinality to constraint the objects and attributes is kept unchanged after translation. Thus the structure information of the original XML schema can be kept.

  15. stu_emp:Student_Employee emp:part_time 2, 0:1, 0:n cou:take, 2, 3:8, 4:n cou:Course emp:Job rdf:ID per:name stu: contact_number cou:take rdf:ID cou:name cou:grade emp:position The ORA-SS after the three-step translations • “stu_emp”, “per”, “cou” and “emp” are namespaces [11] to refer to Student_Employee, Person, Course and Employee ontologies • “rdf” is the namespace to refer to the RDF ontology language The ORA-SS schema diagram after the three-step translations. [11] Namespaces in XML, World Wide Web Consortium 14-January-1999. http://www.w3.org/TR/REC-xml-names/

  16. <stu_emp:Student_Employeerdf:id="HD1234567"> <per:name>John</per:name> <stu:contact_number>9876543 </stu:contact_number> <cou:take> <cou:Course rdf:id="CS4321"> <cou:name>cou:database</cou:name> <cou:grade>A</cou:grade> </cou:Course> </cou:take> <emp:part_time> <emp:Job> <emp:position>emp:programmer </emp:position > </emp:Job> </emp:part_time> </stu_emp:Student_Employee> The XML file after semantic and structural translation The XML file after the three-step translations

  17. Related work – Tools to translate present web to semantic web • SHOE Knowledge Annotator [5] • Annotate HTML • Manual tool • AeroDAML [7] • Annotate HTML • Automatic tool • A single predefined ontology which includes all the concepts for different domains in [5] Jeff Heflin and James Hendler. A Portrait of the Semantic Web in Action. IEEE Intelligent ystems, 16(2), 2001. [7] Paul Kogut, and William Holmes. AeroDAML: Applying Information Extraction to Generate DAML Annotations from Web Pages. K-CAP 2001 Workshop, October 21, 2001.

  18. Related work – Tools to translate present web to semantic web • OntoParser [2] • Translate XML to satisfy the RDF structure • Only structural translation • The translation is very simple which only add some <terms>, <rdf:Seq>, <rdf:li> etc. among the elements to make the elements are resource, property interleaved. • The translation is not based on the semantics of the elements, thus the semantics between two elements (between two resources or two properties) are still not clear after translation. [2] Avigdor Gal , Ami Eyal, Haggai Roitman, Hasan Jamil, Ateret Anaby-Tavor, and Giovanni Modica. OntoParser: an XML2RDF translator of OntoBuilder ontologies, OntoBuilder project. 2004.

  19. Related work – Tools to translate present web to semantic web • Our translation • Semantic translation • OntoParser does not have this translation. • The search to ontologies is only at some related paths of the genetic model, less concepts need to be traversed and less confused concepts are returned (compared to AeroDAML). • Automatic translation (compared to the manual tool SHOE Knowledge Annotator) • Structural translation • AeroDAML and SHOE Knowledge Annotator does not have this translation. • Our structural translation are based on the semantic translation. The inserted resources or properties have clearer semantics, not just <terms>, etc. (compared to OntoParser). • Schematic translation • Discuss how to process the default, fixed value, cardinality etc. constraints in the XML schemas, which is absent in AeroDAML, SHOE Knowledge Annotator and OntoParser.

  20. Conclusion • Three translations to translate XML to semantic web • Semantic translation • Structural translation • Schematic translation • Schemas are translated firstly, then the XML files confirming to the schemas can be translated easily, which improves the efficiency of translation. • Organize ontologies based on the genetic model. • The searching to ontologies is only at several related paths of the genetic model, thus less concepts need to be traversed and less confused concepts will be returned, and the rules introduced in this paper make the semantics of the returned concepts clearer.

  21. Reference [1] Dan Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification 1.0, W3C Candidate Recommendation 27 March 2000. [2] Avigdor Gal , Ami Eyal, Haggai Roitman, Hasan Jamil, Ateret Anaby-Tavor, and Giovanni Modica. OntoParser: an XML2RDF translator of OntoBuilder ontologies, OntoBuilder project. 2004. [3] Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel chneider and Lynn Andrea Stein. OWL Web Ontology Language Reference. [4] Frank van Harmelen, Peter F. Patel-Schneider, and Ian Horrocks. Reference description of the DAML+OIL (March 2001) ontology markup language [5] Jeff Heflin and James Hendler. A Portrait of the Semantic Web in Action. IEEE Intelligent ystems, 16(2), 2001. [6] I. Horrocks, D. Fensel, J. Broekstra, S. Decker, M. Erdmann, C. Goble, F. van Harmelen, M. Klein, S. Staab, R. Studer, and E. Motta. The Ontology Inference Layer OIL. [7] Paul Kogut, and William Holmes. AeroDAML: Applying Information Extraction to Generate DAML Annotations from Web Pages. K-CAP 2001 Workshop, October 21, 2001. [8] Ora Lassila and Ralph R. Swick: Resource description framework (RDF). 1999. [9] Tok Wang Ling, Mong Li Lee, Gillian Dobbie. Semistructured Database Design, Springer, 2005 [10] Lynn Andrea Stein, Dan Connolly, and Deborah McGuinness. DAML Ontology language specification. October 2000 [11] Namespaces in XML, World Wide Web Consortium 14-January-1999. http://www.w3.org/TR/REC-xml-names/

More Related