1 / 17

A Schema Integration Framework over Super-Peer based Network

A Schema Integration Framework over Super-Peer based Network. Hao Ding NTNU/IDI SCC-PTPSC 2004, Shanghai, China. Agenda. Motivation Goals and Objectives Related Works State-of-the-art Platform and Technical Suggestions Conclusion . Motivations. Dilemma.

ken
Télécharger la présentation

A Schema Integration Framework over Super-Peer based Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Schema Integration Framework over Super-Peer based Network Hao Ding NTNU/IDI SCC-PTPSC 2004, Shanghai, China

  2. Agenda • Motivation • Goals and Objectives • Related Works • State-of-the-art • Platform and Technical Suggestions • Conclusion

  3. Motivations Dilemma • On one hand, accumulated large volume retrieval systems produced by different providers in various schemas. • On the other hand, users prefer to access all these heterogeneous data in a unified interface.

  4. Goals and Objectives • A unified access to a more complete view of domain specific information (or content-related information) which is disseminated around the world in various forms.

  5. Knowledge Preparation • OAI-PMH • P2P Protocol. • w.r.t., JXTA protocol • Semantic Web and Ontology Theory • w.r.t., JENA SW framework • RDF, OWL

  6. Related Works • Resource Integration • Bibliographic metadata based Method • i.e., MARC 856 field - URL • Database Browsing and Navigation Method • i.e., subjects-based

  7. Related Works (con’d) • Global as View (GaV) Method • i.e., GaV in NSDL: 9 metadata formats Source DB1 linking Local MD query Global-View MD Set Source DB2 integrate Local MD results Source DBn Local MD linking

  8. Local MD Local MD Local MD Related Works (con’d) • Local as View Method • i.e., MetaCrawler Source DB1 query Source DB2 Query Dist. & Result Int. distribute results Source DBn

  9. State-of-the-art • Problem Statements • Various forms: flat files, database, schema-based libraries, etc. • Semantic ambuguities: the key characteristics of data integration is not so much its volume, but its diversity, heterogeneity and dispersion. (Rechemann, 2000) • Scalability • Authentication

  10. III II I State-of-the-art (con’d) • Scenario of information searching in P2P environment: a1 a2 1.request 5. ack a0 4.resply c0 2.find 3.find c2 Semantic-based Negotiation c1 b0 b3 I: Shared schema II: different schema but in the same community III: different schema and community harvesting b1 b2

  11. State-of-the-art (con’d) • One conventional approach is adopting OAI-PMH for harvesting heterogeneous resources which must support a common metadata set, e.g., DC. arXiv NSDL OCLC's Experimental Thesis Catalog Data Provider … … Service Provider Arc Kepler Users

  12. State-of-the-art (con’d) • In our approach: • Peers will be wrapped with harvesting protocol in order to get content • ’Service Provider’ is removed. No ’Mediator’ any more • Advantages: • Data are always up-to-date • Data providers can ’join’ and ’leave’ freely • Query can reach all available data providers • Query mechanism can also be improved to allow users to choose their favorite data providers. • Challenges: • No control of the qualities of the data providers. • Limitation on the scalability because of the query flooding NSDL OCLC's Experimental Thesis Catalog arXiv P2P Network Kepler Arc Users

  13. State-of-the-art (con’d) • Key problems: • Topology of the system infrastructure: connected graphs, not only hierarchies • ‘hierarchies represent the limitation of the human view of complex structures’ (From Keith, Infosam 2004) • Autonomous understanding of the complex semantics • Domain ontologies to provide supportive metadata for interoperability • Upper-Level ontologies • a foundation for more specific domain ontologies.

  14. MDi MDi Interpreter Interpreter JENA Inference Engine Relationships Generation among MD Records • Semantic Web Framework: adopting JENA • Ontology Language: OWL (which is compatible with JENA).

  15. Platform and Technics • Platform: adopting JXTA Protocol for constructing P2P environment • Semantic Web Framework: adopting JENA • Ontology Language: OWL (which is compatible with JENA). • Inference Engine: Jess or Japster (survey pending) • Testing Domain: Bibliographic records – INEX collection. • More testing collections are to be selected on the basis of the difference in content, format, access mechanism: e.g,SWISS-PORT, EMBL,etc. • Upper Level Ontology: UMLS • Domain Specific Ontologies

  16. Conclusions • A scenario of complete view on heterogenous resources • Problem statements and State-of-the-art • Proposed platform and technical suggestions • Other open problems • Result integration and ranking • Data Providers Location • Query decomposition algorithms

More Related