1 / 29

David Webber OASIS SET TC / CAM TC (with excerpts from iSURF presentation by

OASIS SET TC Automating Intra-domain Mappings. Leveraging SET, OWL, CAM and Dictionary based tools to enabled automated cross-dictionary domain translations. David Webber OASIS SET TC / CAM TC (with excerpts from iSURF presentation by Prof. Dr. Asuman Dogac, METU-SRDC, Turkey ). Agenda.

mirari
Télécharger la présentation

David Webber OASIS SET TC / CAM TC (with excerpts from iSURF presentation by

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OASIS SET TC Automating Intra-domain Mappings Leveraging SET, OWL, CAM and Dictionary based tools to enabled automated cross-dictionary domain translations David Webber OASIS SET TC / CAM TC (with excerpts from iSURF presentation by Prof. Dr. Asuman Dogac, METU-SRDC, Turkey)

  2. Agenda • Part I: Introduction – • Intra-domain example use cases • Challenges and Opportunities • Part II: Roadmap – • CAM templates, OWL, XPath, Dictionaries, CCTS • Using Dictionary based approach and SET Tools for aligning structure components across syntax vocabularies within a domain • Part III: Summary – • Next Steps

  3. Part I: Intra-domain Example Use Cases

  4. Information Exchange Interoperability • Many common domains are using multiple vocabularies that have arisen historically over time – e.g. banking, healthcare, supply chain1. • These may be weakly or strongly aligned depending on the domain and fragmentation / marketplaces within it • All domains share common components such as organisation, person, customer, vehicle, address. 1 – X12/EDI, UN/CEFACT, UBL, GS1, xCBL, cXML, FIX, SWIFT, HL7, more…

  5. Dictionary alignment task challenges • Each domain can be inspected by comparing the vocabulary dictionaries • Creating dictionaries in a common reference format has previously been complex and manual intensive process • Even within a domain implementation the vocabulary maybe fragmented and inconsistent because information models evolve over time

  6. Opportunities and Potential • Creating a domain agnostic set of methods and tools that allow alignment within any domain to facilitate consistent information definitions • Leverage the approach to also support semi or fully automated mapping patterns and templates • Use open standards and open source tools • Provide open public roadmap for tool vendors • Allow standards groups to publish their exchanges in an open non-proprietary syntax and rule system • Enable SMBs to build once, exchange to many

  7. Part II: Roadmap – CAM templates, OWL, XPath Dictionaries and CCTS

  8. CAM templates, OWL and dictionaries • Information components derive their meaning and semantics from the context of their use pattern, not the physical name label, e.g. • Customer/Account/Number • Order/Item/Number • CAM templates and OWL terms share ability to express use patterns that can be inspected and equivalence deduced using software agents that traverse the exchange structure components • Matching is based on rules that can be tailored and reference to dictionaries of known properties • Allows automated generation of domain dictionaries

  9. CAM templates, XPath and dictionaries • CAM toolkit contains dictionary analysis tools that can: • Create a new dictionary from existing domain exchange transactions • Merge dictionaries together • Compare exchange transactions to dictionary definitions and produce spreadsheet of matches and deltas • Report XPath location usage patterns of all unique items and exchange transactions • Assign unique UID values to each component

  10. CAM dictionary generation overview XSLT script XSLT script XSD schemas CAM Templates 2 1 Compare & Merge 3 Components: Name Description Type Restrictions Relationships Usage occurrences Master Dictionary UID

  11. Dictionary Tools • Generate a dictionary of core components from a set of exchange templates • Separate dictionary content by namespace • Merges annotations and type definitions from exchange template into dictionary • Compare each exchange template to the master domain dictionary • Produce spreadsheet workbooks • Update spreadsheet and export back to dictionary core components

  12. Create Dictionary – CAM process Select Dictionary; empty for new create, or existing for merge Output dictionary filename Select template content namespace to match with Merge mode; use true to combine content

  13. Compare to Dictionary Pick dictionary to compare with Name of result cross-reference file

  14. View Cross-Reference as Spreadsheet

  15. Roadmap Summary • Develop crosswalks: • Convert XSD schema to CAM templates • Leverage template structure and XPath rules to build dictionaries with UID labels • Build OWL relationships from dictionaries • Compare each dictionary to master dictionary and reference OWL and type knowledge bases to align • Produce spreadsheet for manual review • Save final results back to master dictionary • Build runtime templates: • Compare individual CAM templates to master dictionary, generate cross-walk section between components • Cross-walk can contain alignment rules in XPath for content handling (e.g. code values and re-formatting)

  16. CAM template to OWL exporter • Currently CAM toolkit contains a variety of exporter tools into XSD schema, XML dictionary and XML test case example generation • Opportunity to write exporter that generates OWL terms directly from CAM template patterns in dictionary • Using XSLT to accomplish this, so can be easily adapted, extended and tailored • Allows OWL-based reasoner to act with CAM • Reasoner can also then update CAM dictionary to complete the semantic mapping

  17. CAM to OWL generation overview Master Dictionary XSLT script 1 Insert UID couplets 5 Extract and Generate 2 XSLT script UID UID Output UID couplet pairings XML 4 Reasoner Components: Name Description Type Restrictions Relationships OWL terms instances 3 UID

  18. Dictionaries, UIDs, and CAM templates • Within a dictionary each unique context of an item can be assigned a UID label value • These UID label values can then be inserted as references into a CAM template • Each UID couplet across exchange formats within a domain can be marked as equivalent (aliases) or similar (rules associated) • For similar items, CAM supports transform rules1 • The UID couplets allow automated mapping across CAM template definitions 1 - Using standard XPath syntax

  19. Explicate semantics related with the different usages of document data types • Different document standards use Data Types differently • For example, “Code.Type" in one standard is represented by “Text.Type" in another standard and yet with “Identifier.Type" in another standard • This knowledge in real world is expressed through class equivalences so that not only the humans but also the reasoner knows about it • Code.Type ≡ Text.Type • Name.Type ≡ Text.Type • Identifier.Type ≡ Text.Type • Can cross-reference via UID as well as type

  20. Dictionary Alignment Step • Human / OWL inspectors • Dictionary alignment report produces known equivalents listing (confidence 100%), and then lesser equivalence rankings based on matching factors • Component compound relationships resolved using CAM template structure layouts • Human inspection then reviews and resolves and updates dictionary (using Excel spreadsheet workbook format) • New dictionary produced • Iterative refinement over time can enhance alignment along with common practices through industry agreements

  21. From Dictionary to Runtime Mapping • Once dictionary is available with UID couplets for domain crosswalks – proceed to align • Take templates of actual exchanges – and label these with UID couplets • Lookup UID couplets in dictionary and update target template with UID from couplet • Take completed templates – use to drive actual mapping processes

  22. Create UID driven mapping template Domain Master Dictionary Same, or Similar UID UID (+ optional XPath mapping rule) CAM template (target) Lookup UID couplet 2 Updated CAM template (matched targets) 1 3 XSLT script UIDs CAM template (source) Rules UIDs

  23. Automated UID driven mapping Rules UIDs CAM template (matched targets) Apply UID matches and rules 2 Input XML instances Output mapped XML instances 1 3 XSLT script CAM template (source) UIDs

  24. Dictionary approach summary • If the document components of two different domain standards share the same semantic properties: • Use this as an indication that they may be similar • Some explicitly defined semantic properties may imply further implicit semantic relationships: • Use a reasoner to obtain implicit relationships • Align to dictionary definitions allowing crosswalk • Create harmonized dictionary lookup • Use abstract UID as common reference (linkage between language specific named types/objects) • Explicate semantics related with the different usages of document data types in different document schemas to obtain some desired interpretations by means of such informal semantics • Determine similar/match relationships and rules for constraint alignment and compound component relationships (e.g. date-time vice date and time) • Provide dictionary structure format for managing relationships • Leverage existing OASIS CAM and ebXML Registry TC work

  25. Part III: Summary – Next Steps

  26. Value Proposition • Mapping templates provide localization mechanism to tailor input and outputs to patterns and scenarios • Domain mapping automation reduces burden on participants to maintain multiple mappings • Removes issues surrounding versioning and exchange transaction structure differences • Simplifies testing and setup • Allows alignment over time to coherent domain reference dictionaries; mitigates collisions • Lowers costs of entry and participation

  27. Tools needed • CAM • Schema ingesting • Dictionary builder • OWL • Reasoner • CAM dictionary to OWL generator • Extend CAM dictionary format for couplets / rules • Extend reasoner to update dictionary couplets • Mapping • XSLT engine to read input, templates and create output • (Can use existing XSLT CAM validator as basis)        

  28. The above equivalences are labelled as couplets through the UID dictionary cross-references and can be stored back into CAM templates <Extensions> section for runtime crosswalk use.

  29. Runtime crosswalks between template structure member items UID: T0015 UID: D4310 UID: C3402

More Related