1 / 48

The Enterprise Data Management Council Semantics Repository Case Study

The Enterprise Data Management Council Semantics Repository Case Study. Mike Bennett EDM Council Inc. Overview. EDM Council Case Study Review format requirements Ontology framework Metamodel Adaptations Extensions Relation to other standards Common terms and standards

medwin
Télécharger la présentation

The Enterprise Data Management Council Semantics Repository Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Enterprise Data Management Council Semantics Repository Case Study Mike Bennett EDM Council Inc.

  2. Overview EDM Council Case Study Review format requirements Ontology framework Metamodel Adaptations Extensions Relation to other standards Common terms and standards Ontology modeling standards Proof of Concept activities

  3. EDM Council Requirements The EDM Council “A non-profit trade association focussed on managing and leveraging enterprise data as a strategic asset to enable financial institutions to increase efficiency, minimize risk, and create competitive advantage” Industry requirement Consistent Terms, Definitions and Relationships Growing realization that this needs semantic approach

  4. History: Financial Standards MDDL: Market data XML Technical (XML) message standard Good design means weak semantics Did try and do a semantic spreadsheet but technical folks did not use ISO TC68/SC4/WG11 FIBIM Logical data model (UML) Intended to be semantic but using unextended UML Not a widely recognized approach to semantics Business experts were unable to comment on a “design” model they don’t understand Industry Conclusion: “We need a semantics standard”

  5. Semantic Model Requirements Position as “Conceptual Model” Same rules apply as for Requirements Specifications Must be owned and validated by business Manage the “Language interface” between tech and business subject matter experts Everything should be in English No techie terms and casing like “objectProperty” Everything should be reviewable Spreadsheets dialect-free diagrams Align with other financial industry standards ISO 20022, EFAMA, FpML, XBRL, MISMO

  6. Early Experiment Looked at what a truly “semantic” layer would be Exercise for the ISO WG responsible for ISO 20022 FIBIM model Example semantic model of industry terms Used TopBraid Composer and Protégé Did it meet “Usability” requirement? Did the semantics stack up?

  7. Possible classes of Thing

  8. Example “Thing”: Equity Real world definition of Equity: "An equity is a financial instrument setting out a number of terms which define rights and benefits to the holder in relation to their holding a portion of the equity within the issuing company".

  9. What is an Equity? Financial Instrument Equity Is a kind of Equity security In relation to Instrument Terms Has rights defined in Or to put it another way…

  10. What is an Equity? Using OWL to define the classes of real things in the world, and the facts about those things Modeled in TopBraid Composer

  11. Financial Semantics in OWL Pizza approach “Everything is a Thing” What about common terms? accounting terms for equity, debt, cashflow Places, time concepts Legal terms (securities are contracts) Better partitioning needed

  12. Conclusions Does not provide views that business SMEs could validate Requires them to interpret OWL terms and diagrams They want spreadsheets and simple diagrams of “Things” and relations Does not allow for common reusable terms outside of the financial services industry

  13. EDM Council Members View What can be seen and understood immediately? Spreadsheets Agreed on a spreadsheet format which could Represent most OWL features Simple as possible but no simpler Simple block diagrams Boxes and Lines like Visio Process Flowcharts Not required for static terms semantics but keep in mind when we need to model business processes

  14. Semantics Repository Strategy OWL was the way to go for semantics models May need to extend for things OWL does not do well OWL tools were almost but not quite ready for business domain experts review Any appearance of dialect or “techie” constructs would limit the review audience and therefore the quality of the model Use a UML Modeling tool for flexibility Generate spreadsheets from this Create “UML-free” diagrams by turning off all UML features Display the results on a dedicated web structure Review this with industry SMEs.

  15. Ontology Definition Metamodel Metamodel and Profile for OWL in UML Early draft available when we started this OWL version 1 UML Tool Enterprise Architect from Sparx Systems Implemented metamodel of RDF/RDFS Implemented metamodel of OWL Used recommended stereotypes in Profile Results in OWL and RDF/S toolbars in EA Added stereotypes for non-stereotyped items in ODM so all on one toolbar e.g. Generalization, RDFS Sub Property, Unions. Tweaks Various tweaks to maintain user diagram commitments Recast all terms in English to maintain user language commitments Exposed predicate logic statements as separate “Logic” classes in XML Spy-like format

  16. English versus OWL concepts Each spreadsheet and diagram feature corresponds to some OWL concept: Thing = OWL class Simple fact = Datatype Property Relationship fact = Object Property with UML multiplicity, or with predicate logic statement applied to the range Special “Logic” classes to visually render logic combinations Mutually Exclusive = Disjoint Logical Union = OWL Union

  17. OWL v UML in ODM Implementation Class = class Object Property = Association Class Datatype Property = Attribute Inverse of a Property = “inverse” Association (red) Predicate Logic Simple: Multiplicity Complex: Exposed as “Logic” class icons Disjoint With = “mutually exclusive” Association (red) RDFS Datatype = datatype (same XML set) Enumerated data range = enumeration RDFS Sub-property of = “from” Generalization (green) OWL Union Class: ODM recommends UML Covering Generalization Set with no stereotype. Stereotyped as “union” (purple)

  18. Resulting Model Framework Modelling tool generates diagrams and spreadsheets content Diagrams and models show: Things Facts about those Things - Simple facts - names, dates etc - Relationship Facts - relating one Thing to another Framed within a technology neutral theory of meaning

  19. Theory of Meaning Set theory constructs Each Thing or class is a set “Is A” relationship defines taxonomy “What kind of thing is it?” Facts about those things Relationship Facts; Simple Facts “What facts distinguish this thing from other things?“ Include necessary and contingent facts Identify mutually exclusive sets OWL1 does not support “Completely exhaustive” set of sub-classes Additional written definitions against each term Reviewed and agreed by business domain SMEs NOTE: This is implemented as OWL Full no limitations are imposed on how a modeler can use the framework to set down facts as they see them .

  20. Spreadsheet Output

  21. Diagrams Block diagrams are derived from the modelling tool but with all the UML features turned off No + signs no < > brackets, no camelCase Domain experts “know they don’t know” what those symbols would mean The diagrams show: The hierarchy of Things Relationships between those Things Simple facts about Things (optional) Additional diagram types show relationships among relationships, for review by ontology experts and those business domain experts who are able to “Keep up” with what many see as the more philosophical aspects

  22. Sample Screenshot Thing “Is A” relations Object Property (Relationship Fact in English)

  23. Sample screenshot 2: Different types of Thing

  24. Sample Screenshot 3Simple Facts

  25. The Elusive CDS

  26. Comparison with Data Models Set theory classes not OO classes Relationships are unidirectional Pair of relationship + inverse = one OO relationship Open World Assumption “Absence of evidence is not evidence of absence” Every fact which defines a thing is included even if data would never be available or is not needed Taxonomy Multiple inheritance Supports real world multiple classifications Data model enumerations Mixed semantics in reference data models Usually points to further semantic modelling requirement

  27. Common Concepts Treatment Goal is interoperability with standards terms E.g. XBRL accounting concepts Define what kind of “Thing” everything is Securities are contracts Tradable Securities v OTC Contracts Need to define high level primitive concepts And the necessary relationships among those concepts

  28. Contexts and Events Digital rights DOI standard “Context” is something with Time and Place English: Something with Time and Place is Event Event with Actor is an Activity Processes are made up of activities Extended to Activity and Process Model concepts Could we replicate Visio-style process flow models with semantics?

  29. Process Example

  30. The “Grammar” approach Extended the thinking with activity and process model, to all concepts Legal top level model (contracts, terms, laws etc.) Geopolitical concepts Time, Information (identifiers etc,) Defined every common concept we could think of thing and relationship fact Necessary relationships among these defines a “Grammar” which is both inherited and specialized Unlike stereotypes, these are also part of the model content Therefore we call them Archetypes Implemented as UML Stereotypes in UML profiles Importing these profiles results in editing toolbar for each set of common concepts Now we can model everything using semantic toolbars

  31. Toolbar screenshot (Accounting) Drag and drop

  32. Toolbar Screenshot (Legal)

  33. Top level taxonomy Partitioned the top of the model into different classes of “Thing” None of our archetypes is directly a “Thing” Based on Knowledge representation (KR) Lattice (John F Sowa, 2000) independent, relative and mediating physical and abstract continuant and occurrent Define all common terms in line with these 3 partitions Added parts, time concepts etc. Allows for cleaner management of data model terms Relative concepts like Issuer v repetitive data structures Introduces “Occurrent” partition in contrast to Continuant

  34. KR Lattice

  35. Semantics Repository Content • Top level: KR Lattice hierarchy • Independent v relative v mediating • continuant v occurrent • concrete v abstract KR Lattice • Mid level: Global terms • Accounting • Legal • Math etc. Global terms Instruments Dated Terms Process • Financial Instruments Ontology • Instruments reference terms • Dated and Time-dependent terms • Processes Common types and selection lists

  36. Semantics Repository Content • Top level: KR Lattice hierarchy • Independent v relative v mediating • continuant v occurrent • concrete v abstract KR Lattice • Mid level: Global terms • Accounting • Legal • Math etc. Global terms These need to be aligned with the best of the rest Instruments Dated Terms Process • Financial Instruments Ontology • Instruments reference terms • Dated and Time-dependent terms • Processes Common types and selection lists

  37. Summary: Presentation to Business SMEs spreadsheets and tables simple block diagrams Ontology framework Partitioning: Terms are descended from one term in each of the three partition layers High level grammars define syntax of meaningful connectionsamong Archetypes e.g. a Transaction always has certain Parties Most of these common terms will be found in industry standards for the relevant industries.

  38. Standards Bodies Financial Securities Industry MDDL: Market Data Definition Language (SIIA/FISD) ISO 20022: Securities messaging (TC68) Registration Authority = SWIFT ISO 20022 FIBIM WG11 draft from ISO TC68/SC4/WG11 FpML: Financial Products Markup Language (ISDA) EFAMA Data dictionary (European Funds and Asset Management Association) FIX: Financial Information eXchange (FPL) Global Terms XBRL: eXtendable Business Reporting Language (XBRL Inc.) MISMO: Standard for loans etc. REA (Resources, Events, Agents) Ontology (William E McCarthy, Michigan State University) DOI Indecs (Digital rights standard)

  39. Financial Industry Standards Reverse engineered these standards as initial repository content: Reference Data Terms ISO 20022 “FIBIM” EFAMA (Funds) Timed and Dated (Market Data) terms MDDL Over the Counter Derivatives: FpML Future / Proof of Concept MISMO (Loans standard) Terms currently imported from project participants, to be realigned with MISMO

  40. Global Concepts Standards Financial Terms XBRL: accounting standard reporting format Used in creating the Financial (Accounting) high level model Disregarded reporting-specific terms Relationships as per basic accounting literature XBRL terms have corresponding archetypes in the SR Other terms to be aligned as material comes available REA – partly incorporated UN-FAO – partly incorporated

  41. Securities Trading Terms FIX: Financial Information eXchange (FIX) format from FPL Would cover pre-trade terms when we model securities trading lifecycle. Data Model Working Group (DMWG) FIX liaising with MDDL in DMWG initiative EDM Council actively participating in this initiative DMWG will align with EDM Council Semantics Repository

  42. Ontology Format Extension Things that are not in scope of OWL itself Synonym (not owl:sameAs) Archetype Classification facets Provenance of meaning to identify standards bodies / originators Hope to collaborate on standardized use of OWL Annotation Properties N-ary relationships would also be useful Shown diagrammatically at present SME view is that this is needed Next iteration will include OWL2 relationship transitivity

  43. Standards Liaison Strategy Meta-level terms Standardize within Annotation Properties Identify the ones of relevance to ontology practitioners Common Upper Ontology Terms Recognize provenance of industry standards bodies is more valuable than isolated ontologist assertions Identify bodies who are developing semantic versions of well attested standard terms and business definitions Ontolog Forum SIO Initiative – take this as canonical form of industry standards semantics recognition and provenance Financial Services Industry ISO 20022: Mapping SR to latest FIBIM Liaise with ISO TC68 on next generation semantic layer for ISO 20022 v2 EFAMA Data Dictionary mapping FpML review of current OTC Derivatives draft XBRL Identify a canonical XBRL “Taxonomy” (ontology) and align formally

  44. History Initial design: May ’08 – Sept ’08 Format review panel – finalized formats of spreadsheets, diagrams and web layout Roadshow presentations Initial draft: Jan ‘09 Weekly SME Reviews to July ’09 Draft released July ’09 Weekly SME Reviews: Pricing, OTC Proof of Concept and Validation Baseline for changes Feb 2010 May 2010 Beta release

  45. Content status Beta Status Reference terms for tradable Securities Draft Pricing, Analytics etc. (market data) OTC Derivatives New in draft Detailed loan and mortgage terms for PoC To Do: Corporate Events and Actions (CAE) Securities Transactions Processing

  46. Proof of Concept Project Securitization (MBS Issuance) ECB, NY Fed, IBM Research, banks, agencies Demonstrate ability to tag new instruments semantically at issue Plans to make this mandatory Basis for systemic risk regulation Transformation to Semantic Data Model New material: Loans and Mortgages model Findings The domain experts get it Many terms not in ISO standards Will feed these into ISO 20022 Refining these with domain experts Complete view of poorly understood securities and missing data linkages (sub prime etc.) It is realistic to tag securities terms at issue, when the semantics are still clearly defined within formal prospectus and other docs, as meanings are grounded legally.

  47. The Future Further work on OTC Derivatives Corporate Events and Actions Track semantics standards evolution (OWL, ODM) Align upper ontology with semantic industry standards as these evolve Align with Ontolog Forum “Sharing and Integration Ontologies (SIO) initiative ISO Alignment Alignment of content with ISO 20022 Logical Data Model ISO 20022 version 2 semantics layer work with TC68 on model standard update the core modeling concepts in line with this Objective: Move from a working prototype model framework to something more standard while contributing our model concepts to industry

  48. Questions?

More Related