1 / 20

Baseline Findings EPA Enterprise Data Architecture / Data Management Metadata

Baseline Findings EPA Enterprise Data Architecture / Data Management Metadata. Mike Fleckenstein, Practice Leader, MDM, Project Performance Corp. mfleckenstein@ppc.com 571-527-6453. Types of Data. Transactional data Measurements at a point in time Dollars earned or units sold

alder
Télécharger la présentation

Baseline Findings EPA Enterprise Data Architecture / Data Management Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Baseline FindingsEPA Enterprise Data Architecture / Data Management Metadata Mike Fleckenstein, Practice Leader, MDM, Project Performance Corp.mfleckenstein@ppc.com571-527-6453

  2. Types of Data Transactional data • Measurements at a point in time • Dollars earned or units sold • Used for trend analysis Reference data • Entity by which transactions measured • ‘Country’, ‘Prefix’ and ‘Industry • Often inconsistently and redundantly stored within an organization Master data • Single version of the truth • Key corporate reference entities like ‘Customer’, ‘Location’ and ‘Product’ Metadata • Describes objects by connecting objects to the subjects they are about

  3. Types of Metadata Technical- data sources, access protocol (ODBC, JDBC, SQL*NET, etc.), physical schema (database definition, table definition, column definition, etc), logical data source (ER models, object models, etc.) Example: people within IT supporting financial reporting know that the financial data mart resides on machine "XPT001;" the data mart is refreshed, "12 a.m. every Saturday night;" data is sourced from "Hyperion GL" and period data was captured in "AP column.” Business- contextual data about the information retrieved; taxonomies that define business organizations and product hierarchies; controlled vocabulary or reference data that are used to define business terms such as a medical dictionary, financial terminology and such. Example: people in the finance department know performance reports come "once a month;" "GPR" stands for "Global Performance Report;" "AP7" means "Accounting Period Number 7;" and accounting period starts in "February." These descriptions are business meta data.

  4. Metadata & Related Terms • Metadata describes objects, and one of the ways in which it does that is by connecting objects to the subjects they are about • Controlled vocabularyis a closed list of subjects, that can be used for classification • Taxonomy is a subject-based classification that arranges the terms in the controlled vocabulary into a hierarchy • Thesauri take taxonomies and extend them to make them better able to describe the world by not only allowing subjects to be arranged in a hierarchy

  5. Taxonomy • Metadata can be organized using a taxonomy • Helps an audience find information more easily • Blue lines reflect metadata; black lines reflect taxonomy • Blue lines – metadata about the paper • Black lines – subject-based taxonomy

  6. Taxonomy Core Characteristics • Simple terminology • Looser, flatter and more intuitive than traditional taxonomies • E.g. Eight top levels, three levels deep each • Usability in favor of detail • Fewer ‘clicks’ • Must be easy to alter • Don’t overanalyze with too many ‘what ifs’ • United understanding

  7. Taxonomy Categorization Schemes Hardest Easiest

  8. Thesauri (e.g. ISO2788) • BT ( Broader Term) - refers to the term above this one in the hierarchy • SN (Scope Node) - a string attached to the term explaining its meaning • USE - refers to another term that is preferred to this term • TT (Top Term) - refers to the topmost ancestor • RT (Related Term)- refers to a term, related to this term, without being a synonym

  9. Metadata Maturity Model WITH NO METADATA MGMT • Information is lost or hidden • Data integration is costly • Cannot support everyday business • Information is difficult to find • Partial& dated information • Loss of trust in data METADATA MANAGEMENT The organization of technical and business metadata with the goal to advance the sharing, retrieving and understanding of enterprise information assets.

  10. Metadata Maturity Model – Phase I PROCESS • Changes are locally acquired, made and consumed • Sharing through conversations with ‘incumbents’ • Infrequent changes TECHNOLOGY • Spreadsheets and unstructured tools • Application specific metadata components PEOPLE • Small group of rouge metadata warriors • Knowledge is in people’s heads • Sharing of metadata is ad-hoc

  11. Metadata Maturity Model – Phase II PROCESS • Limited sharing of metadata • Local or semi-local repositories • Local attempts at managing metadata • Exploration of core metadata and metadata tools TECHNOLOGY • Modeling tools • Application specific metadata components • Some metadata management tools • Mix PEOPLE • Management awareness • Sporadic adding to various repositories • ‘Talk’ about importance of sharing metadata

  12. Metadata Maturity Model – Phase III PROCESS • Governance process is created and enforced • Workflows • Communication with ‘outside’ departments • Beginnings of real-time integration TECHNOLOGY • Metadata management tools with governance process • Workflow engine • Business rule engine • Data integration tools PEOPLE • Data stewards • Data governance body • Management understands importance of administering metadata

  13. Metadata Maturity Model – Phase IV PEOPLE • Constantly seeking optimization • Metadata administrators – centralized validation PROCESS • Enterprise-level standards • Taxonomy, Ontologies, etc. • Authoritative data sources for entities TECHNOLOGY • Collaboration tools • Enterprise data modeling tool • Vocabulary and taxonomy management tool

  14. Metadata Maturity Model – Phase V PEOPLE • Start managing metadata as part of business • Critical, ubiquitous, invisible part of the organization TECHNOLOGY • Ontology management • Reasoning technology • Data mediation PROCESS • Automated real-time integration • Domain ontologies & topic maps • Seamless integration at low cost

  15. Data Governance Components • Data Stewards • Principle – ‘Guardians’ of Data • Business – Help define data and stewardship standards • Data Architects • Part of EA; Understand EA • Broker requests for new data and data changes • Responsible for enterprise-wide taxonomy • Data Advisory Committee (DAC) • Strategic • Managers & Execs • Broad representation • Infrastructure Team • Responsible for physical architecture and data provision • DBA’s & Developers • Systems & Network Administrators

  16. DOI Data Governance Framework

  17. Value vs. Cost of Metadata High awareness but no governance • ROI point • Start of governance • Right of Phase III Sharp rise in cost for unmanaged metadata

  18. The Dublin Core Standard • Created in 1995 to aid internet searches • Most common standard • Primarily for 'document-like objects' (DLOs) • Example: 'Author = Ronald Snijder‘ • Qualifier: 'Author (type=personalName) = Ronald Snijder‘ • Each element can be repeated (e.g. 'Author (type=personalName) = Seargent Pepper‘ • Every metadata description should describe just one information resource • 15 Elements

  19. Not syntax specific Each element is optional Intrinsicality Many Standards bodies exist Extensions can be used & registered DC content is modifiable

  20. Dublin Core Framework & Extensions Domain specific metadata extensions (e.g. geospatial) Dublin Core adopted as standard Metadata extensions for managing information through its lifecycle Mandatory set of Common Look and Feel elements Extensions for clusters and gateways

More Related