260 likes | 389 Vues
22nd Meeting of the Working Group of STATISTICS, TELEMATIC NETWORKS & EDI 05-06 June 2007. Agenda Item 14 Tutorial on using SDMX Technical Standards for the management of reference metadata Kyriakos Kassis – Agilis S.A. Georgia Pappou – Agilis S.A. Outline of the Tutorial.
E N D
22nd Meeting of the Working Group of STATISTICS, TELEMATIC NETWORKS & EDI 05-06 June 2007 Agenda Item 14 Tutorial on using SDMX Technical Standards for the management of reference metadata Kyriakos Kassis – Agilis S.A. Georgia Pappou – Agilis S.A.
Outline of the Tutorial • SDMX basics and reference metadata • Metadata Set (example, formats) • Metadata Structure Definition (MSD) • Metadata Flow Definition • Metadata Reporting in SDMX-ML • Examples of metadata messages • Mapping of XML elements with the SDMX Information Model • Eurostat’s SDMX Registry tool and MSD
SDMX and Reference Metadata • version 2.0 (Nov. 2005) • introduces their standard systematic representation • Supports the routine production, exchange and dissemination of reference metadata suitable for all classes of users • Definition • ‘the larger set of concepts that describe and qualify statistical data sets and processing more generally, and which are often associated not with specific observations or series of data, but with entire collections of data or even the institutions which provide that data’
SDMX benefits • XML streamline • Structured representation format for data and metadata • Wide range of tools and technologies for supporting the production, exchange, and dissemination of data and metadata files • Reduces IT development costs • Model driven approach • SDMX Information Model provides a solid base for the development of concrete technical specifications
Metadata Set • A set of information regarding almost any object within the formal SDMX view of statistical exchange which may describe: • the maintainers of data or structural definitions; • the schedule on which data is released; • the flow of a single type of data over time; • the quality of data, the methodology etc. • In SDMX, the creators of reference metadata may take whatever concepts they are concerned with, or obliged to report, and provide a reference metadata set containing that information. • The Metadata Set contains reference metadata whose content conforms to the specification of a Metadata Structure Definition.
Metadata formats comparison • Grouping of Metadata concepts (under SDDS) • Base Page • Dissemination Formats Page • Summary Methodology Page • New metadata reporting format (SDMX) • Agency-specific template • Data category-specific template • In general, Object-specific • Metadata can be reported for ANY object found in SDMX-IM • Grouping depends on the Metadata Structure Definition
Metadata Structure Definition 1/2 • A reference metadata set also has a set of structural metadata which describes how it is organized. This metadata identifies • what reference metadata concepts are being reported, • how these concepts relate to each other (typically as hierarchies), • how they may be represented (as free text, as coded values, etc.), • which is the role in its usage (mandatory or conditional) • with which formal SDMX object types they are associated
Metadata Structure Definition 2/2 • An MSD comprises two fundamental parts: • The Object Type(s) to which metadata can be attached to • The Concepts for which metadata have to be reported • these concepts are grouped under one (or more) Report Structure(s)
Metadataflow Definition • Very similar to a Data flow definition; describes, categorises, and constrains metadata sets • Metadata sets are reported or disseminated according to a metadata flow definition. • Identifies a Metadata Structure Definition • May be associated with one or more subject matter domains (this facilitates the search for data according to organised category scheme) • Constraints, in terms of reporting periodicity or sub set of possible keys that are allowed in a metadata set, may be attached to the metadata flow definition.
Metadata Reporting in SDMX-ML • Reference Metadata mechanism supports reporting and dissemination through specified types of messages • Structure Message • Provides the Metadata Structure Definition • Generic Metadata Message • Provides a single format for any metadata structure definition • All reference metadata expressible in SDMX-ML format can be marked up according to this format, in agreement with the contents of the Structure • Performs only a minimum of validation • Supports the creation of generic software tools and services for processing reference metadata • Metadata Report message • For each MSD, an XML schema (specific to that MSD) can be created • Performs validation on sets of reported data • Less verbose than the Generic metadata message • Easier to use because the XML mark-up relates directly to the reported concepts
Example Metadata Set in SDDS format Category Data Provider Reported Value Dataflow Reported Attribute: METADATA_UPDATE.LAST_UPDATE Reported Attribute: CONTACT Reported Attribute: CONTACT.CONTACT_ORGANISATION Reported Value: Reported Attribute: CONCEPTS_DEFINITIONS.STATISTICAL_CONCEPT Reported Value
Full Target = DATAFLOW Category Data Flow Target Object Type = Target Object Type = ESTAT_CATEGORY_SCHEME ESTAT_DATAFLOWS Item Scheme = QNA Quarterly National Accounts STSShort-Term Statistics Item Scheme = STSIND_ORD_M Industrial new orders index STSIND_PROD_M Industrial Production index Target Object Type = Data Provider Partial Target = Partial Target = CATEGORY DATA_PROVIDER ESTAT_ORGANISATION_SCHEME Item Scheme = 1A INE, Spain 2A ONS, UK 3A ESTAT Structure message: Attachments GENERIC_MSD Metadata Structure Definition = Identifier Component Identifier Component Identifier Component Identifier Component Ref: Category Identifier Component Ref: Data Flow Identifier Component Ref: Category Identifier Component Ref: Data Provider
Report Structures REPORT_FOR_CATEGORY_METADATA REPORT_FOR_AGENCY_METADATA REPORT_FOR_DATAFLOW_METADATA
Structure message: Report Structure: Example 1/2 Metadata Report = REPORT_FOR_AGENCY_METADATA CONTACT_ORGANISATION CONTACT_MAIL CONTACT_MAIL_ADDRESS CONTACT_EMAIL CONTACT_PHONE CONTACT_FAX Target Id = DATA_PROVIDER Metadata Attributes CONTACT INSTITUTIONAL FRAMEWORK TRANSPARENCY RELEVANCE QUALITY_MANAGEMENT LEGAL_ACTS REPORTING_REQUIREMENTS CONFIDENTIALITY INTERNAL_ACCESS COMMENTARY CHANGES_IN_METHODOLOGY METHODOLOGY_DOCUMENTATION
Metadata Report = REPORT_FOR_CATEGORY_METADATA Target Id = CATEGORY Structure message: Report Structure: Example 2/2 LAST_POSTED LAST_CERTIFIED LAST_UPDATE Metadata Attributes METADATA_UPDATE SCOPE_COVERAGE CONCEPTS_DEFINITIONS DATA_VALIDATION RELEASE_CALENDAR_POLICY STATISTICAL_PRESENTATION CONTACT etc. REFERENCE_AREA TIME_COVERAGE SECTOR_COVERAGE STATISTICAL_UNIT STATISTICAL_POPULATION STATISTICAL_CONCEPT DEFINITIONS
<ReportStructure> Hierarchy of Metadata Attributes
Metadata Set: Structure • References to : • a Metadata Structure Definition (MSD) • a Report Structure • a Target Identifier • Defines: • The actual values of the target objects • Comprises: • The Reported Attributes and their corresponding Values • These Attributes may be: • coded • text • date/time • number etc.
Category = STS Data Provider = ESTAT Concept = CONTACT Concept = CONTACT_ORGANISATION Value = Eurostat, Statistical Office of the European Communities Concept = Concept = CONTACT_NAME Value = Unit C2 : National accounts - Production CONTACT_MAIL_ADDRESS Value = L-2920 Luxembourg Concept = CONTACT_EMAIL Value =http://epp.eurostat.ec.europa.eu/ pls/portal/url/page/PGP_DS_SUPPORT Generic Metadata message: Example 1/2 Metadata Set Metadata Structure = GENERIC_MSD Metadata Report = REPORT_FOR_AGENCY_METADATA Identifiers Metadata Attributes
Concept = INSTITUTIONAL_FRAMEWORK Concept = LEGAL_ACTS Concept = REPORTING_REQUIREMENTS Value = Value = Council Regulation (CE) No 322/97 of 17 February 1997 (OJ No L 52/1) and Council Regulation (EURATOM, EEC) no 1588/90 of 11 June 1990 on the transmission of the data subject to statistical confidentiality to the Statistical Office of the European Communities (OJ No L 151/ 1) stipulates the detailed rules used for receiving, processing and disseminating confidential data. Concept = CONFIDENTIALITY Generic Metadata message: Example 2/2 Value = The legal basis for the indicators is the Council Regulation No 1165/98 of 19 May 1998 concerning short-term statistics , amended by the Regulation No 1158/2005 of 6 July 2005 concerning short-term statistics, referred hereafter as the STS Regulation. The definitions of short-term statistics variables are laid down in Commission Regulation No 1503/2006 of 28 September 2006 implementing and amending Council Regulation N° 1165/98 of 19 May 1998 concerning short-term statistics as regards the definition of variables. The classification by the main industrial groupings (MIGS) is laid down in Commission Regulation N° 586/2001 of 26 March 2001 on implementing Council Regulation N° 1165/98 of 19 May 1998 concerning short-term statistics as regards the definition of Main Industrial Groupings. The above legal references are published in the "Official Journal of the European Communities" (In order to receive these texts in one of the official languages of the EU , please look for information at: http://europa.eu.int/eur-lex/
<GenericMetadata> the Target Object The “key” Reported Attributes and their Values
Item Scheme Metadata Structure Definition uses defined concepts Report Structure defines “keys” of object types to which metadata can be “attached” Full Target Identifier Partial Target Identifier identifies the code list from which the value of the (key) component must be taken when metadata is reported specifies the identifier components (“key”) of the target object identifies target object type of the component Target Object Type Identifier Components MSD: mapping to Information Model Data Flow GENERIC_MSD CATEGORY DATAFLOW DATA_PROVIDER Category ESTAT_CATEGORY_SCHEME ESTAT_DATAFLOWS Data Provider ESTAT_ORGANISATION_SCHEME
Metadata Attributes Metadata Structure Definition can comprise the specification of one or more report concept defined in Metadata Report Concept Scheme Concept takes semantic and context from definition of format and permitted values Format and Permitted Value List can have hierarchy MSD / Report Structure: mapping to Information Model ESTAT:CROSS_DOMAIN_CONCEPTS GENERIC_MSD The reporting hierarchy must respect the concept hierarchy but may also introduce an additional hierarchy.
SDMX registry and Reference Metadata Live Demonstration of the SDMX Registry • Management of MSD