150 likes | 169 Vues
Learn about the CORE system that integrates respondents' IT systems, automating reporting workflows. CORE enhances data collection efficiency and quality, disburdening respondents. Validation protocols ensure data accuracy and consistency.
E N D
eSTATISTIK.core - Integrating Respondents’ IT Systems into Data Collection UNECE Work Session on Statistical Data Editing Bonn, 25-27 September 2006 Michael Schäfer Federal Statistical Office, Germany
Outline • Overview and history • Architecture • Workflow • Validation • Conclusion
The core of .CORE • A new procedure for the primary collection of raw data from businesses and authorities • Integrates IT systems of respondents into the data collection system, allowing reporting to become a fully automated and seamless workflow • Includes methodological improvements • Main objectives: disburden respondents, increase data collection efficiency and data quality • Initiated 03/2003, productive 03/2005 • Partners: Statistical offices and AWV
Current reporting channels Manual entry, small data volumes, little or no standardisation Multiple, parallel reporting, various procedures/formats Paper WWW Statistical offices Web Phone, fax Internal workflow, little or no support by standard ERP software A multiplicity of data collection procedures Large data volumes, but no standard procedures/formats HR GL .. Business data management systems (ERP)
The .CORE reporting channel CORE.server A single point of delivery Statistical offices WWW Direct data retrieval and automated message generation Multi-message documents Survey-independent data and programming interfaces HR GL .. Business data management systems (ERP)
Architecture Business Transport Statistics Production XML HTTPS DMS DCS Survey A Statistics Module Survey A KonVert Stat. Office DatML/SDF Survey definition Survey B Business data Raw data CORE.connect Statistics Module Survey B Stat. Office Data reception Validation Transformation Forwarding DatML/RAW Raw data DatML/RES Validation protocol CORE .reporter Metadata access Business data Stat. Office CORE.connect Raw data Meta- data Resource database
Data validation overview • Where does validation take place? • optionally at the respondent (statistics module/CORE.reporter,) • automatically on CORE.server • How is validation performed? • by using the free CORE.connect library (Java, C, .NET) • What does validation require? • the XML schema definition of DatML/RAW • one or more survey definitions (DatML/SDF) • What is the result of a validation? • on CORE.server, a detailed validation report (DatML/RES), downloadable via CORE.connect and IDEV (on-line DCS) • on the client, an InspectionReport object
Survey definitions DatML/SDF Survey description 1 Input data model 1 Data type 1 Variable + Value space ? Variable group * Include + Output data model 1 Include Message data group + ? Include Case data group + 1 = 1 occurrence + ? = 0-1 occurrences Reference data * = 0-n occurrences ? + = 1-n occurrences Classification +
Data validation objects • Value • Conformance with data type and value space • Occurrence: unconditional / independent • Of variables, variable groups and data groups • mandatory or optional • Occurrence: conditional / dependent • Of variables and variable groups • Conditions test existence and values of variables and variable groups • mandatory or optional or forbidden • Occurrence: instances • Minimum and maximum number of data and variable groups
Conditions has Case data group condition overrides includes variable group targets includes has variable group condition targets includes variable
Some remarks on quality • “Give us the data you have and we see what we can make of it” • Facilitates data provision for respondents • Reduces errors by avoiding to make respondents transform or calculate data • Improved description, fewer interpretation errors • Alignment of statistical terms to business terms facilitates understanding of our definitions • Standardised documentation for SW producers • Respondents no longer need to find out what data we want – they can rely on the statistics module to provide it. • System-provided data • Fewer human interactions = increased coherence and consistence of data over time and per respondent
Conclusion • .CORE is mainly an IT-based solution but marketing activities, methodological improvements, intensive co-operation with the user community and political support are of equal importance • .CORE has regular data validation capabilities but provides them in a new standardised and survey-independent way to respondents and collectors • Automation, standards and methodological work contribute most to improving data quality • As a fully metadata-driven system, .CORE relies on good tool support in back-end procedures
Information and contacts: eSTATISTIK.core: http://www.statistik-portal.de Information and support, +49 (0)611/75-2040 estatistik.core@destatis.de Mr. Jörg Decker, +49 (0)611/75-2442 joerg.decker@destatis.de Mr. Michael Schäfer, +49 (0)611/75-3652 • michael.schaefer@destatis.de
Thank you very muchfor your attention! Any questions?