130 likes | 146 Vues
International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38). Li-Chun Zhang Statistics Norway. Industrialization of Editing: Some issues to be dealt with. Overall objective, principles and guidelines (e.g. the “new” paradigm of editing)
 
                
                E N D
International Collaboration on Industrialization of Editing:Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway
Industrialization of Editing: Some issues to be dealt with • Overall objective, principles and guidelines (e.g. the “new” paradigm of editing) • Conceptual reference framework with regard to GSBPM • Conceptual reference framework with regard to GSIM to-be • Design of generic functionality • Minimum set of standard methods • IT tools and platforms
Objectives & principles • Example: Objectives (the “new” paradigm) • Error-source identification and error prevention • Collect information about quality • Identification and adjustment of critical errors in data • Example: Objectives (SNZ proposal) • Efficiency as quality against cost • Continuous quality improvement • Provide quality information • Example: Principles • Original data as much as possible (“old” Felligi-Holt paradigm) • Maximum automated processing • Analysis of (editing) process efficiency • Training, documentation • …
Generic Statistical Data Editing Process (GSDEP) • GSBPM ≠ Flow Chart • An example from EDIMBUS • Mapping GSDEP with GSBPM • Micro vs. macro editing • “Editing & Imputation” (E&I) vs. “Editing & Estimation” (E&E) • Connections to GSIM to-be
Common Statistical Data Reference (CSDR):Interface btw. SDE and GSIM to-be • Statistical production as transformations of data => steady / major states of data • Common Micro Data Format for database management • Common Functional Data Format for method library
Design of generic functionality • Databases • Micro database of CMDF data files (M-Base) • Functional database of functional data files and alignment tables (F-Base) • Function library (F-Lib) contains all available standardized generic (program) tools. • Builders • Functional data builder (D-Build) transforms relevant CMDF data files into the required functional data files, and updates the relevant alignment tables. • Function builder (F-Build) takes functional data files as the input data and tools from the F-Lib, and configures the necessary parameters according to a given specification for machine-based or automated data processing. • Screen builder (S-Build) takes fnctional and/or CMDF data files as the input data, and configures an environment for manual inspection/editing of individual records/questionnaires according to a given specification. • Runners: • Batch processor is the environment for executing automated/machined-based SDE processes, chiefly relying on functions that are configured in the F-Build. • Manual processor is the environment for manually executing SDE processes, chiefly relying on the interface provided through the S-Build. • Selection and Drilling are the dedicated environments for carrying out selective editing and drilling up-and-down among hierarchically structured aggregations. • Data processor supports the necessary administration of data and metadata. • Managers: • ANOPE is the environment for quality assessment of the editing processes. • Response manager provides the interface for re-contact with the data providers, and other generally related production processes (such as Process 4 Collect).
Claude PoirierStatistics CanadaNext steps • Objectives, guidelines and principles • Finalize user requirements • Identify existing methods • React to functional gaps • Set up the framework • Develop the toolset • Deliver training
Finalizing user requirements • Prioritizing edit and imputation requirements • Micro-editing methods Automated E&I on numerical and categorical data • Macro-editing methods Selective editing; Macro editing; Editing of macro data • On-line editing Collection edits and self-administered edits • Data confrontation and certification Methods using multiple data sources • Standardized platform Common architecture
Existing tools and Platforms • Identifying and analysing existing products • SigEE (Australia) • BANFF, CANCEIS (Canada) • BEST, POSS (New Zealand) • ISEE, DYNAREV (Norway) • TRITON, SELEKT (Sweden)
Reacting to functional gaps • Not all requirements will be satisfied • Brainstorming sessions are being organised • Development priorities will be discussed Developing the tool set • Consolidate preferred tools • Adapt existing tools to the environment • Develop pre/post processors to fit the environment • Develop missing functions
Delivering training material • User guide • Methodology documentation • System documentation Comments / Questions • It’s your turn
Frequently asked questions (FAQ) Q1: What governance model drives the project? Q2: When do we expect the suite of editing functions to be delivered? Q3: As a member of the collaboration network, will my agency have to pay any fees for accessing and using released functions? Q4: My statistical agency is not part of the network. Are there any fees that are planned to let me use the products? Q5: My agency would like join the network. Is this possible? How?
Frequently asked questions (FAQ) Q6: I understand from your presentation that a common environment is being planned? Would I be able to use the functions in another environment? Q7: My agency is willing to share a system but its foundation software is not compliant with the proposed environment. What will happen? Q8: My agency is willing to offer a system or a module for the network. Who will own the module? Q9: Will the resulting products become open-source?