160 likes | 263 Vues
Learn about algorithms for identifying and resolving obvious errors in business survey data, improving data accuracy and editing processes for better results.
E N D
Algorithms for Detecting and Resolving Obvious Inconsistencies in Business Survey Data Sander Scholtus Statistics Netherlands UN ECE Work Session on Statistical Data Editing April 22, 2008
Overview • Selective editing at Statistics Netherlands • Sign errors • Rounding errors
Editing SBS at SN • Selective editing • SLICE • Automatic correction of systematic errors (obvious inconsistencies)
Systematic errors • Currently implemented: • ‘Thousand-error’ • Wrongfully negative value • Empty total with non-empty component items
Obvious inconsistencies • Correcting (obvious) systematic errors in a separate step… • makes the editing process more efficient • makes SLICE more effective
Results block (SBS ≤ 2005) operating returns + operating results + operating costs – financial revenues + operating surplus + financial expenditure – provisions rescinded + balance of provisions + provisions addes – + exceptional income exceptional result + – exceptional expenses pre-tax results
Results block (example 1) 2,100 + 150 + 1,950 – 0 + 10 + 10 – 20 + 15 + 5 – + 50 40 + – 10 195
Results block (example 1) 2,100 + 150 + 1,950 – 0 + 10 + 10 – 20 + 15 + 5 – + 50 40 + – 10 195
Results block (example 1) 2,100 + 150 + 1,950 – 0 + –10 + 10 – 20 + 15 + 5 – + 50 40 + – 10 195
Results block (example 2) 5,100 + 450 + 4,650 – 0 + 130 + 130 – 20 + 20 + 0 – + 15 10 + – 25 610
Results block (example 2) 5,100 + 450 + 4,650 – 0 + 130 + 130 – 20 + 20 + 0 – + 15 10 + – 25 610
Results block (example 2) 5,100 + 450 + 4,650 – 130 + 130 + 0 – 20 + 20 + 0 – + 25 10 + – 15 610
Definition • Sign error / interchanged returns and costs: • at least one edit rule is violated • it is possible to obtain a consistent results block by only changing the signs of balance variables and/or interchanging the values of returns items and costs items
Mathematical formulation • Edit rules: • Find the solution to: • if then change the sign of • if then interchange and
Rounding errors • Linear equalities • Smallest possible violation of edit rule • Virtually no influence on publication figures • SLICE • Separate step