Data editing and validation Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands Division Social and Spatial Statistics Department Support and Development Section Research and Development ESLE@CBS.NL UNECE-Eurostat Meeting on Population and Housing Censuses in Geneva 13-15 May 2008
Contents • Introduction • The French paper (Working Paper 8) • The Italian paper (Working Paper 9)
Introduction • Activities • Regular international meetings (e.g. UNSD, UNECE and Eurostat) • Regular contacts and visits between countries • UN Recommendations and European regulation • Aims • Better comparability over time • Better comparability between countries • France and Italy both contribute to the aims in their own way
The French paper (Working Paper 8) • The validation of the census data in France • Rolling census with advantages (e.g. no longer an enormous peak in the work load) and disadvantages (e.g. more complicated structure) • Less attention for the results? • Reference year? Assumption of stability of Census variables over time (research, e.g. based on EU-SILC?) • Many checks to improve quality, but how large are the final overcount and undercount? • Hot deck (within classes?) seems a logical choice for the imputation method, but what do we know about the selectivity of the non-response?
The Italian paper (Working Paper 9) • An overview of editing and imputation methods • for the next Italian censuses • Strong link to methodology (department) at Istat • Dilemma of timeliness versus quality: automate the editing procedure, but what is the effect of the order of the imputations (check on outliers, not inliers, Winkler)? • Interesting link with graph theory, but a picture is missing (Blaise?) • Minimum change approach (Chernikova algorithm?) • Long / short form reduces number of possible donors • Progress in quality: unique identifier (fiscal code)