220 likes | 320 Vues
Canadian Census E&I – Lessons Learned from 2006 with Plans for 2011. Work Session on Statistical Data Editing Vienna Austria, April 21-23 2008. Mike Bankier, Statistics Canada, bankier@statcan.ca. Outline of Talk. Changes Made for 2006 Census
E N D
Canadian Census E&I – Lessons Learned from 2006 with Plans for 2011 Work Session on Statistical Data Editing Vienna Austria, April 21-23 2008 Mike Bankier, Statistics Canada, bankier@statcan.ca
Outline of Talk • Changes Made for 2006 Census • Impact of adjusting occupancy status and imputation of total non-response households • Processing of demographic variables with an emphasis on age • Possible enhancements to E&I for 2011
Changes to 2006 Census • 73% of dwellings mailed questionnaires • 18% of dwellings responded by Internet • 85% gave permission to link to tax form • Questionnaires captured using ICR • Non-Response Follow-Up (NRFU) done from centralized offices • Failed Edit Follow-Up (FEFU) done from call centres
2006 Census Changes • These new approaches reduced the field staff required by 46% • Because of widespread labour shortages in some regions, the collection period was extended from mid-July to the end of Aug. (Census day May15) • National NR rate 2.8% in 2006 vs 1.6% in 2001
Dwelling Classification Survey • Mistakes made in field classifying dwellings as occupied or unoccupied. • Sample of dwellings revisited to reassess occupancy status for dwellings where no response received • DCS estimated • 17.4% of 934,564 dwelling classified as unoccupied were occupied and • 29.1% of 366,527 dwellings classified as occupied but with no responses were actually unoccupied • Occupancy status for individual dwellings adjusted. Resulted in a 3.6% increase in the number of occupied dwellings and a 5.2% decrease in the number of unoccupied dwellings
Imputation of Total NR Households • After the DCS adjustment, total non-response dwellings had all responses imputed by borrowing unimputed responses from another household • Using a single donor for total non-response was less likely to produce implausible results • Weighting used in 2001 to convert unoccupied dwellings to occupied - it could transfer population from one city block to another and be noticed by users
Demographic E&I • Demographic E&I does minimum change imputation for blanks and inconsistencies so later program can form Census families • All demographic variables for all persons in household are imputed simultaneously using CANCEIS • Three types of Census families • Couples without children • Couples with children • Lone Parents with children
Couple Editing Concepts • For a couple, they should be • both adults (age >=15) and • both married or both common-law and • have appropriate relationships to Person 1
Child/Parent Editing Concepts • For a child/parent pair • At least one parent must be 15 or more years older than the child and • A female parent must not be more than 50 years older than a child and • The relationships to Person 1 should be appropriate
Analysis of Imputation of Age • AGEU and AGE represent respectively the age of the person before and after minimum change donor imputation • 99.11% had AGEU = AGE • 0.61% had AGEU = Blank/Invalid • 0.28% had AGEU≠ AGE because of an inconsistency between AGEU and another variable
2011 Changes – Small Domains • Small domain (e.g. centenarians, same sex married couples) can have upwards bias because of response or data capture errors for persons outside the small domain • Sometimes no alternate source of data to verify the small domain count and the domain is too large to be manually reviewed 100%
2011 Changes – Small Domains • Manually review 20% sample of persons age 95+ to determine those with incorrect age • For other 80% of persons age 95+, use nearest neighbour imputation to determine those with incorrect age • Then in 2nd step, blank out incorrect ages and impute
2011 Changes – Use Failed Records as Donors • Sometimes stratum failure rate is so high that number of donors is insufficient • Failed records could be used as donors since frequently failed record is missing just one or two responses and would be suitable for imputing other responses
2011 Changes - More Minimum Change Donor Imputation • Will do more minimum change donor imputation and less deterministic imputation where possible • Will combine modules so more variables are imputed simultaneously where possible
Concluding Remarks • Sophisticated E&I programs can do a better job detecting and resolving edit failures • With this comes the responsibility to make few assumptions regarding the characteristics of the non-respondents or those giving inconsistent responses • The impact of imputation should be made clear to users • E&I should not be viewed as a panacea such that data quality standards can be lowered