740 likes | 756 Vues
This session provides an introduction to quality measures for ONS population estimates, including plausibility ranges, administrative sources comparison tool, and measures of uncertainty. It also discusses the accuracy of population estimates and the use of demographic analysis tools.
E N D
Quality Measures for ONS population estimates: Introduction Local Insight Reference Panels Autumn 2014
Session Summary • Plausibility ranges • Administrative sources and demographic analysis comparison tool • Measures of Uncertainty • Visualisation tool
How accurate do you think our estimates are? • A perfect measure of the population • Within +/- 1% • Within +/- 2% • Within +/- 5% • Within +/- 10% • Within +/- 15% • Within +/- 20% • >20% little relation to the true population For • 2011 Census (all persons • 2011 Rolled forward (all persons) • 2011 Census (25-29 year olds) • 2011 Census (25-29 year old males)
Accuracy of Population Estimates for 2011 Note: Average = average weighted by population size
Your awareness of quality tools • Had you heard of the Quality tools before this session? • Have you used any of them?
Part 1 Using Administrative Data to Set Plausibility Ranges for Population Estimates- Assessment Following the 2011 Census
Background Update the work carried out in 2012 which used the 2009 Mid-Year Estimates. One of several initiatives taken forward for the quality assurance of Mid-Year Estimates. - Release of 2011 Census estimates allowed methods to be evaluated. Same methodology as the 2012 report. • How the ranges performed against both the 2011 Census estimates and MYEs for 2011. Only for those aged 0-15
What are plausibility ranges? Definition of plausibility ranges: A plausibility range is the setting of upper and lower limits, calculated using administrative data, within which the population estimates could reasonably be expected to fall. Within range Outside range Outside range Lower Upper
What are plausibility ranges? Census Confidence interval Plausibility range SC MYE Census estimate PR CB SC= School Census PR= Patient Register CB= Child Benefit
Does the Census Validate the ranges? 4 15 25 39 67 64 43 69 43 80 Number of LAs 264 201 266 280 280 Under 1s 1 4yrs 5 7yrs 8 11yrs 12 - 15yrs - - - Age group Within range lower 25% Within range Above Plausibility Range Below Plausibility Range Within range upper 25%
Limitations • Data sources- ranges only as good as the administrative sources used to calculate them. • Census variability- current methodology compares a point with a range rather than comparing a range with a range. • Methodology-more sensitive at picking up over estimates than under estimates. • Age grouping- following a cohort is difficult, different sized age groups. • Specific areas- for example armed forces and areas with high levels of independent schools.
Plausibility Ranges • 5 mins discussion • Were you aware of the original report? • Have you seen the revised report? • Do you agree with our conclusions?
Key Points • Around 1/5th of LAs’ plausibility ranges not validated by the 2011 Census. • Some useful information (5-15 years), little useful information for 0-4 year olds. • Plausibility ranges not advised for future use, as they currently stand. • The use of tolerance ranges (as in Census) not ruled out for future use
Part 2 Mid-year estimate QA tool
Background to MYE QA tool • Quality assurance of the 2011 Census made extensive use of admin data and demographic analysis • MYE QA made some use • Wanted to carry out something similar for the MYEs but taking into account the speed of release and the resources available. • Solution, take the most useful and appropriate elements and use those. • Mixed mode approach • Carry out the sort of analysis our stakeholders do
Coherence between counts from admin sources and MYEs • Coverage and definitional differences • School census under-represents resident population • Changes to eligibility for child benefit • Areas with special populations • Timing • PR list inflation • PR list cleaning – reduces list inflation
Comparing MYEs for 2013 with.....2011 and 2012 MYEs on a period basis
Comparing MYEs for 2013 with.....2011 MYEs on a cohort basis
Quality assuring estimates for women to quality assure estimates for men (1) • Admin data for working age males generally weaker than for working age females. • Availability of data on fertility provides additional means of looking estimates of females. • QA of estimates for females more comprehensive than for males. • Use confidence around estimates for females to allow QA of males – sex-ratios.
Quality assuring estimates for women to quality assure estimates for men (2)
Using sex-ratios for QA • Sex-ratio = males/females • Analysis of sex-ratios in decade 2001 and 2011 shows these can be a strong indication of issues with the MYEs. • Comparison of sex-ratios for 2001 and 2011(Census based) shows distribution of sex-ratios is broadly constant. • Use distribution of sex-ratios in 2011 to evaluate local authorities over the decade.
Spread of sex ratiosGiven by standard deviation Note: Excludes Isles of Scilly
Using sex-ratiosWhat does the real distribution of sex-ratios look like?
Summary of MYE QA tool • Necessity of a mixed mode approach • Patient register, child benefit, state pensions, school census. • Sex-ratios • Fertility • Change over time • Present data on period and cohort basis • Published alongside MYEs on day of release • Access to the same data for each local authority (lower & upper tier), regions and England and Wales.
Improving the process • The 2013 MYEs represent the first time we’ve run through this QA process. • The main issue is time, the volume of estimates to QA more than fills the time available to do it. • Increasing the amount of time to allow for contingency would be useful. • For 2014 it is hoped to implement some prioritisation of “more tricky” local authorities. • The potential to automate some of the checks. • More resources
Evaluation • As part of the development of the tool we talked to stakeholders, future developments require further engagement. • Usefulness to stakeholders outside of ONS • What else could be included? • What could be clarified? • Via StatUserNet, Population Statistics Community • Via LIRPs!
What do you think? • Have you looked at or used the MYE QA tool? • Your experiences? • From what you’ve seen today is this something you would find useful? • What else would you like to see? • Do you do something similar?
Part 3 Measuring uncertainty in the ONS mid-year estimates
Uncertainty measures- work in progress • Measuring uncertainty around the mid-year population estimates allows users to evaluate change over time • We have already published uncertainty measures for 2002-10 as research statistics • Uses modelled immigration • Uses school boarder adjustment • We are now reviewing the methods to take into account (1) recent changes in the way the MYEs are calculated and (2) using new information from the 2011 Census
Uncertainty measures- our approach • Methods have been developed in collaboration with academics at Southampton University • We use a simulations-based approach to measure variability around the MYEs • Our methods mirror the complexity of current population estimates, which involve using administrative, survey and census data and a range of statistical techniques
Cohort component method MYEs= + - + - + - + - Base population Natural change International migration Internal migration Other changes Uncertainty estimates = Assume no variance Base population Natural change International migration Internal migration Other changes Bootstrapping to create 1,000 simulations to derive 95% CIs for MYEs
Bootstrapping International Immigration MYEs= 1,000 Worker counts MWS International Passenger Survey National Estimate Workers 348 Local Authority estimates of international immigrants 1,000 Student counts HESA/BIS/WAG Students 1,000 ‘Other’ counts Others PRDS 1,000 UK Returner counts Census UK Returners Split by type Use admin data to distribute to LAs Recombine to create LA totals Uncertainty estimates = 1,000 simulations from IPS 1,000 simulated admin counts for each migrant type in each LA 1,000 simulated international immigrant counts for each LA Apply admin-based proportions to IPS estimate for each migrant type to derive counts of each migrant type in each LA Sum these to produce 1,000 LA totals. 26th and 975th ranked values provide uncertainty interval