1 / 45

Multiple Indicator Cluster Surveys Survey Design Workshop

Multiple Indicator Cluster Surveys Survey Design Workshop. Sampling: Advanced Sampling. MICS Survey Design Workshop. Major steps in designing MICS sample. Define objectives Key indicators Desired level of precision Sub-national domains of estimation

nike
Télécharger la présentation

Multiple Indicator Cluster Surveys Survey Design Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Indicator Cluster SurveysSurvey Design Workshop Sampling: Advanced Sampling MICS Survey Design Workshop

  2. Major steps in designing MICS sample • Define objectives • Key indicators • Desired level of precision • Sub-national domains of estimation • Identify most appropriate sampling frame • Most recent census of population and housing • Master sample or sample for another survey conducted recently

  3. Major steps in designing MICS sample • Determine sample size and allocation • Determine availability of previous MICS or DHS results to provide measures of sampling parameters

  4. Sampling Frame • Sampling frame: • Nationally-representative • Complete coverage • Measures of size (households or population) for small area units • Generally most recent census is the most effective sampling frame

  5. Sampling Frame • In some cases more recent pre-census listing may be available • When no census is available, identify most complete geographic frame available (e.g. list of villages/localities with estimated population)

  6. Sampling Frame • Common problems with area frames: • Coverage issues • Census maps of poor quality • Errors and changes in area boundaries • Inappropriate type and size of area units • Lack of auxiliary information

  7. Sample Size Determination

  8. n is the required sample size (number of households) • 4 is a factor to achieve the 95 percent level of confidence • r is the predicted or estimated value of the indicator in target population • deffis the design effect

  9. RR is the response rate • pb is the proportion of the target subpopulation in total population (upon which the indicator, r, is based) • AveSize is the average household size (that is, average number of persons per household)

  10. e is the margin of error to be tolerated at the 95% level of confidence • Currently, note that e = 0.12r [defined as 12% of r, in this case the relative standard error of r is 6% because e = 2 standard error (r)]

  11. Previously in MICS2 • 2 different values for margin of error  • Margin of error was 5 percentage points for high values of r (over 25%) • Margin of error was 3 percentage points for low values of r (25% or less) • Difficulty for users in deciding on the sample size for their surveys.

  12. MICS template for sample size calculation - EXCEL FILE

  13. Selection of key indicators • Choose an important indicator that will yield the largest sample size • Step 1: Select 2 or 3 target populations representing each a small percentage of the total population (pb); typically • Children 12-23 months: 2-4% or • Children under 5 years: 7%-20%

  14. Selection of key indicators • Step 2: Review important indicators for these target groups but ignore indicators with very low or very high prevalence (less 10% or over 40%, respectively) • Do not choose from the desirably low coverage indicators an indicator that is already acceptably low • Do no choose childhood and maternal mortality ratios

  15. Explicit Stratification • Explicit stratification: dividing the sampling frame into sub-groups (called strata) of homogeneous (similar) PSUs. • Advantages: • Better precision because reduced variance within stratum given similarity of units • Flexible design, sub-national estimates for smaller domains (differential sampling rates) • Example of stratification: region, urban/rural

  16. Implicit Stratification • Sort the sampling frame according to certain characters such as regions, urban-rural residence, sub-regions, districts, etc., then select a systematic ppssample. • Ensures a representative sample for each subgroup • Automatically provides proportional allocation by size of subgroup

  17. Allocation of sample to strata/domains • Proportional allocation • Effective for precision of estimates at the national level • Equal allocation to each domain • Used when each domain requires same level of precision • Optimum allocation – takes into account differential variance and costs by stratum • For example, variability may be higher in urban areas and enumeration costs may be higher in rural areas – use higher sampling rate for urban areas

  18. Subnational estimates • Number of separate areas (domains) for which separate, equally reliable estimates are wanted affects sample size • For example, if 10 regional estimates are wanted, theoretically the sample should be increased by factor of 10 • As a compromise, larger sampling errors accepted for subnational estimates • One proposal (by Dr. Vijay Verma) – increase national sample size by factor of D0.65, where D is the number of domains • Results in an average increase in the sampling errors for domain estimates by a factor of about 1.5

  19. Sampling Stages • Ideal to have two-stage sample design, with EAs defined as PSUs • In some countries only frame of larger administrative units available • Three-stage sample design: larger area units selected as PSUs • Necessary to delineate smaller segments in each sample PSU

  20. Number of PSUs and Cluster Size • Survey costs depend not only on number of households but their distribution among primary sampling units (PSUs) • Important to determine effective balance between number of sample PSUs and number of sample households per cluster • In general, the more PSUs the better for reliability but the greater the cost (mostly costs of travel and listing)

  21. Number of PSUs and Cluster Size • Example: 8000 households selected in 400 PSUs of 20 sample households each is a much more reliable sample than 200 PSUs of 40 households each, but more expensive • Number of sample households per cluster should be as small as practical for reliability • A range of 15-25 households for MICS appears to be effective

  22. Design Effect (DEFF) • Deff - ratio of variance of estimate based on stratified multi-stage sample design and corresponding variance from simple random sample of same size • Measure of the relative efficiency of the sample design • Effective stratification reduces the deff • Cluster sampling increases the deff

  23. Design Effect (DEFF) • In case of cluster sampling, deff generally measures effect of clustering • δ = intraclass correlation coefficient, or measure of homogeneity within cluster • = average number of households per cluster • Design effect increases with intraclass correlation and cluster size

  24. First Stage Selection of PSUs • Standard methodology for MICS and other household surveys – select EAs or clusters systematically with PPS • Important to sort frame before selection, in order to ensure effective implicit stratification • Traditional procedure – cumulate measures of size, determine sampling interval and random start, generate selection numbers

  25. Large sample PSUs in PPS sampling • Sometimes a PSU may have a measure of size larger than the sampling interval • PSU may be selected more than once in the systematic PPS selection • Option 1 – if the PSU is selected two or more times, multiply the number of households to be selected by the number of “hits” • Option 2 – separate the large PSUs and include in sample with a probability of 1

  26. MICS Sampling Option 1 – new sample with household listing • Design new MICS sample • Two stages with census as frame • Use of implicit stratification, systematic selection of census EAs at first stage with pps • List households in selected EAs/segments • Select households systematically from listing • Interview selected households, no replacement will be allowed

  27. Sampling Option 1 - continued • Advantages of option 2 - simple design - probability-based - if possible self-weighting (national level) • Limitations of option 2 - expense of listing households - time necessary to list households [Example, sample size of 5000 households may require 25000 to 50000 households to be listed]

  28. MICS Sampling Option 2 – use an existing sample • Design MICS as a rider to another survey if timely and feasible • Use sample from a previous survey and re-interview households for MICS • Or, use old survey sample EAs and construct new listing of households to select for MICS • Old sample must be probability-based, national in scope • Possibilities – DHS, other national health survey, recent labour force survey • Important: design parameters must be known (such as selection probability, stratification, etc.)

  29. Sampling option 2 - continued • Use of existing master sampling frame • Some countries use master sample design for intercensal national household surveys • Master samples generally sufficiently large for MICS; subsample of PSUs can be selected • Advantage – updated maps may be available for master sample of PSUs, and perhaps updated listing

  30. Sampling option 2 - continued • Advantages of using previous sample - cost savings - maps available for interviewers - appropriate sampling plan available - simplicity • Limitations of using old sample - burden on respondents - sample design may need modification * sample size * sub-national coverage * number of PSUs or clusters • Balance between loss and gain

  31. Listing and Selection of Households • Household listing manual is available • Importance of new listing to represent current population • Problems with using previous listing (older than 1 year) • Does not represent newer households • Distribution of sample population by age group distorted, generally with higher median age • Difficulty of finding households in old list

  32. Listing and Selection of Households • MICS recommends a separate household listing operation • More reliable as listing staff are less likely than interviewers to bias the sample by excluding households that are difficult to reach • Allows household selection to be done in a single central location using reliable and uniform procedures

  33. Listing and Selection of Households • Household selection in the office: • Advantages – conducted by specialized staff, possible to avoid selection bias in the field, possible to control overall sample size • Disadvantage – increased costs from having two field visits • Selection in the field: use household selection table • Advantage – cost savings of having one integrated field operation • Disadvantage - correct sampling may be difficult for field staff, selection may be biased

  34. Listing and Selection of Households • Excel template for generating automatically the sample of households based on the number of households listed(see spreadsheet) • Common problems found in listing operations • Problem with quality of sketch maps – difficult to determine segment boundaries • Sometimes large differences found between number of households in frame (census) and number listed.

  35. Sampling strategy for low fertility countries • In MICS 4 and 5, some low fertility countries are using second-stage stratification of listing by households with and without children under 5 • Higher sampling rate used for households with children • Increases number of households with children in MICS sample, and therefore number of sample children

  36. Sampling strategy for low fertility countries (continued) • Improves the reliability of the child indicators without increasing the sample size to a very high level • This procedure also increases the variability in the weights and the design effects for the overall sample • Important to avoid very large variability in the weights for households with and without children • Differential weights between households with and without children generally should not exceed a factor of about 4

  37. Implications of sampling strategy on sample size calculations • One parameter in the sample size calculation template is the proportion of the indicator subpopulation • Using a higher sampling rate for households with children increases the proportion of children under 5 in the sample • The proportion of children under 5 (or smaller age groups) should be multiplied by a factor that reflects the increase in sample households with children

  38. Implications of sampling strategy on weighting procedures • Under normal MICS sample design, weights vary by sample cluster • With second stage stratification by households with and without children, two weights need to be calculated for each cluster: for households with and without children

  39. Survey weighting procedures • Survey data collected using a complex design featuring clustering, unequal probabilities of selection and stratification: • All analyses must apply survey weights in order to prevent biased results • Formulas for calculating weights depend on the exact sample design used in each country • MICS has 4 set of weights: households, women, men and children

  40. Survey weighting procedures • Components of MICS survey weights: • Design weight: inverse of the final probability of selection for households • Adjustment factors for nonresponse (cluster, household, woman, child level) • Normalized weights so that the total weighted number of observations is equal to the total unweighting number (sample size)

  41. Survey weighting procedures

  42. Sampling Error Estimation • Necessary to evaluate reliability of survey estimates • Possible only when probability sampling is used • Should be done for 30-50 important indicators • Methodology is complex and design-specific • Several software packages: • SPSS Complex Samples module – used in MICS • SAS, Stata, SUDAAN, Clusters,WesVar, CENVAR, PCCarp, etc. • Standard error, confidence intervals and DEFF

  43. Sampling Error Estimation SPSS Complex Samples module • Advantages: • Simple to use • Template syntax available for standard indicators • Supported by MICS Global and Regional staff • Steps: • Set up sampling parameter specifications file (csplan) • Define variables for stratum, PSU and weight

  44. Sampling Error Estimation SPSS Complex Samples module • Stratum should be lowest level of explicit stratification (for example, province, urban/rural) • Necessary to have minimum of two sample PSUs per stratum

  45. Reducing bias • Accuracy of survey results depends on both variance and bias (mostly from nonsampling errors) • Bias should be minimized with quality control for all survey operations • Basic data quality determined during enumeration • Important to have good training and supervision in the field • Data capture should include 100% or sample verification • Important to have quality control for editing and coding procedures • Computer consistency and range checks

More Related