Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Survey Methodology Sampling error and sample size

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Survey MethodologySampling error and sample size**EPID 626 Lecture 4**Lecture overview**• Finish discussion of nonprobability sampling • Discuss sampling error • Discuss sample sizes • Some practical exercises**Nonprobability sampling designs**• Focus groups: small group chosen because the investigator believes they accurately reflect the population of interest- usually used for planning or piloting**Advantages**Cost Efficiency Effort May be the only option Disadvantages Bias Bias Bias Not a valid representation of the population Advantages and Disadvantages**When would nonprobability sampling be appropriate?**• Surveys of hard-to-identify groups (ex. A survey of goals and aspirations among gang members) • Surveys of specific groups (ex. A survey about pain among hospice patients) • Surveys in pilot situations (used for planning purposes, not for research per se)(Fink, 1995)**Sampling error**• The central limit theorem: In a sequence of samples of a population, for a particular estimate (say a mean), there will be a normal distribution around the true population value.**This variation around the true value is the sampling**error-it stems from the fact that, by chance, samples may differ from the population as a whole.**The larger the sample size and the less variance of what is**being measured, the more tightly the sample estimates will “bunch” around the true population value, and the more accurate the sample-based estimate will be.**Standard error of a mean**• The standard deviation of the distribution of sample estimates of the mean that would be formed if an infinite number of samples of a given size were drawn.**~67% of the means of samples of a given size and design will**fall within the range of 1 SE of the true population mean • ~95% will fall within 2 SE -this is where we derive a 95% confidence interval**Proportions**• Mean of a two-value (binomial) distribution • Var of a proportion = p(1-p) • So the**Table 2.1Confidence Ranges for Variability Attributable to**Sampling • Trends • If sample size=75 and p=20,**Confidence intervals**• In a survey of 100 respondents, 20% say yes. What is the confidence interval for a 95% confidence level? • In a survey 250 respondents, 10% say yes. What is the confidence interval for a 95% confidence level? What if 50% said yes?**In a survey of 100 respondents, 20% say yes. What is the**confidence interval for a 95% confidence level? • Interval is 8. • 95% CI=(12%, 28%)**In a survey 250 respondents, 10% say yes. What is the**confidence interval for a 95% confidence level? What if 50% said yes? • Interval is about 3.8. • 95% CI is about (6.2%, 13.8%) • If 50% said yes, CI is about (3.7%, 16.3%)**Sampling error and sampling strategy**• SRS is approximated by the standard error • Systematic sampling • If not stratified, sampling error is the same as in SRS. • If stratified, errors are lower than those associated with SRS for the same size for variables that differ (on average) by stratum, if rates of selection are constant across strata.**Sampling error and sampling strategy (2)**• Unequal rates of selection decrease sampling error for oversampled groups. • It will generally produce sampling errors for the whole sample that are higher than those associated with SRS of the same size for variables that differ by stratum.**Sampling error and sampling strategy (3)**• But when oversampling occurs in strata that have higher than average variances, overall sampling errors will be lower than for a sample of the same size with equal probabilities of selection. (see Fowler text p.32 for more extensive discussion)**Sampling error and sampling strategy (4)**• Clusters will produce sampling errors that are higher than SRS for the same size for variables that are more homogenous within clusters than in the population as a whole.