290 likes | 614 Vues
Two SAS Bootstrapping Programs. Lynn Lethbridge SHRUG November, 2010. What is Bootstrapping?. A method to estimate a statistic’s sampling distribution Bootstrap samples are drawn repeatedly with replacement from the original data
E N D
Two SAS Bootstrapping Programs Lynn Lethbridge SHRUG November, 2010
What is Bootstrapping? • A method to estimate a statistic’s sampling distribution • Bootstrap samples are drawn repeatedly with replacement from the original data • From each new sample, the statistic is re-calculated and saved in a dataset (ie 200 bootstraps, 200 statistics) • The standard error of the statistic is calculated as the standard deviation of the bootstrap statistics • Bootstrapping not used for the point estimate
When to Use Bootstrapping • Distribution has no clear analytical solution • eg Gini coefficient, poverty intensity • Test for sensitivity • Complex survey design (not random) • eg Statistics Canada surveys are a stratified, multistage design • Households within clusters within strata are selected • Observations will not be independent – variance calculated the usual way will be underestimated
Two Programs • One is ‘traditional’ bootstrapping • re-sampling from the original sample • The second is bootstrapping using Statistics Canada survey data • Statistics Canada does the re-sampling heavy lifting in most of its surveys • Use the bootstrap weights provided to calculate the standard error
Program 1 • Project where we examined the effect of trade on ‘poverty intensity’ in Canada/US • Used state/province level measures in regression analysis • Used bootstrapping to measure robustness of results given a different mix of policies • Our dataset consists of 61 unique observations of states and provinces. Re-sample to see if results are affected if we had a different make-up of regions
Program 2 • Project using the National Longitudinal Survey of Children and Youth (NLSCY) • Examined the effect of having a child with disabilities on the health of mothers and fathers • Ordered Probit utilizing Statistics Canada NLSCY bootstrap weights to estimate standard errors
Weighting • Many survey datasets include sampling weights so results will represent the population • The mechanics of using bootstrap weights are the same as for sampling weights • Each individual in survey has a sample weight and all the bootstrap weights • Re-estimate your model or statistic over and over using a different weight each time
Bootstrap Weight Derivation A Miracle Occurs Bootstrap Weights Re-sampling