1 / 24

NLSCY - Variance

NLSCY - Variance. NLSCY - National Longitudinal Survey of Children and Youth. Objectives of the Presentation - Demonstration. Why is it necessary to compute the variance? How can the variance be computed with NLSCY data?. Why Compute the Variance?.

liz
Télécharger la présentation

NLSCY - Variance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NLSCY - Variance NLSCY - National Longitudinal Survey of Children and Youth

  2. Objectives of the Presentation - Demonstration • Why is it necessary to compute the variance? • How can the variance be computed with NLSCY data?

  3. Why Compute the Variance? • NLSCY data come from a probabilistic survey: • Variability associated with estimates produced with data from any probabilistic survey • To make valid inferences about the population of interest, that variability must be measured

  4. Main Difficulty: Complexity of the NLSCY’s Sampling Plan • Two different sample frames used to select the sample: • Labour Force Survey (LFS), itself a survey with a complex sample design • Birth Register • Use of two frames for certain groups (five-year-olds, Cycle 3)

  5. Complexity of the NLSCY’s Sampling Plan (continued) • Children’s selection probabilities very uneven • Non-response adjustments that cross strata boundaries • Empty clusters from the LFS

  6. Effects of the Complexity of the NLSCY’s Sampling Plan • No exact analytical formula for computing the variance because of the complex sample design. • No commercial application can fully take the NLSCY’s complexity into account in computing the variance.

  7. How to Compute the Variance for the NLSCY • 3 solutions: • Approximate sampling variability tables provided in the user’s guide (in the form of coefficients of variation (CVs)). • Approximate CV tables for a number of specific subject areas (Excel spreadsheet). • Use bootstrap weights and SAS program supplied by the NLSCY.

  8. How to Compute the Variance for the NLSCY • Of these 3 solutions: • The first two can be used for exploratory analysis. These 2 methods provide an approximation of the variance • Only the third solution computes the variance “more exactly”

  9. Sampling Variability Tables • Very very limited… • Users’ guide explains how to use them

  10. Approximate CV Tables (Excel) • Brief excerpt:

  11. Approximate CV Tables (Excel) • Let’s go directly to the CV table and take a closer look...

  12. Approximate CV Tables (Excel) • Originally created to answer the question: • Is the Cycle 5 sample size large enough? • Approximation of the exact variance: • Takes the sample design into account by using bootstrap weights. • On the other hand, uses a random variable instead of real variables.

  13. Approximate CV Tables (Excel) • CVs available for many subject areas, for a number of proportions. • Lots of additional information available: • sample size, projected size, confidence interval

  14. Approximate CV Tables (Excel) • Functions: • Can choose areas of interest and obtain an approximate CV. • Possibility of making queries • Example: What subject areas have CVs of less than 25%?

  15. Approximate CV Tables (Excel) • In a nutshell: • Much more detailed than the tables provided in the user’s guide, but … • they can’t replace exact variance calculation • limited number of subject areas

  16. Bootstrap Weights and NLSCY_VES • Computing the variance using the replicate (bootstrap) method: • for longitudinal estimates • for cross-sectional estimates • for all cycles • for all desired subject areas

  17. Bootstrap Weights and NLSCY_VES • SAS program called NLSCY_VES is provided for computing the variance: • set of macros • easy to use, well documented • examples provided • computes variance for totals, ratios, differences between ratios, linear regressions, logistic regressions

  18. Bootstrap Basics • A. Select a subsample from the original sample with replacement • B. For this subsample, calculate the weights as if it were the full sample • Repeat A and B many times (1,000) to produce a set of bootstrap weights

  19. Bootstrap Basics (continued) • For a given estimate: • Calculate the estimate with each set of weights • Calculate the variance of the estimates obtained

  20. Structure of NLSCY_VES • All the macros are in SAS file: NLSCY_VES.sas • No changes needed • Another SAS program calls the macros and allows the user to set the various parameters.

  21. What You Need • The SAS file NLSCY_VES.sas • The SAS program to call the macros • Set of data for which the variance is required • The file of 1,000 bootstrap weights for the appropriate cycle and type of analysis (longitudinal or cross-sectional)

  22. Conclusion • The variance must be computed if we are to make valid inferences • The sample design must be taken into account if we want the variance calculation to be valid. Otherwise, we may draw incorrect conclusions

  23. Conclusion (continued) • The NLSCY provides 3 tools for computing the variance: • tables in the user’s guide (too limited…) • Excel file for many domains (to get an idea) • bootstrap weights (the best approach)

  24. Questions

More Related