1 / 15

Session 10

Session 10. Sampling Weights: an appreciation. Session Objectives. To provide you with an overview of the role of sampling weights in estimating population parameters To demonstrate computation of sampling weights for a simple scenario

bruno-hyde
Télécharger la présentation

Session 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 10 Sampling Weights:an appreciation

  2. Session Objectives • To provide you with an overview of the role of sampling weights in estimating population parameters • To demonstrate computation of sampling weights for a simple scenario • To highlight difficulties in calculating sampling weights for complex survey designs and the need to seek professional expertise for this purpose • To learn about file merging and continue with the on-going project work

  3. What are sampling weights? • Real surveys are generally multi-stage • At each stage, probabilities of selecting units at that stage are not generally equal • When population parameters like a mean or proportion is to be estimated, results from lower levels need to be scaled-up from the sample to the population • This scaling-up factor, applied to each unit in the sample is called its sampling weight.

  4. A simple example • Suppose for example, a simple random sampleof 500 HHs in a rural district (having 7349 HHs in total) showed 140 were living below the poverty line • Hence total in population living below the povertyline = (140/500)*7349 =2058 • Data for each HH was a 0,1 variable, 1 being allocated if HH was below poverty line. • Multiplying this variable by 7349/500=14.7 & summing would lead to the same answer. • i.e. sampling weight for each HH = 14.7

  5. Why are weights needed? • Above was a trivial example with equalprobabilities of selection • In general, units in the sample have very differing probabilities of selection • To allow for unequal probabilities of selection, each unit is weighted by the reciprocal of its probability of selection • Thus sampling weight=(1/prob of selection)

  6. An example • Consider a conveniently rectangular forest witha river running down in the middle, thus dividingthe forest into Region 1 and Region 2. • Region 1 is divided into 96 strips, each 50m x 50m, while Region 2 is divided into 72 strips. • Data are the number of small trees and the number of large trees in each strip. • Aim: To find the total number of large trees, the total number of small trees, and hence the total number of trees in the forest.

  7. Weights in stratified sampling • Each region can be regarded as a stratum: 8strips were chosen from region 1 and 6 from region 2. • Mean number of large trees per strip were: • 97.875 in region 1, based on n1=8 • 83.500 in region 2, based on n2=6 • Hence total number of large trees in the forest can be computed as (96*97.875) + (72*83.5) = 15408 • So what are the sampling weights used for each unit (strip)?

  8. Self-weighting • The sampling weights are the same for all strips, whether in region 1 or region 2. Why is this? • What are the probabilities of selection here? • In region 1, each unit is selected with prob=8/96 • In region 2, each unit is selected with prob=6/72 • A design where probabilities of selection are equal for all selected units is called a self-weighting design. • Regarding the sample as a simple random sample then gives us the correct mean.

  9. Results for means • Easy to see that the mean number of large treesin the forest is [(96/168)*97.875 ] + [(72/168)*83.5] = 91.71 • Regarding the 14 observations as though they were drawn as a simple random sample gives 91.71, i.e. the same answer. • The results for variances however differ • Variance of stratified sample mean=1.28 • Variance of mean ignoring stratification = 2.18

  10. More on weights • Important to note that the weights used incomputing a mean, i.e. • (96/168)*(1/8) = 1/14 for strips in region 1, & • (72/168)*(1/6) = 1/14 for strips in region 2, are not sampling weights • Sampling weights refer to the multiplying factor when estimating a total. • Essentially they represent the number of elements in the population that an individual sampling unit represent.

  11. Other uses of weight • Weights are also used to deal withnon-responses and missing values • If measurements on all units are not availablefor some reason, may re-compute the sampling weights to allow for this. • e.g. In conducting the Household Budget Survey 2000/2001 in Tanzania, not all rural areas planned in the sampling scheme were visited. As a result, sampling weights had to be re-calculated and used in the analysis.

  12. Computation of weights • General approach is to find the probability ofselecting a unit at every stage of the sample selection process • e.g. in a 3-stage design, three set of probabilities will result • Probability of selecting each final stage unit is then the product of these three probabilities • The reciprocal of the above probability is then the sampling weight

  13. Difficulties in computations • Standard methods as illustrated in textbooks on sampling, often do not apply in real surveys • Complex sampling designs are common • Computing correct probabilities of selection can then be very challenging • Usually professional assistance is needed to determine the correct sampling weights and to use it correctly in the analysis

  14. Software for dealing with weights • When analysing data from complex surveydesigns, it is important to check that the softwarecan deal with sampling weights • Packages such as Stata, SAS, Epi-info have facilities for dealing with sampling weights • However, need to be careful that the approaches used are appropriate for your own survey design Note: Above discussion was aimed at providing you with an overview of sampling weights. See next slide for work of the remainder of this session.

  15. Practical work • To understand how files may be merged, work through sections 10.5 and 10.6 of the Stata Guide. • Now move to your project work and practice file merging to address objectives 4 and 5 of your task. • A description of the work you should undertake is provided in the handout titled Practical 10.

More Related