1 / 32

STT 350: SURVEY SAMPLING Dr. Cuixian Chen

STT 350: SURVEY SAMPLING Dr. Cuixian Chen. Chapter 8: Cluster sampling. REVIEW: Why systematic sampling a useful alternative?. Easier to perform in the field (possibly less subject to selection errors by fieldworkers, especially if a good frame is not available)

Télécharger la présentation

STT 350: SURVEY SAMPLING Dr. Cuixian Chen

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow STT 350:SURVEY SAMPLINGDr. Cuixian Chen Chapter 8: Cluster sampling

  2. REVIEW:Why systematic sampling a useful alternative? • Easier to perform in the field (possibly less subject to selection errors by fieldworkers, especially if a good frame is not available) • More information per unit cost than simple or stratified sampling.

  3. Comparison of four sampling schemes • Obtain a specified amount of information about a population parameter at minimum cost. Stratified random sampling is often better suited for this than is simple random sampling. • Systematic sampling often gives results at least as accurate as those from simple random sampling, and it is easier to perform. • Cluster sampling gives more information per unit cost than do any of the other three designs.

  4. Cluster sampling • A cluster sample is a probability sample in which each sampling unit is a collection, or cluster, of elements. • To summarize, cluster sampling is an effective design for obtaining a specified amount of information at minimum cost under the following conditions: • 1. A good frame listing population elements either is not available or is very costly to obtain, but a frame listing clusters is easily obtained. • 2. The cost of obtaining observations increases as the distance separating the elements increases.

  5. Illustrative example • Suppose we wish to estimate the average income per household in a large city. How should we choose the sample? • Possibly: a frame listing all households (elements) in the city, and this frame may be very costly or impossible to obtain. • City block statistics from the Census Bureau are widely used in cluster sampling by market research firms, which may want to estimate the potential market for a product, the potential sales if a new store were to open in the area, or the potential number of clients for a new service, such as an emergency medical facility. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  6. Difference b/w Strata & Cluster sampling • Main difference b/w the optimal construction of strata (Chapter5) and the construction of clusters. • Strata are to be as homogeneous (alike) as possible within, but one stratum should differ as much as possible from another with respect to the characteristic being measured. • Clusters, on the other hand, should be as heterogeneous (different) as possible within, and one cluster should look very much like another in order for the economic advantages of cluster sampling to pay off.

  7. Est of µ and t with Cluster sampling

  8. Est of µ with Cluster sampling The estimated variance is biased and a good estimator of only if n is large - say, n>=20. The bias disappears if the cluster sizes ml, m2, . . . , mNare equal.

  9. Eg 8.1, page 254 • Q: A sociologist wants to estimate per-capita income in a certain small city. No list of resident adults is available. How should he design sample survey? • Answer: Cluster sampling, for no lists of elements are available. • City is marked off into rectangular blocks, except for two industrial areas and three parks (with a few houses). • (a) each city block for one cluster, (b) two industrial for one cluster, (c) three parks for one cluster. • Clusters are numbered on a city map, from 1 to 415. The experimenter has enough time and money to sample clusters and to interview every household within each cluster. • Results: 25 random numbers are SRS from 1 to 415. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  10. Recall Eg8.1: N=415; n=?, For m, t, or p? What sampling scheme? Eg 8.2, page 256 Data is available in class website: Dataset used in Textbook Examples Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  11. Est of t with Cluster sampling, when M is known Note that the estimator M*(y-bar) is useful only if the number of elements in the population, M, is known.

  12. Recall Eg8.1: N=415; n=?, For m, t, or p? What sampling scheme? M is known Eg 8.3, page 258 Data is available in class website: Dataset used in Textbook Examples Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  13. Ex 8.2, page 279 [Overview for background for Ex8.4] Data is available in class website: Dataset in Excel format Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  14. Data is available in class website: Dataset in Excel format Ex 8.4, page 279 Note: n = 20 from the N = 96 M is known. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  15. Est of t with Cluster sampling, when M is unknown • Often, M, the # of elements in the population is unknown. • Note: , sample average of the N cluster total. • It is an unbiased estimator of the population average of N cluster totals. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  16. Est of t with Cluster sampling , when M is unknown

  17. Recall Eg8.1: N=415; n=?, For m, t, or p? What sampling scheme? Eg 8.4, page 260 Data is available in class website: Dataset used in Textbook Examples Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  18. Ex 8.2, page 279 [Overview for background for Ex8.3] Data is available in class website: Dataset in Excel format Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  19. Data is available in class website: Dataset in Excel format Ex 8.3, page 279 Note: n = 20 from the N = 96 M is unknown. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  20. Chap 8.5: Find sample size for Est of µ • Quantity of info is affected by 2 factors: # of clusters and relative cluster size. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  21. Recall Eg8.1: N=415; For m, t, p, or n? What sampling scheme? Eg 8.6, page 265 Data is available in class website: Dataset used in Textbook Examples Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  22. Find sample size for Est of t , when M is known • Quantity of info is affected by 2 factors: # of clusters and relative cluster size. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  23. Recall Eg8.1: N=415; For m, t, p, or n? What sampling scheme? Eg 8.7, page 265 Data is available in class website: Dataset used in Textbook Examples Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  24. Find sample size for Est of t, , when M is unknown • Quantity of info is affected by 2 factors: # of clusters and relative cluster size. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  25. Recall Eg8.1: N=415; For m, t, p, or n? What sampling scheme? Eg 8.8, page 267 Data is available in class website: Dataset used in Textbook Examples Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  26. Data is available in class website: Dataset in Excel format Ex 8.5, page 279 Note: n’ = 20 from the N = 96 M is unknown. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  27. Est of pwith Cluster sampling Let ai denote total # of elements in cluster i that possess characteristic of interest.

  28. Data is available in class website: Dataset in Excel format Eg 8.9, page 269 Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  29. Data is available in class website: Dataset in Excel format Ex 8.8, page 280 Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  30. Est of pwith Cluster sampling Let ai denote total # of elements in cluster i that possess characteristic of interest.

  31. Data is available in class website: Dataset in Excel format Eg 8.10, page 271 Preliminary study outcomes. Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

  32. Data is available in class website: Dataset in Excel format Ex 8.9, page 280 Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow

More Related