1 / 32

Sampling

Sampling. Sampling. Probability Sampling Based on random selection Non-probability sampling Based on convenience. Sampling Miscues: Alf Landon for President (1936). Literary Digest: post cards to voters in 6 states Correctly predicting elections from 1920-1932

lopezdavid
Télécharger la présentation

Sampling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sampling

  2. Sampling • Probability Sampling • Based on random selection • Non-probability sampling • Based on convenience

  3. Sampling Miscues: Alf Landon for President (1936) • Literary Digest: post cards to voters in 6 states • Correctly predicting elections from 1920-1932 • Names selected from telephone directories and automobile registrations • In 1936, they sent out 10 million post cards • Results pick Landon 57% to Roosevelt 43% • Election: Roosevelt in the largest landslide • Roosevelt 61% of the vote and 523-8 in Elect. Col. • Why so inaccurate?: Poor sampling frame • Leads to selection of wealthy respondents

  4. Sampling Miscues: Thomas E. Dewey for President (1948) • Gallup uses quota sampling to pick winner 1936-1944 • Quota sampling: • matches sample characteristics to characteristics of population • Gallup quota samples on the basis of income • In 1948, Gallup picked Dewey to defeat Truman • Reasons: • 1. Most pollsters quit polling in October • 2. Undecided voters went for Truman • 3. Unrepresentative samples—WWII changed society since census

  5. Non-probability Sampling • In situations where sampling frame for randomization doesn’t exist • Types of non-probability samples: • 1. Reliance on available subjects • convenience sampling • 2. Purposive or judgmental sampling • 3. Snowball sampling • 4. Quota sampling

  6. Reliance on Available Subjects • Person on the street, easily accessible • Examples: • Mall intercepts, college students, person on the street • Frequently used, but usually biased • Notoriously inaccurate • Especially in making inferences about larger population

  7. Purposive or Judgmental Sampling • Dictated by the purpose of the study • Situational judgments about what individuals should be surveyed to make for a useful or representative sample • E.g., Using college students to study third-person effects regarding rap and metal music • 3pe: Others are more affected by exposure than self • Assessing effects on self and others • Using college students makes for homogeneity of self

  8. Snowball Sampling • Used when population of interest is difficult to locate • E.g., homeless people • Research collects data from of few people in the targeted group • Initially surveyed individuals asked to name other people to contact • Good for exploration • Bad for generalizability

  9. Quota Sampling • Begins with a table of relevant characteristics of the population • Proportions of Gender, Age, Education, Ethnicity from census data • Selecting a sample to match those proportions • Problems: • 1. Quota frame must be accurate • 2. Sample is not random

  10. Probability Sampling • Goal: Representativeness • Sample resembles larger population • Random selection • Enhancing likelihood of representative sample • Each unit of the population has an equal chance of being selected into the sample

  11. Population Parameters • Parameter: Summary statistic for the population • E.g., Mean age of the population • Sample is used to make parameter estimates • E.g., Mean age of the sample • Used as an estimate of the population parameter

  12. Sampling Error • Every time you draw a sample from the population, the parameter estimate will fluctuate slightly • E.g.: • Sample 1: Mean age = 37.2 • Sample 2: Mean age = 36.4 • Sample 3: Mean age = 38.1 • If you draw lots of samples, you would get a normal curve of values

  13. Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Likely population parameter Estimated Mean

  14. Standard Error • The average distance of sample estimates from the population parameter • 68% of sample estimates will fall within in one standard error of the population parameter

  15. Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Population parameter 1 standard error unit Estimated Mean

  16. Normal Curve of Sample Estimates 2/3 of samples Frequency of estimated means from multiple samples Population parameter 1 standard error unit Estimated Mean

  17. Standard Error Estimates and Sample Size • As the sample size increases: • The standard error decreases • In other words, are sample estimate is likely to be closer to the population parameter • As the sample size increases, we get more confident in our parameter estimate

  18. Confidence Levels • Two thirds of samples will fall within the standard error of the population parameter • Therefore: a single sample has a 68% chance of being within the standard error • Confidence levels: • 68% sure estimate is within 1 s.e. of parameter • 95% sure estimate is within 2 s.e. of parameter • 99% sure estimate is within 3 s.e. of parameter

  19. Confidence Interval • Interval width at which we are 95% confident contains the population parameter • For example, we predict that Candidate X will receive 45% of the vote with a 3% confidence interval • We are 95% sure the parameter will be between: • 42% and 48% • Confidence interval shrinks as: • Standard error is smaller • Sample size is larger

  20. Sample Size & Confidence Interval • How precise does the estimate have to be? • More precise: larger sample size • Larger samples increase precision • But at a diminishing rate • Each unit you add to your sample contributes to the accuracy of your estimate • But the amount it adds shrinks with additional unit added

  21. 95% Confidence Intervals Sample Size

  22. Sampling Frame • List of units from which sample is drawn • Defines your population • E.g., List of members of organization or community • Ideally you’d like to list all members of your population as your sampling frame • Randomly select your sample from that list • Often impractical to list entire population

  23. Sampling Frames for Surveys • Limitations of the telephone book: • Misses unlisted numbers • Class bias: • Poor people may not have phone • Less likely to have multiple phone lines • Most studies use a technique such as Random Digit Dialing as a surrogate for a sampling frame

  24. Types of Sampling Designs • Simple Random Sampling • Systematic Sampling • Stratified Sampling • Multi-stage Cluster Sampling

  25. Simple Random Sampling • Establish a sampling frame • A number is assigned to each element • Numbers are randomly selected into the sample

  26. Systematic Sampling • Establish sampling frame • Select every kth element with random start • E.g., 1000 on the list, choosing every 10th name yields a sample size of 100 • Sampling interval: standard distance between units on the sampling frame • Sampling interval = population size / sample size • Sampling ratio: proportion of population that are selected • Sampling ratio = sample size / population size

  27. Stratified Sampling • Modification used to reduce potential for sampling error • Research ensures that certain groups are represented proportionately in the sample • E.g., If the population is 60% female, stratified sample selects 60% females into the sample • E.g., Stratifying by region of the country to make sure that each region is proportionately represented

  28. Two Methods of Stratification • 1. Sort population in groups • Randomly select within groups in proportion to relative group size • 2. Sort population into groups • Systemically select within groups using random start • Disproportionate stratification: • Some stratification groups can be over-sampled for sub-group analysis • Samples are then weighted to restore population proportions

  29. Cluster Sampling • Frequently, there is no convenient way of listing the population for sampling purposes • E.g., Sample of Dane County or Wisconsin • Hard to get a list of the population members • Cluster sample • Sample of census blocks • List of people for selected census block • Select sub-sample of people living on each block

  30. Multi-stage Cluster Sample • Cluster sampling done in a series of stages: • List, then sample within • Example: • Stage 1: Listing zip codes • Randomly selecting zip codes • Stage 2: List census blocks within selected zip codes • Randomly select census blocks • Stage 3: List households on selected census blocks • Randomly select households • Stage 4: List residents of selected households • Randomly select person to interview

  31. Multi-stage Sampling and Sampling Error • Error is introduced at each stage • One solution is to use stratification at each stage to try to reduce sampling error

More Related