1 / 44

Survey design and sampling

Survey design and sampling. Friday 16 th January 2009. Outline . Surveys Thinking about what you’re researching: case, population, sample Non-probability samples Probability samples Weighting Sampling error. Survey Analysis.

tan
Télécharger la présentation

Survey design and sampling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Survey design and sampling Friday 16th January 2009

  2. Outline • Surveys • Thinking about what you’re researching: case, population, sample • Non-probability samples • Probability samples • Weighting • Sampling error

  3. Survey Analysis • Is most often used where individuals are the unit of analysis (this is not always the case though, for example in surveys of schools) • Individuals, known as Respondents, provide data by responding to questions. • The instrument used to gather data is often referred to as a questionnaire. Questionnaires: • Collect standardised information. • Are used to elicit information to be used in analysis.

  4. Three Types of Surveys: • Self-administered Questionnaires Including: • Mailed Survey (or email) • Web-based surveys • Group Survey (i.e. in a classroom) • Interview Surveys (face to face; including CAP interviewing) • Telephone Surveys (including CAT interviewing)

  5. Response Rate • You must keep track of the response rate. This is calculated as the proportion of people who are selected to take part in the survey (part of the sample) who actually participate. i.e. if you receive 75 surveys back from a sample of 100 people your response rate is 75% Example: • You are studying women over 50 and are stopping women in the street asking them their age, and if they qualify you are asking to interview them. • If you stop 30 women and 20 are under 50 and 10 over 50, your starting number (those qualified to take part) is 10. • If 5 of these are willing to talk to you, you have a 50% response rate (5/10) • Note: it is irrelevant that you originally stopped 30 women, your response rate is NOT 17% (5/30) – you ignore those people who are not qualified in calculating response rate.

  6. Time as a Dimension in Survey Research • Cross Sectional Studies Observations of a sample, or cross-section of a population or phenomena made at one point in time – most surveys are cross-sectional  leading to a common criticism of survey research: that it is ahistorical/unsuited for examining social processes. • Longitudinal Studies Permits observations of the same phenomena over an extended period.  enables analysis of change.

  7. Types of Longitudinal Study • Trend Studies – examines change within a population over time (e.g. the census) • Cohort Studies – examines specific subpopulations or cohorts (although not necessarily the same individuals) as they change over time (i.e. interview people who were age 30 in 1970; 40 in 1980; 50 in 1990; and 60 in 2000) • Panel Study – examines the same set of people each time (e.g. interview same sample of voters every month during a campaign).

  8. Strengths of Survey Research • Useful in describing the characteristics of a large population. • Makes large samples feasible. • Flexible - many questions can be asked on a given topic. • Has a high degree of reliability (and replicability). • Is a relatively transparent process.

  9. Weaknesses of Survey Research • Can seldom deal with the context of social life. • Inflexible – cannot be altered once it’s begun (therefore poor for exploratory research). • Subject to artificiality – the product of respondents’ consciousness that they are being studied. • Can be weak on validity. • Can be poor at answering questions where individual is not the unit of analysis. • Usually inappropriate for historical research. • Particularly weak at gathering at certain sorts of information: • Highly complex or ‘expert’ knowledge • People’s past attitudes or behaviour • Subconscious (especially macro-social) influences • Shameful or stigmatized behavior or attitudes (especially in face-to-face interview) – although survey research may be able to achieve this in some circumstances.

  10. Thinking about what you’re researching: Case, Population, Sample Case: each empirical instance of what you’re researching • So if you’re researching celebrities who have been in trouble with the law Michael Jackson would be a case, as would Winona Ryder, Pete Doherty, Kate Moss, Boy George, George Michael and OJ Simpson… • If you were interested in Fast Food companies McDonalds would be a case, Burger King would be a case, as would Subway… • If you were interested in users of a homeless shelter on a particular night, each person who came to the shelter on the specified night would be a case.

  11. Thinking about what you’re researching: Case, Population, Sample • Population – all the theoretically relevant cases (e.g. “Tottenham supporters”). • Note: This may be different to the study population, which is all of the theoretically relevant cases which are actually available to be studied (e.g. “all Tottenham club members or season ticket holders”).

  12. Sometimes you can study all possible cases(the total population in which you are interested) For example: • Post WW2 UK Prime Ministers • Homeless people using a particular shelter on Christmas Day 2007 • National football teams in the 2006 World Cup • Secondary schools in Coventry

  13. Often you cannot research the whole populationbecause it’s too big and doing so would be too costly, too time consuming, or impossible. For example, if your ‘population’ is: • Voters in the UK since WW2 • All the homeless people in the UK on Christmas Day 2007 • Club and National Football teams involved in cup competitions in 2006 • Secondary schools in the UK On these occasions you need to select some cases to study. Selecting some cases from the total population is called Sampling

  14. How you sample depends (among other things) on some linked issues: • What you are especially interested in (what you want to find out) • The frequency with which what you are interested in occurs in the population • The size/complexity of the population • What research methods you are going to use • How many cases you want (or have the resources/time) to study

  15. Sample and population • A range of statistical analyses can be done on a sample. • However what we are interested in generally involves population parameters (e.g. whether women in the UK earn more or less than men, not whether the 3,452 women in our study earn more on average than the 2,782 men in our study). • Therefore statistical analysis usually involves techniques for making inferences from the sample to the population.

  16. Probability and Non-Probability Sampling Probability Samples Have a mathematical relationship to the total population: we can work out mathematically the likelihood (probability) of what is found within the sample being within a given distance of what would be found for the whole population (if we were able to analyze the whole population). Probability sampling allows us to make inferences about the whole population. Non-Probability Samples • Do not formally allow us to make inferences about the whole population. However there are often logistical reasons for their use, and (despite this being statistically dodgy) inferential statistics are frequently employed (and published!)

  17. Types ofNon-probability Sampling: 1. Reliance on available subjects: • Literally choosing people because they are available (e.g. approaching the first five people you see outside of the library) • Only justified if less problematic sampling methods are not possible. • Researchers must exercise considerable caution in generalizing from their data when this method is used.

  18. Types ofNon-probability Sampling: 2. Purposive or judgmental sampling • Selecting a sample based on knowledge of a population, its elements, and the purpose of the study. Selecting people who would be ‘good’ informants. • Used when field researchers are interested in studying cases that don’t fit into regular patterns of attitudes and behaviors (i.e. deviance). • Relies totally on the researcher’s prior ability to determine ‘suitable’ subjects.

  19. Types ofNon-probability Sampling: 3. Snowball sampling • Researcher collects data on members of the target population she can locate, then asks them to help locate other members of that population. • May be appropriate when members of a population are difficult to locate. • By definition respondents who are located by snowball sample will be connected to one another and so more likely to be similar to one another than to other members of the population.

  20. Types ofNon-probability Sampling: 4. Quota sampling • Begin with a matrix of the population (e.g. based on it being 50% female, 9% minority ethnic, of a particular age structure). • Data is collected from people with the characteristics of a given cell. • Each cell is assigned a weight appropriate to their portion of the population. (so if you were going to sample 1,000 people you would want 500 of them to be female and 45 to be minority women). • Data should provide a representation of the total population. • However the data may not represent the population in terms of criteria that were not factored in to the initial matrix. • You cannot measure response rates. • And the selection may be biased.

  21. The Logic of Probability Sampling • Representativeness: A sample is representative of the population from which it’s selected to the extent that it has the same aggregate characteristics (e.g. same percentage of women, of immigrants, of poor and rich…) • EPSEM (Equal Probability of Selection Method): Every member of the population has the same chance of being selected for the sample.

  22. Random Selection: Each element in the population has a known, non-zero chance of selection. ‘Tables’ of random numbers are often used (these come in print form or can be generated by computer). • Sampling Frame: List of every element/case from which a probability sample is selected. Sampling frames may not include every element. It is the researcher’s job to asses the extent of omissions and to correct them if possible.

  23. A Population of 100

  24. Types of Probability Sampling: 1. Simple Random Sample • Feasible only with the simplest sampling frame. • Enumerate sampling frame, and randomly select people. • Despite being the ‘pure’ type of random sampling this actually rarely occurs.

  25. A Simple Random Sample

  26. Types of Probability Sampling: 2. Systematic Random Sample • Random start and then every kth element selected (i.e. if you wanted to select 1,000 people out of 10,000 you’d select every 10th person: e.g. the 3rd, 13th, 23rd…). • Arrangement of elements in the list can result in a biased sample (e.g. example of picking corner apartments only).

  27. Types of Probability Sampling: 3. Stratified Sampling • Rather than selecting sample from population at large, researcher draws from homogeneous subsets of the population (e.g. random sampling from a set of undergraduates, and from a set of postgraduates). • Ensures that key sub-populations are adequately represented in the sample. • Results in a greater degree of representativeness by decreasing the probable sampling error.

  28. A Stratified, Systematic Samplewith a Random Start

  29. Types of Probability Sampling: 4. Multi-stage Sampling • Often used when it's not possible or practical to create a list of all the elements that compose the target population. • Involves repetition of two basic steps: creating lists of sampling units and sampling. • Can be highly efficient but less accurate.

  30. Example of Multi-stage Sampling Sampling Coventry residents • Write a list of all neighbourhoods in Coventry • Randomly select (sample) 5 neighbourhoods • Write a list of all streets in each selected neighbourhood • Randomly select (sample) 2 streets in each neighbourhood • Write a list of all addresses on each selected street • Select every house/flat [‘Cluster’ sampling!] • Write a list of all residents in each selected house/flat • Randomly select (sample) one person to interview.

  31. Types of Probability Sampling: 5. Probability Proportionate to Size (PPS) Sample • Sophisticated form of multi-stage sampling. • Used in many large scale survey sampling projects. • Here sampling units are selected with a probability proportionate to their size (e.g. a city 10 times larger than another is 10 times more likely to be selected in the first stage of sampling).

  32. Note • The sampling strategy used in real projects often combines elements of multi-stage sampling and elements of stratification. See example of Peter Townsend’s survey of poverty (p. 120 Buckingham and Saunders)

  33. Group Exercise Imagine that you are going to conduct a ‘smoking survey’, and want to get as accurate as possible results from a sample of Warwick students. • What sampling strategy would you choose and why? • What biases might this strategy produce?

  34. Weighting • Used when you have “over-sampled” (or “under-sampled”) a particular group. This is called “disproportionate sampling” • It assigns some cases more weight than others on the basis of the different probabilities each case had of selection • The appropriate approach is to give each case a weight that’s (proportional to) the inverse of the case’s selection probability.

  35. Weighting Example • I have a population of 10,000 university students that is 10% minority ethnic. • I want to sample 100 people and compare ‘white’ and minority ethnic respondents. • If I sample randomly I will probably get only about 10 minority ethnic respondents. This won’t give me much of a basis for a comparison. • So I stratify my sample and sample 50/1000 minority ethnic students, giving a probability of selection of .05 • And 50/9,000 ‘white’ students, giving a probability of selection of .0056 • We now have 50 ‘white’ and 50 minority ethnic respondents – this is useful because it provides more balanced information about each sub-population. • However, it now looks from the sample like the population is 50% minority ethnic, which is wrong. • To re-weight the responses to make them represent the ‘real’ population I can multiply each minority ethnic respondent by the inverse of their chance of selection (1000/50 = 20) and each ‘white’ respondent by the inverse of their chance of selection (9000/50 = 180).

  36. Sampling Error • A Parameter is the summary description of a given variable in a population (e.g. percentage of women in US population) • When researchers generalize from a sample they’re using sample observations to estimate population parameters • Sampling Error is the degree of error to be expected from a given sample design in making these estimations

  37. Sampling Error The most carefully selected sample will never provide a perfect representation of the population from which it was selected. There will always be somesampling error The expected extent of error in a sample is expressed in terms of confidence levels (e.g. that you’re 95% confident of being no more than a given amount wrong about the proportion of the population who are Catholic, given how many people in your sample were Catholic)

  38. A population of ten peoplewith $0 - $9

  39. The Sampling Distribution of Samples of 1

  40. The Sampling Distribution of Samples of 2

  41. The sampling Distribution of Samples of 3,4,5, and 6

  42. Sample Size Sample Size Needed Depends on: • Heterogeneity of the population – the more heterogeneous, the bigger the sample • Number of sub-groups – the more sub groups, the bigger the sample needed • Frequency of the phenomenon you’re trying to detect – the closer to 50% (of the time) that it occurs, the bigger the sample • How accurately you want your sample statistics to reflect the population – the greater the accuracy required, the bigger sample needed. • How confident you want to be in your results!

  43. Other considerations when you’re thinking about Sample Size • Response Rate – if you think that a lot of people will not respond, you need to start off with a larger sample • Form of analysis – some forms of statistical analysis require a larger number of cases than others. If you plan on using one of these you will need to ensure you’ve got enough cases Generally (given a choice): Bigger is Better! (Hence sample size often reflects costs/resources.)

More Related