240 likes | 633 Vues
AP Statistics. 5.1 Designing Samples. Learning Objective:. Differentiate between an observational study and an experiment Learn different types of sampling techniques Use a random digit table to create an SRS Understand different types of bias. Definitions.
E N D
AP Statistics 5.1 Designing Samples
Learning Objective: • Differentiate between an observational study and an experiment • Learn different types of sampling techniques • Use a random digit table to create an SRS • Understand different types of bias
Definitions • designs – arrangements or patterns for producing and collecting data • population – entire group of individuals that we want information about • sample – part of a population that we actually examine in order to gather information
Example: Problem 5.2 p. 248 • Answers: • a) An individual is a person; the population is all adult US residents. • b) An individual is a household; the population is all US households. • c) An individual is a voltage regulator; the population is all the regulators in the last shipment
Some questions that we need to answer as we design a study or experiment: • How many individuals must we collect data from? (sample size?) • How will we select the individuals to be studied? • If (as in many experiments) several groups of individuals are to receive different treatments, how will we form the groups?
Without a systematic design for producing data, we are subject to being misled by incomplete or haphazard data, or by confounding variables.
Sampling Design for Observational Studies • Goal: To use information obtained from a “representative” sample to make inferences about the population from which the sample was taken; the only alternative is taking a census – not very practical! • We will not, as in an experiment, impose a treatment in order to observe the response, but we want to gather information about a large group of individuals. Design of the sample refers to the method used to choose the sample from the population. Poor sample design leads to misleading conclusions.
“Bad” Sampling Methods • voluntary response sample – consists of people who choose themselves by responding to a general appeal; voluntary response samples tend to be biased (especially towards the negative) because people who have a strong opinion are most likely to respond • Ex: call-in polls, internet quick polls, etc.
Example : Problem5.3 p. 249 • Answer: Only people with a strong opinion on the subject – strong enough that they are willing to spend the time and 50¢ – will respond to this advertisement
convenience sample – “grab” the first n people available – not random! Ex:
Probability Samples – “Good” Sampling Methods • simple random sample(SRS) – each individual in the population has an equal chance of being included in the sample and each subgroup of size n has an equal chance of being in the sample. • You can select an SRS by labeling all the individuals in the population with a number and then randomly selecting a sample (using a calculator or table of random digits)
Example : 5.5 p. 252 (1st ed.) • Answer: I started with 01 and numbered the managers down the columns. I used a RDT, picking 6 2-digit #’s without repetition (ignoring numbers 00 and greater than 28). • The numbers I selected are in bold: • Line 139: 55 58 89 94 04 70 70 84 10 98 43 56 35 69 34 48 39 45 17 19 • Line 140: 12 97 51 32 58 13 • Thus, the six managers chosen to be interviewed are: • 04-Bonds, 10-Fleming, 17-Liao, 19-Naber, 12-Goel, and 13-Gomez
Example : 5.19 p. 262 • Answer: I numbered the bottles across the rows from 01 to 25. I used a RDT, picking 3 2-digit #’s without repetition (ignoring numbers 00 and greater than 25). • The chosen numbers are in bold • Line 111: 81 48 66 94 87 60 51 30 92 97 00 41 27 12 38 27 64 93 99 50 • Line 112: 59 63 68 88 04 04 63 47 11 • Thus, the three bottles chosen to be tested are 12-B0986, 04-A1101, and 11-A2220.
The rest of these probability samples give each individual, but not each subgroup, and equal chance of being selected. • Systematic random sampling – randomly select a starting place/number and then take every kth value/individual
The following probability samples are used with populations that are very large and/or spread out: • stratified random sampling – break the population into two or more strata (groups – e.g. males and females), then take an SRS from each strata (similar to blocking used in experiments); insures that you include in the sample all types of individuals from the population (more representative sample)
Example: 5.23 p. 264 • Answer: It is not an SRS, because some samples of size 250 have no chance of being selected. For example, using this method, it would be impossible to select a sample containing all women.
multistage sampling • – uses the idea of a cluster sample – randomly select a location, area, row, etc. and then include everything in that group in your sample; doing this multiple times makes it a multistage sample. • Ex:
Surveys • Surveys are a common method of collecting data in an observational study. There are several problems that arise with surveys: • In order to choose a sample to survey, we need a complete an accurate list of the population, but in reality we rarely have one.
undercoverage • some groups in the population are left out in the process of choosing the sample; e.g. a survey given to households leaves out homeless people, people in prison, students in dorms; opinion polls over the phone leave out people without phones
nonresponse • the selected individual cannot be contacted/found or refuses to answer the questions. The non response rate for surveys often reaches 30% or more.
response bias • bias caused by the behavior of the respondent or of the interviewer; e.g. respondents may lie if the questions deal with illegal or socially unacceptable behavior; the attitude of the interviewer may suggest that one answer is more desirable (therefore interviewers must be trained carefully to remain neutral)
wording of questions (wording effect) • is the most important influence on the answers given to a survey; watch out for leading questions and difficult-to-understand questions
Example: 5.24 p. 264 • Answer: • A) This question will likely elicit more responses against gun control (that is, more people will choose 2). • B) The phrasing of this question will tend to make people respond in favor of a nuclear freeze. Only one side of the issue is presented. • C) HUH? The wording of this question is too technical for most people to understand – and for those rare few that do understand, it is slanted towards supporting recycling. It could be rewritten to say something like: “Do you support economic incentives to promote recycling?”