AP Statistics

AP Statistics 5.1 Designing Samples

Learning Objective: • Differentiate between an observational study and an experiment • Learn different types of sampling techniques • Use a random digit table to create an SRS • Understand different types of bias

Definitions • designs – arrangements or patterns for producing and collecting data • population – entire group of individuals that we want information about • sample – part of a population that we actually examine in order to gather information

Example: Problem 5.2 p. 248 • Answers: • a) An individual is a person; the population is all adult US residents. • b) An individual is a household; the population is all US households. • c) An individual is a voltage regulator; the population is all the regulators in the last shipment

Some questions that we need to answer as we design a study or experiment: • How many individuals must we collect data from? (sample size?) • How will we select the individuals to be studied? • If (as in many experiments) several groups of individuals are to receive different treatments, how will we form the groups?

Without a systematic design for producing data, we are subject to being misled by incomplete or haphazard data, or by confounding variables.

Sampling Design for Observational Studies • Goal: To use information obtained from a “representative” sample to make inferences about the population from which the sample was taken; the only alternative is taking a census – not very practical! • We will not, as in an experiment, impose a treatment in order to observe the response, but we want to gather information about a large group of individuals. Design of the sample refers to the method used to choose the sample from the population. Poor sample design leads to misleading conclusions.

“Bad” Sampling Methods • voluntary response sample – consists of people who choose themselves by responding to a general appeal; voluntary response samples tend to be biased (especially towards the negative) because people who have a strong opinion are most likely to respond • Ex: call-in polls, internet quick polls, etc.

Example : Problem5.3 p. 249 • Answer: Only people with a strong opinion on the subject – strong enough that they are willing to spend the time and 50¢ – will respond to this advertisement

convenience sample – “grab” the first n people available – not random! Ex:

Probability Samples – “Good” Sampling Methods • simple random sample(SRS) – each individual in the population has an equal chance of being included in the sample and each subgroup of size n has an equal chance of being in the sample. • You can select an SRS by labeling all the individuals in the population with a number and then randomly selecting a sample (using a calculator or table of random digits)

Example : 5.5 p. 252 (1st ed.) • Answer: I started with 01 and numbered the managers down the columns. I used a RDT, picking 6 2-digit #’s without repetition (ignoring numbers 00 and greater than 28). • The numbers I selected are in bold: • Line 139: 55 58 89 94 04 70 70 84 10 98 43 56 35 69 34 48 39 45 17 19 • Line 140: 12 97 51 32 58 13 • Thus, the six managers chosen to be interviewed are: • 04-Bonds, 10-Fleming, 17-Liao, 19-Naber, 12-Goel, and 13-Gomez

Example : 5.19 p. 262 • Answer: I numbered the bottles across the rows from 01 to 25. I used a RDT, picking 3 2-digit #’s without repetition (ignoring numbers 00 and greater than 25). • The chosen numbers are in bold • Line 111: 81 48 66 94 87 60 51 30 92 97 00 41 27 12 38 27 64 93 99 50 • Line 112: 59 63 68 88 04 04 63 47 11 • Thus, the three bottles chosen to be tested are 12-B0986, 04-A1101, and 11-A2220.

The rest of these probability samples give each individual, but not each subgroup, and equal chance of being selected. • Systematic random sampling – randomly select a starting place/number and then take every kth value/individual

The following probability samples are used with populations that are very large and/or spread out: • stratified random sampling – break the population into two or more strata (groups – e.g. males and females), then take an SRS from each strata (similar to blocking used in experiments); insures that you include in the sample all types of individuals from the population (more representative sample)

Example: 5.23 p. 264 • Answer: It is not an SRS, because some samples of size 250 have no chance of being selected. For example, using this method, it would be impossible to select a sample containing all women.

multistage sampling • – uses the idea of a cluster sample – randomly select a location, area, row, etc. and then include everything in that group in your sample; doing this multiple times makes it a multistage sample. • Ex:

Surveys • Surveys are a common method of collecting data in an observational study. There are several problems that arise with surveys: • In order to choose a sample to survey, we need a complete an accurate list of the population, but in reality we rarely have one.

undercoverage • some groups in the population are left out in the process of choosing the sample; e.g. a survey given to households leaves out homeless people, people in prison, students in dorms; opinion polls over the phone leave out people without phones

nonresponse • the selected individual cannot be contacted/found or refuses to answer the questions. The non response rate for surveys often reaches 30% or more.

response bias • bias caused by the behavior of the respondent or of the interviewer; e.g. respondents may lie if the questions deal with illegal or socially unacceptable behavior; the attitude of the interviewer may suggest that one answer is more desirable (therefore interviewers must be trained carefully to remain neutral)

wording of questions (wording effect) • is the most important influence on the answers given to a survey; watch out for leading questions and difficult-to-understand questions

Example: 5.24 p. 264 • Answer: • A) This question will likely elicit more responses against gun control (that is, more people will choose 2). • B) The phrasing of this question will tend to make people respond in favor of a nuclear freeze. Only one side of the issue is presented. • C) HUH? The wording of this question is too technical for most people to understand – and for those rare few that do understand, it is slanted towards supporting recycling. It could be rewritten to say something like: “Do you support economic incentives to promote recycling?”

AP Statistics

AP Statistics

Presentation Transcript

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

ap statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics

AP Statistics