170 likes | 182 Vues
Learn how to design experiments and surveys to accurately answer real-life questions. Understand the importance of sample design and avoid biases. Discover different sampling methods like simple random samples, stratified random samples, and multistage sampling.
E N D
AP STATISTICSLESSON 5 - 1 DESIGNING DATA
ESSENTIAL QUESTION: How can data be produced to answer questions about real-life situations? • To design experiments, surveys and analyze them in order to answer questions as accurately as possible. • To Learn how to use the B table to set up Simple Random Samples. • To discover the difficulties that damage a sample.
Introduction Our goal in choosing a sample is a picture of the population, disturbed as little as possible by the act of gathering information. Sample surveys are one kind of observations study.
Observation vs. Experiment An observational studyobserves individuals and measures variables of interest but does not attempt to influence the responses. An experiment,on the other hand, deliberately imposes some treatment on individuals in order to observe their responses. Both have important roles depending on the situation and the questions to be answered.
Additional facts • Observational studies of the effect of one variable on another often fail because the explanatory variable is confounded with lurking variables. • In some situations, it may not be possible to observe individuals directly or to perform an experiment. In other cases, it may be logistically difficult or simply inconvenient to • Simulation provides an alternative method to produce data.
Population and Sample The entire group of individuals that we want information about is called the population. A sampleis a part of the population that we actually examine in order to gather information.
Sampling vs. Census Sampling involves studying a part in order to gain information about the whole. A census attempts to contact every individual in the entire population. The design of a sample refers to the method used to choose the sample from the population. Poor sample design can produce misleading conclusions.
Voluntary Response Sample A voluntary response sampleconsists of people who choose themselves by responding to a general appeal. Voluntary response samples are biased because people with strong opinions, especially negative opinions, are most likely to respond. Another type of bad sampling is conveniencesampling, which chooses the individuals easiest to reach.
Bias The design of a study is biasedif it systematically favors certain outcomes. Choosing a sample by chance attacks bias by giving all individuals an equal chance to be chosen.
Simple Random Sample A simple random sample (SRS)of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. An SRS not only gives each individual an equal chance to be chosen, (thus avoiding bias in the choice) but also gives every possible sample an equal chance to be chosen.
Random Digits A table of random digitsis a long string of the digits 0,1,2,3,4,5,6,7,8,9 with the following two properties: • Each entry in the table is equally likely to be any of the 10 digits 0 through 9. • The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part. *Table B (in the back of your textbook is a Random Digits Table
Choosing an SRS Choose an SRS in two steps: • Table:Use Table B to select labels at random. • Label: Assign a numerical label to every individual in the population.
Probability Sample A probability sampleis a sample chosen by chance. We must know what samples are possible and what chance, or probability, each possible sample has. The use of chance to select the sample is the essential principal of statistical sampling.
Stratified Random Sample To select a stratified random sample, first divide the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full sample. This method is usually used for sampling from large populations spread out over a wide area.
Multistage Sampling A typical example of multistage sampling is Current Population Survey Sampling Design, which is conducted as follows: Stage 1: Divide the United States into 2007 geographical areas called Primary Sampling Units. Stage 2: Divide each PSU selected into smaller areas called “neighborhoods” using ethnic and other information and take a stratified sample of neighborhood. Stage 3: Sort the housing units in each neighborhood into clusters of four nearby units. Interview the households in a random sample of these clusters. This method saves time and money.
Cautions about sample surveys • We need a complete and accurate list of the population. Undercoverageoccurs when some groups in the population are left out of the process of choosing the sample. Nonresponse occurs when an individual chosen for the sample can’t be contacted or does not cooperate. Response bias is when respondents lie, especially if asked about illegal or unpopular behaviors. An interviewer whose attitude suggests that some answers are more desirable than others will get these answers more often. The wording of questionsis the most important influence on the answers given to a sample survey.
Inference about the population • Using chance to choose a sample eliminates bias in the actual selection of the sample. • Because we deliberately use chance, the results obey the laws of probability that govern chance behavior. • Larger random samples give more accurate results than smaller samples.