1 / 27

Chapter 5 Producing Data

Chapter 5 Producing Data. 5.1 Designing Samples 5.2 Designing Experiments 5.3 Simulating Experiments. 5.1 Designing Samples. Population Entire group of individuals that we want info about A census attempts to contact entire population Sample

portia
Télécharger la présentation

Chapter 5 Producing Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5Producing Data 5.1 Designing Samples 5.2 Designing Experiments 5.3 Simulating Experiments

  2. 5.1 Designing Samples • Population • Entire group of individuals that we want info about • A census attempts to contact entire population • Sample • Part of the population that we actually examine in order to gather info • Sampling involves studying part to determine info about the whole

  3. Design of a sample/Sampling Method • Method used to choose sample from population • Poor design could lead to misleading conclusions • Bias • A sampling method is biased if it systematically favors certain outcomes. • Types of Samples • Voluntary Response Sample • Convenience Sampling • Simple Random Sample • Probability Sample • Stratified Random Sample • Cluster Sample • Multistage Sample

  4. Voluntary Response Sample • Consists of people who choose themselves by responding to a general appeal • Negative: Encourage those with strong opinions • Ex: Call-In polls • Convenience Sample • Choose individuals who are easiest to reach • Negative: Will not represent whole population • Ex: Survey everyone in the gym • Voluntary Response and Convenience samples are often Biased

  5. Questions to ask before believing a poll: • Who carried out the survey? • What was the population? • How was the sample selected? • How large was the sample? • What was the response rate? • How were the subjects contacted? • When was the survey conducted? • What were the exact questions asked?

  6. Question Wording A. Which do you prefer, milk or orange juice as a breakfast drink? (Milk: 14%) B. Milk contains high levels of vitamin D and calcium. Which do you prefer, milk or orange juice as a breakfast drink? (Milk: 64%) A. Do you watch cartoons? (90% yes) B. Do you still watch cartoons? (60% yes) A. Do you cheat in class? (anonymous) (47% yes) B. Do you cheat in class? (not anonymous) (15% yes)

  7. Simple Random Sample (SRS) • A SRS of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected • Chosen by chance. • Allows neither favoritism by sampler nor self selection by respondents • Randomization can occur from computer, calculator, or table of random digits • Choose SRS in 4 steps • Label • Table/Calc • Stopping Rule • Identify Sample Class example using calculator and table of random digits

  8. Probability Sample • Sample chosen by chance • Must know what samples are possible and what probability each sample has • Includes: SRS, Stratified, Cluster… • Stratified Random Sample • Divide population into groups of individuals, called strata, that are similar in some way that is important to the response • Choose a separate SRS in each stratum and combine these SRSs to form full sample • Ex: It may be beneficial to divide students into 4 strata: Frosh/Soph/Jr/Sr

  9. Cluster Sample • Divide the population into groups, or clusters. • Clusters are randomly selected • All individuals in the chosen clusters are selected to be in the sample • Benefit: Practical • Ex: To survey households in all counties of CA • Multistage Sample • Select successively smaller groups within a population in stages, resulting in a sample of clusters of individuals • Select a sample of 3000 counties • Select a sample of towns in chosen counties • Select a sample of blocks within chosen towns • Select a sample of households within chosen blocks

  10. Cautions about Sample Surveys • Undercoverage: Some groups of the population are left out of the of the process of choosing sample • Nonresponse: Individual chosen for sample can’t be contacted or doesn’t cooperate • Response Bias: Respondents lie, bias of interviewers, poor wording, etc. • Inference about the Population • Statistical Inference: produces answers to specific questions, along with a statement of how confident we can be that the answers are correct • Using chance to choose a sample eliminates bias in selection of sample • Unlikely that the results from a sample are exactly the same for the entire population • Improve results by selecting larger random samples

  11. Observational Study • Observe individuals and measure variables of interest but do NOT attempt to influence the responses • Ex: Sample Surveys Example of a possible non-survey observational study: Do people wash their hands after going to the bathroom? How would you conduct the study?

  12. 5.2 Designing Experiments • Observational Study • Observe individuals and measure variables of interest but do NOT attempt to influence the responses • Ex: Sample Surveys • Experiment • Deliberately impose some treatment on individuals in order to observe their responses • Provides control to eliminate lurking variables • Cause and Effect? • Experiments are the only way to establish causation • Why not always Experiments? • Unethical or impossible to impose treatments • Ex: Smoking during pregnancy and Child’s subsequent IQ • Some explanatory variables are inherent traits

  13. Observational Study or Experiment? Does listening to music while studying improve student test scores? 40 students are randomly assigned to two groups where one group listens to music while studying the causes of WW2 and the other group does not listen to music while studying. The students take a test the next day and results of the two groups are examined. Experiment VS After giving his students a test on the causes of WW2, a teacher hands out a survey that asks students anonymously whether they listened to music while studying. He compares the results of the test to their answers. Observational Study

  14. Variables • Explanatory Variable (Factor) • Could be more than one • Ex: Listening to music • Response Variable • Ex: Student test scores • Experimental Units • Individuals on which experiment is done • If units are human, they are called subjects • Ex: Students • Treatment • Condition applied to the units • Ex: 2 Treatments: Music, No Music

  15. Basic principles of design of experiment • Control: Comparing 2 or more treatments • Called Comparative Experiments • Helps to reduce effect of lurking variables • Randomization: Use chance to assign subjects to treatments • Replication: Apply treatment to many units to reduce the role of chance variation • Placebo • Dummy treatment • Placebo Effect: Patients respond favorably to a dummy treatment due to trust or to fact medical issues improve without treatment

  16. Experimental Design: Does vitamin C intake affect # colds? One Factor Experiment Treatment Random Allocation Compare 100 Subjects Vitamin C Placebo

  17. Experimental Design: Does vitamin C and Echinacea intake affect # colds? 2 Factor Experiment: Use 2 Way Table

  18. Experimental Design: Does vitamin C and Echinacea intake affect # colds? Group Treatment G 1 V/E G 2 Random Allocation of subjects V/P Compare Results G 3 P/E G 4 P/P

  19. Experimental Design: Test Multiple Levels of Vitamin C intake 100,mg, 500mg, 1000 mg Group Treatment Group Treatment G 1 100mg 500mg G 2 Random Allocation of subjects Compare Results 1000mg G 3 G 4 Placebo

  20. Two Models for Comparative Experiments: Treatment-Observation or Observation-Treatment-Observation • Treatment Observation • Observation Treatment Observation PSAT SAT Scores Scores Vit C Placebo What would yield more accurate results: 6 subjects or 100? Course Book How would you assign the subjects to the treatments? None

  21. Statistical Significance • An observed effect too large to attribute plausibly to chance • Cautions about Experimentation • Always be critical when examining results • Careful for lack of realism • Was there any hidden bias? • Double Blind Experiment • Neither subjects nor people who have contact with them know which treatment each subject received

  22. Blocked Experimental Designs • Completely randomized designs are the simplest stat designs for experiments • A block is a group of subjects that are known before the experiment to be similar in some way that is expected to affect the response to the treatment • In a block design, the random assignment of units to treatments is carried out separately within each block. • Controls the effects of lurking variables by bringing them into the experiment to form blocks • More precise overall conclusions

  23. Experimental Design:Block Design • Ex: Compare the effectiveness of 3 TV commercials for the same product • What could be a lurking variable? Gender • Instead of using an SRS that considers all males and females in the same group, block for gender first.

  24. G1 Ad 1 • Blocked for Gender Women Random Compare Assignment Men Random Assignment Compare G2 Ad 2 G3 Ad3 G1 Ad 1 G2 Ad 2 Ad 3 G3

  25. Matched Pairs Design • Type of Block design • Compares just two treatments • Each subject receives both treatments in a random order • Ex: Coke vs. Pepsi • Each subject receives both treatments • Randomize the order • Use some sort of labeling (A,B) Or • The subjects are matched in pairs as closely as possible and each subject in each pair receives one treatment

  26. 5.3 Simulating Experiments Three methods we can used to answer questions involving chance: • Carry out the experiment many times can calculate the results relative frequency • Slow, costly, impractical, logistically difficult • Develop a probability model • Chapter 6 • Start with a model that reflects the truth about the experiment and develop a procedure for imitating (simulating) a number of repetitions of the experiment. • Use table of random digits, calculator, computer • Quick and easy

  27. Simulation • The imitation of chance behavior based on a model that accurately reflects the experiment under consideration • Steps to Random Digit Simulation • State the problem or describe experiment • State the assumptions • Assign digits to represent outcomes • Simulate many repetitions • State conclusions

More Related