810 likes | 889 Vues
STAT 101 Dr. Kari Lock Morgan. Essential Synthesis. SECTION 4.4, 4.5, ES A, ES B Connecting bootstrap and randomization (4.4) Connecting intervals and tests (4.5) Review ( Ch 1-4). Exam Details. Wednesday, 2/26
E N D
STAT 101 Dr. Kari Lock Morgan Essential Synthesis • SECTION 4.4, 4.5, ES A, ES B • Connecting bootstrap and randomization (4.4) • Connecting intervals and tests (4.5) • Review (Ch 1-4)
Exam Details • Wednesday, 2/26 • Closed to everything except one double-sided page of notes prepared by you (no sharing) and a non-cell phone calculator • Best ways to prepare: • #1: WORK LOTS OF PROBLEMS! • Make a good page of notes • Read sections you are still confused about • Come to office hours and clarify confusion • Covers chapters 1-4 (except 2.6) and anything covered in lecture
Practice Problems • Practice exam online (under resources) • Solutions to odd essential synthesis and review problems online (under resources) • Solutions to all odd problems in the book on reserve at Perkins
Office Hours and Help • Monday 4–6pm: Stephanie Sun, Old Chem 211A • Tuesday 3:30–5pm (extra): Prof Morgan, Old Chem 216 • Tuesday 5-7pm: Wenjing Shi (new TA), Old Chem 211A • Tuesday 7-9pm: Mao Hu, Old Chem 211A • REVIEW SESSION: 5 – 6 pm Tuesday (if we can get a room… I’ll keep you posted)
Review from Last Class You will all do a hypothesis test for Project 1. If all of you are doing tests for which the nulls are true, about how many of you will get statistically significant results using α = 0.05? (there are 110 students in the class) • 110 • 105 • 6 • 0 0.05*110 = 5.5
Multiple Testing When multiple hypothesis tests are conducted, the chance that at least one test incorrectly rejects a true null hypothesis increases with the number of tests. If the null hypotheses are all true, α of the tests will yield statistically significant results just by random chance.
Multiple Comparisons • Consider a topic that is being investigated by research teams all over the world • Using α = 0.05, 5% of teams are going to find something significant, even if the null hypothesis is true
Multiple Comparisons • Consider a research team/company doing many hypothesis tests • Using α = 0.05, 5% of tests are going to be significant, even if the null hypotheses are all true
Multiple Comparisons • This is a serious problem • The most important thing is to be aware of this issue, and not to trust claims that are obviously one of many tests (unless they specifically mention an adjustment for multiple testing) • There are ways to account for this (e.g. Bonferroni’s Correction), but these are beyond the scope of this class
Publication Bias • publication biasrefers to the fact thatusually only the significant results get published • The one study that turns out significant gets published, and no one knows about all the insignificant results • This combined with the problem of multiple comparisons, can yield very misleading results
Jelly Beans Cause Acne! http://xkcd.com/882/
Connections • Today we’ll make connections between… • Chapter 1: Data collection (random sampling?, random assignment?) • Chapter 2: Which statistic is appropriate, based on the variable(s)? • Chapter 3: Bootstrapping and confidence intervals • Chapter 4: Randomization distributions and hypothesis tests
Connections • Today we’ll make connections between… • Chapter 1: Data collection (random sampling?, random assignment?) • Chapter 2: Which statistic is appropriate, based on the variable(s)? • Chapter 3: Bootstrapping and confidence intervals • Chapter 4: Randomization distributions and hypothesis tests
Randomization Distribution For a randomization distribution, each simulated sample should… be consistent with the null hypothesis use the data in the observed sample reflect the way the data were collected
Randomized Experiments • In randomized experiments the “randomness” is the random allocation to treatment groups • If the null hypothesis is true, the response values would be the same, regardless of treatment group assignment • To simulate what would happen just by random chance, if H0 were true: • reallocate cases to treatment groups, keeping the response values the same
Observational Studies • In observational studies, the “randomness” is random sampling from the population • To simulate what would happen, just by random chance, if H0 were true: • Simulate resampling from a population in which H0 is true • How do we simulate resampling from a population when we only have sample data? • Bootstrap! • How can we generate randomization samples for observational studies? • Make H0 true, then bootstrap!
Body Temperatures • = average human body temperature • H0 : = 98.6 • Ha : ≠ 98.6 • We can make the null true just by adding 98.6 – 98.26 = 0.34 to each value, to make the mean be 98.6 • Bootstrapping from this revised sample lets us simulate samples, assuming H0 is true!
Body Temperatures • In StatKey, when we enter the null hypothesis, this shifting is automatically done for us • StatKey p-value = 0.002
Exercise and Gender • H0: m = f , Ha: m> f • How might we make the null true? • One way (of many): add 3 to every female • Bootstrap from this modified sample • In StatKey, the default randomization method is “reallocate groups”, but “Shift Groups” is also an option, and will do this
Exercise and Gender p-value = 0.095
Exercise and Gender The p-value is 0.095. Using α = 0.05, we conclude…. • Males exercise more than females, on average • Males do not exercise more than females, on average • Nothing Do not reject the null… we can’t conclude anything.
Blood Pressure and Heart Rate • H0: = 0 , Ha: < 0 • Two variables have correlation 0 if they are not associated. We can “break the association” by randomly permuting/scrambling/shuffling one of the variables • Each time we do this, we get a sample we might observe just by random chance, if there really is no correlation
Blood Pressure and Heart Rate Even if blood pressure and heart rate are not correlated, we would see correlations this extreme about 22% of the time, just by random chance. p-value = 0.219
Randomization Distribution • Paul the Octopus or ESP(single proportion): • Flip a coin or roll a die • Cocaine Addiction (randomized experiment): • Rerandomize cases to treatment groups, keeping response values fixed • Body Temperature (single mean): • Shift to make H0 true, then bootstrap • Exercise and Gender (observational study): • Shift to make H0 true, then bootstrap • Blood Pressure and Heart Rate (correlation): • Randomly permute/scramble/shuffle one variable
Connections • Today we’ll make connections between… • Chapter 1: Data collection (random sampling?, random assignment?) • Chapter 2: Which statistic is appropriate, based on the variable(s)? • Chapter 3: Bootstrapping and confidence intervals • Chapter 4: Randomization distributions and hypothesis tests
Body Temperature • We created a bootstrap distribution for average body temperature by resampling with replacement from the original sample (
Body Temperature • We also created a randomization distribution to see if average body temperature differs from 98.6F by adding 0.34 to every value to make the null true, and then resampling with replacement from this modified sample:
Body Temperature • These two distributions are identical (up to random variation from simulation to simulation) except for the center • The bootstrap distribution is centered around the sample statistic, 98.26, while the randomization distribution is centered around the null hypothesized value, 98.6 • The randomization distribution is equivalent to the bootstrap distribution, but shifted over
Bootstrap and Randomization Distributions • Big difference: a randomization distribution assumes H0 is true, while a bootstrap distribution does not
Which Distribution? • Let be the average amount of sleep college students get per night. Data was collected on a sample of students, and for this sample hours. • A bootstrap distribution is generated to create a confidence interval for , and a randomization distribution is generated to see if the data provide evidence that > 7. • Which distribution below is the bootstrap distribution? (a) is centered around the sample statistic, 6.7
Which Distribution? • Intro stat students are surveyed, and we find that 152 out of 218 are female. Let p be the proportion of intro stat students at that university who are female. • A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2. • Which distribution is the randomization distribution? (a) is centered around the null value, 1/2
Connections • Today we’ll make connections between… • Chapter 1: Data collection (random sampling?, random assignment?) • Chapter 2: Which statistic is appropriate, based on the variable(s)? • Chapter 3: Bootstrapping and confidence intervals • Chapter 4: Randomization distributions and hypothesis tests
Intervals and Tests • A confidence interval represents the range of plausible values for the population parameter • If the null hypothesized value IS NOT within the CI, it is not a plausible value and should be rejected • If the null hypothesized value IS within the CI, it is a plausible value and should not be rejected
Intervals and Tests If a 95% CI misses the parameter in H0, then a two-tailed test should reject H0 at a 5% significance level. If a 95% CI contains the parameter in H0, then a two-tailed test should not reject H0 at a 5% significance level.
Body Temperatures • Using bootstrapping, we found a 95% confidence interval for the mean body temperature to be (98.05, 98.47) • This does not contain 98.6, so at α = 0.05 we would reject H0 for the hypotheses • H0 : = 98.6 • Ha : ≠ 98.6
Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged 18-29 in 2010 who say yes. A 95% CI for p is (0.487, 0.573). Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we • Reject H0 • Do not reject H0 • Reject Ha • Do not reject Ha 0.5 is within the CI, so is a plausible value for p. http://www.pewsocialtrends.org/2011/03/09/for-millennials-parenthood-trumps-marriage/#fn-7199-1
Both Father and Mother “Does a child need both a father and a mother to grow up happily?” Let p be the proportion of adults aged 18-29 in 1997 who say yes. A 95% CI for p is (0.533, 0.607). Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we • Reject H0 • Do not reject H0 • Reject Ha • Do not reject Ha 0.5 is not within the CI, so is not a plausible value for p. http://www.pewsocialtrends.org/2011/03/09/for-millennials-parenthood-trumps-marriage/#fn-7199-1
Intervals and Tests • Confidence intervals are most useful when you want to estimate population parameters • Hypothesis tests and p-values are most useful when you want to test hypotheses about population parameters • Confidence intervals give you a range of plausible values; p-values quantify the strength of evidence against the null hypothesis
Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? On average, how much more do adults who played sports in high school exercise than adults who did not play sports in high school? • Confidence interval • Hypothesis test • Statistical inference not relevant
Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? Do a majority of adults riding a bicycle wear a helmet? • Confidence interval • Hypothesis test • Statistical inference not relevant
Interval, Test, or Neither? Is the following question best assessed using a confidence interval, a hypothesis test, or is statistical inference not relevant? On average, were the players on the 2014 Canadian Olympic hockey team older than the players on the 2014 US Olympic hockey team? • Confidence interval • Hypothesis test • Statistical inference not relevant
Summary • Using α = 0.05, 5% of all hypothesis tests will lead to rejecting the null, even if all the null hypotheses are true • Randomization samples should be generated • Consistent with the null hypothesis • Using the observed data • Reflecting the way the data were collected • If a null hypothesized value lies inside a 95% CI, a two-tailed test using α = 0.05 would not reject H0 • If a null hypothesized value lies outside a 95% CI, a two-tailed test using α = 0.05 would reject H0
Sample • The Big Picture Population Sampling Statistical Inference Descriptive statistics
Cases and Variables We obtain information about cases or units. A variable is any characteristic that is recorded for each case. • Generally each case makes up a row in a dataset, and each variable makes up a column • Variables are either categorical or quantitative
Sampling • Sampling bias occurs when the method of selecting a sample causes the sample to differ from the population in some relevant way. • If sampling bias exists, we cannot generalize from the sample to the population • To avoid sampling bias, select a random sample
Sample Sampling Population Sample GOAL: Select a sample that is similar to the population, only smaller
Observational Studies • A third variable that is associated with both the explanatory variable and the response variable is called a confounding variable • There are almost always confounding variables in observational studies • Observational studies can almost never be used to establish causation Observational studies can almost never be used to establish causation