1 / 31

Probability

Probability. Lecture 2. Probability. Why did we spend last class talking about probability? How do we use this?. You’re the FDA. A company wants you to approve a new drug They run an experimental trial 40 people have the disease 20 get drug, 20 get placebo Random assignment

Télécharger la présentation

Probability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability Lecture 2

  2. Probability • Why did we spend last class talking about probability? • How do we use this?

  3. You’re the FDA • A company wants you to approve a new drug • They run an experimental trial • 40 people have the disease • 20 get drug, 20 get placebo • Random assignment • Conducted perfectly

  4. You’re the FDA • Results: • Placebo group: 10 of 20 live • Drug group: 11 of 20 live • Does the drug work? • Would you approve it? • Why or why not?

  5. You’re the FDA • Different study, same design • Results: • Placebo: 2 of 20 live • Drug: 18 of 20 live • Does the drug work? • Would you approve it? • Why or why not?

  6. You’re the FDA • Different study, same design • Results: • Placebo: 8 of 20 live • Drug: 12 of 20 live • Does the drug work? • Would you approve it? • Why or why not? • How big of a difference do we need?

  7. Why probability • Probability provides the answer • Set of agreed on rules • All based on mathematical formula

  8. Example • How many of you would accept the following wager: • If no two people in the class have the same birthday (month and day) you get an automatic A. • If two or more people in class have the same birthday, you get an automatic F. • Not ethical for me to accept the wager

  9. Example • Would you have won?

  10. Example • Would you have won? • What is the probability? • Not 60/365 • Think of the complement • How many possible pairs are there in the class? • Me and each student = N • First student and every other student = N-1 • Second student and every remaining student = N-2 • … • Last two students • = = 1770

  11. Example • P of any pair matches is 1/365 = 0.00274 • P any pair doesn’t match is 1-0.00274 • = 0.99726 • We have 1770 pairs. • Remember the rule • Joint probability of all not matching is: • P(first pair not match)*P(second pair not match)*…*P(last pair not match)

  12. What is random? • What are the odds that the first flip is a heads? • ½ • Each outcome is equally likely • The second flip? • ½ • So what are the odds that both are? • Four outcomes: • HH, HT, TH, TT • so ¼ (each equally likely)

  13. What is random? • Odds the third flip is a heads? • ½ • Odds that all three are heads? • 8 outcomes • HHH, HHT, HTH, HTT, THH, THT, TTH, TTT • So, 1/8 • Odds the fourth flip is a heads? • ½ • All four? • 1/16

  14. What is random? • Odds that five in a row are heads? • 1/32 • Odds that six in a row? • 1/64 • If we did this as a probability they would be: • 0.5 • 0.25 • 0.125 • 0.0625 • 0.03125 • 0.0078125 • Each is the previous probability multiplied by 0.5

  15. Example • P of any pair matches is 1/365 = 0.00274 • P any pair doesn’t match is 1-0.00274 • = 0.99726 • We have 1770 pairs. • Remember the rule • Joint probability of all not matching is: • P(first pair not match)*P(second pair not match)*…*P(last pair not match) • = 0.99726 1770 • = 0.008 • Seems likely that at least one would match

  16. Rules of probability and math let us determine how likely an event is. • Want to be able to determine “statistical significance” • Can we conclude that the pattern we see didn’t happen by chance?

  17. What is “statistical significance?” • First, let’s be clear about what statistical significance is NOT. • A finding that a relationship between some X and some Y is “statistically significant” does NOT mean that the relationship is “strong.” (It might be strong, but not because it’s statistically significant.)

  18. This is a common mistake • Many people think that a “statistically significant” relationship is by definition a “strong” one. In fact, many people think that “statistical significance” IS ITSELF a test of the strength of the relationship. It’s not.

  19. Then what is statistical significance? • It is a probabilistic statement—typically, 95% confidence—that the relationship we observe in the sample, no matter how strong or weak, exists in the population.

  20. But, as always… • There is a 5% chance we could be wrong—that is, that despite what we observe in the sample, there really is no relationship in the population.

  21. How do we demonstrate statistical significance? • We perform something called “hypothesis testing.” • We actually begin with a statement called the “null hypothesis.” It is always a statement that there is not a relationship between two variables.

  22. Why a Null Hypothesis? • We want to know if there is a relationship • Our theory is not strong enough to tell us how large the effect is • Theory: Gender helps determine vote choice • Hypothesis: Women were more likely to vote for Obama than men were • Problem: How much more? We don’t know. • How large of a difference would be big enough?

  23. Null Hypothesis • Big enough to not happen by chance • Ok, but how much is enough to be “not by chance?” • This is where probability comes in • Anything is possible—the normal distributions is unbounded.

  24. Null Hypothesis • Everything may be possible, but everything is not probable • We want to know the probability that a relationship could exist in the data by chance

  25. Example: Gender Gap

  26. Probability • If we make some assumptions we can calculate how probable any outcome is. • What do we assume? • There is no difference between treatments • What the probability distribution is (this is technical and I will tell you what matters). • With these, we can calculate P(data occurred by chance).

  27. Probability • But that isn’t exactly what we want to know. • We want to know probability that there is a difference, this would be probability that there is no difference. • Unfortunately that is as good as we can do

  28. Probability • So, what is the null hypothesis (since that is where this started)? • It is the hypothesis that there is no relationship (thus, “null”). This is what we can test. • It is the inverse of what we want to know. • So, if our theory is right the null hypothesis is wrong and we will reject the null hypothesis. • If our theory is wrong, we will accept the null hypothesis

  29. What does this mean? • How likely are we to see this by chance? • If there were no difference between genders, the probability of seeing this difference is 0.01

  30. 0.01 • That is pretty unlikely, but what does that mean? • One of three things occurred • The data are wrong • We were really unlucky • The assumption of no relationship is wrong • Conclusion is the last one. We have a relationship.

  31. Probability • How unlikely does the null have to be for us to reject it? • 1 out of 20 (5%) • Why? • Vestige of pre-computer days • Norm

More Related