1 / 18

# STAT E100

## STAT E100

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. STAT E100 Section Week 8 – Intro to Inference and Reviewing the Binomial Distribution

2. Review • The exams are graded, see your TA for your midterm exam! • If you did not do as well as you hoped, remember, the midterm is worth 20% (undergraduates) or 15% (graduate students). • And homework is worth 30% (undergraduates) and 25% (graduate students). • Interested in forming a study group?

3. Sample Question #1 Harvard College is composed of about 52% women. For all students currently enrolled in Stat 104, 172 of the 381 students are women. Use this information for the following problem. a) What is the expected number of women in a simple random sample of 381 students from Harvard College? What is the standard deviation? b) What is the probability of selecting exactly 172 women in a sample of 381 students from Harvard College? c) What is the probability of selecting exactly 172 women or fewer in a sample of 381 students from Harvard College? d) What is the probability of selecting at least 172 women or fewer in a sample of 381 students from Harvard College [use software]?

4. Sample Question #1 Harvard College is composed of about 52% women. For all students currently enrolled in Stat 104, 172 of the 381 students are women. Use this information for the following problem. a) What is the expected number of women in a simple random sample of 381 students from Harvard College? What is the standard deviation? Expected number =n*p=381*0.52= 198.12 standard deviation=sqrt( np(1-p) )= 9.75 b) What is the probability of selecting exactly 172 women in a sample of 381 students from Harvard College? c) What is the probability of selecting exactly 172 women or fewer in a sample of 381 students from Harvard College? d) What is the probability of selecting at least 172 women or fewer in a sample of 381 students from Harvard College [use software]?

5. Sample Question #1 Harvard College is composed of about 52% women. For all students currently enrolled in Stat 104, 172 of the 381 students are women. Use this information for the following problem. a) What is the expected number of women in a simple random sample of 381 students from Harvard College? What is the standard deviation? Expected number =n*p=381*0.52= 198.12 standard deviation=sqrt( np(1-p) )= 9.75 b) What is the probability of selecting exactly 172 women in a sample of 381 students from Harvard College? P(X=172) = = 0.0011 c) What is the probability of selecting exactly 172 women or fewer in a sample of 381 students from Harvard College? d) What is the probability of selecting at least 172 women or fewer in a sample of 381 students from Harvard College [use software]?

6. Sample Question #1 Harvard College is composed of about 52% women. For all students currently enrolled in Stat 104, 172 of the 381 students are women. Use this information for the following problem. a) What is the expected number of women in a simple random sample of 381 students from Harvard College? What is the standard deviation? Expected number =n*p=381*0.52 = 198.12 standard deviation=sqrt( np(1-p) ) = 9.75 b) What is the probability of selecting exactly 172 women in a sample of 381 students from Harvard College? P(X=172) = = 0.0011 c) What is the probability of selecting exactly 172 women or fewer in a sample of 381 students from Harvard College? P(X≤172) : means we should sum the probabitlies of each possible amount of women up to 172 P(X=172)+P(X=171)+ P(X=170)+ P(X=169)… P(X=0) YIKES! Or we can use the normal approximation of the binomial distribution. X~Bin(n=381, p =0.52)  N(μx=198.12, σx=9.75) Z score = (172 – 198.12)/ 9.75 = -2.68  from the z-table, p = 0.0037 d) What is the probability of selecting at least 172 women or fewer in a sample of 381 students from Harvard College [use software]?

7. Sample Question #1 Harvard College is composed of about 52% women. For all students currently enrolled in Stat 104, 172 of the 381 students are women. Use this information for the following problem. a) What is the expected number of women in a simple random sample of 381 students from Harvard College? What is the standard deviation? Expected number =n*p=381*0.52= 198.12 standard deviation=sqrt( np(1-p) )= 9.75 b) What is the probability of selecting exactly 172 women in a sample of 381 students from Harvard College? P(X=172) = = 0.0011 c) What is the probability of selecting exactly 172 women or fewer in a sample of 381 students from Harvard College? P(X≤172) : means we should sum the probabitlies of each possible amount of women up to 172 P(X=172)+ P(X=171)+ P(X=170)+ P(X=169)… P(X=0) YIKES! Or we can use the normal approximation of the binomial distribution. X~Bin(n=381, p =0.52)  N(μx=198.12, σx=9.75) Z score = (172 – 198.12)/ 9.75 = -2.68  from the z-table, p = 0.0037 d) What is the probability of selecting at least 172 women or fewer in a sample of 381 students from Harvard College [use software]? P(X≤172) = 0.0032 http://www.stat.tamu.edu/~west/applets/binomialdemo.html

8. Key Equations: The main implications of the result • The sampling distribution of is centered over the true population proportion, p. • Note the formula of the standard deviation of . • What happens as n increases? • The standard deviation also depends on the unknown parameter p.

9. More Key Concepts: • The population standard deviation (typically denoted by σ) is the theoretical standard deviation of a probability distribution. • A sample standard deviation (denoted by ‘s’) is the usual standard deviation of a set of numbers. • The standard deviation of a statistic is the theoretical standard deviation of the sampling distribution for a statistic (Ex: σ/√n) • The standard error of a statistic is the estimated (from data) standard deviation for a sampling distribution, after any unknown parameters have been replaced by their estimates. (Ex: s/√n)

10. Key Inference Concepts and Equations: Two main inferential techniques: Confidence Intervals - for estimating values of population parameters Hypothesis Testing- for deciding whether the population supports a specific idea/model/hypothesis

11. A university dean is interested in determining the proportion of students who receive some sort of financial aid. The dean randomly selects 200 students and finds that 118 of them are receiving financial aid. The 95% confidence interval for p is 0.59 ± 0.057. Interpret this interval. • We are 95% confident that between 53% and 65% of the sampled students receive some sort of financial aid. • We are 95% confident that 59% of the students are on some sort of financial aid. • 95% of the students get between 53% and 65% of their tuition paid for by financial aid. • We are 95% confident that the true proportion of all students receiving financial aid is between 0.53 and 0.65. Sample Question #2

12. A university dean is interested in determining the proportion of students who receive some sort of financial aid. The dean randomly selects 200 students and finds that 118 of them are receiving financial aid. The 95% confidence interval for p is 0.59 ± 0.057. Interpret this interval. • We are 95% confident that between 53% and 65% of the sampled students receive some sort of financial aid. • We are 95% confident that 59% of the students are on some sort of financial aid. • 95% of the students get between 53% and 65% of their tuition paid for by financial aid. • We are 95% confident that the true proportion of all students receiving financial aid is between 0.53 and 0.65. Sample Question #2

13. Hillary Clinton’s campaign manager is interested in determining Massachusetts’ public opinion toward her candidacy for president in 2016. She randomly samples 160 registered voters in the Massachusetts. • Assuming 60% of all voters in MA truly would vote for Hillary, what is the sampling distribution for ? • What is the probability that would be greater than 0.50? c) How many observations would she have to sample so that the probability in part (b) would be 97.5%? Sample Question #3

14. Hillary Clinton’s campaign manager is interested in determining Massachusetts’ public opinion toward her candidacy for president in 2016. She randomly samples 160 registered voters in the Massachusetts. • Assuming 60% of all voters in MA truly would vote for Hillary, what is the sampling distribution for ? N(0.6, 0.0387) from  𝑝̂ N approximation equation • What is the probability that would be greater than 0.50? c) How many observations would she have to sample so that the probability in part (b) would be 97.5%? Sample Question #3

15. Hillary Clinton’s campaign manager is interested in determining Massachusetts’ public opinion toward her candidacy for president in 2016. She randomly samples 160 registered voters in the Massachusetts. • Assuming 60% of all voters in MA truly would vote for Hillary, what is the sampling distribution for ? N(0.6, 0.0387) from  𝑝̂ approximation equation • What is the probability that would be greater than 0.50? z score = (0.5 – 0.6)/ 0.0387 = -2.58 from z-table  probability = (1-0.0049) = 99.51% c) How many observations would she have to sample so that the probability in part (b) would be 97.5%? Sample Question #3

16. Hillary Clinton’s campaign manager is interested in determining Massachusetts’ public opinion toward her candidacy for president in 2016. She randomly samples 160 registered voters in the Massachusetts. • Assuming 60% of all voters in MA truly would vote for Hillary, what is the sampling distribution for ? N(0.6, 0.0387) from  𝑝̂ approximation equation • What is the probability that would be greater than 0.50? z score = (0.5 – 0.6)/ 0.0387 = -2.58 from z-table  probability = (1-0.0049) = 99.51% c) How many observations would she have to sample so that the probability in part (b) would be 97.5%? -1.96 (from z-table) = (0.5 – 0.6)/ ( ) p(1-p)/n = (-0.1/-1.96)2 n = (p(1-p) / (-0.1/-1.96)2 ) = (0.6(1- 0.6) / (-0.1/-1.96)2 ) = 92.31 or about 92 observations Sample Question #3 (p(1-p)/n