1 / 21

220 likes | 394 Vues

Chapter 18 – Sampling Distribution Models. How accurate is our sample?. Sometimes different polls show different results for the same question. Since each poll samples a different group of people, we should expect some variation in the results.

Télécharger la présentation
## Chapter 18 – Sampling Distribution Models

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**How accurate is our sample?**Sometimes different polls show different results for the same question. Since each poll samples a different group of people, we should expect some variation in the results. We could try drawing lots of samples and looking at the variation amongst those samples.**Experiment: Simulating a sample**• A recent US Census Bureau study (source) reports that about 30% of Americans 25 or older have a Bachelor’s degree. • Open up a blank Minitab worksheet and let’s generate some random data: • Calc > Random Data > Bernoulli • Enter 200 rows • Store in Column C1-C20 • Event Probability: .3**Proportion estimates for samples of size 5**• We can treat each row as a sample and calculate the proportion of each sample using the mean. • Samples of size 5: • Calc > Row Statistics > Mean • Input Variables: C1 – C5 • Store result in: C21 • Look at these sample proportions. Are they close to the population proportion of 30%? • Draw a histogram of the sample proportions in C21**Proportion estimates for samples of size 10**• Samples of size 10: • Calc > Row Statistics > Mean • Input Variables: C1 – C10 • Store result in: C22 • Look at these sample proportions. Are they close to the population proportion of 30%? • Draw a histogram of the sample proportions in C22**Proportion estimates for samples of size 20**• Samples of size 10: • Calc > Row Statistics > Mean • Input Variables: C1 – C20 • Store result in: C23 • Look at these sample proportions. Are they close to the population proportion of 30%? • Draw a histogram of the sample proportions in C23**Sampling Distribution Model for a Proportion**• Our histogram of the sample proportions started to look like a Normal model • The larger our sample size gets, the better the Normal model works • Assumptions: • Independence: sampled values must be independent of each other • Sample Size: n must be large enough**Conditions to check for assumptions**• Randomization Condition: • Experiments should have treatments randomly assigned • Survey samples should be a simple random sample or representative, unbiased sample otherwise • 10% Condition: • Sample size n must be no more than 10% of population • Success/Failure Condition: • Sample size needs to be large enough to expect at least 10 successes and 10 failures**Sampling Distribution Model for a Proportion**If the sampled values are independent and the sample size is large enough, The sampling distribution model of is modeled by a Normal model with:**Example: Proportion of Vegetarians**• 7% of the US population is estimated to be vegetarian. If a random sample of 200 people resulted in 20 people reporting themselves as vegetarians, is this an unusually high proportion? • Conditions: • Randomization • 10% condition • Success/Failure**Vegetarians Example continued**Since our conditions were met, it’s ok to use a Normal model. = 20/200 = .10 E( ) = p = .07 z = This result is within 2 sd’s of mean, so not unusual**68-95-99.7 Rule with Vegetarians**68% 95% 98% -3σ -2σ -1σ 1σ 2σ 3σ p**Sampling Distribution of a Mean**Rolling dice simulation 10,000 individual rolls recorded Figure from DeVeaux, Intro to Stats**Sampling Distribution of a Mean**Roll 2 dice 10,000 times, average dice Figure from DeVeaux, Intro to Stats**Sampling Distribution of a Mean**Rolling 3 dice 10,000 times and averaging dice Figure from DeVeaux, Intro to Stats**Sampling Distribution of a Mean**Rolling 5 dice 10,000 times and averaging Figure from DeVeaux, Intro to Stats**Sampling Distribution of a Mean**Rolling 20 dice 10,000 times and averaging Once again, as sample size increases, Normal model appears Figure from DeVeaux, Intro to Stats**Central Limit Theorem**• The sampling distribution of any mean becomes more nearly Normal as the sample size grows. • The larger the sample, the better the approximation will be • Observations need to be independent and collected with randomization.**CLT Assumptions**• Assumptions: • Independence: sampled values must be independent • Sample Size: sample size must be large enough • Conditions: • Randomization • 10% Condition • Large enough sample**Which Normal Model to use?**The Normal Model depends on a mean and sd Sampling Distribution Model for a Mean When a random sample is drawn from any population with mean µ and standard deviation σ, its sample mean y has a sampling distribution with: Mean: µ Standard Deviation:**Example: CEO compensation**800 CEO’s Mean (in thousands) = 10,307.31 SD (in thousands) = 17,964.62 Samples of size 50 were drawn with: Mean = 10,343.93 SD = 2,483.84 Samples of size 100 were drawn with: Mean = 10,329.94 SD = 1,779.18 According to CLT, what should theoretical mean and sd be? Example from DeVeaux, Intro to Stats

More Related