1 / 20

STA 291 Spring 2010

STA 291 Spring 2010. Lecture 13 Dustin Lueker. Statistical Inference: Estimation. Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample from that population Quantitative variables Usually estimate the population mean

robbin
Télécharger la présentation

STA 291 Spring 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STA 291Spring 2010 Lecture 13 Dustin Lueker

  2. Statistical Inference: Estimation • Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample from that population • Quantitative variables • Usually estimate the population mean • Mean household income • Qualitative variables • Usually estimate population proportions • Proportion of people voting for candidate A STA 291 Spring 2010 Lecture 13

  3. Two Types of Estimators • Point Estimate • A single number that is the best guess for the parameter • Sample mean is usually a good guess for the population mean • Interval Estimate • Point estimator with error bound • A range of numbers around the point estimate • Gives an idea about the precision of the estimator • The proportion of people voting for A is between 67% and 73% STA 291 Spring 2010 Lecture 13

  4. Point Estimator • A point estimator of a parameter is a sample statistic that predicts the value of that parameter • A good estimator is • Unbiased • Centered around the true parameter • Consistent • Gets closer to the true parameter as the sample size gets larger • Efficient • Has a standard error that is as small as possible (made use of all available information) STA 291 Spring 2010 Lecture 13

  5. Biased • A biased estimator systematically underestimates or overestimates the population parameter • In the definition of sample variance and sample standard deviation uses n-1 instead of n, because this makes the estimator unbiased • With n in the denominator, it would systematically underestimate the variance STA 291 Spring 2010 Lecture 13

  6. Unbiased • An estimator is unbiased if its sampling distribution is centered around the true parameter • For example, we know that the mean of the sampling distribution of equals μ, which is the true population mean • So, is an unbiased estimator of μ • Note: For any particular sample, the sample mean may be smaller or greater than the population mean • Unbiased means that there is no systematic underestimation or overestimation STA 291 Spring 2010 Lecture 13

  7. Efficient • An estimator is efficient if its standard error is small compared to other estimators • Such an estimator has high precision • A good estimator has small standard error and small bias (or no bias at all) • The following pictures represent different estimators with different bias and efficiency • Assume that the true population parameter is the point (0,0) in the middle of the picture STA 291 Spring 2010 Lecture 13

  8. Bias and Efficient Note that even an unbiased and efficient estimator does not always hit exactly the population parameter. But in the long run, it is the best estimator. STA 291 Spring 2010 Lecture 13

  9. Confidence Interval • Inferential statements about a parameter should always provide the accuracy of the estimate • How close is the estimate likely to fall to the true parameter value? • Within 1 unit? 2 units? 10 units? • This can be determined using the sampling distribution of the estimator/sample statistic • In particular, we need the standard error to make a statement about accuracy of the estimator STA 291 Spring 2010 Lecture 13

  10. Confidence Interval • Range of numbers that is likely to cover (or capture) the true parameter • Probability that the confidence interval captures the true parameter is called the confidence coefficient or more commonly the confidence level • Confidence level is a chosen number close to 1, usually 0.90, 0.95 or 0.99 • Level of significance = α = 1 – confidence level STA 291 Spring 2010 Lecture 13

  11. Confidence Interval • To calculate the confidence interval, we use the Central Limit Theorem • Substituting the sample standard deviation for the population standard deviation • Also, we need a that is determined by the confidence level • Formula for 100(1-α)% confidence interval for μ STA 291 Spring 2010 Lecture 13

  12. Common Confidence Intervals • 90% confidence interval • Confidence level of 0.90 • α=.10 • Zα/2=1.645 • 95% confidence interval • Confidence level of 0.95 • α=.05 • Zα/2=1.96 • 99% confidence interval • Confidence level of 0.99 • α=.01 • Zα/2=2.576 STA 291 Spring 2010 Lecture 13

  13. Confidence Intervals • This interval will contain μ with a 100(1-α)% confidence • If we are estimating µ, then why it is unreasonable for us to know σ? • Thus we replace σ by s (sample standard deviation) • This formula is used for a large sample size (n≥30) • If we have a sample size less than 30 a different distribution is used, the t-distribution, we will get to this later STA 291 Spring 2010 Lecture 13

  14. Example • Compute a 95% confidence interval for μ if we know that s=12 and the sample of size 36 yielded a mean of 7 STA 291 Spring 2010 Lecture 13

  15. Interpreting Confidence Intervals • “Probability” means that in the long run 100(1-α)% of the intervals will contain the parameter • If repeated samples were taken and confidence intervals calculated then 100(1-α)% of the intervals will contain the parameter • For one sample, we do not know whether the confidence interval contains the parameter • The 100(1-α)% probability only refers to the method that is being used STA 291 Spring 2010 Lecture 13

  16. Interpreting Confidence Intervals STA 291 Spring 2010 Lecture 13

  17. Interpreting Confidence Intervals • Incorrect statement • With 95% probability, the population mean will fall in the interval from 3.5 to 5.2 • To avoid the misleading word “probability” we say that we are “confident” • We are 95% confident that the true population mean will fall between 3.5 and 5.2 STA 291 Spring 2010 Lecture 13

  18. Confidence Intervals • Changing our confidence level will change our confidence interval • Increasing our confidence level will increase the length of the confidence interval • A confidence level of 100% would require a confidence interval of infinite length • Not informative • There is a tradeoff between length and accuracy • Ideally we would like a short interval with high accuracy (high confidence level) STA 291 Spring 2010 Lecture 13

  19. Choice of Sample Size • Start with the confidence interval formula assuming that the population standard deviation is known • Mathematically we need to solve the above equation for n STA 291 Spring 2010 Lecture 13

  20. Example • About how large of a sample would have been adequate if we merely needed to estimate the mean to within 0.75, with 95% confidence? Assume s = 5 Note: We will always round the sample size up to ensure that we get within the desired error bound. STA 291 Spring 2010 Lecture 13

More Related