Statistical Estimation for Inferences: Methodology and Applications
Learn about point estimates, interval estimation, hypothesis testing, confidence intervals, and more in statistical inferences. Explore estimation of mean, hypothesis testing methodologies, and the central limit theorem.
Statistical Estimation for Inferences: Methodology and Applications
E N D
Presentation Transcript
Unit3: Statistical Inferences Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston
Estimation • Point Estimates • A point estimate of a parameter θ is a single number used as an estimate of the value of θ. • e.g. A natural estimate to use for estimating the population mean is the sample mean . • Interval Estimation • If an random interval I=(L,U) satisfying Pr(L<θ<U)=1- α, the observed values of L and U for a given sample is called a 1- α conference interval estimate for θ. Which one is more accurate? Which one is more precise?
Estimation What to estimate? • B(n, p) proportion • Poisson () mean • N(, σ2) mean and/or variance
Estimation of the Mean of a Distribution • A point estimator of the population mean is sample mean. • Sampling Distribution of is the distribution of values of over all possible samples of size n that could have been selected from the reference population.
Estimation • An estimator of a parameter is unbiased estimator if its expectation is equal to the parameter. • Note: The unbiasedness is not sufficient to be used as the only criterion for chosen an estimator. • The unbiased estimator with the minimum variance(MVUE) is preferred. • If the population is normal, then is the MVUE of .
Sample Mean • Standard error (of the mean) = standard deviation of the sample mean • The estimated standard error where s: sample standard deviation .
Central Limit Theorem • Let X1,…,Xn be a random sample from some population with mean and varianceσ2 Then, for large n,
Interval Estimation • Let X1, ….Xn be a random sample from a normal population N(, σ2). If σ2 is known, a 95% confidence interval (C.I.) for is why? (next slide)
Interval Estimation Interpretation of Confidence Interval • Over the collection of 95% confidence intervals that could be constructed from repeated random samples of size n, 95% of them will contain the parameter • It is wrong to say:There is a 95% chance that the parameter will fall within a particular 95% confidence interval.
Interval Estimation • Note: • When and n are fixed, 99% C.I. is wider than 95% C.I. • If the width of the C.I. is specified, the sample size can be determined. n length length
Hypothesis Testing • Null hypothesis(H0): the statement to be tested, usually reflecting the status quo. • Alternative hypothesis (H1): the logical compliment of H0. • Note: the null hypothesis is analogous to the defendant in the court. It is presumed to be true unless the data argue overwhelmingly to the contrary.
Hypothesis Testing • Four possible outcomes of the decision: • Notation: = Pr (Type I error) = level of significance = Pr (Type II error) 1- = power= Pr(reject H0|H1 is true)
Hypothesis Testing • Goal : to make and both small • Facts: then then • General Strategy: fix , minimize
Testing for the Population Mean • When the sample is from normal population H0 : = 120 vs H1 : < 120 • The best test is based on ,which is called the test statistic. The "best test" means that the test has the highest power among all tests with a given type I error. Is there any bad test? Yes. • Rejection Region: • range of values of test statistic for which H0 is rejected.
One-tailed test • Our rejection region is • Now,
Result • To test H0 : = 0vs H1 : < 0, based on the samples taken from a normal population with mean and variance unknown, the test statistic is . • Assume the level of significance is α then, • if t < tn-1, α , then we reject H0. • if t ≥ tn-1, α, then we do not reject H0.
P-value • The minimum α-level at which we can reject Ho based on the sample. • P-value can also be thought as the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained from the sample, given that the null hypothesis is true.
Remarks • Two different approaches on determining the statistical significance: • Critical value method • P-value method.
One-tailed test • Testing H0: µ=µ0vs H1: µ> µ0 When unknown and population is normal Test Statistic: Rejection Region: t > tn-1,α p-value = 1- Ft,n-1 (t), where Ft,n-1 ( ) is the cdf for t distribution with df=n-1. • Note:If is known, the s in test statistic will be replaced σby and tn-1,αin rejection region will be replaced by zα, Ft,n-1 (t) will be replace by Ф(t).
Testing For Two-Sided Alternative • Let X1,….,Xn be the random samples from the population N(µ, σ²), whereσ²is unknown. • H0 : µ=µ0vs H1 : µ≠µ0 • Test Statistic: • Rejection Region: |t|> tn-1,1-α/2 • p-value = 2*Ft,n-1 (t), if t<= 0. (see figures on next slide) 2*[1- Ft,n-1 (t)], if t > 0. • Warning: exact p-value requires use of computer.
Testing For Two-Sided Alternative P-value for X>U0 P-value for X<=U0
The Power of A Test • To test H0 : µ=µ0vs H1 : µ<µ0 in normal population with known variance σ²,the power is • Review: Power= Pr [rejecting H0 | H0 is false ] • Factors Affecting the Power
The Power of The 1-Sample T Test • To test H0 : µ=µ0 vs H1 : µ<µ0in a normal population with unknown variance σ²,the power, for true meanµ1 and true s.d.= σ, is F(tn-1, .05), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality
Power Function For Two-Sided Alternative • To test H0 :µ=µ0vs H1 : µ≠µ0in normal population with known variance σ²,the power is ,where µ1 is true alternative.
Case of Unknown Variance • For the same test with an unknown variance population, the power is F(-tn-1, 1-α/2) + 1- F(tn-1, 1- α/2), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality
For example:H0 :µ=µ0vs H1 : µ<µ0 power : Hence, Sample Size Determination
Factor Affecting Sample Size 1. 2. 3. 4. • To test H0 :µ=µ0vs H1 : µ≠µ0, σ²is known. Sample size calculation is
Relationship between Hypothesis Testing and Confidence Interval • To test H0 :µ=µ0vs H1 : µ≠µ0, H0 is rejected with a two-sided level α test if and only if the two-sided 100%*(1 - α) confidence interval for µ does not contain µ0.
Exact Method • If p(hat) < p0, the p-value • If p(hat) ≥ p0, the p-value
One-Sample Inference for the Poisson Distribution • X ~ Poisson with mean μ • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Obtain a two-sided 100(1- α)% C.I. for µ, say (C1, C2) • If µ0 (C1, C2), we accept H0 otherwise reject H0.
One-Sample Inference for the Poisson Distribution • The p-value (for above two-sided test) • If observed X < µ0, then • If observed X > µ0, Where F(x |µ0) is the Poisson c.d.f with mean = µ0.
Large-Sample Test for Poisson(for µ0≥ 10) • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Test Statistic: • Rejection Region: • p-value: