Statistical Estimation for Inferences: Methodology and Applications

Unit3: Statistical Inferences Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston

Estimation • Point Estimates • A point estimate of a parameter θ is a single number used as an estimate of the value of θ. • e.g. A natural estimate to use for estimating the population mean  is the sample mean . • Interval Estimation • If an random interval I=(L,U) satisfying Pr(L<θ<U)=1- α, the observed values of L and U for a given sample is called a 1- α conference interval estimate for θ. Which one is more accurate? Which one is more precise?

Estimation What to estimate? • B(n, p)  proportion • Poisson ()  mean • N(, σ2)  mean and/or variance

Estimation of the Mean of a Distribution • A point estimator of the population mean is sample mean. • Sampling Distribution of is the distribution of values of over all possible samples of size n that could have been selected from the reference population.

Estimation • An estimator of a parameter is unbiased estimator if its expectation is equal to the parameter. • Note: The unbiasedness is not sufficient to be used as the only criterion for chosen an estimator. • The unbiased estimator with the minimum variance(MVUE) is preferred. • If the population is normal, then is the MVUE of .

Sample Mean • Standard error (of the mean) = standard deviation of the sample mean • The estimated standard error where s: sample standard deviation .

Central Limit Theorem • Let X1,…,Xn be a random sample from some population with mean  and varianceσ2 Then, for large n,

Interval Estimation • Let X1, ….Xn be a random sample from a normal population N(, σ2). If σ2 is known, a 95% confidence interval (C.I.) for  is why? (next slide)

Interval Estimation

Interval Estimation Interpretation of Confidence Interval • Over the collection of 95% confidence intervals that could be constructed from repeated random samples of size n, 95% of them will contain the parameter  • It is wrong to say:There is a 95% chance that the parameter  will fall within a particular 95% confidence interval.

Interval Estimation • Note: • When  and n are fixed, 99% C.I. is wider than 95% C.I. • If the width of the C.I. is specified, the sample size can be determined. n  length   length 

Hypothesis Testing • Null hypothesis(H0): the statement to be tested, usually reflecting the status quo. • Alternative hypothesis (H1): the logical compliment of H0. • Note: the null hypothesis is analogous to the defendant in the court. It is presumed to be true unless the data argue overwhelmingly to the contrary.

Hypothesis Testing • Four possible outcomes of the decision: • Notation:  = Pr (Type I error) = level of significance  = Pr (Type II error) 1- = power= Pr(reject H0|H1 is true)

Hypothesis Testing • Goal : to make  and  both small • Facts:  then   then  • General Strategy: fix , minimize 

Testing for the Population Mean • When the sample is from normal population H0 :  = 120 vs H1 :  < 120 • The best test is based on ,which is called the test statistic. The "best test" means that the test has the highest power among all tests with a given type I error. Is there any bad test? Yes. • Rejection Region: • range of values of test statistic for which H0 is rejected.

One-tailed test • Our rejection region is • Now,

Result • To test H0 :  = 0vs H1 :  < 0, based on the samples taken from a normal population with mean  and variance unknown, the test statistic is . • Assume the level of significance is α then, • if t < tn-1, α , then we reject H0. • if t ≥ tn-1, α, then we do not reject H0.

P-value • The minimum α-level at which we can reject Ho based on the sample. • P-value can also be thought as the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained from the sample, given that the null hypothesis is true.

Remarks • Two different approaches on determining the statistical significance: • Critical value method • P-value method.

One-tailed test • Testing H0: µ=µ0vs H1: µ> µ0 When unknown and population is normal Test Statistic: Rejection Region: t > tn-1,α p-value = 1- Ft,n-1 (t), where Ft,n-1 ( ) is the cdf for t distribution with df=n-1. • Note:If is known, the s in test statistic will be replaced σby and tn-1,αin rejection region will be replaced by zα, Ft,n-1 (t) will be replace by Ф(t).

Testing For Two-Sided Alternative • Let X1,….,Xn be the random samples from the population N(µ, σ²), whereσ²is unknown. • H0 : µ=µ0vs H1 : µ≠µ0 • Test Statistic: • Rejection Region: |t|> tn-1,1-α/2 • p-value = 2*Ft,n-1 (t), if t<= 0. (see figures on next slide) 2*[1- Ft,n-1 (t)], if t > 0. • Warning: exact p-value requires use of computer.

Testing For Two-Sided Alternative P-value for X>U0 P-value for X<=U0

The Power of A Test • To test H0 : µ=µ0vs H1 : µ<µ0 in normal population with known variance σ²,the power is • Review: Power= Pr [rejecting H0 | H0 is false ] • Factors Affecting the Power

The Power of The 1-Sample T Test • To test H0 : µ=µ0 vs H1 : µ<µ0in a normal population with unknown variance σ²,the power, for true meanµ1 and true s.d.= σ, is F(tn-1, .05), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality

Power Function For Two-Sided Alternative • To test H0 :µ=µ0vs H1 : µ≠µ0in normal population with known variance σ²,the power is ,where µ1 is true alternative.

Case of Unknown Variance • For the same test with an unknown variance population, the power is F(-tn-1, 1-α/2) + 1- F(tn-1, 1- α/2), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality

For example:H0 :µ=µ0vs H1 : µ<µ0 power : Hence, Sample Size Determination

Factor Affecting Sample Size 1. 2. 3. 4. • To test H0 :µ=µ0vs H1 : µ≠µ0, σ²is known. Sample size calculation is

Relationship between Hypothesis Testing and Confidence Interval • To test H0 :µ=µ0vs H1 : µ≠µ0, H0 is rejected with a two-sided level α test if and only if the two-sided 100%*(1 - α) confidence interval for µ does not contain µ0.

One Sample Test for the Variance of A Normal Population

One Sample Test for A Proportion

Exact Method • If p(hat) < p0, the p-value • If p(hat) ≥ p0, the p-value

Power and Sample size

One-Sample Inference for the Poisson Distribution • X ~ Poisson with mean μ • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Obtain a two-sided 100(1- α)% C.I. for µ, say (C1, C2) • If µ0 (C1, C2), we accept H0 otherwise reject H0.

One-Sample Inference for the Poisson Distribution • The p-value (for above two-sided test) • If observed X < µ0, then • If observed X > µ0, Where F(x |µ0) is the Poisson c.d.f with mean = µ0.

Large-Sample Test for Poisson(for µ0≥ 10) • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Test Statistic: • Rejection Region: • p-value:

Statistical Estimation for Inferences: Methodology and Applications

Statistical Estimation for Inferences: Methodology and Applications

Presentation Transcript

Unit3

Making Inferences: Clinical vs Statistical Significance

Chapter 3 Making Statistical Inferences

Statistical Inferences Based on Two Samples

UNIT3 - FORESTRY

Statistical Inferences

Unit3 Let’s celebrate

Chapter 13 Understanding Results: Statistical Inferences

Analyzing Statistical Inferences

Reading (Unit3)

Introduction to Statistical Inferences

Unit3 Hobbies

Unit3 Reading

Unit3

Analyzing Statistical Inferences

CHAPTER 4 (PART 2) STATISTICAL INFERENCES

UNIT3

Unit3

Chapter 4: Making Statistical Inferences from Samples

Making Inferences: Clinical vs Statistical Significance

Unit3

Unit3