Understanding One-Sample t-Tests: Distribution, Degrees of Freedom, and Hypothesis Testing

One Sample Means Test: What if  is unknown? (sampling from normal population) We know that: If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. What is the distribution of: • Symmetric about 0. • Looks like a standard normal density, only more spread out. • 3) The spread of the distribution is indexed to a parameter called the degrees of freedom (df). • 4) As the degrees of freedom increase, the t-distribution gets closer to the standard normal distribution. (Safe to use z instead of t when n>30.)

Tail probabilities of the t-distribution See Table 3 Ott & Longnecker 95th percentiles: N(0,1), t(5), t(2).

Rejection Regions for hypothesis tests using t-distribution critical values For Pr(Type I error) = , df = n - 1 H0:  = 0 Reject H0 if t > t,n-1 t < -t,n-1 | t | > t/2,n-1 HA: 1.  > 0 2.  < 0 3.  0

Because the sum of the deviations are equal to zero, if we know n-1 of these deviations, we can figure out the nth deviation. Hence there are only n-1 independent deviations that are available to estimate the variance (and standard deviation). That is, there are only n-1 pieces of information available to estimate the standard deviation after we “spend” one to estimate the sample mean. The t-distribution is a normal distribution adjusted for unknown standard deviation hence it is logical that it would have to accommodate the fact that only n-1 pieces of information are available. Degrees of Freedom Why are the degrees of freedom only n - 1 and not n? We start with n independent pieces of information with which we estimate the sample mean. Now consider the sample variance:

Confidence Interval for  when  unknown (samples are assumed to come from a normal population) with df = n - 1 and confidence coefficient (1 - ). (Can use z/2 if n>30.) Example: Compute 95% CI for  given

One Sample Median Confidence Interval If the data is not normal (maybe skewed) and we have a small sample, then a nonparametric method can be used to make inferences about the median. First order the data from smallest to largest: Then a 100(1-)% CI for the population median is:

TheSign Test:One Sample Median Test A corresponding nonparametric test for the population median(M) can be developed along similar lines. To test: H0: 1. M  M0 vs. HA: M > M0 2. M  M0M < M0 3. M = M0M  M0 The test statistic is B:the number of data points greater than M0. (If the null is true then B should be approximately n/2.) With values obtained from Table 4, reject the null hypothesis if:

The Level of Significance of a Statistical Test (p-value) • Suppose the result of a statistical test you carry out is to reject the Null. • Someone reading your conclusions might ask: “How close were you to not rejecting?” • Solution: Report a value that summarizes the weight of evidence in favor of Ho, on a scale of 0 to 1. This is the p-value. The larger the p-value, the more evidence in favor of Ho. Formal Definition: The p-value of a test is the probability of observing a value of the test statistic that is as extreme or more extreme (toward Ha) than the actually observed value of the test statistic, under the assumption that Ho is true. (This is just the probability of a Type I error for the observed test statistic.) Rejection Rule:Having decided upon a Type I error probability , reject Ho if p-value .

Equivalence between confidence intervals and hypothesis tests Rejecting the two-sided null Ho:  = 0 is equivalent to 0 falling outside a (1-)100% C.I. for . Rejecting the one-sided null Ho:   0 is equivalent to 0 being greater than the upper endpoint of a (1-2)100% C.I. for , or 0 falling outside a one-sided (1-)100% C.I. for  with –infinity as lower bound. Rejecting the one-sided null Ho:   0 is equivalent to 0 being smaller than the lower endpoint of a (1-2)100% C.I. for , or 0 falling outside a one-sided (1-)100% C.I. for  with +infinity as upper bound.

Example: Practical Significance vs. Statistical Significance Dr. Quick and Dr. Quack are both in the business of selling diets, and they have claims that appear contradictory. Dr. Quack studied 500 dieters and claims, A statistical analysis of my dieters shows a significant weight loss for my Quack diet. The Quick diet, by contrast, shows no significant weight loss by its dieters. Dr. Quick followed the progress of 20 dieters and claims, A study shows that on average my dieters lose 3 times as much weight on the Quick diet as on the Quack diet. So which claim is right? To decide which diets achieve a significant weight loss we should test: Ho:   0 vs. Ha:  < 0 where  is the mean weight change (after minus before) achieved by dieters on each of the two diets. (Note: since we don’t know  we should do a t-test.)

MTB output for Quick diet analysis (Stat  Basic Stats  1 - Sample t) One-Sample T: Quick Test of mu = 0 vs < 0 95% Upper Variable N Mean StDev SE Mean Bound T P Quick 20 -3.02119 34.16614 7.63978 10.18901 -0.40 0.348 Calculating power for mean = null + difference Alpha = 0.05 Assumed standard deviation = 35 Sample Difference Size Power 3 20 0.0219603 Stat  Nonparametrics  1 – Sample Sign Sign Test for Median: Quick Sign test of median = 0.00000 versus < 0.00000 N Below Equal Above P Median Quick 20 11 0 9 0.4119 -5.036 Sign confidence interval for median Confidence Achieved Interval N Median Confidence Lower Upper Position Quick 20 -5.036 0.8847 -13.129 4.038 7 0.9500 -24.126 4.219 NLI 0.9586 -27.509 4.274 6

R output for Quack diet analysis (Read 500 values into vector “quack”) > t.test(quack,alternative=c("less"),mu=0,conf.level=0.95) One Sample t-test data: quack t = -1.7806, df = 499, p-value = 0.03779 alternative hypothesis: true mean is less than 0 95 percent confidence interval: -Inf -0.09036075 sample estimates: mean of x -1.212730 > power.t.test(n=500,delta=1,sd=15,type="one.sample", alternative="one.sided") n = 500, delta = 1, power = 0.438

Summary • Quick’s average weight loss of 3.02 is almost 3 times as much as the 1.21 weight loss reported by Quack. • However, Quack’s small weight loss was significant, whereas Quick’s larger weight loss was not! So Quack might not have a better diet, but he has more evidence, 500 cases compared to 20. • Remarks • Significance is about evidence, and having a large sample size can make up for having a small effect. • If you have a large enough sample size, even a small difference can be significant. If your sample size is small, even a large difference may not be significant. • Quick needs to collect more cases, and then he can easily dominate the Quack diet (though it seems like even a 3 pound loss may not be enough of a practical difference to a dieter). • Both the Quick & Quack statements are somewhat empty. It’s not enough to report an estimate without a measure of its variability. Its not enough to report a significance without an estimate of the difference. A confidence interval solves these problems.

A confidence interval shows both statistical and practical significance. Quack two & one-sided 95% CIs One-sided CI says mean is sig. less than zero. Quick two & one-sided 95% CIs One-sided CI says mean is NOT sig. less than zero.

Understanding One-Sample t-Tests: Distribution, Degrees of Freedom, and Hypothesis Testing