210 likes | 322 Vues
This study examines whether "small" cars have a different average gas mileage compared to "compact" cars. Using data from 13 small and 15 compact vehicles, we conduct hypothesis testing to evaluate the null hypothesis (H0) and alternative hypothesis (HA). The Welch Modified Two-Sample t-Test is applied due to unequal variances, leading to a calculated test statistic and p-value. The findings indicate significant differences in mean gas mileage, emphasizing the importance of statistical analysis in understanding vehicle efficiency.
E N D
Hypothesis Testing ESM 206 6 Feb. 2002
Example: Gas Mileage Do “Small” cars have a different average gas mileage than “Compact” cars? Data on mileage of 13 small and 15 compact cars.
Example: gas consumption • Which coefficients are different from zero? • Data from 36 years in US.
Hypothesis testing • Define null hypothesis (H0) • Does direction matter? • Choose test statistic, T • Distribution of T under H0 • Calculate test statistic, S • Probability of obtaining value at least as extreme as S under H0 (P) • P small: reject H0
The null hypothesis • Statement about underlying parameters of the population • We will either reject or fail to reject H0 • Usually a statement of no pattern or of not exceeding some criterion • Examples
The alternate hypothesis • Written HA • Is the logical complement of H0 • Examples
One- and two-sided tests • One-sided test: direction matters • Pick a direction based on regulatory criteria or knowledge of processes • Direction must be chosen a priori • Two-sided: all that matters is a difference • One-sided has greater power • Must make decision before analyzing data
Comparing means: the t-test • Compare sample mean to fixed value (eqs. 1-4) • Compare regression coefficient to fixed value (eq. 5) • Compare the difference between two sample means to a fixed value (usually 0) (eqs. 6-7)
Assumptions of the t-test • The data in each sample are normally distributed • The populations have the same variance • Can correct for violations of this with the Welch modification of df • Test for difference among variances with F-test
The P-value • P is the probability of observing your data if the null hypothesis is true • P is the probability that you will be in error if you reject the null hypothesis • P is not the probability that the null hypothesis is true
Critical values of P • Reject H0 if P is less than threshold • P < 0.05 commonly used • Arbitrary choice • Other values: 0.1, 0.01, 0.001 • Always report P, so others can draw own conclusions
Example: Gas Mileage Do “Small” cars have a different average gas mileage than “Compact” cars? Data on mileage of 13 small and 15 compact cars.
Gas mileage Test Name: Welch Modified Two-Sample t-Test Estimated Parameter(s): mean of x = 31 mean of y = 24.13333 Data: x: Small in DS2 , and y: Compact in DS2 Test Statistic: t = 5.905054 Test Statistic Parameter: df = 16.98065 P-value: 0.00001738092 95 % Confidence Interval: LCL = 4.413064 UCL = 9.32027
Example: gas consumption • Which coefficients are different from zero? • Data from 36 years in US.
Gas consumption Value Std. Error t value Pr(>|t|) (Intercept) -0.0898 0.0508 -1.7687 0.0868 GasPrice -0.0424 0.0098 -4.3058 0.0002 Income 0.0002 0.0000 23.4189 0.0000 New.Car.Price -0.1014 0.0617 -1.6429 0.1105 Used.Car.Price -0.0432 0.0241 -1.7913 0.0830
Interpreting model coefficients • Is there statistical evidence that the independent variable has an effect? • Is the parameter estimate significantly different from zero? • Is the coefficient large enough that the effect is important? • Must take into account the variation in the independent variable • Use linear measure of variation – SD, IQ range, etc.
Types of error • Type I: reject null hypothesis when it’s really true • Desired level: a • Type II: fail to reject null hypothesis when it’s really false • Desired level: b • Is associated with a given effect size • E.g., want a probability 0.1 of failing to reject when true difference between means is 0.35.
Controlling error levels • a is controlled by setting critical P-value • b is controlled by a, sample size, sample variance, effect size • Tradeoff between a and b • Need to balance costs associated with type I and type II errors • Power is 1-b