Chapter 10
E N D
Presentation Transcript
Chapter 10 Statistical Inference for Two Samples
Learning Objectives • Comparative experiments involving two samples • Test hypotheses on the difference in means of two normal distributions • Test hypotheses on the ratio of the variances or standard deviations of two normal distributions • Test hypotheses on the difference in two population proportions • Compute power, type II error probability, and make sample size decisions for two-sample tests • Explain and use the relationship between confidence intervals and hypothesis tests
Assumptions • Interested on statistical inferences on the difference in means of two normal distributions • Populations represented by X1 and X2 • Expected Value
Assumptions • Quantity • Has a N(0, 1) distribution • Used to form tests of hypotheses and confidence intervals on μ1-μ2
Hypothesis Tests for a Difference in Means, Variances Known • Difference in means μ1-μ2 is equal to a specified value ∆0 • H0: μ1-μ2 =∆0 • H1: μ1-μ2#∆0 • Test statistic
Hypothesis Tests for a Difference in Means, Variances Known • Alternative Hypothesis • H1: μ1-μ2#∆0 • Rejection Criterion • z0> zα/2 or z0<-zα/2 • H1: μ1-μ2>∆0 • Rejection Criterion • z0> zα • H1: μ1-μ2<∆0 • Rejection Criterion • Z0< -zα
Choice of Sample Size • Use of OC Curves • Use OC curves in Appendix Charts VIa, VIb, VIc, and VId • Abscissa scale of the OC curves
Choice of Sample Size • Two-sided Sample Size • Sample size n=n1=n2 required to detect a true difference in means ∆of with power at least 1-β • Where ∆ is the true difference in means of interest • One-sided Sample Size
Type II Error • Follows the singe-sample case • Two-sided alternative
C.I. on a Difference in Means, Variances Known, and Choice of Sample Size • Confidence Interval • 100(1-α)% C.I. on the difference in two means μ1-μ2
Choice of Sample Size • Choice of Sample Size • Error in estimating μ1-μ2 by less than E at 100(1-α)% confidence
Example • Two machines are used for filling plastic bottles with a net volume of 16.0 ounces • The fill volume can be assumed normal, with standard deviation 1=0.020 and 2=0.025 ounces • A member of the quality engineering staff suspects that both machines fill to the same mean net volume, whether or not this volume is 16.0 ounces. A random sample of 10 bottles is taken from the output of each machine as follows
Questions • Do you think the engineer is correct? Use =0.05 • What is the P-value for this test? • What is the power of the test in part (1) for a true difference in means of 0.04? • Find a 95% confidence interval on the difference in means. Provide a practical interpretation of this interval. • Assuming equal sample sizes, what sample size should be used to assure that =0.05 if the true difference in means is 0.04? Assume that =0.05
Solution-Part 1 • Parameter of interest is the difference in fill volume, • H0 : or • H1 : or • = 0.05 • The test statistic is • Reject H0 if z0 < z/2 = 1.96 or z0 > z/2 = 1.96 • 16.015, 16.005, = 0, 0.025, 0.02, n1 = 10, and n2 = 10 • Since -1.96 < 0.99 < 1.96, do not reject the null hypothesis
Solution-Part 2 and 3 2. P-value = 3. = 0 0 = 0 Hence, the power = 1 0 = 1
Solution-Part 4 4. Confidence interval With 95% confidence, we believe the true difference in the mean fill volumes is between 0.0098 and 0.0298. Since 0 is contained in this interval, we can conclude there is no significant difference between the means.
Solution-Part 5 5. Assume the sample sizes are to be equal, use = 0.05, = 0.05, and = 0.08 Hence, n = 3, use n1 = n2 = 3
Hypotheses Tests for a Difference in Means, Variances Unknown • Tests of hypotheses on the difference in means μ1-μ2 of two normal distributions • If n1 and n2 exceed 40, use the CLT • Otherwise base our hypotheses tests and C.I. on the t distribution • Two cases for the variances
Case I: 12=22= 2: Pooled Test • Two normal populations with unknown means and unknown but equal variances • Expected value • Form an estimator of 2 • Pooled estimator of 2, denoted by S2p • Test statistic
Hypotheses Tests • Test hypothesis • H0: μ1-μ2 =∆0 • H1: μ1-μ2#∆0 • Test statistic • Where Sp is the pooled estimator of
Critical Regions • Alternative Hypothesis • H1: μ1-μ2#∆0 • Rejection Criterion • t0>tα/2, n1+n2-2 or • t0<-tα/2, n1+n2-2 • H1: μ1-μ2>∆0 • Rejection Criterion • t0>tα, n1+n2-2 • H1: μ1-μ2<∆0 • Rejection Criterion • t0<-tα, n1+n2-2
Case 2: 12#22 • Not able to assume that the unknown variances 12, 22are equal • Test statistic • With v degrees of freedom • Critical regions • Identical to the case I • Degrees of freedom will be replaced by v
Confidence Interval on the Difference in Means • Case 12=22 • 100(1-)% CI on the difference in means μ1-μ2 • Case 12#22 • 100(1- )% CI on the difference in means μ1-μ2
Example • The diameter of steel rods manufactured on two different extrusion machines is being investigated • Two random samples of of sizes n1=15 and n2=17 are selected, and the sample means and sample variances are 8.73, s12=0.35, 8.68, and s22=0.40, respectively • Assume that equal variances and that the data are drawn from a normal distribution • Is there evidence to support the claim that the two machines produce rods with different mean diameters? Use α=0.05 in arriving at this conclusion • Find the P-value for the t-statistic you calculated in part (1) • Construct a 95% confidence interval for the difference in mean rod diameter. Interpret this interval
Solution 1. Parameter of interest, 2. H0 : or 3. H1 : or 4. = 0.05 5. Test statistic is 6. Reject the null hypothesis if t0 < where = 2.042 or t0 > where = 2.042 7. 8.73, 8.68, 0 = 0, 0.35, 0.40, n1 = 15, and n2 = 17,
Solution 8. Since 2.042 < 0.230 < 2.042, do not reject the null hypothesis
Solution-Cont. • P-value = 2P 2( 0.40), P-value > 0.80 • 95% confidence interval: t0.025,30 = 2.042 • Since zero is contained in this interval, we are 95% confident that machine 1 and machine 2 do not produce rods whose diameters are significantly different
Paired t Test • Special case of the two-sample t-tests • When the observations are collected in pairs • Each pair of observations is taken under homogeneous conditions • Conditions may change from one pair to another • Testing • H0: μD=∆0 • H1: μD#∆0
Paired t Test • Test statistic • D (bar) is the sample average of the n differences • Rejection Region • t0>tα/2, n-1 or t0<-tα/2, n-1 • 100(1-α)% C.I. on the difference in means in means
Example • Ten individuals have participated in a diet-modification program to stimulate weight loss • Their weight both before and after participation in the program is shown in the following list • Is there evidence to support the claim that this particular diet-modification program is effective in producing a mean weight reduction? Use α=0.05.
Solution 1. Parameter of interest is the difference in mean weight, d where di =Weight Before Weight After. 2. H0 : 3. H1 : 4. = 0.05 5. Test statistic is 6. Reject the null hypothesis if t0 > where = 1.833 7. 17, 6.41, n=10 8) Since 8.387 > 1.833 reject the null
Inferences on the Variances of Two Normal Populations • Both populations are normal and independent • Test the hypotheses • H0: 12=22 • H1:12≠22 • Requires a new probability distribution, the F distribution
The F Distribution • Define rv F as the ratio of two independent chi-square r.v., each divided by its number of dof • F=(W/u) /(Y(v)) • Follows the F distribution with u dof in the numerator and v dof in the denominator. • Usually abbreviated as Fu,v
The F Distribution • Shape of pdf with two dof • Table V provides the percentage points of the F distribution • Note that f1-α,u,v=1/fα,v, u
Hypothesis Tests on the Ratio of Two Variances • Suppose H0: 12=22 • S12 and S22 are sample variances • Test statistics • F0= S12 / S22 • Suppose H1:12#22 • Rejection Criterion • f0>fα/2,n1-1,n2-1 or f0<f1-α/2,n1-1, n2-1
Example • Two chemical companies can supply a raw material. • The concentration of a particular element in this material is important. • The mean concentration for both suppliers is the same, but we suspect that the variability in concentration may differ between the two companies • The standard deviation of concentration in a random sample of n1=10 batches produced by company 1 is s1=4.7 grams per liter, while for company 2, a random sample of n2=16 batches yields s2=5.8 grams per liter. • Is there sufficient evidence to conclude that the two population variances differ? Use α=0.05.
Solution 1. Parameters of interest are the variances of concentration, 2. H0 : 3. H1 : 4. = 0.05 5. Test statistic is 6. Reject the null hypothesis if f0 < where = 0.265 or f0 > where =3.12 7. n1=10, n2=16, s1= 4.7, and s2=5.8 8. Since 0.265 < 0.657 < 3.12 do not reject the null hypothesis
Hypothesis Tests on Two Population Proportions • Suppose two binomial parameters of interest, p1and p2 • Large-Sample Test • Test statistic • Critical regions
β-Error • If the H1 is two sided, the β-error • Where
Confidence Interval on the Difference in Means • Two sided 100(1-α)% C.I. on the difference in the true proportions p1-p2
Example • Two different types of injection-molding machines are used to form plastic parts. A part is considered defective if it has excessive shrinkage or is discolored • Two random samples, each of size 300, are selected, and 15 defective parts are found in the sample from machine 1 while 8 defective parts are found in the sample from machine 2 • Is it reasonable to conclude that both machines produce the same fraction of defective parts, using α=0.05?
Solution • Parameters of interest are the proportion of defective parts, p1 and p2 • H0 : • H1 : • = 0.05 • Test statistic is • Reject the null hypothesis if z0 < where = 1.96 or z0 > where = 1.96 • n1=300, n2=300, x1=15, x2=8, 0.05, 0.0267
Solution-Cont • Since 1.96 < 1.49 < 1.96 do not reject the null hypothesis • P-value = 2(1P(z < 1.49)) = 0.13622