400 likes | 533 Vues
Week 10. Comparing Two Means or Proportions. Generalising from sample. Generalising from sample. Numerical measurements: means. Difference in average weight loss for those who diet compared to those who exercise to lose weight?
E N D
Week 10 Comparing Two Means or Proportions
Numerical measurements: means • Difference in average weight loss for those who diet compared to those who exercise to lose weight? • Difference is there between the mean foot lengths of men and women? • Population parameter • 2 – 1 = difference between population means • Sample estimate • x2 – x1 = difference between sample means
Categorical measurements: propns • Difference between the proportions that would quit smoking if taking the antidepressant buproprion (Zyban) versus wearing a nicotine patch? • Difference between proportion who have heart disease of men who snore and men who don’t snore? • Population parameter • 2 – 1 = difference between population proportions • Sample estimate • p2 – p1 = difference between sample proportions
Requirement: independent samples Two samples are called independent samples when the measurements in one sample are not related to the measurements in the other sample. • Random samples taken separately from 2 populations • Randomised experiment with 2 treatments • One random sample, but a categorical variable splits individuals into 2 groups.
Model for numerical data • Sample 1 ~ population (mean 1, s.d. 1)Sample 2 ~ population (mean 2, s.d. 2) • Estimation: estimate (2 – 1) with • Standard error? • Confidence interval? • Testing: is (2 – 1) zero? • p-value
Model for categorical data • Sample 1 ~ population (proportion 1)Sample 2 ~ population (proportion 2) • Estimation: estimate (2 – 1) with (p2 – p1) • Standard error? • Confidence interval? • Testing: is (2 – 1) zero? • p-value
Distribution of difference • In both cases, we need to find distribution of difference • (p2 – p1) or • Independent samples >> difference of independent random variables. • We already know distns of the two parts — what is distn of their difference?
Sum of 2 variables Same distns • Sample mean: • Sample total: Different distns
Difference between 2 variables • Same standard devn as sum • If X1and X2 are normal • Remember that X1and X2 must be independent
Example • Husband height ~ normal(1.85, 0.1)Wife height ~ normal(1.7, 0.08) • Assume independent. (Probably not!!) • Prob that wife is taller than husband? • (Husband - Wife) ~
Example • Husband height ~ normal(1.85, 0.1)Wife height ~ normal(1.7, 0.08) • Husband - Wife ~ normal(0.15, 0.1281) P (diff ≤ 0) = area -0.11 0.02 0.15 0.28 0.41 Prob = 0.297
Difference between proportions • If X1 and X2 are independent, • If p1 and p2 are independent, • For large samples, p1 and p2 are approx normal, so their difference is too.
Std error for difference in propns Nicotine patches vs Antidepressant (Zyban)? n1 = n2 = 244 randomly assigned to each treatment Zyban: 85 out of 244 quit smokingPatch: 52 out of 244 quit smoking So,
Approximate 95% C.I. For sufficiently large samples, the interval Estimate 2 Standard error is an approximate 95% C.I. • Best you can do for difference between proportions • For means, CI can be improved by replacing ‘2’ by a different value.
Patch vs Antidepressant Study: n1 = n2 = 244 randomly assigned to each group Zyban: 85 of the 244 Zyban users quit smoking = .348 Patch: 52 of the 244 patch users quit smoking = .213 So, Approx 95% C.I. .135 2(.040) => .135 .080 => .055 to .215 We are 95% confident that Zyban gives an improvement of between 5.5% and 21.5% of the probability of quitting smoking.
Difference between means • If X1 and X2 are independent, • If X1 and X2 are independent, • If both populations are normal, so is the difference.
Std error for difference in means Lose More Weight by Diet or Exercise? n1 = 42 men on diet, n2 = 27 men on exercise routine Diet: Lost an average of 7.2 kg with std dev of 3.7 kgExercise: Lost an average of 4.0 kg with std dev of 3.9 kg So,
Diet vs Exercise Study: n1 = 42 men on diet, n2 = 27 men exercise Diet: Lost an average of 7.2 kg with std dev of 3.7 kgExercise: Lost an average of 4.0 kg with std dev of 3.9 kg So, Approximate 95% Confidence Interval: 3.2 2(.81) => 3.2 1.62 => 1.58 to 4.82 kg We are 95% confident that those who diet lose on average 1.58 to 4.82 kg more than those who exercised.
Better C.I. for mean A CI for the Difference Between Two Means(Independent Samples): where t*is a value from t-tables. • d.f. = min(n1–1, n2–1) • Welch’s approx gives a different d.f. (higher) but is a complicated formula • t* is approx 1.96 if d.f. is high
Effect of a stare on driving Randomized experiment: Researchers either stared or did not stare at drivers stopped at a campus stop sign; Timed how long (sec) it took driver to proceed from sign to a mark on other side of the intersection. No Stare Group (n = 14): 8.3, 5.5, 6.0, 8.1, 8.8, 7.5, 7.8, 7.1, 5.7, 6.5, 4.7, 6.9, 5.2, 4.7 Stare Group (n = 13): 5.6, 5.0, 5.7, 6.3, 6.5, 5.8, 4.5, 6.1, 4.8, 4.9, 4.5, 7.2, 5.8 Estimate difference between the mean crossing times.
Checking data • No outliers; no strong skewness. • Crossing times in stare group seem faster & less variable.
Effect of stare on driving Using df = min(n1–1, n2–1) = 12, gives t* = 2.179 A 95% CI for 2–1 is
Effect of stare on driving Minitab N.B. C.I. is based on df = 21 (Welch’s approx) • Slightly narrower C.I. that we got with d.f. = 12.
Interpretation A 95% CI for 2–1 is 0.17 to 1.91 sec • We are 95% confident that it takes drivers between 0.17 and 1.91 seconds less on average to cross intersection if someone stares at them.
Testing two proportions • Hypotheses H0: 1 – 2= 0 HA: 1 – 2≠ 0 or1 – 2< 0 or1 – 2> 0 Watch how Population 1 and 2 are defined. • Data requirements • Independent samples • n1 p1, n1(1-p1), n2 p2, n2(1-p2)all at least 5, preferably ≥10
Test statistic • Based on p1 – p2 • Standardise:
Test statistic • If H0 is true, best estimate of is • So we use test statistic • If H0 is true, this has standard normal distn • p-value from normal distn
Prevention of Ear Infections • Does the use of sweetener xylitol reduce the incidence of ear infections? Randomized Experiment: Of 165 children on placebo, 68 got ear infection. Of 159 children on xylitol, 46 got ear infection. • Hypotheses:H0: 1 – 2=0 Ha: 1 – 2>0 • Data check: At least 5 success & failure in each group
Prevention of Ear Infections • Overall propn getting infection • Test statistic • p-value = 0.01 • Conclusion: • Strong evidence xylitol reduces chance of ear infection
Testing two means • Hypotheses H0: 1 – 2= 0 HA: 1 – 2≠ 0 or1 – 2< 0 or1 – 2> 0 Watch how Population 1 and 2 are defined. • Data requirements • Fairly large n1 and n2 (say 30 or more), or • Not much skewness & no outliers (normal model reasonable)
Test statistic • Based on • Standardise:
Test • Test statistic: • If H0 is true, this has approx t-distn with d.f. = min(n1–1, n2–1) • Same d.f. as CI for 1 – 2 • p-value from t distn • Minitab or Excel • n1 and n2 ≥ 30 • Use normal tables
Effect of a stare on driving Randomized experiment: Researchers either stared or did not stare at drivers stopped at a campus stop sign; Timed how long (sec) it took driver to proceed from sign to a mark on other side of the intersection. No Stare Group (n = 14): 8.3, 5.5, 6.0, 8.1, 8.8, 7.5, 7.8, 7.1, 5.7, 6.5, 4.7, 6.9, 5.2, 4.7 Stare Group (n = 13): 5.6, 5.0, 5.7, 6.3, 6.5, 5.8, 4.5, 6.1, 4.8, 4.9, 4.5, 7.2, 5.8 Test whether stare speeds up crossing times.
Checking data • Small sample sizes, but • No outliers; no strong skewness.
Effect of stare on driving H0: 1 – 2= 0 HA: 1 – 2> 0 where 1 = no-stare, 2 = stare • Hypotheses
Effect of stare on driving • Test statistic • P-value df = min(n1–1, n2–1) = 12 Upper tail area of t-distn (12 d.f.) p = 0.016 Strong evidence that stare speeds up crossing
Effect of stare on driving Minitab N.B. Test is based on df = 21 (Welch’s approx) • Very similar p-value and same conclusion Strong evidence that stare speeds up crossing
Paired data and 2-sample data • Make sure you distinguish between: • 2 measurements on each individual (e.g. before & after) • Measurements from 2 independent groups • Different cars assessed for insurance claims in garages A and B • Same cars assessed by both garages 2 independent samples Paired data