160 likes | 395 Vues
Comparing Means from Independent Samples. (Session 12). Learning Objectives. By the end of this session, you will be able to explain how means from two populations may be compared describe the assumptions associated with the independent samples t-test
E N D
Comparing Means from Independent Samples (Session 12)
Learning Objectives By the end of this session, you will be able to • explain how means from two populations may be compared • describe the assumptions associated with the independent samples t-test • interpret computer output from a two-sample t-test • present and write up conclusions resulting from such tests • explain the difference between statistical significance and an important result
An example: Comparing 2 means As part of a health survey, cholesterol levels of men in a small rural area were measured, including those working in agriculture and those employed in non-agricultural work. Aim: To see if mean cholesterol levels were different between the two groups.
Summary statistics Begin with summarising each column of data. There appears to be a substantial difference between the two means. Our question of interest is: Is this difference showing a real effect, or could it merely be a chance occurrence?
Setting up the hypotheses To answer the question, we set up: Null hypothesis H0: no difference between the two groups (in terms of mean response), i.e. 1 = 2 Alternative hypothesis H1: there is a difference, i.e. 1 2 The resulting test will be two-sided since the alternative is “not equal to”.
Test for comparing means • Use a two-sample (unpaired) t-test • - appropriate with 2 independent samples • Assumptions • - normal distributions for each sample • - constant variance (so test uses a pooled estimate of variance) • - observations are independent • Procedure • - assess how large the difference in means is, relative to the noise in this difference, i.e. the std. error of the difference.
Test Statistic The test statistic is: where s2, the pooled estimate of variance, is given by
Numerical Results The pooled estimate of variance, is : = 1279.5 Hence the t-statistic is: = 41.7/(2x1279.5/10) = 2.61 , based on 18 d.f. Comparing with tables of t18, this result is significant at the 2% level, so reject H0. Note: The exact p-value = 0.018
Presenting the results • For comparisons, should report: - difference between means - s.e. of difference in means • 95% confidence interval for true diff. • In addition, may report for each group: - mean - s.e. of each mean • sample size for each mean • Conclusions will then follow…
Results and conclusions Difference of means: 41.7 Standard error of difference: 15.99 95% confidence interval for difference in means: (8.09, 75.3). Conclusions: There is some evidence (p=0.018) that the mean cholesterol levels differ between those working in agriculture and others. The difference in means is 42 mg/dL with 95% confidence interval (8.1, 75.3).
Significance ideas again! e.g. Farmers report that using a fungicide increased crop yields by 2.7 kg ha-1, s.e.m.=0.41 This gave a t-statistic of 6.6 (p-value<0.001) Recall that the p-value is the probability of rejecting the null hypothesis when it is true. i.e. it is the chance of error in your conclusion that there is an effect due to fungicide!
How important are sig. tests? In relation to the example on the previous slide, we may find one of the following situations for different crops. Mean yields: with and without fungicide. 589.9 587.2 Not an important finding! 9.9 7.2 Very important finding! It is likely that in the first of these results, either too much replication or the incorrect level of replication had been used (e.g. plant level variation, rather than plot level variation used to compare means).
What does non-significance tell us e.g. There was insufficient evidence in the data to demonstrate that using a fungicide had any effect on plant yields (p=0.128). Mean yields: with and without fungicide. 157.2 89.9 This difference may be an important finding, but the statistical analysis was unable to pick up this difference as being statistically significant. HOW CAN THIS HAPPEN? Too small a sample size? High variability in the experimental material? One or two outliers? All sources of variability not identified?
Significance – Key Points • Statistical significance alone is not enough. Consider whether the result is also scientifically meaningful and important. • When a significant result if found, report the finding in terms of the corresponding estimates, their standard errors and C.I.’s