Common Non-Parametric Methods for Comparing Two Samples

Common Non-Parametric Methods for Comparing Two Samples (Session 20)

Learning Objectives At the end of this session, you will be able to • Understand the type of logic behind common non-parametric tests for comparing two groups based on ranks • Interpret and understand the commonly used Wilcoxon signed-rank test • Appreciate some practical problems associated with the methods

An illustrative example Paired-Samples 10 farmers recorded their crop yield (tonnes/hectare) before and after the use of a fertiliser. Has the use of fertiliser changed the yield? Data (after-before pair-wise differences): 0.02, 0.89, -0.06, 0.26, 0.83, 0.42, 0.80, -0.05, 0.64, 0.84 How can we address this question objectively?

Start by plotting - Roughly symmetric distribution?

Addressing the question … • A paired t-test is often employed in such cases • Recall this is simply a one-sample t-test applied to the pair-wise differences • The procedure assumes the pair-wise differences are from a normal distribution

Addressing the question • Recall the t-test procedure is quite robust against departures from normality (c.f. Session 19) • However, if we are concerned about the validity of the normal assumption we might use a non-parametric test that does not make this assumption

Addressing the question One possibility is to use a sign test, to test H0: Population median difference, =0 vs. H1: Population median difference, 0 However, this procedure is inefficient as it effectively only utilises the signs (+/-) of the pair-wise differences Can more information be used?

Wilcoxon signed-rank test Yes, but at a price… We use the rank order of the pair-wise differences, but not the actual values This leads to the Wilcoxon-signed rank test Assumptions The pair-wise differences are not only independent, but are from a symmetric, but unspecified distribution

Back to the example • Let us assume the distribution of pair-wise differences is symmetrically distributed • Not unreasonable based on the plots • Also, the sample median and mean are similar; 0.53 and 0.46 respectively

Wilcoxon signed-rank test • Rank the n=10 differences according to their magnitude • Re-attach the signs to give signed-ranks: Notes • Use average ranks for ties • Zero differences are ignored in the above process (reducing the sample size)

Wilcoxon signed-rank test • Let T+=sum of +ve ranks = 50 T=sum of –ve ranks = 5 • Take either T+ or T as a test statistic • T+ + T =n(n+1)/2 • Consider T+. A sufficiently small or large value is evidence to reject H0 • To obtain a p-value we compare T+ with its null distribution • This is a symmetric discrete distribution • A two-sided p-value is then Prob(T+5)+Prob(T+50)

The p-value calculation • Exact method • Cumbersome • Use appropriate software • Large sample approximation • Approximate the null distribution of T+ using a normal distribution • n>20 will usually give a reasonable approximation

Conclusions • The p-value is small. Hence, there is evidence to reject H0 • The estimated median difference (after – before), 0.53, is significant • There is evidence based on this study for a positive fertiliser effect

Comments While the Wilcoxon signed-rank test makes less restrictive assumptions than the t-test there are still a number of major practical problems • The symmetric assumption is still quite limiting, as many distributions are skewed • Confidence intervals (CIs) • As with the sign test (Session 19) most software packages concentrate on the p-value rather than point estimates and confidence intervals

Two independent samples • For comparing independent samples, a t-test for independent samples is often used • If we were concerned about the validity of the underlying assumptions we could employ a non-parametric method • The Wilcoxon rank-sum test (or the equivalent Mann-Whitney U test) is a common choice • Once again this is based on ranks

Concluding remarks • Non-parametric tests may be used. However… • Their usefulness is often over-rated • The lack of confidence intervals is a major disadvantage • Practical statistics is frequently more complicated than comparing two groups. • In this case, t-test methodology naturally extends into more a more general modelling framework • The non-parametric tests discussed do not naturally extend

Common Non-Parametric Methods for Comparing Two Samples