A short introduction to epidemiology Chapter 6: Precision

A short introduction to epidemiologyChapter 6: Precision Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand

Chapter 6Precision • Random error • Basic statistics • Study size and power

Random Error • Random error is not unique to epidemiologic studies and also occurs in randomized trials • Even if the disease under study is not associated with an exposure, there may be a “chance” association in a particular study (e.g. disease may be more common in the exposed than in the non-exposed group)

Random Error • The precision (lack of random error) of an effect estimate (e.g. an odds ratio) is reflected in the 95% confidence interval • Random error reduces, and precision increases, as the study size increases

Basic statistics • Mean • Standard deviation • If we take a sample from a population, and the data is normally distributed, then 95% of individual values in the sample will lie within +1.96 standard deviations of the population mean

The normal distribution (insert figure showing the bell curve)

Basic statistics • Standard error • If we take repeated samples, then the sample means will vary and the standard deviation of the sample means is known as the standard error • 95% of sample means will lie within +1.96 SE of the population mean

The distribution of sample means (insert figure showing the bell curve)

Categorical data • Suppose we want to calculate a proportion (p) • Under the binomial distribution, if the sample is sufficiently large, the sampling distribution will approximate to the normal distribution with mean (p) and standard deviation: s = (p(1-p)/n)0.5

Testing and estimation • The p-value is the probability that differences as large or larger as those observed could have arisen by chance if the null hypothesis (of no association between exposure and disease) is correct • The confidence interval provides a range of values in which it is plausible that the true effect estimate may lie • The principal aim of an individual study should be to estimate the size of the effect (using the effect estimate and confidence interval) rather than just to decide whether or not an effect is present (using the p-value)

Study size and power The study power depends on: • The cut-off value (e.g. p<0.05) below which the p-value would be considered “statistically significant” • The disease rate in the non-exposed group in a cohort study or the exposure prevalence of the controls in a case-control study • The expected relative risk (i.e. the specified value of the relative risk under the alternative (non-null) hypothesis)) • The ratio of the sizes of the two groups being studied • The total number of study participants

Study size and power Zb = N00.5(P1 – P0)B0.5 – ZaB------------------------------- K0.5 where: • Zb = standard normal deviate corresponding to a given statistical power • Za = standard normal deviate corresponding to an alpha level (the largest p-value that would be considered "statistically significant") • N0 = number of persons in the reference group (i.e. the non-exposed group in a cohort study, or the controls in a case-control study) • P1 = outcome proportion in study group • P0 = outcome proportion in the reference group • A = allocation ratio of referent to study group (i.e., the relative size of the two groups) • B = (1-P0) (P1+ (A-1) P0) + P0 (1-P1) • C = (1-P0) (AP1 - (A-1) P0) + AP0 (1-P1) • K = BC - A (P1-P0)2

Study size and power: example • 5,000 exposed persons and 5,000 non-exposed persons • The risk in the non-exposed group is 0.005 • We expect that exposure will double the risk of disease, so the risk will be 0.010 in the exposed group • Zb = 0.994 • Power = 83%

Study size and power • The study power is not the probability that the study will estimate the size of the association correctly • Rather, it is the probability that the study will yield a "statistically significant" finding when an association of the postulated size exists • The observed association could be greater or less than expected, but still be "statistically significant"

Study size and power • Standard calculator and microcomputer programmes incorporating procedures for power calculations are widely available. • EPI-INFO (Dean et al, 1990) can be downloaded for free from http://www.cdc.gov/epiinfo/ • Rothman’s Episheet programme (Rothman, 2002) can be downloaded for free from http://www.oup-usa.org/epi/rothman/

Study size and power • Insert example of power curve from Rothman’s Episheet

A short introduction to epidemiologyChapter 6: Precision Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand

A short introduction to epidemiology Chapter 6: Precision