Medical Statistics as a science

Medical Statisticsas a science

Why Do Statistics? • Extrapolate from data collected to make general conclusions about larger population from which data sample was derived • Allows general conclusions to be made from limited amounts of data • To do this we must assume that all data is randomly sampled from an infinitely large population, then analyse this sample and useresults to make inferences about the population

Statistical Analysisin a Simple Experiment • Define population of interest • Randomly select sample of subjects to study(clinical trials do not enrol a randomly selected sample of patients due to inclusion/exclusion criteria but define a precise patient population) • Half the subjects receive one treatment and the other half another treatment (usually placebo) • Measure baseline variables in each group(e.g. age, Apache II to ensure randomisation successful) • Measure trial outcome variables in each group (e.g. mortality) • Use statistical techniques to make inferences about the distribution of the variables in the general population and about the effect of the treatment

Data • Categorical data: values belong to categories • Nominal data: there is no natural order to the categoriese.g. blood groups • Ordinal data: there is natural order e.g. Adverse Events (Mild/Moderate/Severe/Life Threatening) • Binary data: there are only two possible categoriese.g. alive/dead • Numerical data: the value is a number(either measured or counted) • Continuous data: measurement is on a continuume.g. height, age, haemoglobin • Discrete data: a “count” of events e.g. number of pregnancies

Descriptive Statistics: concerned with summarising or describing a sample eg. mean, median • Inferential Statistics: concerned with generalising from a sample, to make estimates and inferences about a wider population eg. T-Test, Chi Square test

Statistical Terms • Mean: the average of the data sensitive to outlying data • Median: the middle of the data not sensitive to outlying data • Mode: most commonly occurring value • Range: the spread of the data • IQ range: the spread of the data commonly used for skewed data • Standard deviation: a single number which measures how much the observations vary around the mean • Symmetrical data: data that follows normal distribution  (mean=median=mode) report mean & standard deviation & n • Skewed data: not normally distributed (meanmedian mode) report median & IQ Range

Standard Normal Distribution

Standard Normal Distribution Mean +/- 1 SD  encompasses 68% of observations Mean +/- 2 SD  encompasses 95% of observations Mean +/- 3SD  encompasses 99.7% of observations

Steps in Statistical Testing • Null hypothesisHo: there is no difference between the groups • Alternative hypothesisH1: there is a difference between the groups • Collect data • Perform test statistic eg T test, Chi square • Interpret P value and confidence intervals P value  0.05 Reject Ho P value > 0.05 Accept Ho • Draw conclusions

Meaning of P • P Value: the probability of observing a result as extreme or more extreme than the one actually observed from chance alone • Lets us decide whether to reject or accept the null hypothesis • P > 0.05 Not significant • P = 0.01 to 0.05 Significant • P = 0.001 to 0.01 Very significant • P < 0.001 Extremely significant

T Test • T test checks whether two samples are likely to have come from the same or different populations • Used on continuous variables • Example: Age of patients in the APC study (APC/placebo) PLACEBO: APC: mean age 60.6 years mean age 60.5 years • SD+/- 16.5 SD +/- 17.2 • n= 840 n= 850 • 95% CI 59.5-61.7 95% CI 59.3-61.7 • What is the P value? • 0.01 • 0.05 • 0.10 • 0.90 • 0.99 • P = 0.903  not significant  patients from the same population(groups designed to be matched by randomisation so no surprise!!)

T Test: SAFE “Serum Albumin” PLACEBO ALBUMIN n 3500 3500 mean 28 30 SD 10 10 95% CI 27.7-28.3 29.7-30.3 Q: Are these albumin levels different?Ho = Levels are the same (any difference is there by chance)H1 =Levels are too different to have occurred purely by chance Statistical test:T test  P < 0.0001 (extremely significant)Reject null hypothesis (Ho) and accept alternate hypothesis (H1) ie. 1 in 10 000 chance that these samples are both from the same overall group therefore we can say they are very likely to be different

Effect of Sample Size Reduction PLACEBO ALBUMIN n 350 350 mean 28 30 SD 10 10 95% CI 27.0-29.0 29.0-31.0 • smaller sample size (one tenth smaller) • causes wider CI (less confident where mean is) • P = 0.008 (i.e. approx 0.01  P is significant but less so) • This sample size influence on ability to find any particular difference as statistically significant is a major consideration in study design

Reducing Sample Size (again) PLACEBO ALBUMINn 35 35 mean 28 30 SD 10 10 95% CI 24.6-31.4 26.6-33.4 • using even smaller sample size (now 1/100) • much wider confidence intervals • p=0.41 (not significant anymore) •  SMALLER STUDY has LOWER POWER to find any particular difference to be statistically significant (mean and SD unchanged) • POWER: the ability of a study to detect an actual effect or difference

Reduction in death rate = 30.8%-24.7%= 6.1% ie 6.1% less likely to die in APC group Chi Square Test • Proportions or frequencies • Binary data e.g. alive/dead • PROWESS Study: Primary endpoint: 28 day all cause mortality ALIVE DEAD TOTAL % DEAD PLACEBO 581 (69.2%) 259 (30.8%) 840 (100%) 30.8 DEAD 640 (75.3%) 210 (24.7%) 850 (100%) 24.7 TOTAL 1221 (72.2%) 469 (27.8%) 1690 (100%) • Perform Chi Square test  P = 0.006 (very significant) • 6 in 1000 times this result could happen by chance 994 in 1000 times this difference was not by chance variation

Reduction in death rate = 6.1% (still the same) Reducing Sample Size • Same results but using much smaller sample size (one tenth) ALIVE DEAD TOTAL % DEAD PLACEBO 58 (69.2%) 26 (30.8%) 84 (100%) 30.8 DEAD 64 (75.3%) 21 (24.7%) 85 (100%) 24.7 TOTAL 122 (72.2%) 47 (27.8%) 169 (100%) • Perform Chi Square test  P = 0.39 39 in 100 times this difference in mortality could have happened by chance therefore results not significant • Again, power of a study to find a difference depends a lot on sample size for binary data as well as continuous data

Summary • Size matters=BIGGER IS BETTER • Spread matters=SMALLER IS BETTER • Bigger difference=EASIER TO FIND • Smaller difference=MORE DIFFICULT TO FIND • To find a small difference you need a big study

Medical Statistics as a science

Medical Statistics as a science

Presentation Transcript

STATISTICS AS A PROFESSION

Health Statistics including Medical Statistics

Statistics as Evidence

Psychology as a Science

SCIENCE AS A PROCESS

Psychology as a Science

YOGA AS A SCIENCE

Astronomy as a Science

MAMMALOGY AS A SCIENCE

Yoga as a Medical Alternative

BIOLOGY AS A SCIENCE

Psychology as a Science

Medical Statistics

Medical Statistics

Psychology as a Science

Science as a method

SCIENCE AS A PROCESS

Astrology - As a Science

Science As a Process!!!!!

Sociology as a Science

SCIENCE AS A PROCESS

Psychology as a science