Intro to Statistics Part2

Intro to Statistics Part2 Arier Lee University of Auckland

Standard error • Standard error – the standard deviation of the sampling distribution of a statistic • The standard deviation of the sample means is called the standard error of the mean and it measures how precisely the population mean is estimated by the sample mean • The standard error is a measure of the precision of the estimated mean whereas the standard deviation summarises the variability or the spread of the observations • Standard error <= standard deviation • The larger the sample size the smaller the standard error

Confidence intervals • A 95% confidence interval for a mean is calculated by (mean-1.96*SE, mean+1.96*SE) • An example: In a sample of 2000 pregnant women, serum cholesterol was measured and it was found that the sample mean is 5.62 and SE=0.15. 95% confidence interval: (5.33, 5.91)

Confidence intervals • 95% CI does not mean that there is a 95% chance that the true mean lies between 5.33 and 5.91 • If we repeat the study over and over again, calculating a 95% confidence interval each time, about 95 of 100 such intervals would include the true mean • Whether the one that we have obtained from our study is one of them we will never know – but we have some confidence • It is a measure of precision of our estimate • Bigger confidence interval -> less precision

Graphical presentation of the data • Exploratory data analysis • Presentation of results • Examples: Bar charts, Line graphs, Scatter plots, Box plots, Kaplan Meier Plots etc. • Graphs can only be as good as the data they display • No amount of creativity can produce a good graph from dubious data

Bar chart 2005 maternity report

Line graph

Box plot Obs beyond end of whisker Q3 1.5 x (Q3-Q1) median Q1 Smallest obs marks end of whisker

Data to chart ratio Mental health score by treatment groups Good Bad

Inadequate chart type Effect of ethnicity on road traffic injury deaths and hospitalisations, 2000-8, Auckland region, by age group, adjusted for gender and deprivation (using National Minimum Data Set and Mortality Collection data) Graphs of risk or rate ratio should be presented with • Points with error bars • Log scale

Odds ratio presented with logarithmic scale Outcome: Blindness

Unnecessary 3D effects How often do you read to your child

Inadequate labelling

Graphical presentation of the data • Use appropriate graph types for the appropriate purpose, e.g. line chart for trend • All axes, tick marks, title, should be labelled • Appropriate scale used • Adequate data to chart ratio • Avoid unnecessary complexity such as • Irrelevant decoration • Too much colours • 3D effects • Keep it simple!

Research process Research question Analyse data Primary and secondary endpoints Study design Interpret results Sampling and/or randomisation scheme Disseminate Power and sample size calculation Pre-define analyses methods

Sample size and power of a study • One of the statistical, economical and ethical issues of the design of medical studies • Statistical: Ensure the study is large enough to detect an effect if it exists • Economical: Ensure not enlist more patients than are needed • Ethical: unethical to engage more people in a trial than are needed • Larger samples -> more precise estimates • How large?

Sample size and power of a study • The power of a test is the probability of detecting a true difference • The size of the sample needed depends on • required power • detectable difference • variability in the population • level of significance (probability of falsely reject the NULL) • statistical test being used • Need information to calculated a meaningful sample size – literature search

Sample size and power of a study- an example • A double blind randomised controlled study on treatment for chronic hypertension during pregnancy • Comparing two treatments: • Standard treatment • New treatment

Sample size and power of a study- an example • Based on current evidence, assume • Detectable difference: 10mmHg • Standard deviation: 15 mmHg • 90% power • 5% significance level • Two-sided test • 1:1 ratio • Using PS (a power and sample size calculation software) – 48 subjects per group • After considering drop-out rate, say 10%, round to, say, 60 subjects per group

Sample size and power of a study Chronic hypertension during pregnancy example • To detect a difference of 10mmHg • SD varies from 5 to 30mmHg

Sample size and power of a study Sample size calculation is an evidence based best guess • Relies on assumptions • Not a precise number • No guarantee of significant effect at the end of a study

Any Questions?

Intro to Statistics Part2

Intro to Statistics Part2

Presentation Transcript

Part2

Introduction --- Part2

Lab4 Part2

Clustering Part2

Intro to Statistics

PART2

Unit 1 – Intro to Statistics

Statistics Intro

Part2 course

Intro to Statistics and Data

Sensors(part2)

Part2

Intro to statistics

A Little Intro to Statistics

Intro to Statistics – Part 2

MAT 102: Intro to Statistics

Intro to Probability and Statistics

Arrays – Part2

Intro to statistics