P Values

P Values Robin Beaumont 8/2/2012 With much help from Professor Chris Wilds material University of Auckland

Where do they fit in!

probability Putting it all together P Value sampling statistic Rule

A P value is a special type of probability: • It considers more than one outcome (one event can have more than one outcome) • Is a conditional probability Probability Values P Value • A typical probability value: 0.25 • A probability must be between 0 and 1 e.g. Probability of winning the lottery 0.0000001 yes no 0.9999999 All possible outcomes at any one time must add up to 1

Probabilities are rel. frequencies

Probability Density Function 11 The total area = 1 total 48 scores 10 9 8 7 6 Probability 5 4 Density 3 B A 2 1 0 33 37 43 47 53 57 63 67 73 77 83 87 Scores p(score<45) = area A p(score > 50) = area B Multiple outcomes at any one time P(score<45 and score >50) = Just add up the individual outcomes

Normal Distribution: 0.4 Chi-Squared Distribution: df = 9 0.10 0.3 0.08 0.2 0.06 Density 0.04 0.1 0.02 0.0 0.00 -3 -2 -1 0 1 2 3 0 5 10 15 20 25 30 x The ‘more extreme’ idea The probability of a value more extreme?

What happens if events affect each other? = Conditional Probability Example from Taylor – From patient data to medical knowledge p160 20 in a room : 8 female + 12 male 4 of which have a beard P(bearded) = 4/20 = 0.2 P(male) = 12/20 = .6 So does the probability of being a bearded male = 0.2 x 0.6 = 0.12 NO Multiple each branch of the tree to get end value P(bearded|male) P(Male AND bearded) = 0.6 x 0.3333 = 0.2 4/12 = .3333 P(male) 12 12/20 = .6 20 P(clear|male) P(bearded AND male) = P(male) x P(bearded| male) 8/20 = .4 P(female) 8

Screening Example 0.1% of the population (i.e 1 in a thousand) carry a particular faulty gene. A test exists for detecting whether an individual is a carrier of the gene. In people who actually carry the gene, the test provides a positive result with probability 0.9 90% of the time we get the correct result In people who don’t carry the gene, the test provides a positive result with probability 0.01. 1% of the time we get a incorrect positive result Let G = person carries gene P = test is positive for gene N = test is negative for gene Given that someone has a positive result, find the probability that they actually are a carrier of the gene. We want to find Need P(P) looking at the two P(P) branches P(P) = P(G and P) + P(G' and P) = 0.0009 + 0.00999 = 0.01089 P( P | G) Errors P(P | G) ≠ P (G | p) ORDER MATTERS

Disease / Test = Conditional Probability P(test+|disease) Disease X AND test+ P(disease)

observed | hypothesised The probability of obtaining the hypothesised value GIVEN THAT we obtained the summary value x X Hypothesised value Summary value=x P(hypothesised value|summary value=x) • The probability of obtaining summary value x GIVEN THAT I have this hypothesised value summary value=x Hypothesised value P(summary value=x|hypothesised value)

Chi-Squared Distribution: df = 9 0.10 0.08 0.06 Density 0.04 0.02 0.00 0 5 10 15 20 25 30 Combining conditional probability + multiple outcomes = P value Here we have a probability distribution of possible observed values for the chi-square summary statistic GIVEN THAT The hypothesised value is ZERO A P value is a conditional probability considering a range of outcomes The blue bit presents all those values greater than 15 • 0.0909 Area = 0.0909 This is the P value P value = P(observed chi square value or one more extreme |value = 0)

Probability summary • All outcomes at any one time add up to 1 • Probability histogram = area under curve =1 • -> specific areas = sets of outcomes • “More extreme than x” • Conditional probability –– ORDER MATTERS • A P value is a conditional probability which considers a range of outcomes

probability Putting it all together P Value sampling statistic Rule

Populations and samples Ever constant at least for your study! = Parameter estimate = statistic

One sample

Size matters – single samples

Size matters – multiple samples

We only have a rippled mirror

Standard deviation - individual level Area! Wait and see But does not take into account small sample size = t distribution = measure of variability 'Standard Normal distribution' Area: 95% 68% Total Area = 1 Defined by sample size aspect ~ df SD value = 2 1 0 Between + and - three standard deviations from the mean = 99.7% of area Therefore only 0.3% of area(scores) are more than 3 standard deviations ('units') away. -

Sampling level -‘accuracy’ of estimate Talking about means here We can predict the accuracy of your estimate (mean) by just using the SEM formula. From a single sample = 5/√5 = 2.236 SEM = 5/√25 = 1 From: http://onlinestatbook.com/stat_sim/sampling_dist/index.html

Example - Bradford Hill, (Bradford Hill, 1950 p.92) • mean systolic blood pressure for 566 males around Glasgow = 128.8 mm. Standard deviation =13.05 • Determine the ‘precision’ of this mean. • “We may conclude that our observed mean may differ from the true mean by as much as ± 2.194 (.5485 x 4) but not more than that in around 95% of observations. page 93. [edited]

Sampling summary • The SEM formula allows us to: • predict the accuracy of your estimate ( i.e. the mean value of our sample) • From a single sample • Assumes Random sample

Variation what have we ignored! Onto Probability now

sampling Putting it all together P Value probability statistic Rule

Statistics • Summary measure – SEM, Average etc • T statistic – different types, simplest: So when t = 0 means 0/anything = estimated and hypothesised population mean are equal So when t = 1 observed different same as SEM So when t = 10 observed different much greater than SEM

T statistic example Serum amylase values from a random sample of 15 apparently healthy subjects. The mean = 96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) A population value = the null hypothesis This looks like a rare occurrence? But for what

9.037 n =15 t density: s = x 96 Original units: 120 Shaded area=0.0188 0 2.656 0 -2.656 t Given that the sample was obtained from a population with a mean of 120 a sample with a T(n=15) statistic of -2.656 or 2.656 or one more extreme will occur 1.8% of the time = just under two samples per hundred on average. . . . . Given that the sample was obtained from a population with a mean of 120 a sample of 15 producing a mean of 96 (120-x where x=24) or 144 (120+x where x=24) or one more extreme will occur 1.8% of the time, that is just under two samples per hundred on average. What does the shaded area mean! Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) But it this not a P value p = 2 · P(t(n−1) < t| Ho is true) = 2 · [area to the left of t under a t distribution with df= n − 1]

P value and probability for t statistic p value = 2 x P(t(n-1) values more extreme than t(n-1) | Ho is true) = 2 · [area to the left of t under a t distribution with n − 1 shape] A p value is a special type of probability with: Multiple outcomes + conditional upon the specified parameter value

sampling Putting it all together P Value probability statistic Rule Do we need it!

9.037 n =15 t density: s = x 96 Original units: 120 Shaded area=0.0188 0 2.656 0 -2.656 t Say one in twenty 1/20 = Or 1/100 Or 1/1000 or . . . . Rules Set a level of acceptability = critical value (CV)! If our result has a P value of less than our level of acceptability. Reject the parameter value. Say 1 in 20 (i.e.CV=0.5) Given that the sample was obtained from a population with a mean (parameter value) of 120 a sample with a T(n=15) statistic of -2.656 or 2.656 or one more extreme with occur 1.8% of the time, This is less than one in twenty therefore we dismiss the possibility that our sample came from a population mean of 120 . . . . What do we replace it with?

Fisher – only know and only consider the model we have i.e. The parameter we have used in our model – when we reject it we accept that any value but that one can replace it. Neyman and Pearson + Gossling Must have an alternative specified value for the parameter

Power – sample size • Affect size • – indication of clinical importance: If there is an alternative - what is it – another distribution! Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted)

α = the reject region = 96 = 120 Correct decisions incorrect decisions

Insufficient power – never get a significant result even when effect size large Too much power get significant result with trivial effect size

Life after P values • Confidence intervals • Effect size • Description / analysis • Bayesian statistics - qualitative approach by the back door! • Planning to do statistics for your dissertation? • see: My medical statistics courses: • Course 1: • www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html • YouTube videos to accompany course 1: • http://www.youtube.com/playlist?list=PL9F0EBD42C0AB37D0 • Course 2: • www.robin-beaumont.co.uk/virtualclassroom/stats/course2.html • YouTube videos to accompany course 2: • http://www.youtube.com/playlist?list=PL05FC4785D24C6E68

Your attitude to your data

Where do they fit in!

P Values

P Values

Presentation Transcript

p-values and Discovery

Patients as P-values Partners

P-Values

Inference: Fisher’s Exact p-values

P Values - part 2 Samples & Populations

P Values - part 3 The P value as a ‘statistic’

P Values - part 2 Samples & Populations

Use and abuse of P values

P-values for different alternatives

P Values - part 4 The P value and ‘rules’

P Values

Statistical inference: CLT, confidence intervals, p-values

Understanding P-values and Confidence Intervals

Understanding P-values and Confidence Intervals

Understanding P- values and Confidence Intervals

p-values and Discovery

P-Values and the null hypothesis…

p-values and Discovery

Understanding p-values

-log 10 (p- values )

Hypothesis Testing and P-values

Vote Counting and combined p values

P Values

P Values

Presentation Transcript

p-values and Discovery

Patients as P-values Partners

P-Values

Inference: Fisher’s Exact p-values

P Values - part 2 Samples &amp; Populations

P Values - part 3 The P value as a ‘statistic’

P Values - part 2 Samples &amp; Populations

Use and abuse of P values

P-values for different alternatives

P Values - part 4 The P value and ‘rules’

P Values

Statistical inference: CLT, confidence intervals, p-values

Understanding P-values and Confidence Intervals

Understanding P-values and Confidence Intervals

Understanding P- values and Confidence Intervals

p-values and Discovery

P-Values and the null hypothesis…

p-values and Discovery

Understanding p-values

-log 10 (p- values )

Hypothesis Testing and P-values

Vote Counting and combined p values

P Values - part 2 Samples & Populations

P Values - part 2 Samples & Populations