1.42k likes | 1.54k Vues
p. Clinical Trial Investigation Interpretation of Results “ to p or not to p ”. Ferran Torres Hospital Clínic Barcelona / Universitat Autònoma Barcelona. EMA: Scientific Advice Working Party (SAWP) Biostatistics Working Party (BSWP). p. Today ’ s talk is on statistics.
E N D
p Ferran.Torres@uab.es
Clinical Trial Investigation Interpretation of Results “to p or not to p” • Ferran Torres • Hospital Clínic Barcelona / Universitat Autònoma Barcelona. • EMA: • Scientific Advice Working Party (SAWP) • Biostatistics Working Party (BSWP). Ferran.Torres@uab.es
p Ferran.Torres@uab.es
Today’s talk is on statistics Ferran.Torres@uab.es
Statistics Considerations Ferran.Torres@uab.es
Basic statistics Why Statistics? Samples and populations P-Value Random and sistematical errors Statistical errors Sample size Confidence Intervals Interpretation of CI: superiority, non-inferiority, equivalence Ferran.Torres@uab.es
The role of statistics “Thus statistical methods are no substitute for common sense and objectivity. They should never aim to confuse the reader, but instead should be a major contributor to the clarity of a scientific argument.” The role of statistics. Pocock SJ . Br J Psychiat 1980; 137:188-190 Ferran.Torres@uab.es
Why Statistics? Variation!!!! Ferran.Torres@uab.es
Variability Ferran.Torres@uab.es
Why Statistics? • Medicine is a quantitative science but not exact • Not like physics or chemistry • Variation characterises much of medicine • Statistics is about handling and quantifying variation and uncertainty • Humans differ in response to exposure to adverse effects Example: not every smoker dies of lung cancer some non-smokers die of lung cancer • Humans differ in response to treatment Example: penicillin does not cure all infections • Humans differ in disease symptoms Example: Sometimes cough and sometimes wheeze are presenting features for asthma Ferran.Torres@uab.es
Why Statistics Are Necessary Statistics can tell us whether events could have happened by chance and to make decisions We need to use Statistics because of variability in our data Generalize: can what we know help to predict what will happen in new and different situations? Ferran.Torres@uab.es
Population and Samples Sample Population of the Study Target Population Ferran.Torres@uab.es
Extrapolation Study Results Sample Inferential analysis Statistical Tests Confidence Intervals Population “Conclusions” Ferran.Torres@uab.es
Statistical Inference Statistical Tests=> p-value Confidence Intervals Ferran.Torres@uab.es
Valid samples? Population Likely to occur Invalid Sample and Conclusions Unlikely to occur Ferran.Torres@uab.es
P-value The p-value is a “tool” to answer the question: Could the observed results have occurred by chance*? Remember: Decision given the observed results in a SAMPLE Extrapolating results to POPULATION *: accounts exclusively for the random error, not bias p < .05 “statistically significant” Ferran.Torres@uab.es
P-value: an intuitive definition • The p-value is the probability of having observed our data when the null hypothesis is true (no differences exist) • Steps: • Calculate the treatment differences in the sample (A-B) • Assume that both treatments are equal (A=B) and then… • …calculate the probability of obtaining a magnitude of at least the observed differences, given the assumption 2 • We conclude according the probability: • p<0.05: the differences are unlikely to be explained by random, • we assume that the treatment explains the differences • p>0.05: the differences could be explained by random, • we assume that random explains the differences Ferran.Torres@uab.es
Factors influencing statistical significance • Difference • Variance (SD) • Quantity of data • Signal • Noise (background) • Quantity Ferran.Torres@uab.es
Random vs Sistematic error True Value 130 150 170 01 02 03 04 05 Example: Systolic Blood Pressure (mm Hg) Systematic (Bias) Random True Value 130 150 170 01 05 02 03 04 Ferran.Torres@uab.es
Random vs Sistematic error Sample size Sample size Random Bias Ferran.Torres@uab.es
P-value • A “statistically significant” result (p<.05) tells us NOTHING about clinical or scientific importance. Only, that the results were not due to chance. A p-value does NOT account for bias only by random error STAT REPORT Ferran.Torres@uab.es
P-value • A “very low” p-value do NOT imply: • Clinical relevance (NO!!!) • Magnitude of the treatment effect (NO!!) With n or variability p • Please never compare p-values!! (NO!!!) Ferran.Torres@uab.es
RCT from a statistical point of view Treatment A Randomisation Treatment B (control) 1 homogeneous population 2 distinct populations Ferran.Torres@uab.es
RCT Sample Population Ferran.Torres@uab.es
Statistics can never PROVEanything beyond any doubt, just beyond reasonable doubt!! • … because of working with samples and random error Ferran.Torres@uab.es
Type I & II Error & Power Ferran.Torres@uab.es
Utilidad de Creer en la Existencia de Dios (según Pascal) H0: Dios No Existe H1: Dios Existe Ferran.Torres@uab.es
Type I & II Error & Power • Type I Error (a) • False positive • Rejecting the null hypothesis when in fact it is true • Standard: a=0.05 • In words, chance of finding statistical significance when in fact there truly was no effect • Type II Error (b) • False negative • Accepting the null hypothesis when in fact alternative is true • Standard: b=0.20 or 0.10 • In words, chance of not finding statistical significance when in fact there was an effect Ferran.Torres@uab.es
Sample Size • The planned number of participants is calculated on the basis of: • Expected effect of treatment(s) • Variability of the chosen endpoint • Accepted risks in conclusion ↗ effect ↘ number ↗ variability ↗ number ↗ risk ↘ number Ferran.Torres@uab.es
Sample Size ↗ effect ↘ number ↗ variability ↗ number ↗ risk ↘ number • The planned number of participants is calculated on the basis of: • Expected effect of treatment(s) • Variability of the chosen endpoint • Accepted risks in conclusion Ferran.Torres@uab.es
Sample Size ↗ effect ↘ number ↗ variability ↗ number ↗ risk ↘ number • The planned number of participants is calculated on the basis of: • Expected effect of treatment(s) • Variability of the chosen endpoint • Accepted risks in conclusion Ferran.Torres@uab.es
Interval Estimation “A probability that the population parameter falls somewhere within the interval” Sample statistic (point estimate) Confidence interval Confidence limit (lower) Confidence limit (upper) Ferran.Torres@uab.es
95%CI • Better than p-values… • …use the data collected in the trial to give an estimate of the treatment effect size, together with a measure of how certain we are of our estimate • CI is a range of values within which the “true” treatment effect is believed to be found, with a given level of confidence. • 95% CI is a range of values within which the ‘true’ treatment effect will lie 95% of the time • Generally, 95% CI is calculated as • Sample Estimate ± 1.96 x Standard Error Ferran.Torres@uab.es
Superiority study Control better Test better IC95% d < 0 - effect d = 0 No differences d > 0 + effect Ferran.Torres@uab.es
Upper equivalence boundary Lower equivalence boundary Statistically and Clinically superiority Statistical Superiority Non-inferiority Equivalence Inferiority 0 <- Treatment less effective Treatment more effective -> Treatment-Control Ferran.Torres@uab.es
Escalas de medición del efecto Riesgos Ferran.Torres@uab.es
Cálculo de RR y OR • RR ó OR > 1 • RR ó OR =1 • RR ó OR < 1 • Factor de riesgo • Ausencia de ‘efecto’ • Factor protector Ferran.Torres@uab.es
Proporción en Expuestos: 0.50 • Proporción en no Expuestos: 0.25 RR=2 No Expuestos Expuestos • Odds en Expuestos: 2/2=> 1 • Odds en no Expuestos: 1/3 OR=3 Cálculo de RR y OR Enfermos Ferran.Torres@uab.es
Seamos críticos • En ocasiones las cosas no son lo que parecen Ferran.Torres@uab.es
Seamos críticos Obtención de los resultados • ¿Es adecuada la técnica estadística utilizada? • T-Test • ANOVA de medidas repetidas Ferran.Torres@uab.es
Seamos críticos ¿Me fío del valor? • Afirmaciones sin especificación de resultados • Porcentajes sin el denominador • Medias sin intervalo de confianza Ferran.Torres@uab.es
Seamos críticos Otro ejemplo más • A un paciente se le recomienda una intervención quirúrgica y pregunta por la probabilidad de sobrevivir. • El cirujano le contesta que en las 30 operaciones que ha realizado, ningún paciente ha muerto. • ¿Qué valores de P(morir) son compatibles con esta información, con una confianza del 95%? Ferran.Torres@uab.es
Seamos críticos Solución • Límite superior del IC 95% para p=0 con n=30 Pr(X=0,n=30,ps) = 0,025 • La solución aproximada no sirve. • Solución exacta, basada en la binomial: {0; 0,116} • Incluso si la mortalidad es de un 11,6%, en 30 intervenciones no se observará ninguna muerte con Pr=0,025 Ferran.Torres@uab.es
Seamos críticos • Si se disponen de datos... • ... No se han de desperdiciar. Unos datos bien ‘torturados’ al final cantan. ¡¡¡ p<0.05 !!! Ferran.Torres@uab.es
... ¿Y lo del denominador?El famoso perro fantástico Ferran.Torres@uab.es