Statistics for Clinical Trials in Neurotherapeutics

# Statistics for Clinical Trials in Neurotherapeutics

Télécharger la présentation

## Statistics for Clinical Trials in Neurotherapeutics

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Statistics for Clinical Trials in Neurotherapeutics Barbara C. Tilley, Ph.D. Medical University of South Carolina

2. Funding: NIA Resource Center on Minority Aging 5 P30 AG21677 NINDS Parkinson’s Disease Statistical Center U01NS043127 and U01NS43128

3. Sample Size

4. Issues in Neurotherapeutics • What is the outcome? • How will this be measured • One or many measures of outcome? • How will you analyze the data? • (Nquery \$700, STPLAN free, etc.)

5. Sample Size: Putting it all together Continuous (Normal) Distribution Need all but one: , , 2, , N Z = 1.96 (2 sided, 0.05); Z = 1.645 (always one-sided, 0.05, 95% power)  = difference between means 2= pooled variance + ) s 4(Z Z 2 2 a b = 2n d 2

6. Adjusting for Drop-outs/Drop-ins • 10% dropout, increasing sample size by 10% is not enough • Use: 1/(1-R)2 Friedman, Furburg, DeMets

7. Sample Size for Multiple Primary Outcomes • Choose largest sample size for any single outcome. • If multiple aims, use largest sample size for any aim.

8. Sample Size: Food for Thought • Is detectable difference biologically/clinically meaningful? • Is sample size too small to be believable? WHERE DID YOU GET the estimate???? • Report power (for design), not conditional power for negative study.

9. Sample Size: Keeping It Small • Study continuous outcome (if variability does not increase) • Updrs Score rather “above or below cut-point” • Study surrogate outcome where effect is large • Rankin at 3 months rather than stroke mortality • Reduce variability (ANCOVA, training, equipment, choosing model)

10. Sample Size: Keeping It Small • Difference between two means = 1 • Standard deviation = 2; N = 64/group • Standard deviation = 1; N = 17/group

11. Analysis • Parametric? • Normal • Binomial • Nonparmetric? • Ranked

12. Sample Size Sample size to detect effect of size observed in NINDS t-PA Stroke Trial Barthel: • Non-parametric N = 507 • Binary N = 335 Rankin: • Non-parametric N = 394 • Binary N = 286

13. Multiple Comparisons • Different questions, can argue no adjustment (O’Brien, 1983) • Effect on blood pressure • Effect on quality of life • All pair-wise comparisons or multiple measures of same outcome, adjust • Pairwise comparisons of Drugs A, B, C (same outcome)

14. Multiple Comparisons • Bonferroni (or less conservative Simes, or Hockberg) • /#tests = 0.05/5 = 0.01 • Sample size, use adjusted  • ANOVA methods – Tukey’s, etc. • Sample size for ANOVA

15. Bonferroni for Different Primary Outcomes, Same Construct • All outcomes measure same construct • Stroke recovery • PD progression • May lack power when most measures of efficacy are improved, but no single measure is overwhelmingly so. • Problem exacerbated when outcomes are highly correlated.

16. Use Global Tests When: • No one outcome sufficient or desirable • Outcome is difficult to measure and combination of correlated outcomes useful

17. Properties of Global Test • If all outcome measures perfectly correlated, • test statistic, p-value same as for single (univariate) test • power = power of univariate test • Assumes common dose effect • Power increases as correlation among outcomes decreases

18. O’Brien’s Non-parametric Procedure (Biomet., 1984) • Separately rank each outcome in the two treatment groups combined. • Sum ranks for each subject. • Compare mean ranks in the two treatment groups using • Wilcoxon or t-test • ANOVA if more than two treatments

19. Sample Size forGlobal Test • Use largest sample size for single outcome

20. NINDS t-PA Stroke Trial Binary Outcomes (Part II)

21. NINDS t-PA Trial Observed Agreement & Correlations for Binary Outcomes

22. Randomization

23. Randomization • Stratification • Age, prior stroke, years with PD, site • Greatest gain if N < 20 • Too many strata, difficult to balance • 3 age x 2 years with PD x gender = 12 • Blocking – balance number in each treatment group • Important if number expected per site is small • Minimization • Can be complicated to implement, cause delays

24. Interim Analyses • Who? • Why? • When? • How?

25. Stopping “Guidelines” 5.0 3.0 2.0 -2.0 -3.0 -5.0 Reject Ho • O’Brien-Fleming • Pocock • Peto Continue Fail to Reject Ho 0 Standard Normal Statistic (Zi) Reject Ho # Looks 1 2 3 4 5

26. Intent-to-Treat (ITT) Intent-to-treat means analyzing ALL patients as randomized. • Patients lost to follow-up (LTF) • Patients who do not adhere to treatment • Patients who were randomized and did not receive treatment • Patients incorrectly randomized

27. Imputation • Definition - replacing a value for those lost to follow-up or not adhering. • Imputation may or may not be ITT.

28. Optimal Approach MAKE IMPUTATION UNECESSARY!

29. Optimal Approach Continued • Make follow-up a high priority • Monitor follow-up closely • Build in patient incentives • “gifts” for patients (t-shirts, mugs, etc.) • free parking, meal ticket • Transportation • Follow even those off treatment

30. Hypertension Detection and Follow-up Program/MRFIT • Outcome was mortality • HDFP 21/10,940 • MRFIT 30/12,866 • Used Death Index, Social Security, detectives

31. NINDS t-PA Stroke Trial • Four 3-month outcomes • Barthel,NIHSS,GOS, Rankin • NINDS Project Officer pushed for complete ascertainment • Study staff made house calls, searched medical records • 5/612 (<1%) lost to follow-up on at least one of the four outcome measures • Used worst value possible

32. NET-PD Futility StudiesLTF for 1-year outcome(Used worst outcome in assigned group) • FS-1 3/200 • Creatine 2 • Minocycline 0 • Placebo 1 • FS-2 4/213 • GPI 3 • CoQ10 1 • Placebo 0

33. Handling Missing Values • Why? • How?

34. When Data Are Missing:Common Approaches

35. Subgroup Analyses (Sub-set) • Pre-specified based on rationale • NINDS t-PA Stroke Trial • Those randomized 0-90 minutes and 91-180 minutes from stroke onset • Post-hoc in the presence of interaction • (Yusuf, 1991)

36. Subgroup Analyses • The more subgroups examined, the more likely analyses will lead to finding a difference by chance alone. • 10 mutually exclusive subgroups; • 20% chance that in one group the treatment will be better than control and that the converse will be true in another

37. Example of Interaction (Effect Modification)

38. Example of Interaction(Effect Modification)

39. Lack of Interaction

40. Trial of Org10172 for Stroke (TOAST) Trial N = 379(M) 238 (F) N=372(M) 239 (F) Test for interaction p = 0.251

41. Pooled AnalysisCarotid Endarterectomy Rothwell, 2004 NASCET &ECST N (men) 4175 N(women) 1718 Test for interaction p = 0.007 (Cox model)

42. Pooled Analysis ECASS, Atlantis, NINDSKent 2005 N (men) 4175 N(women) 1718 Test for interaction p = 0.04 (logistic model)

43. References • Rubin, DB. More powerful randomization-based p-values in double blind trials with non-compliance. Statistics in Medicine (1998) 17:317-385. • Little R, Yau L. Intent-to-treat analysis for longitudinal studies with drop-outs. Biometrics (1996) 52:1324-1333. • NINDS t-PA Stroke Trial Study Group. Tissue Plasminogen Activator for Acute Stroke (1995) 333:1581-1587. • Curb JD, et al. Ascertainment of vital status through the national death index and social security administration. A J Epi (1985)121:754-766. • Multiple Risk Factor Intervention Trial Research Group. Multiple risk factor intervention trial: risk factor changes and mortality results. JAMA (1982) 248:1466-77.

44. EXTRA slides not presented

45. Completers • Retain only those patients who remain on treatment • Was used frequently in past in trials in rheumatoid arthritis • Not intent-to-treat • Obvious potential for bias • patients not responding to treatment drop-out

46. Last Observation Carried Forward • For those missing a final value, use most recent previous observation. • Potential for bias in disease with downward course

47. Worst case • Replace missing values with worst outcome • assumes that those who are lost to follow-up were not successfully treated • generally variance is not inflated • could inflate or deflate differences

48. Best Case/Worst Case • Replace missing values in treatment group by worst outcome and missing values in comparison group with best outcome. • Rarely used • Generally overly conservative as both treatment and placebo group drop-out for lack of efficacy.