Create Presentation
Download Presentation

Download Presentation
## STATISTICS 542 Intro to Clinical Trials SURVIVAL ANALYSIS

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Survival Analysis Terminology**• Concerned about time to some event • Event is often death • Event may also be, for example 1. Cause specific death 2. Non-fatal event or death, whichever comes first death or hospitalization death or MI death or tumor recurrence**Survival Rates at Yearly Intervals**• YEARS • At 5 years, survival rates the same • Survival experience in Group A appears more favorable, considering 1 year, 2 year, 3 year and 4 year rates together**Beta-Blocker Heart Attack Trial**LIFE-TABLE CUMULATIVE MORTALITY CURVE**Survival Analysis**Discuss 1. Estimation of survival curves 2. Comparison of survival curves I. Estimation • Simple Case • All patients entered at the same time and followed for the same length of time • Survival curve is estimated at various time points by (number of deaths)/(number of patients) • As intervals become smaller and number of patients larger, a "smooth" survival curve may be plotted • Typical Clinical Trial Setting**Staggered Entry**T years 1 T years 2 Subject T years 3 T years 4 0 T 2T Time Since Start of Trial (T years) • Each patient has T years of follow-up • Time for follow-up taking place may be different for each patient**Subject**o Administrative Censoring 1 Failure 2 * • Censoring Loss to Follow-up 3 * 4 T 0 2T Time Since Start of Trial (T years) • Failure time is time from entry until the time of the event • Censoring means vital status of patient is not known beyond that point**Subject**Administrative Censoring o 1 Failure 2 * • 3 Censoring Loss to Follow-up 4 * T 0 Follow-up Time (T years)**Clinical Trial with Common Termination Date**Subject o 1 2 * • 3 • 4 o • 5 * • • 6 • • 7 • * 8 • 9 o o • 10 o * • 11 o o 0 T 2T Trial Terminated Follow-up Time (T years)**Reduced Sample Estimate (1)**Years of Cohort Follow-Up Patients I II Total Entered 100 100 200 1 Died 20 25 45 Entered 80 75 155 2 Died 20 Survived 60**Reduced Sample Estimate (2)**• Suppose we estimate the 1 year survival rate a. P(1 yr) = 155/200 = .775 b. P(1 yr, cohort I) = 80/100 = .80 c. P(1 yr, cohort II) = 75/100 = .75 • Now estimate 2 year survival Reduced sample estimate = 60/100 = 0.60 Estimate is based on cohort I only Loss of information**Actuarial Estimate (1)**• Ref: Berkson & Gage (1950) Proc of Mayo Clinic • Cutler & Ederer (1958) JCD • Elveback (1958) JASA • Kaplan & Meier (1958) JASA • - Note that we can express P(2 yr survival) as • P(2 yrs) = P(2 yrs survival|survived 1st yr) • P(1st yr survival) • = (60/80) (155/200) • = (0.75) (0.775) • = 0.58 • This estimate used all the available data**I1 I2 I3**I4 I5 t0 t1 t2 t3 t4 t5 Actuarial Estimate (2) • In general, divide the follow-up time into a series of intervals • Let pi = prob of surviving Ii given patient alive at beginning of Ii (i.e. survived through Ii -1) • Then prob of surviving through tk, P(tk)**Actuarial Estimate (3)**Ii ti-1 ti - Define the following ni = number of subjects alive at beginning of Ii (i.e. at ti-1) di = number of deaths during interval Ii li= number of losses during interval Ii (either administrative or lost to follow-up) - We know only that di deaths and losses occurred in Interval Ii**Estimation of Pi**• a. All deaths precede all losses • b. All losses precede all deaths • Deaths and losses uniform, • (1/2 deaths before 1/2 losses) • Actuarial Estimate/Cutler-Ederer • - Problem is that P(t) is a function of the interval choice. • - For some applications, we have no choice, but if we • know the exact date of deaths and losses, the • Kaplan‑Meier method is preferred.**Actuarial Lifetime Method (1)**• Used when exact times of death are not known • Vital status is known at the end of an interval period (e.g. 6 months or 1 year) • Assume losses uniform over the interval**Actuarial Lifetime Method (2)**Lifetable At Number Number Adjusted Prop Prop. Surv. Up to Interval Risk Died Lost No. At Risk Surviving End of Interval (ni) (di) (li) 0-1 50 9 0 50 41/50-0.82 0.82 1-2 41 6 1 41-1/2=40.5 34.5/40.5=0.852 0.852 x 0.82=0.699 2-3 34 2 4 34-4/2=32 30/32=0.937 0.937 x 0.699=0.655 3-4 28 1 5 28-5/2=25.5 24.5/25.5=0.961 0.961 x 0.655=0.629 4-5 22 2 3 22-3/2=20.5 18.5/20.5=0.902 0.902 x 0.629=0.567**Actuarial Survival Curve**100 80 60 40 20 0 X ___ X___ X___ X___ X___ X___ 1 2 3 4 5**Kaplan-Meier Estimate (1)(JASA, 1958)**• Assumptions • 1. "Exact" time of event is known • Failure = uncensored event • Loss = censored event • 2. For a "tie", failure always before loss • 3. Divide follow-up time into intervals such that • a. Each event defines left side of an interval • b. No interval has both deaths & losses**Kaplan-Meier Estimate (2)(JASA, 1958)**• Then ni = # at risk just prior to death at ti • Note if interval contains only losses, Pi = 1.0 • Because of this, we may combine intervals with only losses with the previous interval containing only deaths, for convenience X———o—o—o——**Estimate of S(t) or P(t)**Suppose that for N patients, there are K distinct failure (death) times. The Kaplan-Meier estimate of survival curves becomes P(t)=P (Survival t) K-M or Product Limit Estimate titi = 1,2,…,k where ni = ni-1 - li-1- di-1 li-1 =# censored events since death at ti-1 di-1 = # deaths at ti-1**Estimate of S(t) or P(t)**• Variance of P(t) Greenwood’s Formula**KM Estimate (1)**Example (see Table 14-2 in FFD) Suppose we follow 20 patients and observe the event time, either failure (death) or censored (+), as [0.5, 0.6+), [1.5, 1.5, 2.0+), [3.0, 3.5+, 4.0+), [4.8], [6.2, 8.5+, 9.0+), [10.5, 12.0+ (7 pts)] There are 6 distinct failure or death times 0.5, 1.5, 3.0, 4.8, 6.2, 10.5**KM Estimate (2)**1. failure at t1 = 0.5 [.5, 1.5) n1 = 20 d1 = 1 l1 = 1 (i.e. 0.6+) If t d [.5, 1.5), p(t) = p1 = 0.95 V [ P(t1) ] = [.95]2 {1/20(19)} = 0.0024 ^ ^**KM Estimate (4)**Data [0.5, 0.6+), [1.5, 1.5, 2.0+), 3.0 etc. 2. failure at t2 = 1.5 n2 = n1 - d1 - R1 [1.5, 3.0) = 20 - 1 - 1 = 18 d2 = 2 R2 = 1 (i.e. 2.0+) If t d [1.5, 3.0), then P(t) = (0.95)(0.89) = 0.84 V [P(t2)] = [0.84]2 { 1/20(19) + 2/18(18-2) } = 0.0068**Life Table 14.2**Kaplan-Meier Life Table for 20 Subjects Followed for One Year Interval Interval Time Number of death nj djRj [.5,1.5) 1 .5 20 1 1 0.95 0.95 0.0024 [1.5,3.0) 2 1.5 18 2 1 0.89 0.84 0.0068 [3.0,4.8) 3 3.0 15 1 2 0.93 0.79 0.0089 [4.8,6.2) 4 4.8 12 1 0 0.92 0.72 0.0114 [6.2,10.5) 5 6.2 11 1 2 0.91 0.66 0.0135 [10.5, ) 6 10.5 8 1 7* 0.88 0.58 0.0164 nj: number of subjects alive at the beginning of the jth interval dj: number of subjects who died during the jth interval Rj : number of subjects who were lost or censored during the jth interval : estimate for pj, the probability of surviving the jth interval given that the subject has survived the previous intervals : estimated survival curve : variance of * Censored due to termination of study**Survival Curve**Kaplan-Meier Estimate 1.0 o * 0.9 ^ * o * 0.8 o o * * Estimated Survival Cure [P(t)] 0.7 o o * o o 0.6 o o * o o o 0.5 0 4 6 8 10 12 2 Survival Time t (Months)**Comparison of Two Survival Curves**• Assume that we now have a treatment group and a control group and we wish to make a comparison between their survival experience • 20 patients in each group (all patients censored at 12 months) Control 0.5, 0.6+, 1.5, 1.5, 2.0+, 3.0, 3.5+, 4.0+, 4.8, 6.2, 8.5+, 9.0+, 10.5, 12+'s Trt1.0, 1.6+, 2.4+, 4.2+, 4.5, 5.8+, 7.0+, 11.0+, 12+'S**Kaplan-Meier Estimate for Treatment**1. t1 = 1.0 n1 = 20 p1 = 20 - 1 = 0.95 d1 = 1 20 l1 = 3 p(t) = .95 2. t2 = 4.5 n2 = 20 - 1 - 3 p2 = 16 - 1 =0 .94 = 16 16 d2 = 1 ^**Kaplan-Meier Estimate**1.0 o * TRT 0.9 * ^ * o * 0.8 o o * * Estimated Survival Cure [P(t)] 0.7 CONTROL o o * 0.6 o * o o o o 0.5 0 4 6 8 10 12 2 Survival Time t (Months)**Comparison of Two Survival Curves**• Comparison of Point Estimates • Suppose at some time t* we want to compare PC(t*) for the control and PT(t*) for treatment • The statistic has approximately, a normal distribution under H0 • Example:**Comparison of Overall Survival Curve**• H0: Pc(t) = PT(t) • A. Mantel-Haenszel Test • Ref: Mantel & Haenszel (1959) J Natl Cancer Inst • Mantel (1966) Cancer Chemotherapy Reports • - Mantel and Haenszel (1959) showed that a series of 2 x 2 • tables could be combined into a summary statistic • (Note also: Cochran (1954) Biometrics) • - Mantel (1966) applied this procedure to the comparison of • two survival curves • - Basic idea is to form a 2 x 2 table at each distinct death • time, determining the number in each group who were at • risk and number who died**Comparison of Two Survival Curves (1)**• Suppose we have K distinct times for a death occurring • ti i = 1,2, .., K. For each death time, • Died At Risk • at ti Alive (prior to ti) • Treatment ai bi ai + bi • Control ci di ci + di • ai + ci bi + di Ni • Consider ai, the observed number of • deaths in the TRT group, under H0**Comparison of Two Survival Curves(2)**E(ai) = (ai + bi)(ai + ci)/Ni CMantel-Haenszel Statistic**Table 14.3Comparison of Survival Data for a Control Group**and an Intervention Group Using the Mantel-Haenszel Procedure Rank Event Intervention Control Total Times j tj aj + bj ajj cj + dj cjj aj + cj bj + dj 1 0.5 20 0 0 20 1 1 1 39 2 1.0 20 1 0 18 0 0 1 37 3 1.5 19 0 2 18 2 1 2 35 4 3.0 17 0 1 15 1 2 1 31 5 4.5 16 1 0 12 0 0 1 27 6 4.8 15 0 1 12 1 0 1 26 7 6.2 14 0 1 11 1 2 1 24 8 10.5 13 0 1 8 1 1 20 • aj + bj = number of subjects at risk in the intervention group prior to the death at time tj • cj + cj = number of subjects at risk in the control group prior to the death at time tj • aj = number of subjects in the intervention group who died at time tj • cj = number of subjects in the control group who died at time tj • j = number of subjects who were lost or censored between time tj and time tj+1 • aj + cj = number of subjects in both groups who died at time tj • bj + dj = number of subjects in both groups who are at risk minus the number who died at time tj**Mantel-Haenszel Test**• Operationally • 1. Rank event times for both groups combined • 2. For each failure, form the 2 x 2 table • a. Number at risk (ai + bi, ci + di) • b. Number of deaths (ai, ci) • c. Losses (lTi, lCi) • Example (See table 14-3 FFD) - Use previous data set • Trt: 1.0, 1.6+, 2.4+, 4.2+, 4.5, 5.8+, 7.0+, 11.0+, 12.0+'s • Control: 0.5, 0.6+, 1.5, 1.5, 2.0+, 3.0, 3.5+, 4.0+, 4.8, 6.2, • 8.5+, 9.0+, 10.5, 12.0+'s**1. Ranked Failure Times - Both groups combined**0.5, 1.0, 1.5, 3.0, 4.5, 4.8, 6.2, 10.5 C T C C T C C C 8 distinct times for death (k = 8) 2. At t1 = 0.5 (k = 1) [.5, .6+, 1.0) T: a1 + b1 = 20 a1 = 0 lT1 = 0 c1 + d1 = 20 c1 = 1 lC1 = 1 1 loss @ .6+ D A R T 0 20 20 C 1 19 20 1 39 40 E(a1)= 1•20/40 = 0.5 V(a1) = 1•39 • 20 • 20 402 •39**E(a2)= 1•20**38 V(a2) = 1•37 • 20 • 18 382 •37 3. At t2 = 1.0 (k = 2) [1.0, 1.5) T: a2 + b2 = (a1 + b1) - a1 - lT1 a2 = 1.0 = 20 - 0 - 0 = 20 lT2 = 0 C. c2 + d2 = (c1 + d1) - c1 - lC1 c2 = 0 = 20 - 1 - 1 = 18 lC2 = 0 so D A R T 1 19 20 C 0 18 18 1 37 38**Eight 2x2 Tables Corresponding to the Event TimesUsed in the**Mantel-Haenszel Statistic in Survival Comparison of Treatment (T) and Control (C) Groups 1. (0.5 mo.)* D† A‡ R§ 5. (4.5 mo.)* D A R T 0 20 20 T 1 15 16 C 1 19 20 C 0 12 12 1 39 40 1 27 28 2. (1.0 mo) D A R 6. (4.8 mo.) D A R T 1 19 20 T 0 15 15 C 0 18 18 C 1 11 12 1 37 38 1 26 27 3. (1.5 mo.) D A R 7. (6.2 mo.) D A R T 0 19 19 T 0 14 14 C 2 16 18 C 1 10 11 2 35 37 1 24 25 4. (3.0 mo.) D A R 8. (10.5 mo.) D A R T 0 17 17 T 0 13 13 C 1 14 15 C 1 7 8 1 31 32 1 20 21 * Number in parentheses indicates time, tj, of a death in either group † Number of subjects who died at time tj ‡ Number of subjects who are alive between time tj and time tj+1 § Number of subjects who were at risk before the death at time tj R=D+A)**Compute MH Statistics**Recall K = 1 K = 2 K = 3 t1 = 0.5 t2 = 1.0 t3 = 1.5 D A 0 20 20 1 19 20 1 39 40 D A 1 19 20 0 18 18 1 37 38 D A 0 19 19 2 16 18 2 35 37 a. ai = 2 (only two treatment deaths) b. E(ai ) = 20(1)/40 + 20(1)/38 + 19(2)/37 + . . . = 4.89 c. V(ai) = = 2.22 d. MH = (2 - 4.89)2/2.22 = 3.76 or ZMH =**B. Gehan Test (Wilcoxon)**Ref: Gehan, Biometrika (1965) Mantel, Biometrics (1966) Gehan (1965) first proposed a modified Wilcoxon rank statistic for survival data with censoring. Mantel (1967) showed a simpler computational version of Gehan’s proposed test. 1. Combine all observations XT’s and XC’s into a single sample Y1, Y2, . . ., YNC + NT 2. Define Uijwhere i = 1, NC + NT j = 1, NC + NT -1 Yi < Yj and death at Yi Uij = 1 Yi > Yj and death at Yj 0 elsewhere 3. Define Ui i = 1, … , NC + NT**Gehan Test**• Note: • Ui = {number of observed times definitely less than i} • {number of observed times definitely greater} • 4. Define W = S Ui (controls) • 5. V[W] = NCNT • Variance due to Mantel • 6. • Example (Table 14-5 FFD) • Using previous data set, rank all observations**The Gehan Statistics, Gi involves**the scores Ui and is defined as G = W2/V(W) where W = Ui (Uis in control group only) and**Example of Gehan Statistics Scores Ui for Intervention and**Control (C) Groups Observation Ranked Definitely Definitely = Ui i Observed Time Group Less More 1 0.5 C 0 39 -39 2 (0.6)* C 1 0 1 3 1.0 I 1 37 -36 4 1.5 C 2 35 -33 5 1.5 C 2 35 -33 6 (1.6) I 4 0 4 7 (2.0) C 4 0 4 8 (2.4) I 4 0 4 9 3.0 C 4 31 -27 10 (3.5) C 5 0 5 11 (4.0) C 5 0 5 12 (4.2) I 5 0 5 13 4.5 I 5 27 -22 14 4.8 C 6 26 -20 15 (5.8) I 7 0 7 16 6.2 C 7 24 -17 17 (7.0) I 8 0 8 18 (8.5) C 8 0 8 19 (9.0) C 8 0 8 20 10.5 C 8 20 -12 21 (11.0) I 9 0 9 22-40 (12.0) 12I, 7C 9 0 9 *Censored observations**Gehan Test**• Thus W = (-39) + (1) + (-36) + (-33) + (4) + . . . . • = -87 • and V[W] = (20)(20) {(-39)2 +12 + (-36)2 + . . . } • (40)(39) • = 2314.35 • so • Note MH and Gehan not equal**Cox Proportional Hazards Model**Ref: Cox (1972) Journal of the Royal Statistical Association • Recall simple exponential S(t) = e-lt • More complicated If l(s) = l, get simple model • Adjust for covariates, x • Cox Proportional Hazards Model l(t,x) =l0(t) ebx**Cox Proportional Hazards Model**• So • S(t1,X) = • = • = • Estimate regression coefficients (non-linear estimation) b, SE(b) • Example • x1 = 1 Trt • 2 Control • x2 = Covariate 1 • indicator of treatment effect, adjusted for x2, x3 , . . . • If no covariates, except for treatment group (x1), • PHM = logrank**Homework Problem**1. Kaplan-Meier 2. Gehan-Wilcoxon 3. Mantel-Haenszel a D = drug; P = placebo b In weeks c A = alive; D = dead Source: P.B. Gregory (1974)**Survival Analysis Summary**• Time to event methodology very useful in multiple settings • Can estimate time to event probabilities or survival curves • Methods can compare survival curves • Can stratify for subgroups • Can adjust for baseline covariates using regression model • Need to plan for this in sample size estimation & overall design