1 / 57

Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 1, 2014. Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa. Objectives. Background Residuals in Cox regression The ‘STRATA’ statement in PHreg.

louis
Télécharger la présentation

Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EPI 5344:Survival Analysis in EpidemiologyTesting the Proportional Hazard AssumptionApril 1, 2014 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

  2. Objectives • Background • Residuals in Cox regression • The ‘STRATA’ statement in PHreg. • Graphical approaches to PH testing • Model-based approaches to PH testing • The ‘ASSESS’ option in SAS.

  3. Residuals for Cox (1) • Residuals in linear regression measure how far the model deviates from the true data points: • That doesn’t work for Cox because we have no ‘y’ • Two alternative types of residuals are used • Individual • Covariate-wise

  4. Residuals for Cox (2) • Individual Residuals • One residual for each time when an event happens • Three variants • Cox-Snell • Deviance • Martingale • Difference between the observed and expected number of events at each event time • These are not widely used • Will return to them when we discuss ASSESS statement

  5. Residuals for Cox (3) • Schoenfeld Residuals • One residual for each • Covariate at each • Time when at least one event happens • For each subject who has an event at that time point. • Based on the expected value for each of the covariates at every point when an event happens • Computed at each time ‘t’ when an event happens

  6. Residuals for Cox (4) • Schoenfeld Residuals • At each event point, every subject in the risk set has a probability that they would have had an event • Use these probabilities to determine the expected value of each of the covariates at that point in time • The residual for this covariate for each subject having an event at this time point is: • The difference between this expected value and the covariate for the subject who had the event.

  7. Residuals for Cox (5) • Schoenfeld Residuals (cont) • For now, assume only one event at each time point • At event time ‘t’, let subject j* be the one who had the event • Consider any subject ‘j’ who is still in the risk set at ‘t’. From earlier classes, we have:

  8. Residuals for Cox (6) • Schoenfeld Residuals (cont) • Compute the expected value for covariate ‘i’ at time ‘t’: • The Schoenfeld residual for covariate ‘I’at time ‘t’ is:

  9. Residuals for Cox (7) • Note that there is one residual for: • each covariate • at each time point • for the subject having the event • Differs from linear regression which has one residual for: • each subject • If there is more than one event at the time point • Compute one residual for each subject • Expected value is ‘0’ under PH assumption

  10. A worked example True model: ln(HR)=x1+2x2 At time ‘t’: • 3 subjects remain in the risk set • covariates are given in the table • Subject #2 has event Event

  11. Compute probability that each subject has an event at ‘t’

  12. Expected value of x1 at ‘t’ is: 0.42*0.3 + 0.31*0.4 + 0.28*0.5 = 0.39 Schoenfeld Residual for x1 at ‘t’ is: 0.4 – 0.39 = 0.01 Expected value of x2 at ‘t’ is: 0.42*0.4 + 0.31*0.2 + 0.28*0.1 = 0.258 Schoenfeld Residual for x2 at ‘t’ is: 0.2 – 0.258 = -0.058

  13. The ‘STRATA’ Statement (1) • Proc Lifetest has a ‘strata’ statement • Used to define two (or more) groups for the log-rank test. • Produces one S(t) curve for each level of the stratification variable. • Plot log(-log(S(t))) vs. ‘t’ to check PH (more later) • Phreg also has STRATA statement • Useful for ‘adjusting’ out variables which do not meet PH assumption but which aren’t of interest to us.

  14. The ‘STRATA’ Statement (2) • Effectively, fits a separate model in each stratum, with a different baseline hazard • Use Baseline to estimate S(t) in each stratum • Plot log(-log(S(t))) vs. ‘t’ in each stratum to check PH

  15. End of Background

  16. Testing PH (1) • 2graphical methods and 1 modeling approach Graphical method #1 • Plot: log(-log(S(t))) vs. log(t) • Can also plot against just ‘t’ • Consider two groups which satisfy the PH assumption.

  17. But: Take another log of both sides: So, plotting log-log curves of the 2 groups should show curves which are parallel. Can plot against ‘t’ or ‘ln(t)’

  18. Testing PH (2) • How to generate the curves? Method #1 • Use KM method in Proc LIFETEST • use STRATA statement to generate different curves for each level of the predictor • Produces one set of log(-log(S(t))) values for each set of predictor variable • Limitations • Can not adjust for other factors • Hard to use with continuous predictors.

  19. ODS graphics on; proc lifetest data=njb1 plots=(s,ls,lls); time week*arrest(0); strata fin; run; ODS graphics off;

  20. Aid No aid

  21. H(t) No aid Aid

  22. No aid Aid

  23. ODS graphics on; proc lifetest data=njb1 plots=(s,ls,lls); time week*arrest(0); strata age (20,25); run; ODS graphics off;

  24. H(t) <20 20-25 >25

  25. 20-25 <20 >25

  26. Testing PH (3) Method #2 • Use Proc PHREG with the STRATA and BASELINE statements • Using ‘baseline’ and ‘strata’ produces an estimate of S(t), ln(S(t)), etc. within each stratum, adjusted for the other variables in the model. • Variable being tested for PH goes into the STRATA statement and not in the model • Plot curves and examine

  27. PROC PHREG DATA=allison.recid; class fin; MODEL week*arrest(0)=age prio / TIES=EFRON; STRATA fin; BASELINE OUT=a SURVIVAL=s logsurv = logs loglogs = loglogs ; RUN;

  28. Testing PH (4) • Generates one S(t) curve for each stratum level. • Use these to plot ‘log(-log(S(t))’ for each group • Could you do same thing using covariate option as discussed earlier? • NO!! • Gives S(t) for level of the variable of interest • BUT based on a common baseline hazard • Means that the log(-log(S(t))) curves MUST be parallel

  29. Testing PH (5) • No need to use the ‘covariates’ option, even for categorical variables • Issue is: • are the lines are parallel. • As long as covariates are the same for both groups, it ‘works’ • ODS Graphics can produce the S(t) and H(t) plots. • ODS graphics can not produce the log(-log(S(t))) plots directly • Instead, I use SAS Graph

  30. proc sort data=a; by fin week; run; symbol1 interpol=j color=red width=6; symbol2 interpol=j color=green width=6; axis1 order=(0 to 1 by .1); axis2 logbase=10 logstyle=expand order=(1,10,100); proc gplot data=a ; plot (s)*week=fin/vaxis=axis1; plot (loglogs)*week=fin; run; proc gplot data=a ; plot (loglogs)*week=fin/ haxis=axis2; run; Gives the plot against log(t)

  31. S(t) vs. time

  32. Log(-log(S(t))) vs. time

  33. Log(-log(S(t))) vs. log(time)

  34. Second example – graphs only

  35. Testing PH (6) Graphical method #2 • Plot: Schoenfeld residuals • Do for each variable in the model • Fit a LOESS curve to each graph • Curve should be parallel to the x-axis • Departures imply PH assumption is violated • Can handle continuous predictors • Problem: Interpreting graphs is ‘tricky’ • Hard to distinguish random fluctuation from non-PH effects

  36. Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is OK Dichot. var Cont. var

  37. Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is NOT OK Dichot. var Cont. var

  38. Simulated data; 2 vars; dichotomous var; PH is NOT OK Continuous var; PH is NOT OK Dichot. var Cont. var

  39. Testing PH (7) Graphical method #3 • Uses the new SAS command: ASSESS • Easy to produce but tricky to understand • Based on Martingale residuals and a ‘counting process’ approach to survival models • Advanced ideas but, in overview: • We observe N(t): # events for subject at by time ‘t’ • Splits into two parts: • ‘process’ (based on the model hazard) • random (martingale)

  40. Testing PH (8) Graphical method #3 • Plot Martingale residuals against time (ASSESS) • Generate 1,000 simulations of the ‘hazard’ process, which meets the PH assumption (RESAMPLE) • For each, compute the Martingale residuals • Plot the observed curve and simulated curves • Kolmogorov-type test supremum gives p-value that the observed curve is ‘consistent’ with PH.

  41. ODS GRAPHICS ON; ODS rtf; PROC PHREG DATA=allison.recid; MODEL week*arrest(0)=fin age race wexp mar paro prio / TIES=EFRON; ASSESS PH / RESAMPLE; RUN; ODS rtf close; ODS GRAPHICS OFF;

  42. Testing PH (7) Analytical method • If you have Non-PH, that means • HR varies over time • Should be detectable with a time varying covariate • What covariate to use? • Can develop specific models • For screening purposes use either: • x*t • x*log(t)

  43. Testing PH (9) Analytical method (cont) • Can use either ‘t’ or ‘log(t)’ • Log(t) is usually preferred • ‘time’ can get very large • Can produce numerical problems • log(t) tends to avoid numerical problems. • PROCESS • Define time varying covariate • Defining variable in Proc step is easier • Place in model and run it • Look for statistical significance of the time varying variable

  44. ODS GRAPHICS ON; ODS rtf; PROC PHREG DATA=allison.recid; MODEL week*arrest(0)=fin age race wexp mar paro prio aget / TIES=EFRON; aget = age*log(week); RUN; ODS rtf close; ODS GRAPHICS OFF;

More Related