Nonparametric Estimation with Recurrent Event Data
290 likes | 463 Vues
Nonparametric Estimation with Recurrent Event Data. Talk at USC Epid/Biostat 03/04/02. Edsel A. Pena Department of Statistics, USC. Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State). Research supported by NIH and NSF Grants.
Nonparametric Estimation with Recurrent Event Data
E N D
Presentation Transcript
Nonparametric Estimation with Recurrent Event Data Talk at USC Epid/Biostat 03/04/02 Edsel A. Pena Department of Statistics, USC Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State) Research supported by NIH and NSF Grants
A Real Recurrent Event Data(Source: Aalen and Husebye (‘91), Statistics in Medicine) Variable:Migrating motor complex (MMC) periods, in minutes, for 19 individuals in a gastroenterology study concerning small bowel motility during fasting state.
t-S3=4 T2=59 T1=284 T3=186 S2=343 t=533 S3=529 S1=284 Calendar Scale Pictorial Representation of Data for a Subject • Consider subject #3. • K = 3 • Inter-Event Times, Tj: 284, 59, 186 • Censored Time, t-SK: 4 • Calendar Times, Sj: 284, 343, 529 • Limit of Observation Period: t = 533 0
Features of Data Set • Random observation period per subject (administrative, time, resource constraints). • Length of period: t • Event of interest is recurrent. • # of events (K) informative about MMC period distribution (F). • Last MMC period right-censored by which is informative about F. • Calendar times: S1, S2, …, SK.
Assumptions and Problem • Aalen and Husebye: “Consecutive MMC periods for each individual appear (to be) approximate renewal processes.” • Translation: The inter-event times Tij’s are assumed stochastically independent. • Problem: Under this IID or renewal assumption, and taking into account the informativeness of K and the right-censoring mechanism, how to estimate the inter-event distribution, F and its parameters (e.g., median).
General Form of Data Accrual Calendar Times of Event Occurrences Si0=0 and Sij = Ti1 + Ti2 + … + Tij Number of Events in Observation Period Ki = max{j: Sij<ti} Limits of observation periods, ti’s, may be fixed, or assumed IID with unknown distribution G.
Observables Problem Obtain a nonparametric estimator of the inter-event time distribution, F, and determine its properties.
Relevance and Applicability • Recurrent phenomena occur in public health/medical settings. • Outbreak of a disease. • Hospitalization of a patient. • Tumor occurrence. • Epileptic seizures. • In other settings. • Non-life insurance claims. • Stock index (e.g., Dow Jones) decreases by at least 6% in a day. • Labor strikes. • Reliability and engineering.
Existing Methods and Their Limitations • Consider only the first, possibly right-censored, observation per unit and use the product-limit estimator (PLE). • Loss of information • Inefficient • Ignore the right-censored last observation, and use empirical distribution function (EDF). • Leads to bias (“biased sampling”) • Estimator inconsistent
Review: Complete Data • T1, T2, …, Tn IID F(t) = P(T < t) Empirical Survivor Func. (EDF) Asymptotics of EDF
In Hazards View Hazard rate function Cumulative hazard function Product representation
Alternative Representations EDF in Product Form Another representation of the variance
Review: Right-Censored Data • Failure times: T1, T2, …, Tn IID F • Censoring times: C1, C2, …, Cn IID G • Right-censored data (Z1, d1), (Z2, d2), …, (Zn, dn) Zi = min(Ti, Ci) di = I{Ti< Ci} Product-Limit Estimator
PLE Properties • Asymptotics of PLE • If G(w) = 0, so no censoring,
Recurrent Event Setting • Ni(s,t) = number of events for the ith unit in calendar period [0,s] with inter-event times at most t. • Yi(s,t) = number of events for the ith unit which are known during the calendar period [0,s] to have inter-event times at least t. • Ki(s) = number of events for the ith unit that occurred in [0,s]
For MMC Unit #3 Inter-Event Time t=100 t=50 284 343 529 =533 Calendar Time s=550 s=400 K3(s=400) = 2 N3(s=400,t=100) = 1 Y3(s=400,t=100) = 1 t s K3(s=550) = 3 N3(s=550,t=50) = 0 Y3(s=550,t=50) = 3
Aggregated Processes Limit Processes as s Increases
Estimator of F in Recurrent Event Setting Estimator is called the GPLE; generalizes the EDF and the PLE discussed earlier.
Asymptotic Distribution of GPLE For large n and regularity conditions, with variance function
EDF: • PLE: • GPLE (recurrent event): Variance Functions: Comparisons • If in stationary state, R(t) = t/mF, so mG(w) = mean residual life of t at w.
Wang-Chang Estimator (JASA, ‘99) • Can handle correlated inter-event times. Comparison with GPLE may be unfair to WC estimator under the IID setting!
Frailty (“Random Effects”) Model • U1, U2, …, Un are IID unobservedGamma(a, a) random variables, called frailties. • Given Ui = u, (Ti1, Ti2, Ti3, …) are independent inter-event times with • Marginal survivor function of Tij:
Estimator under Frailty Model • Frailty parameter, a: dependence parameter. • Small (Large) a: Strong (Weak) dependence. • EM algorithm needed to obtain the estimator (FRMLE). Unobserved frailties viewed as missing.GPLE needed in EM algorithm. • Under frailty model: GPLE not consistent when frailty parameter is finite.
Comparisons of Estimators • Under gamma frailty model. • F = EXP(q): q = 6 • G = EXP(h): h = 1 • n = 50 • # of Replications = 1000 • Frailty parameter a took values: { (IID Model), 6, 2} • Computer programs: S-Plus and Fortran routines.
Simulated Comparison of the Three Estimators for Varying Frailty Parameter Black=GPLE; Blue=WCPLE; Red=FRMLE IID
Effect of the Frailty Parameter (a) for Each of the Three Estimators (Black=Infty; Blue=6; Red=2)
The Three Estimates of MMC Period Survivor Function IID assumption seems acceptable. Estimate of frailty parameter a is 10.2.