Michael Dennis, Ph.D., Chestnut Health Systems, Bloomington, IL

Practical Approaches for Dealing with Missing Data in Longitudinal Analyses of Adolescent Addiction Programs Michael Dennis, Ph.D., Chestnut Health Systems, Bloomington, IL Presentation at the Advisory Committee Meeting for the “Economic Evaluation Methods: Development and Applications (R01 DA018645)”. Cocunut Grove, FL, November 10-11, 2006. Preparation of this manuscript was supported by funding from the Center for Substance Abuse Treatment (CSAT Contract no. 270-2003-00006). The content of this presentation are the opinions of the author and do not reflect the views or policies of the government. Available on line at www.chestnut.org/LI/Posters or by contacting Joan Unsicker at 720 West Chestnut, Bloomington, IL 61701, phone: (309) 827-6026, fax:(309) 829-4661, e-Mail: junsicker@Chestnut.Org

This presentation provides.. • A quick review of the problems of missingness and methods of imputation based on Schafer 2002 • A summary of the practical approach chestnut uses to deal with missing data • Focus here is on the conceptual issues and actual effectiveness – not the math or computation formula per se

Types of Missingness • By design • Logical skipouts • Item missing • Wave missing • Unobserved latent constructs

Key Terms (From Rubin) • Missing Completely at Random (MCAR): No relationship to predictors or dependent variables • Missing at Random (MAR): No relationship with dependent variable (can be predicted) • Missing Not at Random (MNAR): Related to predictors and or dependent variables

Changes correlations & Relationships The Problem With Listwise Deletion (default) Each Estimate are Increasingly biased as we move away from MCAR Unstable Smaller SD inflates significance tests Loss of sample is also problematic for multivariate analyses Source: Schafer (2002)

Pair-wise • Pair-wise is particularly efficient and unbiased under the assumption of MCAR • Becomes rapidly unstable even under MAR • Often narrows covariance or variance estimates and distorts relationship in regression or structural equation model (SEM)

Hot Deck better but still biased Only models using real variance are relatively unbiased Problems with other common methods of replacement Mean Subst. Narrows Variance Reg. Est. Still Narrows Variance Source: Schafer & Graham (2002)

Examples of Predictive • Weighted hot-deck: sort people based on related variables, then randomly replace • Maximum Likelihood (ML): predict from all other available data. • Restricted Maximum Likelihood (RML): predict from all other available data within the same condition (site, time, etc) to preserve differences • Multiple imputations: Average over several imputations – a form of boot strapping that does not assume a normal distribution

Problem with these methods… • Complicated on many variables and/or for multiple analyses • All methods have unknown biases under MNAR unless there is a know a-priori basis for modeling missingness (e.g.. A common factor) • In longitudinal analysis, this includes knowing the expected trajectory over time.

Chestnut Strategy 1: Minimize it • Train, monitoring and do quality assurance to get staff to minimize data • Use simple logical skips to minimize not applicable questions and burden • Differentiate between refusals (rare), don’t knows (more common) and skip outs (common) – track and do problem solving if refusals start occurring on specific items (which is MNAR) • Put more effort into follow-up

Follow-up Rates are PRIMARILY related to effort Source: Scott (2004)

Accepting a lower follow-up rate “biases” results • The easiest to find people are different on the outcome – which is MNAR • The differences are as or larger as the treatment effects we are looking for Source: Scott (2004)

Strategy 2: Make Logical Edits • Design questionnaire so that there are clear simple logical edits with implied value • Test logic of edits (all do not work, e.g., M1) • Replace logical skip outs with implied value • Test logic of complex edits to create summary measures (all do not work, e.g.., NHSDA) • Make complex edits

Strategy 3: Replace missing data within known factors • Recall that this was one of the few ways to deal with MNAR • Know common factors should have a Cronbach’s alpha of at least .7 • Evaluate amount of missing – • by design (e.g., adding an item in a new version) is MCAR, • systematic refusal is MNAR. • Calculate scale as mean of valid items x expected number of items. (Require at least 3 valid) • Generally do above within subscale, then sum up to higher order scales

Rasch Model Demonstrating Severity of Items are NOT Equal PERSONS MAP OF ITEMS <more>|<rare> 2 TRUNCATED.### | ## | .## | . | HlthProbs .## |T 1 .## + .## S| .### | .### |S Withdrawal/ill .#### | ProbW/Law .###### | Unsafe GiveUpActs DespiteMedPsyProbs .#### | DepressedNervous NeededMoreAOD UnableCutDown 0 .###### +M .###### | ResponNotMet LargerAmnt/more .####### | .############ M| HideWhenUseAOD Fights/trouble .###### |S SpentTimeGetting .####### | .###### | ParentComplained -1 .###### + .##### |T WeeklyAOD . | .###### | .#### | . S| .###### | -2 . + .#### | .##### | -3 TRUNCATED + -4 .############ + EACH '#' is 24 Example: GAIN Substance Problems Scale (SPS) Source: Riley et al (in press)

Use of Rasch Measurement Model / Computer Adaptive Tests (CAT) models Construct validation: Comparing alternative measures to “expected” correlates Weighting items with Rasch Does a Little Better CAT can closely approximate with a fraction of items Source: Riley et al (in press)

Strategy 4: Replace structural missing data (e.g.., by site) • Where data is missing structurally by design (i.e., MCAR), use regression to impute value based on correlated factors in other sites (seeking formula with 70% or more of variance explained). • Simple regression if small percent of data (under 5%) • As the amount of missing data goes up to 15%, it is worth considering the use of ML or MI • Above 15% missing, all methods are questionable • At this point we usually have less than 1% missing within wave, but 5-20% or more by wave

Strategy 5: Replacement within wave • Identify remaining items with more than 1-2% missing and the feasibility of replacing via regression (or ML/MI) • For the rest, sort data on key dimensions of variation and do modified weighted hot deck on the 2-3 people above or below • we typically sort on a total symptom count and the baseline dependent variable within count, condition & site • Can replace with mean, median or random choice – we have found that the median was more stable because of the skewed nature of several distributions and use it by default

Understanding Multidimensional Nature can be used to Create Additional Strata for Replacement % Blue Male Sex Risk Dimension Crack Risk Male Sex Buyers High Risk Needle Sharers Female Sex Traders Needle Risk Female Sex Risk Source: Dennis et al (2001)

Important to block on Condition in Experiments or Quasi-Experiments Unrestricted replacement would average out real variance effect of experimental condition

Strategy 6: Replacement Across Waves • Create a summary measure based on the average across waves times the expected number of waves to get a total (e.g.., total days of abstinence) • Works best when most people only have 1-2 waves of several (e.g.., 4-8) missing • Above can become biased is missing data by wave is high or systematic • Can regress from first/last or all available to fill in • Need to know the expected trajectory

Special Case of A Curvilinear Trajectory Source: Godley et al (2004)

Special Case of A Curvilinear Trajectory Very Biased Source: Godley et al (2004)

Special Case of A Curvilinear Trajectory Much less biased Source: Godley et al (2004)

Strategy 7: Use of Maximum Likelihood (ML) • Where possible, use ML or Restricted ML (RML) as part of software applications like AMOS, Stata etc. • Need to evaluate how much data it is replacing • Need to be confident that it is not MAR (vs. MNAR) by virtual of small n missing, knowledge of reason, or other analyses • Restricted ML (RML) preferred to control for site, condition, and/or subject differences. Alternative: We have not used, but have been thinking about exploring some of the new methods of multiple imputation

References • Dennis, M. L., Wechsberg, W. M., McDermeit (Ives), M., Campbell, R. S., & Rasch, R.R. (2001). The correlates and predictive validity of HIV risk groups among drug users in a community-based sample: Methodological findings from a multi-site cluster analysis. Evaluation and Program Planning, 24, 187-206. • Godley, S. H., Dennis, M. L., Godley, M. D., & Funk, R. R. (2004). Thirty-month relapse trajectory cluster groups among adolescents discharged from outpatient treatment. Addiction, 99, 129-139. • Riley, B. B., Conrad, K. J., Bezruczko, N., & Dennis, M. (in press). Relative precision, efficiency and construct validity of different starting and stopping rules for a Computerized Adaptive Test: The GAIN Substance Problem Scale. Journal of Applied Measurement. • Schafer, J. L., & Graham, J. W. (2002). Missing data Our view of the state of the art. Psychological Methods, 7, 147-177. • Scott, C. K. (2004). A replicable model for achieving over 90% follow-up rates in longitudinal studies of substance abusers. Drug and Alcohol Dependence, 74, 21-36.

Michael Dennis, Ph.D., Chestnut Health Systems, Bloomington, IL