Fixed Versus Random Effects Models for Multilevel and Longitudinal Data Analysis

Fixed Versus Random Effects Models for Multilevel and Longitudinal Data Analysis Ashley H. Schempf, PhD MCH Epidemiology Training Course June 1, 2012

Outline • Clustered Data • Fixed Effects Models • Random Effects Models • GEE Models • Hybrid Models • Applied Examples • Homework Assignment

Clustered Data • Involves nesting/clustering of observations or data points • Multilevel—clustering over space • Panel/Longitudinal—clustering over time A B C D Neighborhood j 1 2 3 4 5 6 7 8 9 10 11 12 Individual i Time t repeated measurements A 1 2 3 Individual/ B 1 2 3 Unit i C 1 2 3 D 1 2 3

Unique features • Correlation of data within clusters • Violation of independence; as a modeling assumption errors must be independent • Complexity/redundancy must be accounted for • Variation at multiple levels allows richer examination and distinction of effects • Neighborhood/family versus individual effects • Cross-sectional (selection) versus longitudinal (causation) • A lot of bias can be introduced with single-level data (omitted variables, selection) • Racial disparities when contextual differences aren’t examined (neighborhood level data omitted) • Association between dieting and weight (longitudinal data omitted)

Between versus Within Cluster Effects • Factors that only vary between clusters are cluster level effects • Multilevel: walkability, crime level • Longitudinal: race/ethnicity, sex • However, any factor that varies within cluster can also vary between cluster • Multilevel: income/poverty, race • Individual-level and neighborhood aggregated (e.g. % poverty, % black) • Longitudinal: smoking, activity, diet • At each time point but also averaged for an individual (e.g. average activity level over time)

Between versus Within Cluster Effects • In multilevel cases, we may care about both between and within-cluster effects • Contextual effect of living in more versus less segregated neighborhoods (% Black) • Individual effect of race/ethnicity • In longitudinal cases, the between-cluster effects of within-cluster variables tend to represent confounded cross-sectional inference • Comparing a person who smokes to one who doesn’t • Comparing outcomes within a person when they smoke and after they quit

Handling Clustered Data • Many ways of accounting for complex errors and violation of non-independence • Robust SEs, random effects, GEE, survey analysis • Many ways of disentangling between and within-cluster effects • Fixed effects, hybrid models • The correct choice lies in your purpose

Applied Data Example • To demonstrate these options, I’ll use a dataset of birth certificate information from two counties in North Carolina • Multilevel data structure: births nested within neighborhoods (Census block groups) • Covariate of interest: race (Black-White) • Continuous and dichotomous outcome: gestational age and preterm birth (<37 weeks) Schempf AH, Kaufman JS. Accounting for context in studies of health inequalities: a review and comparison of approaches. Ann Epidemiol. forthcoming Schempf AH, Kaufman JS, Messer LC, Mendola P. The neighborhood contribution to black-white perinatal disparities: an example from two north Carolina counties, 1999-2001. Am J Epidemiol. 2011;174(6):744-52.

Fixed Effects • Account for all cluster-level variation by holding cluster constant • All inference is therefore within-cluster • Can be implemented either by • entering dummy variables for n-1 clusters • conditional approach • Continuous outcome: “de-meaning” or subtracting the cluster means from all variables before running model • Binary outcome: conditional logistic regression

regga_clean race i.b_group i.b_group_Ib_group_1-392 (_Ib_group_1 for b_g~p==370630001011 omitted) note: _Ib_group_51 omitted because of collinearity note: _Ib_group_155 omitted because of collinearity Source | SS df MS Number of obs = 31489 -------------+------------------------------ F(390, 31098) = 2.86 Model | 5064.7562 390 12.9865543 Prob > F = 0.0000 Residual | 141130.634 31098 4.53825437 R-squared = 0.0346 -------------+------------------------------ Adj R-squared = 0.0225 Total | 146195.391 31488 4.64289223 Root MSE = 2.1303 ------------------------------------------------------------------------------ ga_clean | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.4692281 .032978 -14.23 0.000 -.5338663 -.40459 _Ib_group_2 | .0241545 .3980305 0.06 0.952 -.7560012 .8043102 _Ib_group_3 | .5450826 .3702998 1.47 0.141 -.1807199 1.270885 _Ib_group_4 | -.1346887 .4903915 -0.27 0.784 -1.095876 .8264983 _Ib_group_5 | .435551 .4606856 0.95 0.344 -.4674114 1.338513 _Ib_group_6 | -.2657666 .5106346 -0.52 0.603 -1.266631 .7350977 _Ib_group_7 | .3649276 .4704894 0.78 0.438 -.5572506 1.287106 _Ib_group_8 | -.6098332 .5706181 -1.07 0.285 -1.728268 .5086011 _Ib_group_9 | .30786 .6502678 0.47 0.636 -.9666911 1.582411 _Ib_group_10 | .1086299 .5707573 0.19 0.849 -1.010077 1.227337 ….↓ Accounting for neighborhood differences (within-neighborhood inference), Black infants are delivered -.47 weeks earlier than White infants

procglmdata=nc.data_final; classb_group ; modelga_clean= race b_group /clparmsolution; quit; R-Square CoeffVar Root MSE ga_clean Mean 0.034644 5.476830 2.130318 38.89692 Source DF Type I SS Mean Square F Value Pr > F race 1 2585.192623 2585.192623 569.64 <.0001 b_group 389 2479.563572 6.374199 1.40 <.0001 Source DF Type III SS Mean Square F Value Pr > F race 1 918.775440 918.775440 202.45 <.0001 b_group 389 2479.563572 6.374199 1.40 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 39.45939128 B 0.32492837 121.44 <.0001 race -0.46922813 0.03297796 -14.23 <.0001 b_group 370630001011 -0.67109834 B 0.45201887 -1.48 0.1376 b_group 370630001012 -0.64694387 B 0.40694049 -1.59 0.1119 b_group 370630001021 -0.12601570 B 0.37957976 -0.33 0.7399 b_group 370630002001 -0.80578708 B 0.49753595 -1.62 0.1053 b_group 370630002002 -0.23554737 B 0.46848328 -0.50 0.6151 Parameter 95% Confidence Limits Intercept 38.82251860 40.09626396 race -0.53386626 -0.40459000 b_group 370630001011 -1.55707353 0.21487684 b_group 370630001012 -1.44456361 0.15067587 b_group 370630001021 -0.87000731 0.61797592 b_group 370630002001 -1.78097758 0.16940342 Accounting for neighborhood differences (within-neighborhood inference), Black infants are delivered -.47 weeks earlier than White infants

xtregga_clean race, i(b_group) fe Fixed-effects (within) regression Number of obs = 31489 Group variable: b_group Number of groups = 390 R-sq: within = 0.0065 Obs per group: min = 1 between = 0.3141 avg = 80.7 overall = 0.0177 max = 652 F(1,31098) = 202.45 corr(u_i, Xb) = 0.2308 Prob > F = 0.0000 ------------------------------------------------------------------------------ ga_clean | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.4692281 .032978 -14.23 0.000 -.5338663 -.40459 _cons | 39.04992 .0161171 2422.89 0.000 39.01833 39.08151 -------------+---------------------------------------------------------------- sigma_u | .39512117 sigma_e | 2.1303179 rho | .03325698 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(389, 31098) = 1.40 Prob > F = 0.0000 Same result without the fixed coefficient output for all the clusters

procglmdata=nc.data_final; absorbb_group; modelga_clean = race; quit; The GLM Procedure Dependent Variable: ga_clean Sum of Source DF Squares Mean Square F Value Pr > F Model 390 5064.7562 12.9866 2.86 <.0001 Error 31098 141130.6344 4.5383 Corrected Total 31488 146195.3906 R-Square CoeffVar Root MSE ga_clean Mean 0.034644 5.476830 2.130318 38.89692 Source DF Type I SS Mean Square F Value Pr > F b_group 389 4145.980755 10.658048 2.35 <.0001 race 1 918.775440 918.775440 202.45 <.0001 Source DF Type III SS Mean Square F Value Pr > F race 1 918.7754401 918.7754401 202.45 <.0001 Standard Parameter Estimate Error t Value Pr > |t| race -.4692281302 0.03297796 -14.23 <.0001

Comparison to Conventional Regression Use cluster-robust SEs to account for complex error (individual and cluster) regga_clean race, vce(cl b_group) Linear regression Number of obs = 31489 F( 1, 389) = 351.42 Prob > F = 0.0000 R-squared = 0.0177 Root MSE = 2.1356 (Std. Err. adjusted for 390 clusters in b_group) ------------------------------------------------------------------------------ | Robust ga_clean | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.6112235 .0326053 -18.75 0.000 -.6753281 -.5471189 _cons | 39.09623 .014152 2762.59 0.000 39.0684 39.12405 ------------------------------------------------------------------------------ Crude effect: -0.61 weeks Adjusting for neighborhood: -0.47 weeks Neighborhood explained 23% of the racial disparity (assuming there are no confounders of neighborhood)

Can use surveyreg for cluster-robust SEs in SAS procsurveyregdata=nc.data_final; clusterb_group; classb_group ; modelga_clean= race /clparmsolution; run; The SURVEYREG Procedure Regression Analysis for Dependent Variable ga_clean Estimated Regression Coefficients Standard 95% Confidence Parameter Estimate Error t Value Pr > |t| Interval Intercept 39.0962254 0.01415202 2762.59 <.0001 39.0684014 39.1240495 race -0.6112235 0.03260526 -18.75 <.0001 -0.6753281 -0.5471189 NOTE: The denominator degrees of freedom for the t tests is 389.

Logistic Model for Binary Outcome logitptb_totali.racei.b_group, or Logistic regression Number of obs = 31157 LR chi2(367) = 687.08 Prob > chi2 = 0.0000 Log likelihood = -9091.5835 Pseudo R2 = 0.0364 ------------------------------------------------------------------------------ ptb_total | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 1.614217 .0849715 9.10 0.000 1.455979 1.789652 _Ib_group_2 | .7499313 .4644189 -0.46 0.642 .2227859 2.524383 _Ib_group_3 | .7817543 .4527046 -0.43 0.671 .2512752 2.432154 _Ib_group_4 | 1.837389 1.208127 0.93 0.355 .506426 6.66632 _Ib_group_5 | 1.012978 .6832426 0.02 0.985 .2700685 3.799497 _Ib_group_6 | 1.871619 1.284711 0.91 0.361 .4874589 7.186159 _Ib_group_7 | .6577316 .504969 -0.55 0.585 .1460644 2.961781 _Ib_group_8 | 4.113144 2.752025 2.11 0.035 1.108285 15.26498 _Ib_group_9 | .6818469 .7794273 -0.34 0.738 .0725551 6.407749 _Ib_group_10 | 1.130029 1.000033 0.14 0.890 .1994381 6.40281 Accounting for neighborhood differences (within-neighborhood inference), the odds of PTB are 1.61 times greater for Black than White infants 23 clusters with 322 observations were dropped because of non-varying outcomes—all 0 or 1—division by 0 for an OR

margins race, vce(unconditional) post Predictive margins Number of obs = 31157 Expression : Pr(ptb_total), predict() (Std. Err. adjusted for 367 clusters in b_group2) ------------------------------------------------------------------------------ | Unconditional | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 0 | .0753146 .0021845 34.48 0.000 .071033 .0795962 1 | .1154796 .0039339 29.35 0.000 .1077693 .12319 ------------------------------------------------------------------------------ . lincom _b[1.race] - _b[0.race] Risk Difference ( 1) - 0bn.race + 1.race = 0 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | .040165 .0046569 8.62 0.000 .0310377 .0492924 ------------------------------------------------------------------------------ . nlcom _b[1.race] / _b[0.race] Risk Ratio _nl_1: _b[1.race] / _b[0.race] ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _nl_1 | 1.533297 .0713801 21.48 0.000 1.393394 1.673199 ------------------------------------------------------------------------------

proclogisticdata=nc.data_final; classb_group ; modelptb_total (desc)= race b_group; run; Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits race 1.614 1.456 1.790 b_group 370630001011 vs 371830544023 2.118 0.387 11.580 b_group 370630001012 vs 371830544023 1.588 0.314 8.034 b_group 370630001021 vs 371830544023 1.656 0.347 7.897 b_group 370630002001 vs 371830544023 3.891 0.727 20.828 b_group 370630002002 vs 371830544023 2.145 0.390 11.787 b_group 370630002003 vs 371830544023 3.964 0.709 22.160 b_group 370630003011 vs 371830544023 1.393 0.219 8.850 b_group 370630003012 vs 371830544023 8.710 1.599 47.453 b_group 370630003013 vs 371830544023 1.444 0.120 17.316 b_group 370630003021 vs 371830544023 2.393 0.311 18.388 b_group 370630003022 vs 371830544023 1.250 0.198 7.881 b_group 370630003023 vs 371830544023 2.515 0.431 14.684 b_group 370630004011 vs 371830544023 1.409 0.187 10.615 b_group 370630004012 vs 371830544023 2.206 0.347 14.033 …↓ Would need to use SUDAAN or binomial/poisson models for RD or RR in SAS

xtlogitptb_total race, i(b_group) fe or clogitptb_total race, group(b_group) vce(clb_group) or note: multiple positive outcomes within groups encountered. note: 23 groups (332 obs) dropped because of all positive or all negative outcomes. Iteration 0: log pseudolikelihood = -8466.3107 Iteration 1: log pseudolikelihood = -8456.9979 Iteration 2: log pseudolikelihood = -8456.9975 Conditional (fixed-effects) logistic regression Number of obs = 31157 Wald chi2(1) = 84.25 Prob > chi2 = 0.0000 Log pseudolikelihood = -8456.9975 Pseudo R2 = 0.0047 (Std. Err. adjusted for 367 clusters in b_group) ------------------------------------------------------------------------------ | Robust ptb_total | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.race | 1.604089 .0825837 9.18 0.000 1.450126 1.774398 ------------------------------------------------------------------------------ The conditional approach is recommended for non-linear models because of the incidental parameters problem with dummy variables, leading to upward bias. Mainly a problem for small clusters so not too different in this sample (avg cluster size ~80) 1.61 versus 1.60

proclogisticdata=nc.data_final; stratab_group; modelptb_total (desc) = race; run; The LOGISTIC Procedure Conditional Analysis Model Fit Statistics Without With Criterion Covariates Covariates AIC 16994.399 16915.995 SC 16994.399 16924.352 -2 Log L 16994.399 16913.995 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 80.4035 1 <.0001 Score 82.2342 1 <.0001 Wald 81.7008 1 <.0001 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq race 1 0.4726 0.0523 81.7008 <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits race 1.604 1.448 1.777

Comparison to Conventional Regression logitptb_total race, vce(clb_group) or Logistic regression Number of obs = 31489 Wald chi2(1) = 278.47 Prob > chi2 = 0.0000 Log pseudolikelihood = -9312.2722 Pseudo R2 = 0.0163 (Std. Err. adjusted for 390 clusters in b_group) ------------------------------------------------------------------------------ | Robust ptb_total | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 2.02922 .0860517 16.69 0.000 1.86738 2.205085 ------------------------------------------------------------------------------ Crude OR: 2.03 Adjusting for neighborhood: 1.60 Neighborhood explained ~40% of the racial disparity in PTB (assuming there are no confounders of neighborhood) N.B. For percent change in OR, you always need to subtract the null (1.0) first (0.6-1.03)/1.03 = -.41 or a drop of 41% after controlling for contextual differences

procsurveylogisticdata=nc.data_final; clusterb_group; modelptb_total (desc)= race; run; Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 308.0644 1 <.0001 Score 324.9556 1 <.0001 Wald 278.4631 1 <.0001 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -2.6016 0.0285 8352.0534 <.0001 race 1 0.7077 0.0424 278.4631 <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits race 2.029 1.867 2.205 Association of Predicted Probabilities and Observed Responses Percent Concordant 32.9 Somers' D 0.167 Percent Discordant 16.2 Gamma 0.340 Percent Tied 50.8 Tau-a 0.027 Pairs 80536248 c 0.584

Fixed Effects: Benefits & Disadvantages • Benefits: • Provides within-cluster effects that are not confounded by cluster-level factors because all cluster variation is removed (accounts for unobservable confounding) • No minimum number of clusters • Disadvantages: • Does not allow estimation of observable cluster-level effects so often seen more in longitudinal analyses where between-cluster effects may not be of interest • Can be inefficient/less precise due to less degrees of freedom (each cluster counts as parameter) and it only exploits one level of variation

Random Effects • Alternative to fixed effects that models only one additional parameter (instead of k-1) by making greater assumptions • More efficient but vulnerable to bias • Average cluster-specific intercept with the cluster-level variance estimated (τ₀2) • Accounts for variability in the outcome across neighborhoods but not for covariates (corrμoj, xi = 0) • Allows estimates of variance at both levels and of cluster-level covariates because the cluster-level variance isn’t completely removed from model

xtregga_clean race, i(b_group) re mle Random-effects ML regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 80.7 max = 652 LR chi2(1) = 389.27 Log likelihood = -68563.638 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ga_clean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.5883143 .027936 -21.06 0.000 -.6430679 -.5335606 _cons | 39.07689 .0180206 2168.46 0.000 39.04157 39.11221 -------------+---------------------------------------------------------------- /sigma_u | .1364134 .021792 .0997419 .1865676 /sigma_e | 2.131475 .0085433 2.114796 2.148285 rho | .0040792 .0013011 .0021387 .0074679 ------------------------------------------------------------------------------ Likelihood-ratio test of sigma_u=0: chibar2(01)= 18.07 Prob>=chibar2 = 0.000 Cluster-specific, within-neighborhood interpretation but significantly higher than FE estimate of -0.47 Intracluster Correlation = 0.004 (proportion of variance that occurs at neighborhood level) 0.1362/(0.1362+2.132) = 0.004 - Significant neighborhood variation but a small fraction of overall variability (0.4%)

procglimmixdata=nc.data_finalmethod=quad; classb_group; modelga_clean = race /solution ; random intercept/ subject=b_group; run; The GLIMMIX Procedure Optimization Information Optimization Technique Dual Quasi-Newton Parameters in Optimization 4 Lower Boundaries 2 Upper Boundaries 0 Fixed Effects Not Profiled Starting From GLM estimates Quadrature Points 1 Covariance Parameter Estimates Standard CovParm Subject Estimate Error Intercept b_group 0.01861 0.005946 Residual 4.5432 0.03642 Solutions for Fixed Effects Standard EffectEstimateError DF t Value Pr > |t| Intercept 39.0769 0.01802 389 2168.69 <.0001 race -0.5883 0.02793 31098 -21.06 <.0001 Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F race 1 31098 443.54 <.0001

xtlogitptb_total race, i(b_group) re or Random-effects logistic regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 80.7 max = 652 Wald chi2(1) = 265.00 Log likelihood = -9310.5391 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ptb_total | OR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 1.992814 .0844133 16.28 0.000 1.834049 2.165323 -------------+---------------------------------------------------------------- /lnsig2u | -4.059262 .6231052 -5.280526 -2.837999 -------------+---------------------------------------------------------------- sigma_u | .131384 .040933 .0713425 .241956 rho | .0052196 .0032354 .0015447 .0174837 ------------------------------------------------------------------------------ Likelihood-ratio test of rho=0: chibar2(01) = 3.47 Prob >= chibar2 = 0.031 Cluster-specific, within-neighborhood interpretation but significantly higher than FE estimate of 1.60 Intracluster Correlation = 0.005 (most of variance occurs within neighborhood at the individual level, 99.5%, rather than between neighborhoods at neighborhood level, 0.5%) 0.1314/(0.1314 + π2/3) = 0.005

procglimmixdata=nc.data_finalmethod=quad; classb_group; modelptb_total (descending) = race /solutiondist=bin link=logitoddsratio; random intercept / subject=b_group; run; Covariance Parameter Estimates Standard CovParm Subject Estimate Error Intercept b_group 0.01746 0.01081 Solutions for Fixed Effects Standard EffectEstimateError DF t Value Pr > |t| Intercept -2.5945 0.02907 389 -89.26 <.0001 race 0.6894 0.04238 31098 16.27 <.0001 Odds Ratio Estimates 95% Confidence race _race Estimate DF Limits 1.3261 0.3261 1.992 31098 1.834 2.165 Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F race 1 31098 264.60 <.0001

Random Intercept + Slope • Also possible to allow random normal variation in covariate effect across neighborhood, e.g. allowing racial disparity to vary by neighborhood xtmixedga_clean race || b_group: race, mle Mixed-effects ML regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Obs per group: min = 1 avg = 80.7 max = 652 Wald chi2(1) = 304.72 Log likelihood = -68537.949 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ga_clean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.6060449 .0347181 -17.46 0.000 -.6740911 -.5379988 _cons | 39.09577 .0147177 2656.37 0.000 39.06693 39.12462 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ b_group: Independent | sd(race) | .3447168 .0374273 .2786404 .4264626 sd(_cons) | .0221739 .0600709 .0001096 4.485535 -----------------------------+------------------------------------------------ sd(Residual) | 2.127355 .0085317 2.110699 2.144143 ------------------------------------------------------------------------------ LR test vs. linear regression: chi2(2) = 69.45 Prob > chi2 = 0.0000 Note: LR test is conservative and provided only for reference. • Appears to be significant variation across neighborhoods but the point estimate or average within-neighborhood disparity is not correct based on comparisons to FE models (-0.47)

procglimmixdata=nc.data_finalmethod=quad; classb_group; modelga_clean = race /solution ; random intercept race/ subject=b_group; Run; The GLIMMIX Procedure Covariance Parameter Estimates Standard CovParm Subject Estimate Error Intercept b_group 0.000494 0.003077 race b_group 0.1188 0.02584 Residual 4.5256 0.03631 Solutions for Fixed Effects Standard EffectEstimateError DF t Value Pr > |t| Intercept 39.0958 0.01498 389 2609.64 <.0001 race -0.6060 0.03488 352 -17.38 <.0001 Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F race 1 352 301.93 <.0001

xtmelogitptb_total race || b_group: race, or Mixed-effects logistic regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Obs per group: min = 1 avg = 80.7 max = 652 Integration points = 7 Wald chi2(1) = 255.50 Log likelihood = -9310.2914 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ptb_total | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 1.991382 .0858163 15.98 0.000 1.830093 2.166887 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ b_group: Independent | sd(race) | .1218997 .0899226 .0287138 .5175045 sd(_cons) | .1101542 .0562791 .040468 .2998403 ------------------------------------------------------------------------------ LR test vs. logistic regression: chi2(2) = 3.96 Prob > chi2 = 0.1380 Note: LR test is conservative and provided only for reference. No indication of significant neighborhood variation in the PTB racial disparity; average neighborhood-specific disparity is biased relative to FE (1.60)

procglimmixdata=nc.data_finalmethod=quad; classb_group; modelptb_total (descending) = race /solutiondist=bin link=logitoddsratio; random intercept race/ subject=b_group; run; Covariance Parameter Estimates Standard CovParm Subject Estimate Error Intercept b_group 0.01230 0.01248 race b_group 0.01495 0.02201 Solutions for Fixed Effects Standard EffectEstimateError DF t Value Pr > |t| Intercept -2.5967 0.02875 389 -90.31 <.0001 race 0.6886 0.04312 352 15.97 <.0001 Odds Ratio Estimates 95% Confidence race _race Estimate DF Limits 1.3261 0.3261 1.991 352 1.829 2.167 Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F race 1 352 255.04 <.0001

When will RE approximate FE? • When there is no between-cluster confounding • No clustering of X • No variation in outcome by cluster • Even when confounding is present, RE can still approximate FE under certain conditions • Normally, a composite of within and between effects but weighted toward the within-effect when it is more precise • Large cluster size • High ICC

Example • Most of variation is between rather than within cluster (ICC=0.99) so within-effect is going to be very precise (little variability) ---------------------------------------------------------------- Variable | olsolsc gee fe re -------------+-------------------------------------------------- x | -2.2152 -2.2152 2.0552 2.0563 2.0503 | 0.799 1.189 0.044 0.045 0.054 _cons | 166.0880 166.0880 67.3646 67.3404 67.4793 | 18.868 28.958 13.615 1.049 7.058 ---------------------------------------------------------------- legend: b/se

For most multilevel neighborhood studies, ICC is quite low (<10%) so only a huge cluster size could compensate to get valid within-cluster estimates in the presence of neighborhood confounding • We can also control for observable cluster-level factors but there are many factors that may not be measured or imperfectly measured • e.g. built environment, air quality/toxins, health care access/quality, fresh foods, social cohesion • So it can be rare to get the FE estimate, which controls for all factors--observed and unobserved, by controlling only for a few observed factors in a RE model

xtregga_clean race poverty, i(b_group) re mle Random-effects ML regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 80.7 max = 652 LR chi2(2) = 407.38 Log likelihood = -68554.58 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ga_clean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.5368362 .030145 -17.81 0.000 -.5959193 -.477753 poverty | -.0064216 .0015111 -4.25 0.000 -.0093832 -.0034599 _cons | 39.1246 .0205574 1903.19 0.000 39.08431 39.16489 -------------+---------------------------------------------------------------- /sigma_u | .1304851 .0218083 .0940368 .1810607 /sigma_e | 2.131108 .0085401 2.114436 2.147913 rho | .003735 .0012466 .0018992 .007028 ------------------------------------------------------------------------------ Likelihood-ratio test of sigma_u=0: chibar2(01)= 15.89 Prob>=chibar2 = 0.000 So controlling for poverty moves us closer to the within-cluster effect but doesn’t control for all important neighborhood factors Crude: -0.61 FE: -0.47 Controlling for neighborhood poverty: -0.54 Explains about half of the neighborhood contribution (0.07/0.14) And 11.5% of overall disparity (0.07/-0.61)

Hausman test for consistency in estimates from FE and RE models hausmanga_clean_fega_clean_re_poverty ---- Coefficients ---- | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | ga_clean_fega_clean_r~y Difference S.E. -------------+---------------------------------------------------------------- race | -.4692281 -.5500752 .080847 .0156404 ------------------------------------------------------------------------------ b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 26.72 Prob>chi2 = 0.0000 Rejects the null of equivalence between the FE and RE estimator **From a RE model that is based on generalized least squares, not maximum likelihood

xtlogitptb_total race poverty, i(b_group) re or Random-effects logistic regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 80.7 max = 652 Wald chi2(2) = 316.05 Log likelihood = -9297.2331 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ptb_total | OR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 1.804966 .0834998 12.77 0.000 1.64851 1.976272 poverty | 1.010437 .0020192 5.20 0.000 1.006487 1.014403 -------------+---------------------------------------------------------------- /lnsig2u | -4.32276 .7583429 -5.809085 -2.836435 -------------+---------------------------------------------------------------- sigma_u | .1151661 .0436677 .0547739 .2421452 rho | .0040153 .0030328 .0009111 .0175106 ------------------------------------------------------------------------------ Likelihood-ratio test of rho=0: chibar2(01) = 2.21 Prob >= chibar2 = 0.068 So controlling for poverty moves us closer to the within-cluster effect but doesn’t control for all important neighborhood factors Crude: 2.03 FE: 1.60 Controlling for neighborhood poverty: 1.80 Explains about half of the neighborhood contribution (0.2/0.43) And 20% of overall disparity (0.2/1.03)

Hausman test for consistency in estimates from FE and RE models hausmanptb_feptb_re_poverty ---- Coefficients ---- | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | ptb_feptb_re_pov~y Difference S.E. -------------+---------------------------------------------------------------- race | .4725559 .590542 -.1179861 .0243549 ------------------------------------------------------------------------------ b = consistent under Ho and Ha; obtained from xtlogit B = inconsistent under Ha, efficient under Ho; obtained from xtlogit Test: Ho: difference in coefficients not systematic chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 23.47 Prob>chi2 = 0.0000 Rejects the null of equivalence between the FE and RE estimator

Random Effects: Benefits & Disadvantages • Benefits: • Ability to estimate covariates both within and between cluster (level 1 and 2 effects) • Ability to partition variance at multiple levels • Examine variation in effects across cluster • Efficient/parsimonious • Disadvantages: • Within-cluster effects can be significantly biased • Requires ~30 clusters for estimation of cluster variance with random normal assumption

GEE • Handles clustered data with complex error treated as a nuisance rather than explicitly controlled (FE) or modeled as an interest (RE) • Within-cluster correlation specified as • Independent (robust SEs, point estimates unchanged) • Exchangeable (similar to RE point estimates) • Unstructured (allows variation in correlation) • Inference is population-averaged rather than cluster-specific (only difference is for odds ratio since the average of each cluster-specific OR ≠ overall OR; not collapsible)

xtregga_clean race, i(b_group) pa corr(ind) vce(robust) xtgeega_clean race, corr(ind) vce(robust) Iteration 1: tolerance = 6.133e-15 GEE population-averaged model Number of obs = 31489 Group variable: b_group Number of groups = 390 Link: identity Obs per group: min = 1 Family: Gaussian avg = 80.7 Correlation: independent max = 652 Wald chi2(1) = 351.43 Scale parameter: 4.560647 Prob > chi2 = 0.0000 Pearson chi2(31489): 143610.20 Deviance = 143610.20 Dispersion (Pearson): 4.560647 Dispersion = 4.560647 (Std. Err. adjusted for clustering on b_group) ------------------------------------------------------------------------------ | Semirobust ga_clean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.6112235 .0326047 -18.75 0.000 -.6751276 -.5473194 _cons | 39.09623 .0141518 2762.63 0.000 39.06849 39.12396 ------------------------------------------------------------------------------ Results are very similar to OLS regression with cluster-robust SEs

xtregga_clean race, i(b_group) pa corr(exc) xtgeega_clean race, corr(exc) vce(robust) Iteration 1: tolerance = .00997269 Iteration 2: tolerance = .00071555 Iteration 3: tolerance = .00004882 Iteration 4: tolerance = 3.319e-06 Iteration 5: tolerance = 2.256e-07 GEE population-averaged model Number of obs = 31489 Group variable: b_group Number of groups = 390 Link: identity Obs per group: min = 1 Family: Gaussian avg = 80.7 Correlation: exchangeable max = 652 Wald chi2(1) = 350.49 Scale parameter: 4.560803 Prob > chi2 = 0.0000 (Std. Err. adjusted for clustering on b_group) ------------------------------------------------------------------------------ | Semirobust ga_clean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | -.5939304 .0317245 -18.72 0.000 -.6561093 -.5317514 _cons | 39.08105 .013988 2793.91 0.000 39.05364 39.10847 ------------------------------------------------------------------------------ Results are very similar to the random intercept model with cluster-robust SEs

procgenmoddata=nc.data_final; classb_group; modelga_clean = race; repeatedsubject=b_group /corr=ind; run; Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > |Z| Intercept 39.0962 0.0141 39.0685 39.1239 2766.18 <.0001 race -0.6112 0.0326 -0.6750 -0.5474 -18.77 <.0001 procgenmoddata=nc.data_final; classb_group; modelga_clean = race; repeatedsubject=b_group /corr=exc; run; Exchangeable Working Correlation Correlation 0.0028899011 Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > |Z| Intercept 39.0811 0.0140 39.0537 39.1084 2797.50 <.0001 race -0.5939 0.0317 -0.6560 -0.5318 -18.75 <.0001

xtlogitptb_total race, i(b_group) pa corr(ind) vce(robust) or xtgeeptb_total race, fam(bin) link(logit) corr(ind) vce(robust) eform Iteration 1: tolerance = 1.545e-07 GEE population-averaged model Number of obs = 31489 Group variable: b_group Number of groups = 390 Link: logitObs per group: min = 1 Family: binomial avg = 80.7 Correlation: independent max = 652 Wald chi2(1) = 278.47 Scale parameter: 1 Prob > chi2 = 0.0000 Pearson chi2(31489): 31489.00 Deviance = 18624.54 Dispersion (Pearson): 1 Dispersion = .5914619 (Std. Err. adjusted for clustering on b_group) ------------------------------------------------------------------------------ | Semirobust ptb_total | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 2.02922 .0860517 16.69 0.000 1.867381 2.205086 ------------------------------------------------------------------------------ Results are very similar to logistic regression with cluster-robust SEs

xtlogitptb_total race, i(b_group) pa corr(exc) vce(robust) or xtgeeptb_total race, fam(bin) link(logit) corr(exc) vce(robust) eform Iteration 1: tolerance = .00895091 Iteration 2: tolerance = .00071972 Iteration 3: tolerance = .00006077 Iteration 4: tolerance = 5.223e-06 Iteration 5: tolerance = 4.495e-07 GEE population-averaged model Number of obs = 31489 Group variable: b_group Number of groups = 390 Link: logitObs per group: min = 1 Family: binomial avg = 80.7 Correlation: exchangeable max = 652 Wald chi2(1) = 269.83 Scale parameter: 1 Prob > chi2 = 0.0000 (Std. Err. adjusted for clustering on b_group) ------------------------------------------------------------------------------ | Semirobust ptb_total | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race | 1.995782 .0839587 16.43 0.000 1.837828 2.167313 ------------------------------------------------------------------------------ Results are very similar to RE logistic regression with cluster-robust SEs ICC is so low that marginal OR ≈ cluster-specific OR

procgenmoddata=nc.data_final; classb_group; modelptb_total = race /dist=bin link=logit; repeatedsubject=b_group/corr=ind; estimate'race or' race -11 /exp; run; Contrast Estimate Results Mean MeanL'Beta Standard L'Beta Label Estimate Confidence Limits Estimate Error Alpha Confidence Limits race or 0.6699 0.6513 0.6880 0.7077 0.0424 0.05 0.6246 0.7907 Exp(race or) 2.0292 0.0859 0.05 1.8676 2.2049 procgenmoddata=nc.data_final; classb_group; modelptb_total = race /dist=bin link=logit; repeatedsubject=b_group/corr=exc; estimate'race or' race -11 /exp; run; Contrast Estimate Results Mean MeanL'Beta Standard L'Beta Label Estimate Confidence Limits Estimate Error Alpha Confidence Limits race or 0.6662 0.6476 0.6843 0.6910 0.0420 0.05 0.6087 0.7734 Exp(race or) 1.9958 0.0839 0.05 1.8380 2.1671

GEE: Benefits & Disadvantages • Benefits: • Ability to estimate both within and between-cluster effects and adjust SEs for clustering • Examine cross-level interaction • Disadvantages: • Within-cluster effects can be significantly biased • Marginal inference leads to effect estimates closer to the null in logistic models (depending on ICC) • No variance components

Hybrid Models • Obtain the appropriate within-neighborhood effect in random effects, GEE, or general cluster-robust models • Contain the advantages of RE or GEE models without the bias in the within-cluster effects • Incorporate the cluster-mean of the covariate to account for all between-cluster variation related to the covariate (aggregated variable, % Black) • Centering (subtracting cluster-mean) • Centering + cluster-mean covariate adjustment • Cluster-mean covariate adjustment

egenrace_bgc=mean(race), by(b_group) gen race_c=race-race_bgc xtregga_cleanrace_c, i(b_group) re mle Random-effects ML regression Number of obs = 31489 Group variable: b_group Number of groups = 390 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 80.7 max = 652 LR chi2(1) = 201.73 Log likelihood = -68657.406 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ ga_clean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race_c | -.4692281 .0329833 -14.23 0.000 -.5338742 -.404582 _cons | 38.85962 .0206466 1882.13 0.000 38.81915 38.90008 -------------+---------------------------------------------------------------- /sigma_u | .291513 .0203577 .2542229 .3342729 /sigma_e | 2.130663 .0085431 2.113985 2.147473 rho | .0183752 .0025312 .0139516 .0239413 ------------------------------------------------------------------------------ Likelihood-ratio test of sigma_u=0: chibar2(01)= 193.82 Prob>=chibar2 = 0.000 Estimate now corresponds to within-neighborhood effect of race obtained in FE model (-0.47) -Note that the ICC is now much larger than in the RE model (0.004  0.018) -Neighborhood variance increased because centering removed the association between race and neighborhood before estimation so this refers to the ICC from a null model -No other variables in the model (including random intercept) will be adjusted for

Fixed Versus Random Effects Models for Multilevel and Longitudinal Data Analysis