1 / 61

Introduction

Multilevel Models in Public Policy Research Brandon Bartels GWU Department of Political Science bartels@gwu.edu. Introduction. Exciting methodological toolkit Multilevel modeling is not monolithic There are lots of different types of model specifications that fall under the umbrella.

Télécharger la présentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multilevel Models in Public Policy ResearchBrandon BartelsGWU Department of Political Sciencebartels@gwu.edu

  2. Introduction • Exciting methodological toolkit • Multilevel modeling is not monolithic • There are lots of different types of model specifications that fall under the umbrella. • Various specifications carry different substantive interpretations.

  3. Outline I. Multilevel and hierarchical data • Motivation and Core Issues III. Modeling approaches IV. Statistical specifications – what you can do with these models V. Applications

  4. Multilevel Data • Contain multiple levels of analysis, with each level consisting of distinct units of analysis. • Most common form of multilevel data: hierarchical data. • Two-level structure: Units from the lowest level of analysis (level-1 units) are nested within units from a higher level of analysis (level-2 units) • Data are “clustered” • Level-2 units are referred to as “clusters” • Three-level structure: Third level is present

  5. Multilevel Data • Examples • Education: students (level-1 units) nested within schools (level-2 units) • Three levels: students nested within schools nested within states • Individuals nested within cities • Voters nested within congressional districts • Voters nested within time (or temporal contexts) • Panel data and time-series cross-sectional (TSCS) data

  6. Multilevel Data • X1 and X2 are level-1 variables • X3 and X4 are level-2 variables. • Balanced data: cluster sizes are equal

  7. Multilevel Data • Multilevel data that is non-hierarchical: cross-classified data. • Lower level units are cross-classified; belong to two or more higher-level units that are themselves non-nested.

  8. Motivation • Types of phenomena we’re interested in are multilayered and complex. • Incorporating these layers enhances our substantive explanations of phenomena. • People don’t make choices or behave in a vacuum; there’s a context in which they act. • This contextual, or situational, variation may have consequences for how people behave. • Most simple cross-sectional data ignores this structure; “naïve pooling”

  9. Motivation • Parsing explained variance in the DV between individual versus aggregate levels of analysis. • Student versus school effects on performance.

  10. Core Issues • Unobserved heterogeneity (UH) in the response • Between-cluster variation in the response (i.e., the DV) that is not accounted for by observed heterogeneity (i.e., measured IVs). • Unobserved factors specific to each cluster that influence the outcome; factors are shared by observations within each cluster. • UH represents conditional differences between clusters (conditional on observed heterogeneity). • Multilevel models separate the error term into a within-cluster (level 1) and between-cluster (level 2) error. • UH in a cross-sectional context: yi = b0 + b1x1i + b2x2i + ei

  11. Illustrating Unobserved Heterogeneity

  12. Core Issues • Pooling • Degree to which parameters (e.g., intercept, effects of IVs) are “pulled” toward the pooled (global) effect or reflect within-cluster variation. Spectrum: No Partial Pooling Complete Pooling ------------------------------------------- Pooling • Distinguish within-cluster, between-cluster, and total variation.

  13. Within-Cluster v. Between-Cluster Variation

  14. Three General Modeling Approaches 1. Complete Pooling: • Ignores clustering/hierarchical structure • Between-cluster UH unaccounted for • Doesn’t distinguish within- versus between-cluster variation • Generalization: global, pooled effect across all observations • Estimation technique: Plain-vanilla pooled regression (e.g., OLS) 2. No Pooling: • Effects are unpooled. • Between-cluster UH accounted for completely • Within-cluster variation is all that’s left. • Estimation technique: fixed-effects (within) estimator • Or…separate models for each cluster.

  15. Three General Modeling Approaches 3. Partial Pooling: • Weighted average between no pooling and complete pooling extremes. • Borrows information from completely pooled effects to generate refined estimate of within-cluster effects (problem with small cluster sizes) • Estimation technique: random intercept (aka, random effects) model; random coefficient model • What most people think of when they talk about a “multilevel model.”

  16. Model Specification Level-1 units indexed i=1, 2, …N. Level-2 units indexed j=1, 2, …J. [Level-1 equation] [Level-2 equation] Reduced form version: • zj= unobserved heterogeneity (between-cluster) • Key specification decision: How we treat zj is directly connected to the three approaches just discussed. • Complete pooling:zjdisappears from model; UH unaccounted for • No pooling: zjtreated as “fixed”; each cluster gets its own intercept • Fixed effects, “within” approach; UH completely accounted for. • Partial pooling: zj treated as “random” • Random effects, or random intercept model

  17. Partial Pooling: Random Intercept Model Level-1 units indexed i=1, 2, …N. Level-2 units indexed j=1, 2, …J. N level-1 units nested within J level-2 units. [Level-1 equation] [Level-2 equation] Reduced form • Assumptions: • Errors normally distributed • No correlation between observed IVs and error terms • CONTROVERSIAL ASSUMPTION: cov(xij, zj) = 0 • Var(zi)=y : Between-cluster error variance (UH). • Var(eij)=q : Within-cluster error variance. • Intraclass correlation: r = y / (y + q)

  18. Estimation of Linear Random Intercept Model • Can be estimated via GLS and ML; both yield similar results. • Foundation: Estimates of b are a weighted average of the pooled and within estimates of b. Partial pooling of coefficients. What regulates this weighting? • Pooling factor: • Recall: q=within-cluster error variance; y=between error variance • If w = 0, bRI reduces to bWithin(FE). • If w = 1, bRI reduces to bOLS. • Degree of pooling depends on how informative the within-cluster variation in the data is; the less informative, the more it borrows from the between-cluster variation in the data. • As cluster size (n) increases, there’s less pooling. • As q decreases, there’s less pooling. • As y increases (cluster differentiation), there’s less pooling.

  19. Considerations for the Random Intercept Model How do we interpret effects from each approach? What do the pooled and RI approaches assume about the within- and between-cluster effects of a level-1 variable? They’re equal. Justifiable? Controversial assumption: Correlation between random effect and X at level 1.

  20. Applications: Things You Can Do • High School and Beyond Data (1982) • Nationally representative survey of U.S. public and Catholic high schools. • Subsample of the 1982 HSB data • Hierarchical structure: 7,185 students nested within 160 schools • DV: math achievement • Software: Stata, HLM, R, WinBUGS • Stata: • Continuous DVs: xtreg, xtmixed • Binary response: xtlogit, xtprobit, xtmelogit

  21. Organization

  22. Organization

  23. Organization

  24. Organization

  25. Organization

  26. Organization

  27. Organization

  28. Describing Data

  29. Describing Data DV Level 1 Level 1 Level 1 Level 2 Level 2 Level 2

  30. Unconditional Random Intercept Model Between-school (level-2) error s.d. Within-school (level-1) error s.d. Between-school (level-2) error variance (y) Within-school (level-1) error variance (q)

  31. Unconditional Random Intercept Model Intraclass Correlation Coefficient

  32. Unconditional Random Intercept Model

  33. Unconditional Random Intercept Model

  34. Unconditional Random Intercept Model

  35. Random Intercept Model with IVs L1 L2

  36. Random Intercept Model with IVs Degree of UH Test of RI v. OLS

  37. Random Intercept Model with IVs Level-2 error variance (y) Level-1 error variance (q)

  38. Cluster Confounding • Cluster confounding: When the within-cluster and between-cluster effects of an independent variable at level-1 differ. • If they do differ, but you don’t account for that difference, then effect you get confounds within-cluster and between-cluster variation into an “averaged” effect. • This issue only applicable to level-1 variables • Level-2 variables only vary between clusters, not within. • Underpinning to “controversial assumption.” • One can estimate within-cluster and between-cluster effects of a level-1 variable. • This satisfies the controversial assumption.

  39. Illustrating Cluster Confounding

  40. Illustrating Cluster Confounding

  41. Illustrating Cluster Confounding

  42. Illustrating Cluster Confounding

  43. Accounting for Cluster Confounding Think of it this way (Method 1): • What is the correlation now between the within-clusterxijandzj? • Identical model, different interpretation (Method 2): • Can perform Hausman-like test for equality of between and within estimates. d represents the difference between with the within- and between-cluster effects. • This method is akin to Hausman test for the equality of RE and FE estimates. • With this procedure, we’re testing the equality of the within and between effects. It’s more direct than the Hausman.

  44. Accounting for Cluster Confounding

  45. Accounting for Cluster Confounding

  46. Accounting for Cluster Confounding

  47. Accounting for Cluster Confounding Within effect Between effect

  48. Accounting for Cluster Confounding Within effect Difference between within and between effects (d)

  49. Causal Heterogeneity • Causal heterogeneity • When the relationship between X and Y varies across clusters • How higher level variables shape lower-level relationships. • Method: Random coefficient model Y X

  50. Random Coefficient Model with Cross-Level Interactions Causal heterogeneity, in addition to heterogeneity in the response. [Level-1 equation] [Level-2 equations]

More Related