Mediation and Multi-group Analyses Lyytinen & Gaskin
Mediation M X Y In an intervening variable model, variable X, is postulated to exert an effect on an outcome variable, Y, through one or more intervening variables called mediators (M) “mediational models advance an X → M → Y causal sequence, and seek to illustrate the mechanisms through which X and Y are related.” (Mathieu & Taylor)
Why Mediation? • Seeking a more accurate explanation of the causal effect the antecedent (predictor) has on the DV (criterion , outcome) – focus on mechanisms that make causal chain possible • Missing variables in the causal chain • Intelligence Performance • Intelligence Work Effectiveness Performance
Conditions for mediation (1) justify the causal order of variables including temporal precedence; (2) reasonably exclude the influence of outside factors; (3) demonstrate acceptable construct validity of their measures; (4) articulate, a priori, the nature of the intervening effects that they anticipate; and (5) obtain a pattern of effects that are consistent with their anticipated relationships while also disconfirming alternative hypotheses through statistical tests.
Conditions for mediation Inferences of mediation are founded first and foremost in terms of theory, research design, and the construct validity of measures employed, and second in terms of statistical evidence of relationships. Mediation analysis requires: 1) inferences concerning mediational X MY relationships hinge on the validity of the assertion that the relationships depicted unfold in that sequence (Stone-Romero & Rosopa, 2004). As with SEM, multiple qualitatively different models can be fit equally well to the same covariance matrix. Using the exact same data, one could as easily ‘confirm’ a YMX mediational chain as one can an XMY sequence (MacCallum, Wegener, Uchino, & Fabrigar, 1993).
Conditions for mediation 2) experimental designs is to isolate and test, as best as possible, XY relationships from competing sources of influence. In mediational designs, however, this focus is extended to a three phase XMY causal sequence requiring random assignments to both X and M and related treatments “Because researchers may not be able to randomly assign participants to conditions, the causal sequence of XMY is vulnerable to any selection related threats to internal validity (Cook & Campbell, 1979; Shadish et al., 2002). To the extent that individuals’ status on a mediator or criterion variable may alter their likelihood of experiencing a treatment, the implied causal sequence may also be compromised. For example, consider a typical: trainingself-efficacyperformance, mediational chain. If participation in training is voluntary, and more efficacious people are more likely to seek training, then the true sequence of events may well be self-efficacytrainingperformance. If higher performing employees develop greater self-efficacy (Bandura, 1986), then the sequence could actually be performanceefficacytraining. If efficacy and performance levels remain fairly stable over time, one could easily misconstrue and find substantial support for the trainingefficacyperformancesequence when the very reverse is actually occurring.” (Mathieu and Taylor 2006)
Conditions of mediation It is a hallmark of good theories that they articulate the how and why variables are ordered in a particular way (e.g., Sutton & Staw, 1995; Whetten, 1989). This is perhaps the only basis for advancing a particular causal order in non-experimental studies with simultaneous measurement of the antecedent, mediator, and criterion variables (i.e., classic cross-sectional designs). Implicitly, mediational designs advance a time-based model of events whereby X occurs before M which in turn occurs before Y. It is the temporal relationships of the underlying phenomena that are at issue, not necessarily the timing of measurements In other words, in mediation analyses, omitted variables represent a significant threat to validity of the XM relationship if they are related both to the antecedent and to the mediator, and have a unique influence on the mediator. Likewise omitted variables (and related paths) may lead to conclude falsely that no direct effect XY exists, while in fact it holds in the population
Importance of theory – Cause and effect Performance Self-efficacy Training Training Self-efficacy Training Self-efficacy Performance Training Performance Performance Self-efficacy
Types of Mediation Significant Path Insignificant Path M Indirect Effect X Y M Partial Mediation X Y M Full Mediation X Y
More complex mediation structures Chain Model X M1 M2 M3 Y M1 X Y M2 M3 Parallel Model
Hypothesizing Mediation • All types of mediation need to be explicitly and with good theoretical reasons and logic hypothesized before testing them • Indirect Effect • You still need to assume and test that X has an indirect effect on Y, though there is no effect in path XY • “X has an indirect, positive effect on Y, through M.” • Partial or Full • “M partially/fully mediates the effect of X on Y.” • “The effect of X on Y is partially/fully mediated by M.” • “The effect of X on Y is partially/fully mediated by M1, M2, & M3.”
Statistical evidence of relationships. Each type of mediation needs to be backed by appropriate statistical analysis Sometimes the analysis can be based on OLS, but in most cases it needs to be backed by SEM based path analysis There are four types of analyses to detect presence of mediation relationships Causal steps approach (Baron-Kenny 1986) (tests for significance of different paths) Difference in coefficients (evaluates the changes in betas/coefficients and their significance when new paths are added to the model) Product of effect approach (tests for indirect effects a*b’- this always needs to be tested or evaluated using bootstrapping) Sometimes evaluating differences in R squares
Statistical evidence of relationships Convergent validity is critical for mediation tests as this forms the basis for reliability – especially poor reliability of mediator as “to the extent that a mediator is measured with less than perfect reliability, the MY relationship would likely be underestimated, whereas the XY would likely be overestimated when the antecedent and mediator are considered simultaneously” (see Baron & Kenny 1986) Discriminant validity must be gauged in the context of the larger nomological network within which the relationships being considered are believed to reside. Discriminant validity does not imply that measures of different constructs are uncorrelated – the issue is whether measures of different variables are so highly correlated as to raise questions about whether they are assessing different constructs. It is incumbent on researchers to demonstrate that their measures of X, M, and Y evidence acceptable discriminant validity before any mediational tests are justified.
Statistical evidence of the relationships In simple partial mediation βmx is the coefficient for X for predicting M, and βym.x and βyx.m are the coefficients predicting Y from both M and X, respectively. Here βyx.m is the direct effect of X, whereas the product βmx*βym quantifies the indirect effect of X on Y through M. If all variables are observed then βyx = βyx.m + βmx*βym or βmx*βym = βyx - βyx.m Indirect effect is the amount by which two cases who differ by one unit of X are expected to differ on Y through X’s effects on M, which in turn affects Y Direct effect part of the effect of X on Y that is independent of the pathway through M Similar logic can be applied to more complex situations
Statistical analysis • The testing of the existence of the mediational effect depends on the type of indirect effect • The lack of direct effect XY (βyx is either zero or not significant) is not a demonstration of the lack of mediated effect • Therefore three different situations prevail (in this order) • The presence of a indirect effect (βmx*βym is significant) • The presence of full mediation (βyx is significant but βyx.m is not) • The presence of partial mediation (βyx is significant and βyx.m is non zero and significant)
Observations of statistical analysis • The key is to test for the presence of a significant indirect effect – just demonstrating the significant of paths βyx, βyx.m,βmx.y, and βmx is not enough • One reason is that Type I testing of statistical significance of paths is not based on inferences on indirect effects as products of effects and their quantities • Can be done either using Sobel test (see e.g. www.quantpsy.org) or bootstrapping • Sobel tests assumes normality of product terms and relatively large sample sizes (>200) • Lacks power with small sample sizes or if the distribution is not normal
Bootstrapping Bootstrapping (available in most statistical packages, or there is additional code to accomplish it for most software packages) Samples the distribution of the indirect effect by treating the obtained sample of size n as a representation of the population as a minitiature – and then resampling randomly the sample with replacement so that a sample size n is built by sampling cases from the original sample by allowing any case once drawn to be thrown back to be redrawn as the resample of size n is constructed βmx and βym and their product is estimated for each sample recorded The process is repeated for k times where k is large (>1000) Hence we have k estimates of the indirect effect and the distribution functions as an empirical approximation of the sampling distribution of the indirect effect when taking the sample of size n from the original population Specific upper and lower bound for confidence intervals are established to find ith lowest and jst largest value in the ordered rank of value estimates to reject the null hypothesis that the indirect effect is zero with e.g. 95 level of confidence
Observations of statistical analysis In full and partial mediation bivariate XY (assessed via correlation rYX or coefficient βyx) must be nonzero in the population if the effects of X on Y are mediated by M Hence establishing a significant bivariate is conditional on sample size For example Assume that N=100 and sample correlations rXM=.30 and rMY=.30 and both would be significant at p<.05. However sample correlationrXY=.09 would not! Hence tests for full mediation can be precluded if this is the true model in the population This point become even more challenging when complex mediations XM1M2M3Y are present. Hence many times full mediations are not detected due to underpowered designs; the same holds for interactions or suppression variables; in fact four step Baron Kenny has power of .52 with a sample size of 200 to detect medium effect! This can be overcome by bootstrapping
Observations of statistical analysis Testing for full mediation requires that βyx.m is zero. When βyx.m does not drop zero the evidence supports partial mediation. This requires researchers to make a priori hypotheses concerning full or partial mediation and transforms confirmatory tests to exploratory data mining What counts as significant reduction inβyx vs. βyx.m is not clear (c.f. from .15 to .05 vs. .75 to .65) Typically the baseline model for mediation is partial mediation while theoretical clarity and Ockam’s razor would speak for full mediation
Testing for Mediation in AMOS Direct Effects First
Testing for Mediation in AMOS Add Mediator
Testing significance of partially mediated paths – Sobel Test http://www.danielsoper.com/statcalc/calc31.aspx Use for partially mediated relationships. Use the Sobel Test online calculator Assumes normal distributionand sufficiently large sample
Testing significance of indirect effects– Bootstrapping At least 1000 No Missing Values Allowed!
No Mediation • If Indirect is > 0.05 • Full Mediation • Given the direct effects were significant prior to adding the mediator • If Indirect < 0.05 and Direct is > 0.05 • Partial Mediation • If Direct & Indirect < 0.05, check Total. • If Total < 0.05 then partial mediation is significant.
Findings Partial Mediation .23*** .37*** .20** .17* .08 Full Mediation WORDING Overall value partially mediates the effect of trust in agent on loyalty for longterm (p < 0.000). Overall value fully mediates the effect of trust in company on loyalty for longterm (p < 0.000).
Moderation concept Based on the observation that independent-dependent variable relationship is affected by another independent variable This situation is called moderator effect which occurs when a moderator variable, a second independent variable changes the form of the relationship between another independent variable and the DV Can be expanded to a situation where the mediated relationship is moderated
Moderation: affecting the effect Moderating variables must be chosen with strong theoretical support (Hair et al 2010) The causality of the moderator cannot be tested directly Becomes potentially confounded as moderator becomes correlated with either of the variables in the relationship Testing easiest when moderator has no significant relationship with other constructs This assumption is important in distinguishing moderator from mediators which (by definition) are related to both constructs of the mediated relationship
Moderation: Multi-group Non-Metric moderators: categorical variables are hypothesized as moderators (gender, age, turbulence vs. non-turbulence, non customer vs. customer) For non-metric variables a multi-group analysis is applied i.e. data is split for separate groups for analysis based on variable values and tested for statistical difference (both for measurement and structural model)
Multi-group example Exercise Weight Loss Exercise Weight Loss Low Protein High Protein
Moderator vs. Mediator M B A K E C M K E Mediator: the means by which IV affects DV Moderator: a variable that influences the magnitude of the effect an IV has on a DV
Mediation vs. Moderation Example Notice that the mediator and the moderator can be the same! Can a mediator also be used as a moderator? Yes - see Baron and Kenny 1986 for a complex example
Some Theory-based Criteria(i.e., arguments for mediation and moderation are based on theory first, rather than statistical correlations) • Mediator • Logical effect of IV • Logical cause of DV • Moderator • Not logically correlated to IV or DV (if categorical) • Holistic/multiplicative effect (interaction) • Varying effect for different categorical values (multi-group)
Either, Neither, One or the Other? Driving home the point:Moderator or Mediator? M Exercise M Weight Loss Exercise Weight Loss Caloric intake Positive reinforcement Gender Age Heredity Exercise partner Exercise IQ Activity level Protein intake Attitude