Searching for Causal Models with L atent V ariables

Searching for Causal Models with Latent Variables Peter Spirtes, Richard Scheines, Joe Ramsey, Erich Kummerfeld, Renjie Yang

What is the Causal Relation Between Economic Stability and Political Stability? Economicalstability ? Politicalstability Economicalstability ? Politicalstability ? Economicalstability Politicalstability Economicalstability Politicalstability ? L

Measure Latents with Indicators Country XYZ 1. GNP per capita: _____ 2. Energy consumption per capita: _____ 3. Labor force in industry: _____ 4. Ratings on freedom of press: _____ 5. Freedom of political opposition: _____ 6. Fairness of elections: _____ 7. Effectiveness of legislature _____ Task: learn causal model

Multiple Indicator Models • To draw causal conclusions about the unmeasured Economical stability and Political stability variables we are interested in, use • hypothesized causal relations between X’s , Es and Ps • statistics gathered on X’s (correlation matrix) Economicalstability ? Politicalstability Pure Measurement Model X1 X2 X3 X4 X5 X6 X7

Structural Model – Two Factor Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

Measurement Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

Impurities L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

Pure Measurement Models • A pure n-factor measurement model for an observed set of variables O is such that: • Each observed variable has exactly n latent parents. • No observed variable is an ancestor of other observed variable or any latent variable. • A set of observed variables O in a pure n-factor measurement model is a pure cluster if each member of the cluster has the same set of n parents.

Alternative Models L1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 Bifactor L2 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 Higher-Order L1 L3 L2 Higher-Order ⊂ Bifactor ⊂ Connected Bifactor ⊂ Connected Two-Factor

Common Strategy L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 1. Estimate and test pure Higher-order model. 2. Estimate and test pure Two-Factor model. 3. Choose whichever one fits best.

Two Problems • If a measurement model is impure, and you assume it is pure, this will hinder the inference of the correct structural model. • If a higher-order model has impurities, it will fit a more inclusive pure model such as a pure two-factor model better than a pure higher-order model.

Finding the Structural Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 Generating Model

Finding the Structural Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 • Data fits model with black edges + pure measurement model better than model without black edges + pure measurement model.

Finding the Right Kind of Measurement Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 Generating Model

Finding the Right Kind of Measurement Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 Worse Fit

Finding the Right Kind of Measurement Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 Better Fit

Finding the Right Kind of Measurement Model L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12X13 Generating Model 1. Identify pure submodel {1,2,3,4,5,8,9,10,11,12,13}. 2. See if it fits Higher-order. 3. If it does select Higher –order, otherwise see if it fits Two-Factor model.

Alternative Strategy L1 L3 X1X2 X3 X4 X5 X8X9 X10 X11 X12X13 Pure submodel fits Higher-order model, so select Higher-order. ?

Alternative Strategy ? L1 L3 X1 X2 X3 X4 X5 X8X9 X10 X11 X12 X13 L2 L4 Data will also fit Two-Factor model (slightly lower chi-squared), but when adjusted for degrees of freedom, p-value will be lower. ?

Rank Constraints

Entailed Algebraic Constraints • An algebraic constraint is linearly entailed by a DAG if it is true of the implied covariance for every value of the free parameters (the linear coefficients and the variances of the noise terms)

Trek and Sides of Treks L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

Trek-Separation L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 (CA:CB) trek-separates A from B iff every trek between A and B intersects CA on the A side or CB on the B side.

<{L1,L2},∅> Trek-Separate {1,2,3}:{8,9,10} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

<∅,{L3,L4}> Trek-Separate {1,2,3}:{8,9,10} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

Theorem (Sullivant, Talaska, Draisma) L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 If (CA:CB) trek-separates A from B, and the model is an acyclic linear Gaussian model, then rank(cov(A,B)) ≤ #CA+ #CB.

<{L1,L2},∅> Trek-Separate {1,2,3}:{8,9,10} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

Theorem (Sullivant, Talaska, Draisma) L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 If #CA + #CB≤ #C’A+ #C’B for all (C’A:C’B) that trek-separate A from B, then for generic linear acyclic Gaussian models, rank(cov(A,B)) = #CA+ #CB.

Theorem (Sullivant, Talaska, Draisma) L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 If #CA + #CB> r for all (CA:CB) that trek-separate A from B in DAG G, then for some linear Gaussian parameterization, rank(cov(A,B)) > r.

Linear Acyclic Below the Choke Sets L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3}:{10,11,12}linear acyclic below <{L1,L2},∅> f(L1,εL3) g(L2,εL4)

Linear Acyclic Below the Choke Sets L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3}:{10,11,12} notlinear acyclic below < ∅, {L1,L2}> f(L1,εL3) g(L2,εL4)

Theorem (Spirtes) L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 If (CA:CB) trek-separates A from B, and model is linear acyclic below (CA:CB) for A, B, then rank(cov(A,B)) ≤ #CA+ #CB.

Proof full rank CA CB … … A B

Theorem (Spirtes) L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 If #CA + #CB > r for all (CA:CB) that trek-separate A from B in DAG G, then for some linear acyclic below (CA:CB) for A, B parameterization, rank(cov(A,B)) > r.

Faithfulness Assumption • If a rank constraint is not entailed by the graphical structure, then the rank constraint does not hold. • If the constraints do not hold for the whole space of parameters (i.e. they are not entailed), but are the roots of rational equations in the parameters, they are of Lebesgue measure 0.

Faithfulness Assumption • This says nothing about the measure of constraints that are not entailed but “almost” hold (i.e. cannot be distinguished from 0 reliably given the power of the statistical tests.) • However, the performance of the algorithm will not depend upon the extent to which individual non-entailed constraints “almost” hold, but the extent to which sets of non-entailed constraints “almost” hold. • This depends upon which sets of constraints affect the performance of the algorithm, and the joint distribution of the constraints which we do not know.

Advantages and Disadvantages of Algebraic Constraints • Advantages • No need for estimation of model. • No iterative algorithm • No local maxima. • No problems with identifiability. • Fast to compute. • Disadvantages • Does not contain information about inequalities. • Power and accuracy of tests? • Difficulty in determining implications among constraints

Find Two Factor Clusters (FTFC) Algorithm • Find a list of pure pentads of variable. • Merge pentads on list that overlap. • Select which merged subsets to output.

1. Construct a List of Pure Fivesomes L1 L3 X1 X2 X3 X4 X5 X6 X7X8 X9 X10 X11 X12 X13 L2 L4 For each subset of size 5, if it is Pure, add to PureList. {1,2,3,4,5}; {9,10,11,12,13}; {8,10,11,12,13}; {8,9,11,12,13}; {8,9,10,12,13}; {8,9,10,11,12}

Test for Purity of {1,2,3,4,5} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 <{L1,L2},∅> Trek-Separate All Partitions of {1,2,3,4,5,x}

Test for Purify of {1,2,3,4,8} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 No Pair Trek-Separate All Partitions of {1,2,3,4,8,x}, e.g. {1,2,8}:{3,4,9}

Test for Purify of {1,2,3,4,6} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 No Pair Trek-Separates All Partitions of {1,2,3,4,6,x}, e.g. {1,2,6}:{3,4,7}

No Pair Trek-Separate {1,2,3}:{7,8,9} L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4

2. Merge Overlapping Items - Theory L1 L3 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3,4,5}; {9,10,11,12,13}; {8,10,11,12,13}; {8,9,11,12,13}; {8,9,10,12,13}; {8,9,10,11,12}→ {1,2,3,4,5}; {8,9,10,11,12,13}

2. Merge Overlapping Items - Practice L1 L3 X1 X2 X3 X4 X5 X6 X7X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3,4,5}; {9,10,11,12,13}; {8,10,11,12,13}; {8,9,11,12,13}; {8,9,10,12,13}; {8,9,10,11,12}; {1,2,3,8,9}(false positive)

2. Merge Overlapping Items - Practice L1 L3 X1 X2 X3 X4 X5 X6 X7X8 X9 X10 X11 X12 X13 L2 L4 {9,10,11,12,13}; {8,10,11,12,13}→ {8,9,10,11,12,13}; All subsets of size 5 of {8,9,10,11,12,13} are in PureList so accept merger, and remove both from PureList.

2. Merge Overlapping Items - Practice L1 L3 X1 X2 X3 X4 X5 X6 X7X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3,4,5}; {1,2,3,8,9} → {1,2,3,4,5,8,9} All subsets of size 5 except {1,2,3,8,9} and {1,2,3,4,5}not on PureList – so reject merger

2. Final List L1 L3 X1 X2 X3 X4 X5 X6 X7X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3,4,5}; {8,9,10,11,12,13}; {1,2,3,8,9}

3. Select Which Ones to Output L1 L3 X1 X2 X3 X4 X5 X6 X7X8 X9 X10 X11 X12 X13 L2 L4 {1,2,3,4,5}; {8,9,10,11,12,13}; {1,2,3,8,9} Output {8,9,10,11,12,13} because it is largest. Output {1,2,3,4,5} because it is next largest that is disjoint.

Theorem • If • The causal graph contains as a subgraph a pure 2-factor measurement model with at least six indicators and at least 5 variables in ech cluster; • The model is linear acyclic below the latent variables; • Whenever there is no trek between two variables they are independent; • There are no correlations equal to zero or one; • The distribution is LA faithful to the causal graph; • then the population FTFC algorithm outputs a clustering in which any two variables in the same output cluster have the same pair of latent parents.

Searching for Causal Models with L atent V ariables

Searching for Causal Models with L atent V ariables

Presentation Transcript

Graphical Models of Probability for Causal Reasoning

Causal Models for Drug and Biomarker Development

Interpreting Probability in Causal Models for Cancer

Causal Models for Regression Modeling Strategies

Causal Rasch Models

SEEM Input V ariables for RBSA Calibration

Intro to E xpressions a nd V ariables

Section V Searching for Scholarships

Bayesian selection of dynamic causal models for fMRI

Causal Models for Performance Analysis of Computer Systems

SOC 681 – Causal Models with Directly Observed Variables

Dynamic Causal Models

Changes in Probabilistic Causal Models

Dynamic Causal Models

Graphical Causal Models

Searching for Preteen Models Through Agencies

Causal Inference and Graphical Models

Dynamic Causal Models

Causal Modeling with TETRAD

Interpreting Probability in Causal Models for Cancer

Discovering Causal Models