Inferring structure to make substantive conclusions: How does it work?

Inferring structure to make substantive conclusions:How does it work? Hypothesis testing approaches: Tests on deviances, possibly penalised (AIC/BIC, etc.), MDL, cross-validation... Problem is how to search model space when dimension is large

Inferring structure to make substantive conclusions:How does it work? Bayesian approaches: Typically place prior on all graphs, and conjugate prior on parameters (hyper-Markov laws, Dawid & Lauritzen), then use MCMC to update both graphs and parameters to simulate posterior distribution

Graph moves Giudici & Green (Biometrika, 1999) develop a full Bayesian methodology for model selection in Gaussian models, assuming decomposability (= graph triangulated = no chordless -cycles) 5 7 6 4 1 2 3

Is decomposability a serious constraint? out of • How many graphs are decomposable? • Models using decomposable graphs are ‘dense’

Is decomposability any use? • Maximum likelihood estimates can be computed exactly in decomposable models • Decomposability is a key to the ‘message passing’ algorithms for probabilistic expert systems (and peeling genetic pedigrees) 2 1 3 4

Graph moves We can traverse graph space by adding and deleting single edges Some are OK, but others make graph non-decomposable 5 7 6 4 1 2 3

Graph moves Frydenberg & Lauritzen (1989) showed that all decomposable graphs are connected by single-edge moves Can we test for maintaining decomposability before committing to making the change? 5 7 6 4 1 2 3

Cliques A clique is a maximal complete subgraph: here the cliques are {1,2},{2,6,7}, {2,3,6}, and {3,4,5,6} 5 7 6 4 1 2 3

Deleting edges? Deleting an edge maintains decomposability if and only if it is contained in exactly one clique of the current graph (Frydenberg & Lauritzen) 5 7 6 4 1 2 3

A graph is decomposable if and only if it can be represented by a junction tree (which is not unique) 5 7 6 4 1 2 3 a separator another clique a clique 267 236 3456 26 36 2 The running intersection property: For any 2 cliques C and D, CD is a subset of every node between them in the junction tree 12

A graph is decomposable if and only if it can be represented by a junction tree (which is not unique) 5 7 6 4 1 2 3 a clique another clique 267 236 3456 26 36 a separator 2 The running intersection property: For any 2 cliques C and D, CD is a subset of every node between them in the junction tree 12

5 7 6 Non-uniqueness of junction tree 4 1 2 3 267 236 3456 26 36 2 12

5 7 6 Non-uniqueness of junction tree 4 1 2 3 267 236 3456 26 36 2 2 12 12

Adding edges? (Giudici & Green) Adding an edge (a,b) maintains decomposability if and only if either: • a and b are in different connected components, or • there exist sets R and T such that aR and bT are cliques and RT is a separator on the path in the junction tree between them 5 7 6 4 1 2 3

You can add edge (1,7) since 1R and 7T are cliques (with R={2} and T={2,6}) and RT={2} is a separator on path between them 5 7 6 4 1 2 3 267 236 3456 26 36 2 12

You cannot add edge (1,4) since the only cliques containing 1 and 4 resp. are {1,2} and {3,4,5,6}, and {2}{3,5,6} is not a separator on path between them 5 7 6 4 1 2 3 267 236 3456 26 36 2 12

Adding edges? (Giudici & Green) Adding an edge (a,b) maintains decomposability if and only if either: • a and b are in different connected components, or • there exist sets R and T such that aR and bT are cliques and RT is a separator on the path in the junction tree between them 5 7 6 4 1 2 3

Proof (in connected case)  First suppose that there are no such sets R and T. We have to show that adding edge (a,b) makes graph non-deomposable. LetaR and bT be the cliques containing a and b that have shortest connecting path in the junction tree: by assumption, RT is not a separator (it may be empty): so all separators on the path are proper supersets of RT. So there is a shortest path in the original graph: arv1...vktb with k0, rR\T, tT\R and all v’s RT.Joining (a,b) will make a chordless (k+4)-cycle, making the graph non-decomposable.

You cannot add edge (1,4) since the only cliques containing 1 and 4 resp. are {1,2} and {3,4,5,6}, and {2}{3,5,6} is not a separator on path between them 5 7 6 4 1 2 3 267 236 3456 26 36 2 12

Proof (in connected case)  Conversely, suppose such sets R and T do exist. We can suppose aR and bT are adjacent in the junction tree (otherwise it is quite easy to show that the junction tree can be manipulated until this is true). Let S=RT, P=R\T and Q=T\R. There are 4 cases according to whether P and Q are empty or not. aS bS S Both P and Q empty: (it is easy to see that you still have a tree & that running intersection property is maintained) abS

aSP bS aS bSQ S S abS bSQ aSP abS bS aS Only Q empty: Only P empty: aSP bSQ S Neither P nor Q empty: abS aSP bSQ aS bS

5 7 6 Once the test is complete, actually committing to adding or deleting the edge is little work 4 1 2 3 267 236 3456 26 36 2 12

5 7 6 Once the test is complete, actually committing to adding or deleting the edge is little work 4 1 2 3 It makes only a (relatively) local change to the junction tree 267 236 3456 26 36 27 2 127 12

5 7 6 Once the test is complete, actually committing to adding or deleting the edge is little work 4 1 2 3 It makes only a (relatively) local change to the junction tree 267 236 3456 26 36 6 27 127

267 236 356 26 36 27 35 127 345 5 7 6 Once the test is complete, actually committing to adding or deleting the edge is little work 4 1 2 3 It makes only a (relatively) local change to the junction tree The End

Inferring structure to make substantive conclusions: How does it work?