870 likes | 1.19k Vues
Reverse engineering gene regulatory networks. Dirk Husmeier Adriano Werhli Marco Grzegorczyk. Systems biology Learning signalling pathways and regulatory networks from postgenomic data. unknown. unknown. high-throughput experiments. postgenomic data. unknown. data. data.
E N D
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk
Systems biology Learning signalling pathways and regulatory networks from postgenomic data
unknown high-throughput experiments postgenomic data
unknown data data machine learning statistical methods
extracted network true network Does the extracted network provide a good prediction of the true interactions?
Reverse Engineering of Regulatory Networks • Can we learn the network structure from postgenomic data themselves? • Statistical methods to distinguish between • Direct interactions • Indirect interactions • Challenge: Distinguish between • Correlations • Causal interactions • Breaking symmetries with active interventions: • Gene knockouts (VIGs, RNAi)
direct interaction common regulator indirect interaction co-regulation
Relevance networks • Graphical Gaussian models • Bayesian networks
Relevance networks • Graphical Gaussian models • Bayesian networks
Relevance networks(Butte and Kohane, 2000) • Choose a measure of association A(.,.) • Define a threshold value tA • For all pairs of domain variables (X,Y) compute their association A(X,Y) 4. Connect those variables (X,Y) by an undirected edge whose association A(X,Y) exceeds the predefined threshold value tA
1 2 ‘direct interaction’ X 1 2 1 2 X X ‘common regulator’ 1 1 2 2 ‘indirect interaction’ strong correlation σ12
Pairwise associations without taking the context of the system into consideration
Relevance networks • Graphical Gaussian models • Bayesian networks
1 2 direct interaction 1 2 Graphical Gaussian Models strong partial correlation π12 Partial correlation, i.e. correlation conditional on all other domain variables Corr(X1,X2|X3,…,Xn)
Distinguish between direct and indirect interactions direct interaction common regulator indirect interaction co-regulation A and B have a low partial correlation
1 2 direct interaction 1 2 Graphical Gaussian Models strong partial correlation π12 Partial correlation, i.e. correlation conditional on all other domain variables Corr(X1,X2|X3,…,Xn) Problem: #observations < #variables
Graphical Gaussian Models direct interaction common regulator indirect interaction P(A,B)=P(A)·P(B) But: P(A,B|C)≠P(A|C)·P(B|C)
Undirected versus directed edges • Relevance networks and Graphical Gaussian models can only extract undirected edges. • Bayesian networks can extract directed edges. • But can we trust in these edge directions? It may be better to learn undirected edges than learning directed edges with false orientations.
Relevance networks • Graphical Gaussian models • Bayesian networks
Bayesian networks • Marriage between graph theory and probability theory. • Directed acyclic graph (DAG) representing conditional independence relations. • It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. • We can infer how well a particular network explains the observed data. NODES A B C EDGES D E F
Bayesian networks versus causal networks Bayesian networks represent conditional (in)dependence relations - not necessarily causal interactions.
Node A unknown A A True causal graph B C B C Bayesian networks versus causal networks
Bayesian networks versus causal networks A A A B C B C B C • Equivalence classes: networks with the same scores: P(D|M). • Equivalent networks cannot be distinguished in light of the data. A B C
A C B Equivalence classes of BNs A C B A C A B P(A,B)≠P(A)·P(B) P(A,B|C)=P(A|C)·P(B|C) C B A C completed partially directed graphs (CPDAGs) B v-structure A P(A,B)=P(A)·P(B) P(A,B|C)≠P(A|C)·P(B|C) C B
Symmetry breaking A A A B C B C B C A • Interventions • Priorknowledge B C
Symmetry breaking A A A B C B C B C A • Interventions • Priorknowledge B C
Interventional data A and B are correlated A B inhibition of A A B A B A B down-regulation of B no effect on B
Learning Bayesian networks from data P(M|D) = P(D|M) P(M) / Z M: Network structure. D: Data
Learning Bayesian networks from data P(M|D) = P(D|M) P(M) / Z M: Network structure. D: Data
Evaluation • On real experimental data, using the gold standard network from the literature • On synthetic data simulated from the gold-standard network
Evaluation • On real experimental data, using the gold standard network from the literature • On synthetic data simulated from the gold-standard network
Evaluation: Raf signalling pathway • Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell • Deregulation carcinogenesis • Extensively studied in the literature gold standard network
Raf regulatory network From Sachs et al Science 2005
Flow cytometry data • Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins • 5400 cells have been measured under 9 different cellular conditions (cues) • Downsampling to 100 instances (5 separate subsets): indicative of microarray experiments