Télécharger la présentation
## Problem

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Problem**• Limited number of experimental replications. • Postgenomic data intrinsically noisy. • Poor network reconstruction.**Problem**• Limited number of experimental replications. • Postgenomic data intrinsically noisy. • Can we improve the network reconstruction by systematically integrating different sources of biological prior knowledge?**+**+**+**+ + + …**Which sources of prior knowledge are reliable?**• How do we trade off the different sources of prior knowledge against each other and against the data?**Overview of the talk**• Revision: Bayesian networks • Integration of prior knowledge • Empirical evaluation**Overview of the talk**• Revision: Bayesian networks • Integration of prior knowledge • Empirical evaluation**Bayesian networks**• Marriage between graph theory and probability theory. • Directed acyclic graph (DAG) representing conditional independence relations. • It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. • We can infer how well a particular network explains the observed data. NODES A B C EDGES D E F**Bayesian networks versus causal networks**Bayesian networks represent conditional (in)dependence relations - not necessarily causal interactions.**Node A unknown**A A True causal graph B C B C Bayesian networks versus causal networks**Bayesian networks versus causal networks**A A A B C B C B C • Equivalence classes: networks with the same scores: P(D|M). • Equivalent networks cannot be distinguished in light of the data. A B C**Symmetry breaking**A A A B C B C B C A Priorknowledge B C P(M|D) = P(D|M) P(M) / Z D: data. M: network structure**P(M)**Prior knowledge: B is a transcription factor with binding sites in the upstream regions of A and C**Learning Bayesian networks**P(M|D) = P(D|M) P(M) / Z M: Network structure. D: Data**Overview of the talk**• Revision: Bayesian networks • Integration of prior knowledge • Empirical evaluation**Biological prior knowledge matrix**Indicates some knowledge about the relationship between genes i and j Biological Prior Knowledge**Biological prior knowledge matrix**Indicates some knowledge about the relationship between genes i and j Biological Prior Knowledge Define the energy of a Graph G**Notation**• Prior knowledge matrix: P B (for “belief”) • Network structure: G (for “graph”) or M (for “model”) • P: Probabilities**Energy of a network**Prior distribution over networks**Sample networks and hyperparameters**• from the posterior distribution • Capture intrinsic inference uncertainty • Learn the trade-off parameters automatically P(M|D) = P(D|M) P(M) / Z**Energy of a network**Prior distribution over networks**Energy of a network**Rewriting the energy**Approximation of the partition function**Partition functionof a perfect gas**Sample networks and hyperparameters from the posterior**distribution Proposal probabilities Metropolis-Hastings scheme**Bayesian networkswith biological prior knowledge**• Biological prior knowledge: Information about the interactions between the nodes. • We use two distinct sources of biological prior knowledge. • Each source of biological prior knowledge is associated with its own trade-off parameter:b1 and b2. • The trade off parameter indicates how much biological prior information is used. • The trade-off parameters are inferred. They are not set by the user!**Bayesian networkswith two sources of prior**Source 2 Source 1 Data BNs + MCMC b1 b2 Recovered Networks and trade off parameters**Bayesian networkswith two sources of prior**Source 2 Source 1 Data BNs + MCMC b1 b2 Recovered Networks and trade off parameters**Bayesian networkswith two sources of prior**Source 2 Source 1 Data BNs + MCMC b1 b2 Recovered Networks and trade off parameters**Overview of the talk**• Revision: Bayesian networks • Integration of prior knowledge • Empirical evaluation**Evaluation**• Can the method automatically evaluate how useful the different sources of prior knowledge are? • Do we get an improvement in the regulatory network reconstruction? • Is this improvement optimal?**Raf regulatory network**From Sachs et al Science 2005**Evaluation: Raf signalling pathway**• Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell • Deregulation carcinogenesis • Extensively studied in the literature gold standard network**Flow cytometry data**• Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins • 5400 cells have been measured under 9 different cellular conditions (cues) • Downsampling to 100 instances (5 separate subsets): indicative of microarray experiments**Microarray example**Spellman et al (1998) Cell cycle 73 samples Tu et al (2005) Metabolic cycle 36 samples time time Genes Genes