260 likes | 406 Vues
Probabilistic graphical models and regulatory networks. BMI/CS 576 www.biostat.wisc.edu/bmi576.html Sushmita Roy sroy@biostat.wisc.edu Nov 27 th , 2012. Two main questions in regulatory networks. Sko1. Hot1. HSP12. X 2. X 1. Hot1. Sko1. BOOLEAN. LINEAR. DIFF. EQNS.
E N D
Probabilistic graphical models and regulatory networks BMI/CS 576 www.biostat.wisc.edu/bmi576.html Sushmita Roy sroy@biostat.wisc.edu Nov 27th, 2012
Two main questions in regulatory networks Sko1 Hot1 HSP12 X2 X1 Hot1 Sko1 BOOLEAN LINEAR DIFF. EQNS PROBABILISTIC …. Hot1 regulates HSP12 ψ(X1,X2) HSP12 is a target of Hot1 HSP12 Y Function Structure Who are the regulators? How they determine expression levels?
Graphical models for representing regulatory networks • Bayesian networks • Dependency networks Random variables encode expression levels Sho1 Msb2 Regulators X2 X1 X1 Ste20 Y3=f(X1,X2) X2 Y3 Target Y3 Structure Function Edges correspond to some form of statistical dependencies
Bayesian networks: estimate a set of conditional probability distributions … ? ? ? Regulators (parents) Yi Function: Conditional probability distribution (CPD) Target (child)
Dependency networks: a set of regression problems Regulators 1 p 1 … 1 ? ? ? 1 Yi X1 …… Xp = bj Yi d p d Function: Linear regression Regularization term Number of genes
Bayesian networks • a BN is a Directed Acyclic Graph (DAG) in which • the nodes denote random variables • each node X has a conditional probability distribution (CPD) representing P(X | Parents(X)) • A type of probabilistic graphical model (PGM) • A graph • Parameters (for probability distributions) • the intuitive meaning of an arc from X to Y is that X directly influences Y • Provides a tractable way to work with large joint distributions
Bayesian networks for representing regulatory networks … ? ? ? Regulators (parents) Yi Conditional probability distribution (CPD) Target (child)
Example Bayesian network Parents X2 X1 X4 X3 Child Assume Xi is binary X5 Needs 25 measurements No independence assertions Needs 23 measurements Independence assertions
P( D | A, B,C) as a tree A f t Pr(D =t) = 0.9 B f t Pr(D =t) = 0.5 C f t Pr(D =t) = 0.8 Pr(D =t) = 0.5 Representing CPDs for discrete variables • CPDs can be represented using tables or trees • consider the following case with Boolean variables A, B, C, D P( D | A, B,C) as a table
Representing CPDs for continuous variables Parameters X2 X1 X3 Conditional Gaussian
The learning problems • Parameter learning: • Given a network structure learn the parameters from data • Structure learning: • Given data, learn the structure (and parameters) • Subsumes parameter learning
Structure learning • Maximum likelihood framework • Bayesian framework Score
The structure learning task • structure learning methods have two main components • a scheme for scoring a given BN structure • a search procedure for exploring the space of structures
Structurelearning using score-based search ... Maximum likelihood Best graph
Decomposability of scores • Score decomposes over variables. • Thus we can independently optimize the S(X_i) • It all boils down to accurately estimating the conditional probability distributions
D D D C C C B B B A A A Structure search operators given the current network at some stage of the search, we can… add an edge delete an edge Check for cycles
Bayesian network search: hill-climbing given: data set D, initial network B0 i = 0 BbestB0 while stopping criteria not met { for each possible operator application a { Bnewapply(a, Bi) if score(Bnew) > score(Bbest) BbestBnew } ++i BiBbest }
Learning networks from expression is difficult due to too few measurements • Reduce the candidate parents • Sparse candidate • Prior knowledge • MinREG • Reduce the target set • Module networks
Bayesian network search: the Sparse Candidate algorithm[Friedman et al., UAI 1999] Given: data set D, initial network B0, parameter k
Therestrict step in Sparse Candidate • to identify candidate parents in the first iteration, can compute the mutual information between pairs of variables • where denotes the probabilities estimated from the data set
D C C B D A A Therestrict step in Sparse Candidate • suppose true network structure is: • we’re selecting two candidate parents for A and I(A; C) > I(A; D) > I(A; B) • the candidate parents for A would then be C and D; how could we get B as a candidate parent on the next iteration?
The restrict step in Sparse Candidate • Kullback-Leibler(KL) divergence provides a distance measure between two distributions, P and Q • mutual information can be thought of as the KL divergence between the distributions (assumes X and Y are independent)
D C C B D B A A The restrict step in Sparse Candidate • we can use KL to assess the discrepancy between the network’s estimate Pnet(X, Y) and the empirical estimate true distribution current Bayes net How might we calculate Pnet(A,B)?
important to ensure monotonic improvement The restrict step in Sparse Candidate Pa(Xi): current parents of Xi
Themaximize step in Sparse Candidate • hill-climbing search with add-edge, delete-edge,reverse-edge operators • test to ensure that cycles aren’t introduced into the graph
Efficiency of Sparse Candidate n = number of variables