Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

九大数理集中講義Comparison, Analysis, and Control of Biological Networks (4)Analysis and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Contents • Boolean Network • Attractor Detection/Enumeration • Algorithms for Singleton AttractorDetection/Enumeration • Control of Boolean Networks • Integer Linear Programming-based Approach

Boolean Network

Boolean Network • Mathematical model of genetic networks • node⇔gene • State of node：　1 (active) /0 (inactive) • Regulation rules • Boolean function(AND, OR, NOT …) • Edge from y to x⇔y directly controls x • Synchronized update • Almost the same as digital circuits(with clocks) [Kauffman, The Origin of Order, 1993]

A time ｔ time ｔ＋１ A ’ B’ C’ A B C 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 B C 0 1 0 0 0 0 0 1 0 0 A ’ = B 1 0 1 0 1 1 1 0 1 1 0 B ’ = A and C 1 1 1 INPUT OUTPUT C ’ = not A Example of Boolean Network Boolean Network State Transition Table Example of state transition：１１１　⇒　１１０　⇒　１００　⇒　０００　⇒　００１　⇒　００１　⇒　００１　⇒　。。。

Why Boolean Networks ? • Criticism that BN is too simplified • Unless simplified, difficult for theoretical analysis, inference, and control • though complex models can be used for simulation • Maybe useful for qualitative analyses • One of most simple non-linear models • Negative results on BN suggest negative results on more general (non-linear) models • Almost the same as digital circuits • Theories and techniques in computer science can beutilized

Our focus: Time Complexity • Many problems for BN are NP-hard • NP-hard means that there is no polynomial time algorithm (unless P=NP) • It will take O(2n) time or more if we use naïve methods • But, we want to solve much better • Because we can solve the cases of • n=300 for O(1.1n) • n=600 for O(1.05n) • Important for coping with large-scale networks

Attractor Detection

time ｔ time ｔ＋１ A ’ B’ C’ A B C 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 1 INPUT OUTPUT Attractor(1) State Transition Table • Steady state • Different attractors ⇔ Different cell types • Example • 011 ⇒ 101 ⇒ 010 ⇒ 101 ⇒ 010 ⇒… • 111 ⇒ 110 ⇒ 100 ⇒ 000 ⇒ 001 ⇒ 001 ⇒001 ⇒ …

111 010 000 100 110 011 001 101 time ｔ time ｔ＋１ A ’ B’ C’ A B C 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 1 INPUT OUTPUT Attractor (2)

indegree＝２ indegree＝３ v v N-K Model (Kauffman Network) • N: Number of nodes (We use n instead of N) • K: Indegree • Indegree = the number of input edges = the number of genes directly affecting node v • Each node has (maximum or average) indegree K • Boolean function assigned to each node is randomly selected

Distribution of Attractors inN-KModel • Classical conjecture • The number of attractors is • Recent results suggest that this conjecture may not be true • Superpolynomial growth ( > nγ for any γ) of the number of attractors (Samuelsson & Troein, PRL, 2003) • Superpolynomial growth of the average size of attractors (Drossel et al., PRL, 2005) • No conclusive result is known

Singleton Attractor (or Point Attractor) • Biological interpretation of attractors • Different attractors　⇔　Different cell types • Point attractor • Attractor with period 1 • Corresponding to a steady state • Definition: satisfying • Attractor Detection • Input: Boolean Network • Output: Point Attractor (if any) （or, 　　　　　　　　　　）

Attractor Detection: Previous Works • Around time is enough since there are2n global states • But, it cannot be applied to largen • Several heuristics are known, but no theoretical guarantee [Irons, Pysica D, 2006], [Devloo et al., Bull. Math. Biol. 2003], … • Detection of a singleton attractor is NP-hard [Akutsu et al., GIW 1998] • We developed algorithms with average case theoretical bounds[Zhang et al., EURASIP JBSB 2007] • We also developed time algorithms for AND-OR BNs [Tamura & Akutsu, FCT07, Trans. IEICE 2009] [Tamura & Akutsu, AB08, Math. in CS 2009] [Melkman, Tamura & Akutsu, 2010]

Algorithms for Singleton Attractor Detection/Enumeration

Singleton Attractor（=Attractor with Period 1) attractor attractor

indegree＝２ indegree＝３ v v Indegree • Indegree = the number of input edges = the number of genes directly affecting node v • We use Kto denote the maximum indgree

Simple Recursive Enumeration Algorithm (1) • Examine 0-1 assignment one-by-one, and backtrack as soon as some contradiction occurs [Zhang et al., EURASIP JBSB 2007]

Illustration of Recursive Algorithm 0 0 0 0 0 1 0 1 0 0 0 1 Output

Simple Recursive Enumeration Algorithm (2) • Examine 0-1 assignment one-by-one, and backtrack as soon as some contradiction occurs. • 0 • 00 　Ｘ　 backtrack • 01 • 010 　Ｘ　 backtrack • 011 　Ｘ　 backtrack • 10 • Several variants depending on ordering of nodes • Much better than trivial O(n2n) time

Analysis of Average Case Time Complexity t=0 t=1 v1 • Probability that vi(0)≠vi(1)is detected when 0-1 assignment for first m bits is examined: • Probability that a random assignment for m bits is consistent (with def. of singleton attractor): • Expected number of consistent 0-1 assignments for m bits: • By taking the maximum of the above for m in [1…n] , we can estimate the complexity vm-1 vm vm+1 K

ComputationalExperiment • Exponential increases, but bases are less than 2 Empirical Time Complexity

Issues on Worst Case Time Complexity • Detection of a Singleton Attractor for BNs with indegree K (K+1)-SAT • O(1.322n) time for K=2 (randomized) • We developed O((1.322-δ)n) time algorithm for K=2 • Detection problem remains NP-hard even for K=2 • O(1.587n) time algorithm for BNs with AND/OR nodes (no constraint on K) [Melkman, Tamura & Akutsu, 2010]

Reduction from BN-ATTRACTOR to SAT • Detection of Singleton Attractor with Max. Indegree K (K+1)-SAT (Boolean SATisfiability problem) vj vk vi

Basic Idea in O(1.587n) Time Algorithm (A) (B) u • Consider recursive assignment of 0-1 values to nodes (A) v=0 ⇒ u=0, v=1 ⇒ w=1 (B) v=0 ⇒ u=0 and w=1 • Letf(k) be #(assignments) for BN with k variables • By solving the above (like Fibonacci number), f(k) is O(1.4656n) • However, above procedure cannot be applied to all cases (e.g., not to bipartite networks)  combination with SAT is required O(1.587n) time u v v w All nodes are OR NOT input w

Attractor Detection: Previous Works (2) Singleton Attractors Cyclic Attractors (Recursive, Average Case)

Control of Boolean Network

BN-Control: Previous Works • Datta et al. defined a problem of control of PBN (Probabilistic Extension of BN) and proposed a dynamic programming based method • They also proposed various extensions • But, their method must handle 2n×2n matrices • BN-Control (also PBN-Control) is NP-hard • BN-Control can be solved in polynomial time if the network has a tree structure [Akutsu et al., JTB 2007] • Practical approach based on Model Checking/SAT [Langmund & Jha, APBC 2008, JBCB 2009] • Theoretical studies using Semi-Tensor Product [Cheng, 2009] [Machine Learning, 52:169-191, 2003]

Definition of BN-Control • Input • Internal nodes: v1 ,…, vn External nodes：u1 ,…, um • Initial state:v0Desired state: vMBN • Output • Sequence of states of external nodes：u(0), u(1), …, u(M) • v(0)=v0, v(M)=vM　　　（leading to the desired state at time M） [Akutsu et al., J. Theo. Biol. 2007]

Dynamic Programming for Control of BN • BN version of the algorithm by Datta et al. • DP table: • takes 1 if there is a control seq. leading to the target state • can be computed by

Illustration of DP Algorithm D[1,1,1, 2] =1 D[0,0,0, 2] = 0 u1=1, u2=1 DP Computation D[0,1,1, 3] = 1 But, the size of DP table is exponential

Integer Linear Programming-Based Approach

Integer Programming • Linear Programming (LP) • Maximize (or minimize) an objective linear function under constraints of linear inequalities • Integer Linear Programming (ILP) • LP + constraints that specified variables must take integer value • Several efficient solvers: CPLEX, Gurobi • Used for solving various NP-hard problems

ILP for Attractor Detection (1) xi: state of vi

ILP for Attractor Detection (2) 0

dummy for using ILP ILP forAttractorDetection(3)

ILP formalization for BN-Control major changes from Attractor Detection

Summary

Summary • Boolean network • A discrete model of a genetic network • Similar to digital circuits • Attractor Detection/Enumeration • NP-hard • Much better than a naïve O(2n) bound for bounded indegree cases • Identification of cyclic attractors is more difficult • Control of Boolean networks • NP-hard • Can be solved by DP algorithm (but, in exponential time) • Integer Linear Programming-based Approach • Simple • Flexible for modifications/extensions • Fast if indegree ≦ 2

Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Presentation Transcript

BioInformatics Institute

Satoshi Konishi Institute of Advanced Energy, Kyoto University

Center for Intelligent Systems Research GW Transportation Research Institute

Bioinformatics for Research

Koichiro Yoshino , Shinsuke Mori and Tatsuya Kawahara Kyoto University, Japan

Yuya Sasai (Yukawa Institute for Theoretical Physics, Kyoto University)

Research Institute for Physical Chemical Problems of the Belarusian State University

Toshimasa Yoshiie Research Reactor Institute, Kyoto University

1) Japan Atomic Energy research Institute 2) Institute of Advanced Energy, Kyoto University

Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji Kyoto University, Japan

Satoshi Konishi Institute of Advanced Energy, Kyoto University

Morihiro Hayashida, Nobuhisa Ueda, Tatsuya Akutsu Bioinformatics Center, Kyoto University

Yukawa Institute for Theoretical Physics Kyoto University

Bioinformatics Research Centre University of Glasgow

N. Itagaki Yukawa Institute for Theoretical Physics, Kyoto University

CENTER FOR GENOMICS AND BIOINFORMATICS

Research Institute for Sustainable Humanosphere （ RISH) Kyoto University

N. Itagaki Yukawa Institute for Theoretical Physics, Kyoto University

Early Findings Indiana University Center for Aging Research Regenstrief Institute, Inc.

Tatsuya Kawahara (Kyoto University, Japan) kawahara@i.kyoto-u.ac.jp

Morihiro Hayashida, Nobuhisa Ueda, Tatsuya Akutsu Bioinformatics Center, Kyoto University

Tatsuya Kawahara (Kyoto University, Japan)