1 / 61

Complex (Biological) Networks

Complex (Biological) Networks. Today : Measuring Network Topology Thursday : Analyzing Metabolic Networks. Elhanan Borenstein Spring 2010. Some slides are based on slides from courses given by Roded Sharan and Tomer Shlomi. Measuring Network Topology. Introduction to network theory

Télécharger la présentation

Complex (Biological) Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Complex (Biological) Networks Today: Measuring Network Topology Thursday: Analyzing Metabolic Networks Elhanan Borenstein Spring 2010 Some slides are based on slides from courses given by Roded Sharan and Tomer Shlomi

  2. Measuring Network Topology • Introduction to network theory • Global Measures of Network Topology • Degree Distribution • Clustering Coefficient • Average Distance • Random Network Models • Network Motifs

  3. What is a Network? • A map of interactions or relationships • A collection of nodes and links (edges)

  4. What is a Network? • A map of interactions or relationships • A collection of nodes and links (edges)

  5. Why Networks? • Focus on the organization of the system (rather than on its components) • Simple representation • Visualization of complex systems • Networks as tools • Underlying diffusion model (e.g. evolution on networks) • The structure and topology of the systemaffect (determine) its function

  6. Networks vs. Graphs • Graph Theory • Definition of a graph: G=(V,E) • V is the set of nodes/vertices (elements) • |V|=N • E is the set of edges (relations) • One of the most well studied objects in CS • Subgraph finding (e.g., clique, spanning tree) and alignment • Graph coloring and graph covering • Route finding (Hamiltonian path, traveling salesman, etc.) • Many problems are proven to be NP-complete

  7. The Seven Bridges of Königsberg • Published by Leonhard Euler, 1736 • Considered the first paper in graph theory

  8. Types of Graphs/Networks • Directed/undirected • Weighted/non-weighted • Directed Acyclic Graphs (DAG) / Trees • Bipartite Graphs • Hypergraphs

  9. Computational Representation of Networks A B C D Object Oriented List/set of edges:(ordered) pairs of nodes{ (A,C) , (C,B) , (D,B) , (D,C) } Connectivity Matrix Name:Dngr: Name:Cngr: Name:Bngr: Name:Angr: p1 p1 p1 p2 • Which is the most useful representation?

  10. Network Visualization VisualComplexity.com • Art? Science? Cytoscape

  11. Networks in Biology • Molecular networks: • Protein-Protein Interaction (PPI) networks • Metabolic Networks • Regulatory Network • Synthetic lethality Network • Gene Interaction Network • More …

  12. Metabolic Networks • Reflect the set of biochemical reactions in a cell • Nodes: metabolites • Edges: biochemical reactions • Additional representations! • Derived through: • Knowledge of biochemistry • Metabolic flux measurements S. Cerevisiae 1062 metabolites 1149 reactions

  13. Protein-Protein Interaction (PPI) Networks • Reflect the cell’s molecular interactions and signaling pathways (interactome) • Nodes: proteins • Edges: interactions(?) • High-throughput experiments: • Protein Complex-IP (Co-IP) • Yeast two-hybrid S. Cerevisiae 4389 proteins 14319 interactions

  14. Transcriptional Regulatory Network • Reflect the cell’s genetic regulatory circuitry • Nodes: transcription factors (TFs) and genes; • Edges (directed): from TF to the genes it regulates • Derived through: • Chromatin IP • Microarrays

  15. Other Networks in Biology/Medicine

  16. Non-Biological Networks • Computer related networks: • WWW; Internet backbone • Communication and IP • Social networks: • Friendship (facebook; clubs) • Citations / information flow • Co-authorships (papers); Co-occurrence (movies; Jazz) • Transportation: • Highway system; Airline routes • Electronic/Logic circuits • Many more…

  17. Global Measures ofNetwork Topology

  18. Node Degree / Rank • Degree = Number of neighbors • Local characterization! • Node degree in PPI networks correlates with: • Gene essentiality • Conservation rate • Likelihood to cause human disease

  19. Degree Distribution • Degree distribution P(k): probability that a node has degree k • For directed graphs, two distributions: • In-degree • out-degree • Average degree: • Number of edges: Nd/2

  20. Common Distributions • Poisson: • Exponential: • Power-law:

  21. The Power-Law Distribution • Fat or heavy tail! • Leads to a “scale-free” network • Characterized by a small number of highlyconnected nodes, known as hubs • Hubs are crucial: • Affect error and attack tolerance of complex networks (Albert et al. Nature, 2000) • ‘party’ hubs and ‘date’ hubs 

  22. The Internet • Nodes – 150,000 routers • Edges – physical links • P(k) ~ k-2.3 Govindan and Tangmunarunkit, 2000

  23. Movie Actor Collaboration Network Tropic Thunder (2008) • Nodes – 212,250 actors • Edges – co-appearance in a movie • (<k> = 28.78) • P(k) ~ k-2.3 Barabasi and Albert, Science, 1999

  24. Protein Interaction Networks • Nodes – Proteins • Edges – Interactions (yeast) • P(k) ~ k-2.5 Yook et al, Proteomics, 2004

  25. Metabolic Networks • Nodes – Metabolites • Edges – Reactions • P(k) ~ k-2.2±2 • Metabolic networks across all kingdoms of life are scale-free E. Coli(bacterium) A.Fulgidus(archae) Averaged(43 organisms) C.Elegans(eukaryote) Jeong et al., Nature, 2000

  26. Network Clustering Costanzo et al., Nature, 2010

  27. Clustering Coefficient (Watts & Strogatz) • Characterizes tendency of nodes to cluster • “triangles density” • “How often do my (facebook) friends know each other • (if di = 0 or 1 then Ci is defined to be 0)

  28. Clustering Coefficient: Example • Lies in [0,1] • For cliques: C=1 • For triangle-free graphs: C=0 Ci=3/10=0.3 Ci=0/10=0 Ci=10/10=1

  29. Average Distance • Distance: Length of shortest (geodesic) path between two nodes • Average distance: average over all connected pairs

  30. Small World Networks • Despite their often large size, in most (real) networks there is a relatively short path between any two nodes • “Six degrees of separation” (Stanley Milgram;1967) • Collaborative distance: • Erdös number • Bacon number  Danica McKellar: 6 Natalie Portman: 6 Daniel Kleitman: 3

  31. Network Structure in Real Networks

  32. Additional Measures • Network Modularity • Giant component • Betweenness centrality • Current information flow • Bridging centrality • Spectral density

  33. Random Network Models • Random Graphs (Erdös/Rényi) • Generalized Random Graphs • Geometric Random Graphs • The Small World Model (WS) • Preferential Attachment

  34. Random Graphs (Erdös/Rényi) • N nodes • Every pair of nodes is connected with probability p • Mean degree: d = (N-1)p ~ Np

  35. Random Graphs: Properties • Mean degree: d = (N-1)p ~ Np • Degree distribution is binomial • Asymptotically Poisson: • Clustering Coefficient: • The probability of connecting two nodes at random is p •  Clustering coefficient is C=p • In many large networks p ~ 1/n  C is lower than observed • Average distance: • l~ln(N)/ln(d) …. (think why?) • Small world! (and fast spread of information)

  36. Generalized Random Graphs • A generalized random graph with a specified degree sequence (Bender & Canfield ’78) • Creating such a graph: • Prepare k copies of each degree-k node • Randomly assign node copies to edges • [Reject if the graph is not simple] This algorithm samples uniformly from the collection of all graphs with the specified degree sequence!

  37. Geometric Random Graphs • G=(V,r) • V – set of points in a metric space (e.g. 2D) • E – all pairs of points with distance ≤ r • Captures spatial relationships • Poisson degree distribution

  38. The Small World Model (WS) • Generate graphs with high clustering coefficients Cand small distance l • Rooted in social systems • Start with order (every node is connected to its K neighbors) • Randomize (rewire each edge with probability p) • Degree distribution is similar to that of a random graph! Varying p leads to transition between order (p=0) and randomness (p=1) Watts and Strogatz, Nature, 1998

  39. The Scale Free Model:Preferential Attachment • A generative model (dynamics) • Growth: degree-m nodes are constantly added • Preferential attachment: the probability that a new node connects to an existing one is proportional to its degree • “The rich get richer” principle Albert and Barabasi, 2002

  40. Preferential Attachment:Clustering Coefficient C ~ N-0.75 C ~ N-01

  41. Preferential Attachment: Empirical Evidence • Highly connected proteins in a PPI network are more likely to evolve new interactions Wagner, A. Proc. R. Soc. Lond. B , 2003

  42. Model Problems • Degree distribution is fixed(although there are generalizations of this method that handle various distributions) • Clustering coefficient approaches 0 with network size, unlike real networks • Issues involving biological network growth: • Ignores local events shaping real networks (e.g., insertions/deletions of edges) • Ignores growth constraints (e.g., max degree) and aging (a node is active in a limited period)

  43. Conclusions • No single best model! • Models differ in various network measures • Different models capture different attributes of real networks • In literature, “random graphs” and “generalized random graphs” are most commonly used

  44. Network Motifs

  45. Network Motifs • Going beyond degree distribution … • Generalization of sequence motifs • Basic building blocks • Evolutionary design principles R. Milo et al. Network motifs: simple building blocks of complex networks. Science, 2002

  46. What are Network Motifs? • Recurring patterns of interactions (subgraphs) that are significantly overrepresented (w.r.t. a background model) 13 possible 3-nodes subgraphs R. Milo et al. Network motifs: simple building blocks of complex networks. Science, 2002

  47. Finding motifs in the Network 1. Generate randomized networks 2a. Scan for all n-node subgraphs in the real network 2b. Record number of appearances of each subgraph(consider isomorphic architectures) 3a. Scan for all n-node sub graphs in rand’ networks 3b. Record number of appearances of each sub graph 4. Compare each subgraph’s data and choose motifs

  48. Finding motifs in the Network

  49. Network Randomization • Preserve in-degree, out-degree and mutual degree • For motifs with n>3 also preserve distribution of smaller sub-motifs (simulated annealing)

  50. Generation of Randomized Networks • Algorithm A (Markov-chain algorithm): • Start with the real network and repeatedly swap randomly chosen pairs of connections(X1Y1, X2Y2 is replaced by X1Y2, X2Y1) • Repeat until the network is well randomized • Switching is prohibited if the either of the connections X1Y2 or X2Y1 already exist X1 Y1 X1 Y1 X2 Y2 X2 Y2

More Related