450 likes | 685 Vues
3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution. 2006 Telluride Workshop Philip M. Kim, Ph.D., Yale University. New Haven, CT August 14th, 2006. MOTIVATION. ILLUSTRATIVE. Network perspective:. =.
 
                
                E N D
3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution 2006 Telluride Workshop Philip M. Kim, Ph.D., Yale University New Haven, CT August 14th, 2006
060119_CSB_Talk_PMK MOTIVATION • ILLUSTRATIVE Network perspective: = There remains a rich source of knowledge unmined by network theorists! Structural biology perspective: B4 B1-4 A B3 ≠ B1 B2 A Part of the RNA-pol complex Cdk/cyclin complex
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK PROTEIN INTERACTION NETWORKS IN YEAST • ILLUSTRATIVE • A snapshot of the current interactome • Description and methodologies • Determined by: • Large-scale Yeast-two-hydrid • TAP-Tagging • Literature curation • Currently over 20,000 unique interactions available in yeast • Spawned a field of computational “graph theory” analyses that view proteins as “nodes” and interactions as “edges” DIP (Database of interacting Proteins) Source: Gavin et al. Nature (2002), Uetz et al. Nature (2000), Cytoscape and DIP
060119_CSB_Talk_PMK TINY GLOSSARY: DEGREE AND HUBS • A: Degree = 5 • A is a “Hub”* • C: Degree = 1 * The definition of hubs is somewhat arbitrary, usually a cutoff is used Source: PMK
060119_CSB_Talk_PMK INTERESTING PROPERTIES OF INTERACTION NETWORKS • OVERVIEW • Examples of studies • What distribution does the degree (number of interaction partners) follow? • Network topology • Relationship of topology and genomic features • What is the relationship between the degree and a proteins essentiality? • Is there a relationship between a proteins connectivity and expression profile? • What is the relationship between a proteins evolutionary rate and its degree? • Network Evolution • How did the observed network topology evolve? Source: Various, see following slides
060119_CSB_Talk_PMK INTERACTION NETWORKS ARE SCALE-FREE – THEIR TOPOLOGY IS DOMINATED BY SO-CALLED HUBS • So-called scale-free topology has been observed in many kinds of networks (among them interaction networks) p(k) ~ kγ • Scale freeness: A small number of hubs and a large number of poorly connected ones (“Power-law behavior”) • Topology is dominated by “hubs” • Scale-freeness is in stark contrast to normal (gaussian) distribution Source: Barabasi, A. and Albert, R., Science (1999)
060119_CSB_Talk_PMK HUBS TEND TO BE IMPORTANT PROTEINS, THEY ARE MORE LIKELY TO BE ESSENTIAL PROTEINS AND TEND TO BE MORE CONSERVED • By now it is well documented that proteins with a large degree tend to be essential proteins in yeast. (“Hubs are essential”) • Likewise, it has been found that hubs tend to evolve more slowly than other proteins (“Hubs are slower evolving”) Source: Jeong et al. Nature (2001), Yu et al. TiG (2004) and Fraser et al. Science (2002)
060119_CSB_Talk_PMK … OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE • EXAMPLES • No, the relationship is unclear • Yes, hubs are more conserved • Fraser et al. Science (2002) • Jordan et al. Genome Res. (2002) ? • Jordan et al. BMC Evol. Biol. (2003) • But the “Yes” side appears to be winning • Fraser et al. BMC Evol. Biol. (2003) • Hahn et al. J. Mol. Evol. (2004) • Wuchty Genome Res. (2004) • Fraser Nature Genetics (2005) Source: See text
060119_CSB_Talk_PMK THERE IS A RELATIONSHIP BETWEEN NETWORK TOPOLOGY AND GENE EXPRESSION DYNAMICS Frequency Co-expression correlation Source: Han et al. Nature (2004) and Yu*, Kim* et al. (Submitted)
060119_CSB_Talk_PMK SCALE FREENESS GENERALLY EVOLVES THROUGH PREFERENTIAL ATTACHMENT (THE RICH GET RICHER) • ILLUSTRATIVE • The Duplication Mutation Model • Description • Theoretical work shows that a mechanism of preferential attachment leads to a scale-free topology (“The rich get richer”) • In interaction network, gene duplication followed by mutation of the duplicated gene is generally thought to lead to preferential attachment The interaction partners of A are more likely to be duplicated Gene duplication • Simple reasoning: The partners of a hub are more likely to be duplicated than the partners of a non-hub Source: Albert et al. Rev. Mod. Phys. (2002) and Middendorf et al. PNAS (2005)
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
Conclusion • Clearly, a protein is very unlikely to have >200 simultaneous interactors. • Some of the >200 are most likely false positives • Some others are going to be mutually exclusive interactors (i.e. binding to the same interface). • There appears to be an obvious discrepancy between >200 and 12. • Gedankenexperiment How many maximum neighbors can a protein have? 060119_CSB_Talk_PMK THERE IS A PROBLEM WITH SCALE-FREENESS AND REALLY BIG HUBS IN INTERACTION NETWORKS Wouldn’t it be great tobe able to see the different binding interfaces? • ILLUSTRATIVE • A really big hub (>200 Interactions) Source: DIP, Institut fuer Festkoerperchemie (Univ. Tuebingen)
Use a high-confidence • filter • Homology mapping • of Pfam domains • to all structures of • interactions • PDB • ~10000 Structures • of interactions* 060119_CSB_Talk_PMK UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES • ILLUSTRATIVE • Interactome • ~20000 interactions • Map Pfam domains to all • proteins in the interactome • Annotate interactions • with available structures, • discard all others Combine with all structures of yeast protein complexes • Distinguish • interfaces * Many redundant structures Source: PMK
060119_CSB_Talk_PMK SHORT DIGRESSION: THIS ALLOWS US TO DISTINGUISH SYSTEMATICALLY BETWEEN SIMULTANEOUSLY POSSIBLE AND MUTUALLY EXCLUSIVE INTERACTIONS Mutually exclusive interactions Simultaneously possible interactions Source: PMK
060119_CSB_Talk_PMK SIMULTANEOUSLY POSSIBLE INTERACTIONS (“PERMANENT”) MORE OFTEN LINK PROTEINS THAT ARE FUNCTIONALLY SIMILAR, COEXPRESSED AND CO-LOCATED Fraction same biological process Fraction same cellular component p<<0.001 p<<0.001 Fraction same molecular function Co-expression correlation p<<0.001 p<<0.001 Mutually exclusive interactions Mutually exclusive interactions Simultaneously possible interactions Simultaneously possible interactions Source: PMK
060119_CSB_Talk_PMK THAT IS HOW THE RESULTING NETWORK LOOKS LIKE • The Structural Interaction Dataset (SID) • Properties • Represents a “very high confidence” network • Total of 873 nodes and 1269 interactions, each of which is structurally characterized • 438 interactions are classified as mutually exclusive and 831 as simultaneously possible • While much smaller than DIP, it is of similar size as other high-confidence datasets Source: PDB, Pfam, iPfam and PMK
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK REMEMBER THE NETWORK PROPERTIES AS WE DESCRIBED BEFORE? • OVERVIEW • Examples of studies • What distribution does the degree (number of interaction partners follow?) • Does the network easily separate into more than one component? • Network topology • Relationship of topology and genomic features • What is the relationship between the degree and a proteins essentiality? • Is there a relationship between a proteins connectivity and expression profile? • What is the relationship between a proteins evolutionary rate and its degree? • Network Evolution • How did the observed network topology evolve? Source: Various, see following slides
060119_CSB_Talk_PMK THERE DO NOT APPEAR TO BE THE KINDS OF REALLY BIG HUBS AS SEEN BEFORE – IS THE TOPOLOGY STILL SCALE-FREE? • Degree distribution • Properties • With the maximum number of interactions at 13, there are no “really big hubs” in this network • Note that in other high-confidence datasets (or similar size), there are still proteins with a much higher degree • The degree distribution appears to top out much earlier and less scale free than that of other networks Source: PMK
060119_CSB_Talk_PMK IT’S REALLY ONLY THE MULTI-INTERFACE HUBS THAT ARE SIGNIFICANTLY MORE LIKELY TO BE ESSENTIAL Percentage of essential proteins Single-interface hubs only All proteins In our dataset Multi-interface hubs only Entire genome Source: PMK
060119_CSB_Talk_PMK DATE-HUBS AND PARTY-HUBS ARE REALLY SINGLE-INTERFACE AND MULTI-INTERFACE HUBS Frequency Expression correlation Expression Correlation Single-interface hubs only All proteins In our dataset Multi-interface hubs only Source: Han et al. Nature (2004) and PMK
060119_CSB_Talk_PMK AND ONLY MULTI-INTERFACE PROTEINS ARE EVOLVING SLOWER, SINGLE-INTERFACE HUBS DO NOT Evolutionary Rate (dN/dS) Single-interface hubs only All proteins In our dataset Multi-interface hubs only Entire genome Source: PMK
060119_CSB_Talk_PMK … OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE • No, the relationship is unclear • Yes, hubs are more conserved This debate may have arisen because the two different sides were all looking at the wrong variable! • Fraser et al. Science (2002) • Jordan et al. Genome Res. (2002) ? • Jordan et al. BMC Evol. Biol. (2003) • But the “Yes” side appears to be winning • Fraser et al. BMC Evol. Biol. (2003) • Hahn et al. J. Mol. Evol. (2004) • Wuchty Genome Res. (2004) Source: See text
060119_CSB_Talk_PMK IN FACT, EVOLUTIONARY RATE CORRELATES BEST WITH THE FRACTION OF INTERFACE AVAILABLE SURFACE AREA • DATA IN BINS Large portion of surface area involved in interfaces – slow evolving Small portion of surface area involved in interfaces – fast evolving Source: PMK
060119_CSB_Talk_PMK IS THERE A DIFFERENCE BETWEEN SINGLE-INTERFACE HUBS AND MULTI-INTERFACE HUBS WITH RESPECT TO NETWORK EVOLUTION? • The Duplication Mutation Model • In the structural viewpoint The interaction partners of A are more likely to be duplicated Gene duplication If these models were correct, there would be an enrichment of paralogs among B Source: PMK
060119_CSB_Talk_PMK MULTI-INTERFACE HUBS DO NOT APPEAR TO EVOLVE BY A GENE DUPLICATION – THE DUPLICATION MUTATION MODEL CAN ONLY EXPLAIN THE EXISTENCE OF SINGLE-INTERFACE HUBS But that also means that the duplication-mutation model cannot explain the full current interaction network! Fraction of paralogs between pairs of proteins Random pair Same partner Same partner different interface Same partner same interface Source: PMK
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK CONCLUSIONS • PRELIMINARY • The topology of a direct physical interaction network is much less dominated by hubs than previously thought • Several genomic features that were previously thought to be correlated with the degree are in fact related to the number of interfaces and not the degree • Specifically, a proteins evolutionary rate appears to be dependent on the fraction of surface area involved in interactions rather than the degree • The current network growth model can only explain a part of currently known networks Source: PMK
060119_CSB_Talk_PMK ACKNOWLEDGEMENTS Mark Gerstein Long Jason Lu Yu Brandon Xia The Gersteinlab, in particular: Alberto Paccanaro Jan Korbel Joel Rozowsky Tara Gianoulis Tom Royce
060119_CSB_Talk_PMK BACKUP
060119_CSB_Talk_PMK UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES • ILLUSTRATIVE • Start with high-confidence interactome dataset • Collected dimer and multimer structures and mapped Pfam domains onto the corresponding proteins • Removed ubiquitous domains (e.g., WD40) • All interactions that contain Pfam domains found to interact in a crystal structure are annotated with this structural information (all others are removed) • Dataset: ~1269 interactions (combined with all structures that were from yeast). Pfam -- Homology Combine with all structures of yeast protein complexes Explain methodology…. Source: PMK
060119_CSB_Talk_PMK UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES • ILLUSTRATIVE • Interactome • Use a high-confidence • filter • ~20000 interactions • Map Pfam domains to all • proteins in the interactome • Homology mapping • of Pfam domains • to all structures of • interactions • PDB • Annotate interactions • with available structures, • discard all others • ~10000 Structures • of interactions* Combine with all structures of yeast protein complexes • Distinguish • interfaces * Many redundant structures Source: PMK
060119_CSB_Talk_PMK SOME NETWORK STATISTICS – SCALE FREENESS? • In the Pfam dataset, the vast majority (570 out of 790) of the proteins (even hubs) has only one distinct interface. • 220 proteins (~25%) have 2 or more interfaces. • Most hubs are mediated by promiscuous interfaces rather than many interfaces ~ 2.6 interactions/interface Max Degree 161 nodes (degree >5) Avg. Degree Max Interfaces 220 nodes (numint>1) Avg. Interfaces Source: PMK
060119_CSB_Talk_PMK UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES • ILLUSTRATIVE • Interactome • PDB Source: PMK
060119_CSB_Talk_PMK CLIQUES, K-PLEXES AND K-CORES IN SOCIAL NETWORKS • … • … • … • … • … • … • … Source: …
060119_CSB_Talk_PMK AUTOMORPHIC EQUIVALENCE • … • … • … Source: …
060119_CSB_Talk_PMK NETWORKS IN MANAGEMENT SCIENCE - THE FIELD OF ORGANIZATION THEORY • … • …
060119_CSB_Talk_PMK DECISION MAKING IN ORGANIZATIONS: DECENTRALIZATION OF CERTAIN ISSUES • … • …
060119_CSB_Talk_PMK GROWING ORGANIZATIONS NEED TO DEPARTMENTALIZE • … • … • … Source: …
060119_CSB_Talk_PMK DOES SIZE MATTER? • … • … • … Source: …
060119_CSB_Talk_PMK ENVIRONMENTAL EFFECTS ON ORGANIZATIONAL STRUCTURE • … • … • … • … • … • … • … • … • … • … * … Source: …
… • … • … • … • … • … • … • … • … • … • … • … • … • … • … • … 060119_CSB_Talk_PMK FIVE DIFFERENT ORGANIZATIONAL CONFIGURATIONS • … • … • … • … * … Source: …