330 likes | 433 Vues
Gain new insights into protein function, network topology, and evolution through a structural perspective. Explore interaction networks, hub proteins, scale-free topology, and evolutionary relationships.
E N D
3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution CSB Seminar Philip M. Kim, Gerstein Lab New Haven, CT January 19th, 2006
060119_CSB_Talk_PMK MOTIVATION • ILLUSTRATIVE Network perspective: = There remains a rich source of knowledge unmined by network theorists! Structural biology perspective: B4 B1-4 A B3 ≠ B1 B2 A Part of the RNA-pol complex Cdk/cyclin complex
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK PROTEIN INTERACTION NETWORKS IN YEAST • ILLUSTRATIVE • A snapshot of the current interactome • Description and methodologies • Determined by: • Large-scale Yeast-two-hydrid • TAP-Tagging • Literature curation • Currently over 20,000 unique interactions available in yeast • Spawned a field of computational “graph theory” analyses that view proteins as “nodes” and interactions as “edges” DIP (Database of interacting Proteins) Source: Gavin et al. Nature (2002), Uetz et al. Nature (2000), Cytoscape and DIP
060119_CSB_Talk_PMK TINY GLOSSARY: DEGREE AND HUBS • A: Degree = 5 • A is a “Hub”* • C: Degree = 1 * The definition of hubs is somewhat arbitrary, usually a cutoff is used Source: PMK
060119_CSB_Talk_PMK INTERESTING PROPERTIES OF INTERACTION NETWORKS • OVERVIEW • Examples of studies • What distribution does the degree (number of interaction partners) follow? • Network topology • Relationship of topology and genomic features • What is the relationship between the degree and a proteins essentiality? • Is there a relationship between a proteins connectivity and expression profile? • What is the relationship between a proteins evolutionary rate and its degree? • Network Evolution • How did the observed network topology evolve? Source: Various, see following slides
060119_CSB_Talk_PMK INTERACTION NETWORKS ARE SCALE-FREE – THEIR TOPOLOGY IS DOMINATED BY SO-CALLED HUBS • So-called scale-free topology has been observed in many kinds of networks (among them interaction networks) p(k) ~ kγ • Scale freeness: A small number of hubs and a large number of poorly connected ones (“Power-law behavior”) • Topology is dominated by “hubs” • Scale-freeness is in stark contrast to normal (gaussian) distribution Source: Barabasi, A. and Albert, R., Science (1999)
060119_CSB_Talk_PMK HUBS TEND TO BE IMPORTANT PROTEINS, THEY ARE MORE LIKELY TO BE ESSENTIAL PROTEINS AND TEND TO BE MORE CONSERVED • By now it is well documented that proteins with a large degree tend to be essential proteins in yeast. (“Hubs are essential”) • Likewise, it has been found that hubs tend to evolve more slowly than other proteins (“Hubs are slower evolving”) Source: Jeong et al. Nature (2001), Yu et al. TiG (2004) and Fraser et al. Science (2002)
060119_CSB_Talk_PMK … OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE • EXAMPLES • No, the relationship is unclear • Yes, hubs are more conserved • Fraser et al. Science (2002) • Jordan et al. Genome Res. (2002) ? • Jordan et al. BMC Evol. Biol. (2003) • But the “Yes” side appears to be winning • Fraser et al. BMC Evol. Biol. (2003) • Hahn et al. J. Mol. Evol. (2004) • Wuchty Genome Res. (2004) • Fraser Nature Genetics (2005) Source: See text
060119_CSB_Talk_PMK THERE IS A RELATIONSHIP BETWEEN NETWORK TOPOLOGY AND GENE EXPRESSION DYNAMICS Frequency Co-expression correlation Source: Han et al. Nature (2004) and Yu*, Kim* et al. (Submitted)
060119_CSB_Talk_PMK SCALE FREENESS GENERALLY EVOLVES THROUGH PREFERENTIAL ATTACHMENT (THE RICH GET RICHER) • ILLUSTRATIVE • The Duplication Mutation Model • Description • Theoretical work shows that a mechanism of preferential attachment leads to a scale-free topology (“The rich get richer”) • In interaction network, gene duplication followed by mutation of the duplicated gene is generally thought to lead to preferential attachment The interaction partners of A are more likely to be duplicated Gene duplication • Simple reasoning: The partners of a hub are more likely to be duplicated than the partners of a non-hub Source: Albert et al. Rev. Mod. Phys. (2002) and Middendorf et al. PNAS (2005)
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
Conclusion • Clearly, a protein is very unlikely to have >200 simultaneous interactors. • Some of the >200 are most likely false positives • Some others are going to be mutually exclusive interactors (i.e. binding to the same interface). • There appears to be an obvious discrepancy between >200 and 12. • Gedankenexperiment How many maximum neighbors can a protein have? 060119_CSB_Talk_PMK THERE IS A PROBLEM WITH SCALE-FREENESS AND REALLY BIG HUBS IN INTERACTION NETWORKS Wouldn’t it be great tobe able to see the different binding interfaces? • ILLUSTRATIVE • A really big hub (>200 Interactions) Source: DIP, Institut fuer Festkoerperchemie (Univ. Tuebingen)
Use a high-confidence • filter • Homology mapping • of Pfam domains • to all structures of • interactions • PDB • ~10000 Structures • of interactions* 060119_CSB_Talk_PMK UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES • ILLUSTRATIVE • Interactome • ~20000 interactions • Map Pfam domains to all • proteins in the interactome • Annotate interactions • with available structures, • discard all others Combine with all structures of yeast protein complexes • Distinguish • interfaces * Many redundant structures Source: PMK
060119_CSB_Talk_PMK SHORT DIGRESSION: THIS ALLOWS US TO DISTINGUISH SYSTEMATICALLY BETWEEN SIMULTANEOUSLY POSSIBLE AND MUTUALLY EXCLUSIVE INTERACTIONS Mutually exclusive interactions Simultaneously possible interactions Source: PMK
060119_CSB_Talk_PMK SIMULTANEOUSLY POSSIBLE INTERACTIONS (“PERMANENT”) MORE OFTEN LINK PROTEINS THAT ARE FUNCTIONALLY SIMILAR, COEXPRESSED AND CO-LOCATED Fraction same biological process Fraction same cellular component p<<0.001 p<<0.001 Fraction same molecular function Co-expression correlation p<<0.001 p<<0.001 Mutually exclusive interactions Mutually exclusive interactions Simultaneously possible interactions Simultaneously possible interactions Source: PMK
060119_CSB_Talk_PMK THAT IS HOW THE RESULTING NETWORK LOOKS LIKE • The Structural Interaction Dataset (SID) • Properties • Represents a “very high confidence” network • Total of 873 nodes and 1269 interactions, each of which is structurally characterized • 438 interactions are classified as mutually exclusive and 831 as simultaneously possible • While much smaller than DIP, it is of similar size as other high-confidence datasets Source: PDB, Pfam, iPfam and PMK
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK REMEMBER THE NETWORK PROPERTIES AS WE DESCRIBED BEFORE? • OVERVIEW • Examples of studies • What distribution does the degree (number of interaction partners follow?) • Does the network easily separate into more than one component? • Network topology • Relationship of topology and genomic features • What is the relationship between the degree and a proteins essentiality? • Is there a relationship between a proteins connectivity and expression profile? • What is the relationship between a proteins evolutionary rate and its degree? • Network Evolution • How did the observed network topology evolve? Source: Various, see following slides
060119_CSB_Talk_PMK THERE DO NOT APPEAR TO BE THE KINDS OF REALLY BIG HUBS AS SEEN BEFORE – IS THE TOPOLOGY STILL SCALE-FREE? • Degree distribution • Properties • With the maximum number of interactions at 13, there are no “really big hubs” in this network • Note that in other high-confidence datasets (or similar size), there are still proteins with a much higher degree • The degree distribution appears to top out much earlier and less scale free than that of other networks Conventional Datasets (e.g. DIP) Our dataset (SID) Source: PMK
060119_CSB_Talk_PMK IT’S REALLY ONLY THE MULTI-INTERFACE HUBS THAT ARE SIGNIFICANTLY MORE LIKELY TO BE ESSENTIAL Percentage of essential proteins Single-interface hubs only All proteins In our dataset Multi-interface hubs only Entire genome Source: PMK
060119_CSB_Talk_PMK DATE-HUBS AND PARTY-HUBS ARE REALLY SINGLE-INTERFACE AND MULTI-INTERFACE HUBS Frequency Expression correlation Expression Correlation Single-interface hubs only All proteins In our dataset Multi-interface hubs only Source: Han et al. Nature (2004) and PMK
060119_CSB_Talk_PMK AND ONLY MULTI-INTERFACE PROTEINS ARE EVOLVING SLOWER, SINGLE-INTERFACE HUBS DO NOT Evolutionary Rate (dN/dS) Single-interface hubs only All proteins In our dataset Multi-interface hubs only Entire genome Source: PMK
060119_CSB_Talk_PMK … OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE • No, the relationship is unclear • Yes, hubs are more conserved This debate may have arisen because the two different sides were all looking at the wrong variable! • Fraser et al. Science (2002) • Jordan et al. Genome Res. (2002) ? • Jordan et al. BMC Evol. Biol. (2003) • But the “Yes” side appears to be winning • Fraser et al. BMC Evol. Biol. (2003) • Hahn et al. J. Mol. Evol. (2004) • Wuchty Genome Res. (2004) Source: See text
060119_CSB_Talk_PMK IN FACT, EVOLUTIONARY RATE CORRELATES BEST WITH THE FRACTION OF INTERFACE AVAILABLE SURFACE AREA • DATA IN BINS Small portion of surface area involved in interfaces – fast evolving Large portion of surface area involved in interfaces – slow evolving Source: PMK
060119_CSB_Talk_PMK IS THERE A DIFFERENCE BETWEEN SINGLE-INTERFACE HUBS AND MULTI-INTERFACE HUBS WITH RESPECT TO NETWORK EVOLUTION? • The Duplication Mutation Model • In the structural viewpoint The interaction partners of A are more likely to be duplicated Gene duplication If these models were correct, there would be an enrichment of paralogs among B Source: PMK
060119_CSB_Talk_PMK MULTI-INTERFACE HUBS DO NOT APPEAR TO EVOLVE BY A GENE DUPLICATION – THE DUPLICATION MUTATION MODEL CAN ONLY EXPLAIN THE EXISTENCE OF SINGLE-INTERFACE HUBS But that also means that the duplication-mutation model cannot explain the full current interaction network! Fraction of paralogs between pairs of proteins Random pair Same partner Same partner different interface Same partner same interface Source: PMK
060119_CSB_Talk_PMK OUTLINE • Interaction Networks and their properties • A 3-D structural point of view • Network properties revisited • Conclusions
060119_CSB_Talk_PMK CONCLUSIONS • PRELIMINARY • The topology of a direct physical interaction network is much less dominated by hubs than previously thought • Several genomic features that were previously thought to be correlated with the degree are in fact related to the number of interfaces and not the degree • Specifically, a proteins evolutionary rate appears to be dependent on the fraction of surface area involved in interactions rather than the degree • The current network growth model can only explain a part of currently known networks Source: PMK
060119_CSB_Talk_PMK ACKNOWLEDGEMENTS Mark Gerstein The nets group (Haiyuan, Jason, Brandon, Tara, Kevin, Zhengdong and Alberto) The Gersteinlab
060119_CSB_Talk_PMK BACKUP