Biological networks
250 likes | 369 Vues
Protein-protein interactions (PPIs) are crucial for cellular functions, mediating most biological activities through physical associations. This overview of PPIs includes definitions, examples like receptor-ligand and transcription factor interactions, and the significance of disturbances resulting in diseases. We explore methods for identifying PPIs: the yeast two-hybrid system, tandem affinity purification, and computational approaches. Additionally, we highlight public databases for interaction data, emphasizing their importance in understanding complex biological networks in various organisms, including humans.
Biological networks
E N D
Presentation Transcript
Biological networks Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu
Protein-protein interaction (PPI) • Definition • Physical association of two or more protein molecules • Examples • Receptor-ligand interactions • Kinase-substrate interactions • Transcription factor-co-activator interactions • Multiprotein complex, e.g. multimeric enzymes RNA polymerase II, 12 subunits Cramer et al. Science 292:1863, 2001 BCHM352, Spring 2011
Significance of protein interaction • Most proteins mediate their function through interacting with other proteins • To form molecular machines • To participate in various regulatory processes • Distortions of protein interactions can cause diseases BCHM352, Spring 2011
Yeast two-hybrid • Method • Bait strain: a protein of interest, bait (B), fused to a DNA-binding domain (DBD) • Prey strains: ORFs fused to a transcriptional activation domain (AD) • Mate the bait strain to prey strains and plate diploid cells on selective media (e.g. without Histidine) • If bait and prey interact in the diploid cell, they reconstitute a transcription factor, which activates a reporter gene whose expression allows the diploid cell to grow on selective media • Pick colonies, isolate DNA, and sequence to identify the ORF interacting with the bait • Pros • High-throughput • Can detect transient interactions • Cons • False positives • Non-physiological (done in the yeast nucleus) • Can’t detect multiprotein complexes UetzP. Curr Opin Chem Biol. 6:57, 2002 BCHM352, Spring 2011
Tandem affinity purification • Method • TAP tag: Protein A, Calmodulin binding domain, TEV protease cleavage site • Bait protein gene is fused with the DNA sequences encoding TAP tag • Tagged bait is expressed in cells and forms native complexes • Complexes purified by TAP method • Components of each complex are identified through gel separation followed by MS/MS • Pros • High-throughput • Physiological setting • Can detect large stable protein complexes • Cons • High false positives • Can’t detect transient interactions • Can’t detect interactions not present under the given condition • Tagging may disturb complex formation • Binary interaction relationship is not clear Chepelev et al. Biotechnol & Biotechnol 22:1, 2008 BCHM352, Spring 2011
Large scale protein interaction identification • Experimental • Yeast two-hybrid • Tandem affinity purification • Computational • Gene fusion • Ortholog interaction • Phylogenetic profiling • Microarray gene co-expression Valencia et al. Curr. Opin. Struct. Biol, 12:368, 2002 BCHM352, Spring 2011
Protein interaction data in the public domain • Database of Interacting Proteins (DIP) http://dip.doe-mbi.ucla.edu/ • The Molecular INTeraction database (MINT) http://mint.bio.uniroma2.it/mint/ • The Biomolecular Interaction Network Database (BIND) http://www.binddb.org/ • The General Repository for Interaction Datasets (BioGRID) http://www.thebiogrid.org/ • Human Protein Reference Database (HPRD) http://www.hprd.org • Online Predicted Human Interaction Database (OPHID) http://ophid.utoronto.ca • The Munich Information Center for Protein Sequences (MIPS) http://mips.gsf.de BCHM352, Spring 2011
HPRD BCHM352, Spring 2011
Protein interaction networks Saccharomyces cerevisiae Jeong et al. Nature, 411:41, 2001 Drosophila melanogaster Giot et al. Science, 302:1727, 2003 Caenorhabditis elegans Li et al. Science, 303:540, 2004 Homo sapiens Rual et al. Nature, 437:1173, 2005 BCHM352, Spring 2011
Gene regulatory networks • Experimental • Chromatin immunoprecipitation (ChIP) • ChIP-chip • ChIP-seq • Computational • Promoter sequence analysis • Reverse engineering from microarray gene expression data • Public databases • Transfac (http://www.gene-regulation.com) • MSigDB (http://www.broadinstitute.org/gsea/msigdb) • hPDI (http://bioinfo.wilmer.jhu.edu/PDI/ ) Shen-orr et al. Nat Genet, 31:64, 2002 BCHM352, Spring 2011
KEGG metabolic network BCHM352, Spring 2011
Network visualization tools • Cytoscape • http://www.cytoscape.org Gehlenborg et al. Nature Methods, 7:S56, 2010 BCHM352, Spring 2011
Graph representation of networks • Graph: a graph is a set of objects called nodes or vertices connected by links called edges. In mathematics and computer science, a graph is the basic object of study in graph theory. node edge RNA polymerase II Cramer et al. Science 292:1863, 2001 BCHM352, Spring 2011
Undirected graph vs directed graph Protein interaction network Nodes: protein Edges: physical interaction Undirected Krogan et al. Nature 440:637, 2006 Lee et al. Science 298:799, 2002 Metabolic network Nodes: metabolites Edges: enzymes Directed Substrate->Product Transcriptional regulatory network Nodes: transcription factors and genes Edges: transcriptional regulation Directed TF->target gene Fhl1 RPL2B Ravasz et al. Science 297:1551, 2002 BCHM352, Spring 2011
Degree, path, shortest path • Degree: the number of edges adjacent to a node. A simple measure of the node centrality. • Path: a sequence of nodes such that from each of its nodes there is an edge to the next node in the sequence. • Shortest path: a path between two nodes such that the sum of the distance of its constituent edges is minimized. Fhl1 Out degree: 4 In degree: 0 YDL176W Degree: 3 BCHM352, Spring 2011
Obama vs Lady Gaga: who is more influential? Twitter following (out degree) Twitter followers (in degree) Obama 701,301 7,035,548 Gaga 144,263 8,873,525 Eminem 0 3,509,469 BCHM352, Spring 2011
Network properties (I): hubs • Random network • 130 nodes, 215 edges • Homogeneous: most nodes have approximately the same number of links • Five red nodes with the highest number of links reach 27% of the nodes • Scale-free network • 130 nodes, 215 edges • Heterogeneous: the majority of the nodes have one or two links but a few nodes have a large number of links • Five red nodes with the highest degrees reach 60% of the nodes (hubs) Albert et al., Nature, 406:378, 2000 BCHM352, Spring 2011
Scale-free biological networks Metabolic network C. elegans Protein interaction network H. sapiens Gene co-expression network S. cerevisiae Jeong et al, Nature, 407:651, 2000 Stelzl et al. Cell, 122:957, 2005 Noort et al, EMBO Reports,5:280, 2004 BCHM352, Spring 2011
Network properties (II): small world network Wichita • Stanly Milgram’s small world experiment • Social network • Average path length between two person • Small world network: a graph in which most nodes can be reached from every other by a small number of steps. • Biological interpretation: Efficiency in transfer of biological information Boston Omaha • "If you do not know the target person on a personal basis, do not try to contact him directly. Instead, mail this folder to a personal acquaintance who is more likely than you to know the target person." Six degrees of separation BCHM352, Spring 2011
Network properties (III): motifs • Network motifs: Patterns that occur in the real network significantly more often than in randomized networks. • Three-node patterns Milo et al., Science, 298:824, 2002 Feed-forward loop Feedback loop BCHM352, Spring 2011
Network properties (IV): modularity • Modularity refers to a group of physically or functionally linked molecules (nodes) that work together to achieve a relatively distinct function. • Examples • Transcriptional module: a set of co-regulated genes sharing a common function • Protein complex: assembly of proteins that build up some cellular machinery, commonly spans a dense sub-network of proteins in a protein interaction network • Signaling pathway: a chain of interacting proteins propagating a signal in the cell Protein interaction modules Pallaet al, Nature, 435:841, 2005 Gene co-expression modules Shi et al, BMC SystBiol, 4:74, 2010 BCHM352, Spring 2011
Network distance vs functional similarity • Proteins that lie closer to one another in a protein interaction network are more likely to have similar function and involve in similar biological process. Sharan et al. Mol SystBiol, 3:88, 2007 BCHM352, Spring 2011
Network-based disease gene prioritization Kohler et al. Am J Hum Genet. 82:949, 2008 For a specific disease, candidate genes can be ranked based on their proximity to known disease genes. BCHM352, Spring 2011
Summary • Biological networks • Protein-protein interaction network; Gene regulatory network; Metabolic network • Graph representation of networks • Graph, node, edge, undirected graph, directed graph, degree, path, shortest path • Network properties • Hubs and scale-free degree distribution • Small-world • Motifs • Modularity • Network-based applications • Disease gene prioritization BCHM352, Spring 2011