Protein-protein interface recognition

Protein-protein interface recognition Xiang Li @ Jie’s Lab

Topics Before feature paper • Brief introduction to protein-protein interfaces hydrophobicity hydrophilicity morphology amino acid composition. • Brief introduction to protein-protein interfaces prediction Sequence Based Structure Based Sequence & Structure Based

Properties of protein-protein interfaces • hydrophobicity • hydrophilicity • morphology • amino acid composition.

Hydrophobicity • By average, the amount of hydrophobic interactions is less than those of protein interior, but more than those of protein surface(Tsai&Nussinov et al, Pro. Sci. 1997). • The contribution of hydrophobic interactions to protein-protein association is not as strong as to protein folding(Tsai&Nussinov et al, Pro. Sci. 1997). • Hydrophobic core is present in only a minority of protein-protein interfaces(Larsen, Olson, and Goodsell, Structure, 1998). • Interfaces of homodimer are most hydrophobic (Janin , thornton, and Nussinov)

Hydrophilicity • Hydrophilic residues buried in interface contribute to the specificity of protein-protein recognition(Honig et al and Nussinov et al). • Hydrophilic residues, esp. charged residues buried in interface may or may not stabilize the complexes (Sheinerman&Honig 2000, COSB; Nussinov 1997, JMB; Baker, 2003 PNAS). • Antibody-Antigen are most hydrophilic (Thornton, Janin)

Morphology • Small hydrophobic patches, polar interactions and water molecules scattered over the entire interface(Larsen et al, 1997, structure) • Considerable pockets between protein interfaces (Hubbard et al, 1994, Pro. Sci.). • Water moleclues play an improtant role in interface packing (Janin, 1999, JMB and Hubbard et al, 1994, Pro. Sci.).

Amino acid composition • The rim of interface is similar in composition to the rest of surface, but the core has distinctive amino acid composition (Janin, 2002, JMB). • Amino acid composition is also substantially different between different types of interfaces. Which type of interfaces can be predicted with amino acid composition along at ≥ 63% accuracy (Ofran and Rost, 2003, JMB).

Protein binding site prediction • Sequence Based • Structure Based • Sequence & Structure Based

Sequence Based • Proline-brackets (Kini et al, 1995, BBRC) --- Proline is the most common residues found in the flanking segments of interaction sites (eg. Fibrin polymerization). • Propensity of amino acids to be located at interface for different type of complexes (70%) (Ofran and Rost, 2003, FEBS Letter) • Correlated mutations --- adaptation (Pazos 1997, 2003) --- residues close to an interaction site are expected to mutate simultaneously during evolution. • co-evolution along the interfaces of both sides (Lichtarge et al, 1996 1997; Goh et al, 2000; Sowa et al, 2001). Point or patches? need partner or not?

Structure Based • Docking (Smith&Sternberg, 2002., COSB) • Surface patch (Thornton, 1997, JMB) --- Solvation potential, residue propensity, hydrophobicity, planarity, protrusion, accessible surface area (66%)

Sequence & Structure Based • Sequence profile and residue neighbor list (Zhou, 2001, Proteins) --- Neural Network: Sequence profile from Psi-Blast, Solvent accessible area, profiles from 20 neighboring residues. (70%) --- ???

SiteLight: Binding-site prediction using phage display librariesINBAL HALPERIN, HAIM WOLFSON, AND RUTH NUSSINOV • Dataset • ---Artificially Selected Proteins/Peptides Database (ASPD) • Protein complexes and the peptide library • Basic idea

Template Target Binding site Binding site Peptide libary … … … …

Some thoughts before going further • Very valuable dataset: ASPD • Does Inbal make full use of this dataset, and what else others can do with it. ----1 prediction the binding site on Template (Inbal et al) ----2 prediction the binding site on Target • What is the difference about the methods and potential applications. 1: similarity matrix, binding partner recognition 2: scoring function, antigenic site recognition

Agenda • Description of dataset • Algorithm • Validation of dataset • Validation of Algorithm • Results • Potential Problems

Library Types

Algorithm • Molecular Surface representation Connolly surface(Connolly 1983a,b) critical points (Lin et al. 1994) G1 G1 = (V1, E1) V1 = Atom centers E1 = (u, v) | if u and v share a sparse critical points Geodesic distance assigned to E

2. Patch partition from surface. ---- divide the protein surface into overlapping patches. ---- Partial exploration of solution space. ! possible sub-graphs within N vertices: 2N ??? !Ca as patch centers, defined radius depending on the length of peptide. Radius = 0.0012 X3 – 0.0552 X2 + 0.2985 X + 3.513

3. Peptide-patch matching if X = 5.0, R = 8.775 Å, 7 residues included (7!/(5!2!)) 5! = 2520 comparisons! 60 patches * 50 peptide * 2520 = 7.5 million !!!! ! Maximal Bipartite Algorithm

Bipartite graph,G = (V, E), is a graph such that V can be partitioned into two subsets V1 and V2, and no edge has both its vertices in the same subset Network flow problem, given no vertex is included in more than one edge. Edge is assigned with similarity score by McLachlan 1972. Time complexity is O(n*(m + n log n))

3. Scoring and correctness assessment ---- The similarity score is the summation of the edges. ---- For each match between a peptide and a patch, the matching score is determined by the score of best alignment . ---- High scoring matches are iteratively selected until 25% of the Template protein is covered. Attempts to reduce the searching surface by 75%, but without excluding the binding site.

Validation of dataset --- Artificial interface Peptides (82%, 18%) • Validation of Algorithm --- To make sure each at least one peptide can be mapped to the correct binding site. • Results (Very unclear)

Potential Problems * Irrelevant peptides. ** Site mimicry. *** Conformation difference between the peptides and matched peptides. **** partial exploration of the solution space. ***** Multiple binding interfaces of Template

Thank you!

Protein-protein interface recognition

Protein-protein interface recognition

Presentation Transcript

Protein-protein Interactions

Protein stability, protein-protein interactions

Protein-protein interactions

Protein-Protein Interactions

Protein-protein interaction

Protein Fold recognition

Protein-protein interactions

Protein-protein interactions

Protein-Protein Interactions

Protein protein interactions

A Protein Interface

Protein – Protein Interactions

Protein-protein interactions

Protein Fold recognition

Protein – protein interaction

Protein-Protein Interactions

Protein-Protein Interactions

Protein Fold recognition

Protein-protein Interactions

protein protein docking

Protein Fold recognition