1 / 18

Protein Sectors: Evolutionary Units of Three-Dimensional Structure

Protein Sectors: Evolutionary Units of Three-Dimensional Structure. Najeeb Halabi, Olivier Rivoire, Stanislas Leibler, and Rama Ranganthan. Cell 138, 774-786, August 21, 2009. Journal Club Yizhou Yin Sep 23, 2009. Sequence Conservation.

Télécharger la présentation

Protein Sectors: Evolutionary Units of Three-Dimensional Structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Sectors: Evolutionary Units of Three-Dimensional Structure Najeeb Halabi, Olivier Rivoire, Stanislas Leibler, and Rama Ranganthan Cell 138, 774-786, August 21, 2009 Journal Club Yizhou Yin Sep 23, 2009

  2. Sequence Conservation “…sequence conservation – the degree to which the frequency of amino acids at a given position deviates from random expectation in a well sampled multiple sequence alignment of the protein family...” property/function evolution sequence structure Evolutionary relationship sequence conservation Structural/functional importance

  3. Hypothesis -However, in the 3-dimensional structure of protein, the large amount of interactions between amino acid residues are also fundamental “structural elements”. -Amino acid distributions at individual position should not be taken as independent of one another. -Investigation of correlations between sequence positions in protein family leads to decomposition of the protein into groups of coevolving amino acids – “sectors”. Hypothesis: the sectors are features of proteins structures and reflect the evolutionary histories of their conserved biological properties.

  4. S1A Family Clan Family Sub-family Member trypsin S1 S1A Serine protease SA … chymotrypsin S2 SB tryptase … … kallikrein Catalytic triad – active site granzyme … rat trypsin (3TGI) Broad distribution and functions Digestion Blood clotting Inflammation … Prokaryotes Invertebrates Vertebrates Binding site - specificity

  5. Method Outline • Identification of sectors • Statistical Coupling Analysis • Statistical Independence • Correlated entropy • Physical connectivity • Distinct biochemical properties • Alanine mutagenesis • Catalytic power & thermal stability assays • Independent divergence • Sequence similarity analysis

  6. From Sequence to Sectors Multiple sequence alignment of 1470 members of the S1A family (single domain) NCBI nonredundant database through iterative PSI-BLAST Alignment: Cn3D, ClustalX Standard manual adjustment methods Position Conservation Di(a): Divergence (or relative entropy) fi(a): Observed frequency of amino acid a at position i q(a): Background frequency of a in all proteins

  7. Statistical Coupling Analysis (SCA) SCA matrix (conservation-weighted covariance matrix) Cijab: frequency-based correlation between position i and j ~Cijab is a measure of the significance of observed correlations as judged by the conservation of the amino acids under consideration After binary approximation:

  8. Binary approximation Di(ai): the conservation of ai, which is the most prevalent amino acid at that position

  9. Spectral cleaning to separate functional correlation from statistical and historical noise Principal Component Analysis Spectral decomposition of ~Cij matrix to partially sort out the different contributions to the correlations 223 eigenvalues Lowest 218 – Statistical noise Randomized alignments retaining the same size and amino acid propensities at sites show eigenvalues of similar magnitude First mode makes the dominant contribution to ~Cij – historical noise The first eiganvelue is well approximated by a first order approximation, proves that the first eigenvector should just report the net contribution of each position to the total correlation

  10. Sector Identification using modes 2 to 5

  11. Overview of Sectors

  12. Statistical Independence Compute correlation entropy to quantitatively measure the independence of sectors Minimum discriminatory information method i.e. S is small set of position, specifically, the top five positions contributing to each sector

  13. Structure Connectivity Known primary/secondary/subdomain-architecture subdivision Distinction in degree of solvent exposure Difference in proximity to the active site (not for green sector) No sector

  14. Red: focus on S1 pocket catalytic specificity Blue: more distributed property Green: focus around catalytic triad catalytic activity Without information about tertiary structure and only ~10% of total sequence positions contributes strongly to each sector, each sector reveals obvious intra-sector physical connectivity and only a few inter-sector contacts.

  15. Biochemical Independence Mutations of red and blue sectors showed very different effects focused either on catalytic power or thermal stability Additive effects from combination of mutations between two groups (magenta: observed | white: predicted)

  16. Independent Sequence Divergence Sequence similarity analysis of each sector classifies members in the family effectively only by the related property, while the analysis on all positions failed to do the classification (442 members with functional annotation)

  17. Evidence of “Sector” theory in Other Protein Families PDZ PAS SH2 SH3 Different regulatory mechnisms

  18. Discussion Novel Structural Organization Implication for Physical Properties of Proteins Alternative View to Calculate Residue Covariance Technical Challenges Protein Modulization Adaptive Advantage

More Related