1 / 73

Protein Structure and Function

Protein Structure and Function. CHAPTER4. From Sequence to Function : Case Studies in Structural and Functional Genomics. 4-0. Overview : From Sequence to Function in the Age of Genomics.

floyd
Télécharger la présentation

Protein Structure and Function

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure and Function

  2. CHAPTER4.From Sequence to Function : Case Studies in Structural and Functional Genomics

  3. 4-0. Overview: From Sequence to Function in the Age of Genomics • Genomics is making an increasing contribution to the study of protein structure and function • Many computational and experimental tools are now available. • Different experimental methods are required to define a protein’s function. • In this chapter : methods of comparing amino-acids sequences to determine their similarity and to search for related sequences in the sequence databases. • Predicting a protein’s function from its structure.

  4. 4-0. Overview: From Sequence to Function in the Age of Genomics Figure4-1.Time and distance scales in functional genomics

  5. 4-0. Overview: From Sequence to Function in the Age of Genomics Figure4-1.Time and distance scales in functional genomics

  6. 4-1. Sequence Alignment and Comparison • Sequence comparison provides a measure of the relationship between genes • Homologous : genes or proteins related by divergent evolution from a common ancestor. • Homology : evolutionary similarity between them. • Alignment is the first step in determining whether two sequences are similar to each other • Alignment : comparing two or more sequences. • Sometimes insertions and deletions causes sequences slid. Sliding creates gaps. Figure4-2. Pairwise alignment

  7. 4-1. Sequence Alignment and Comparison High - E-value : the probability that an alignment score as good as the one found between two sequences. - Up to an E-value of approximately 10-10, the likelihood of an identical function is reasonably high, but then it starts to decrease substantially. Low Figure4-3. Plot of percentage of protein pairs having the same biochemical function as sequence changes

  8. 4-1. Sequence Alignment and Comparison • Multiple alignments and phylogenetic trees • The alignment process can by expanded to give a multiple sequence alignment. • Any residue, or short stretch of sequence, that is identical in all sequences in a given set is said to be CONSERVED. Figure4-4. Multiple alignment

  9. 4-1. Sequence Alignment and Comparison • Multiple sequence alignments of homologous proteins or gene sequences from different species are used to derive a so-called evolutionary distance. • These distances can be used to construct phylogenetic trees that attempt to reflect evolutionary relationships between species. Figure4-5. Phylogenetic tree comparing the three major MAP kinase subgroups

  10. 4-2. Protein Profiling Structural data can help sequence comparison find related proteins • Straightforward sequence alignment does not indicate any relationship between the prokaryotic and eukaryotic domain. • However, when the alignment is performed by comparing residues in the corresponding secondary structure elements of the prokaryotic and eukaryotic domains, some regions of sequence conservations appear. Figure4-6. Some Examples of Small Functional Protein Domains

  11. 4-2. Protein Profiling • Sequence and structural motifs and patterns can identify proteins with similar biochemical functions • Sometimes, only a part of a protein sequence can be aligned with that of another protein. • Local alignment can identify a functional module within a protein. • These function-specific blocks of sequence are called functional motifs. • Two broad classes : short, contiguous motif = usually specify binding site • : discontinuous or non-contiguous motif = catalytic sites

  12. 4-2. Protein Profiling Figure4-7. Representative examples of short contiguous binding motifs

  13. 4-2. Protein Profiling PSI-BLAST : position-specific iterated BLAST. Amino acid position Five sequences Probability for Cys Figure4-8. Construction of a profile

  14. 4-3. Deriving Function from Sequence Sequence information is increasing exponentially - The growth of sequence information is exponential, and shows no sign of slowing down. Figure4-9. The growth of DNA and protein sequence information collected by GenBank over 20 years

  15. 4-3. Deriving Function from Sequence - As one proceeds form prokaryotes to eukaryotes, and from single-celled to multicellular organisms, the number of genes increases markedly. Figure4-10. Table of the size of the genomes of some representative organisms

  16. 4-3. Deriving Function from Sequence In some cases function can by inferred from sequence - If a protein has more than about 40% sequence identity to another protein whose biochemical function is known, and if the functionally important residues are conserved between them. Green : non-enzymatic Blue : enzymatic Figure4-11. Relationship of sequence similarity to similarity of function

  17. 4-3. Deriving Function from Sequence • Local alignments of functional motifs in the sequence can often identity at least one biochemical function of a protein. (Ex. Helix-turn-helix, zinc finger motifs) - Walker motif : ATP or GTP binding motif. Figure4-12. The P loop of the Walker motif

  18. 4-3. Deriving Function from Sequence • Sequence comparison is an active area of research because it is now the easiest technique to applyto a new protein sequence. • Large proportion are inferred only by overall sequence similarity to known proteins. Figure4-13. Analysis of the functions of the protein-coding sequences in the yeast genome

  19. 4-4. Experimental Tools for Probing Protein Function • Gene function can sometimes be established experimentally without information from protein structure or sequence homology • Experience suggests that genes of similar function often display similar patterns of expression. • Expression can by measured at the level of mRNA or protein. • The mRNA-based techniques : • DNA microarrays and SAGE - Microarray technology can provide expression patterns for up to 20,000 genes at a time. Figure4-14. DNA microarray

  20. 4-4. Experimental Tools for Probing Protein Function • High throughput protein expression monitor can be achieved by two-dimensional gel electrophoresis. • Protein spot can be identified by Mass spectrometry. • 2D GE can detect the amount of protein and modifications. • But it is slow and expensive. • It can fail to detect proteins tat are only present in a few copies per cell. Figure4-15. 2-D protein gel

  21. 4-4. Experimental Tools for Probing Protein Function • The phenotype produced by inactivating a gene, a gene knockout, is highly informative about the cellular pathway. • Knockout can be obtained by classical mutagenesis, targeted mutations, RNA interference, the use of antisense message RNA, or by antibody binding. Figure4-16. The phenotype of a gene knockout can give clues to the role of the gene

  22. 4-4. Experimental Tools for Probing Protein Function • The location of a protein in the cell often provides a valuable clue to its functions. • - Technique : attachment of a tag sequence to the gene in question. Commonly used method is to fuse the sequence encoding GFP(green fluorescent protein). Figure4-17. Protein localization in the cell

  23. 4-4. Experimental Tools for Probing Protein Function - Interacting proteins can be found by yeast two-hybrid system. • Two distinct domains are necessary to activate transcription in yeast. • ①. A DNA binding domain(bind to promoter) • ②. An activation domain - DBD fused A protein + AD fused Y protein. - If A and Y protein interact each other, DBD and AD close together. And transcription will start. Figure4-18. Two-hybrid system for finding interacting proteins

  24. 4-5. Divergent and Convergent Evolution • In general, if the overall identity between the two sequences is greater than about 40%, they will code for proteins of similar fold. • Rmsd : rood-mean-square difference in spatial positions of backbone atoms. 40 Figure4-19. Relationship between sequence and structural divergence of proteins

  25. 4-5. Divergent and Convergent Evolution Benzoylformate decarboxylase Pyruvate decarboxylase Low seq.similarity Similar structure Proteins with low sequence similarity but very similar overall structure and active sites are likely to be homologous Figure4-20. Ribbon diagram of the structure of a monomer of benzoylformate decarboxylase (BFD) and pyruvate decarboxylase (PDC)

  26. 4-5. Divergent and Convergent Evolution Divergent evolution can produce proteins with sequence and structural similarity but different function -Steroid delta-isomerase -Nuclear transport factor2 -Scytalone dehydratase Similar structure Different function Figure4-21. Seuperposition of the three-dimensional structures of steroid-delta- isomerase, nuclear transport factor-2 and scytalone dehydratase

  27. 4-6. Structure from Sequence : Homology Modeling Homology modeling is used to deduce the structure of a sequence with reference to the structure of a close homolog • Upper : sequence similarity is likely to yield enough structural similarity for homology modeling. • Lower : highly problematic to homology modeling. Figure4-22. The threshold for structural homology

  28. 4-6. Structure from Sequence : Homology Modeling Integral membrane protein rodopsin with the cluster of conserved interacting residues(red) Conservation is measured by Gstat - High value = more conserved Homology modeling based on conservancy Figure4-23. Evolutionary conservation and interactions between residues in the protein-interaction domain PDZ and in rhodopsin

  29. 4-6. Structure from Sequence : Homology Modeling Chymotripsin(green), Plasminogen(blue) and chymotipsinogen(red) different active site conformation. Plasminogen(blue) and chymotipsinogen(red) are very similar. Figure4-24. Structural changes in closely related proteins

  30. 4-7. Structure from Sequence : Profile-Based Threading and “Rosetta” Profile-based threading tries to predict the structure of a sequence even if no sequence homologs are known • Computer program forces the sequence to adopt every known protein fold in turn, and in each case a scoring function is calculated that measures the suitability of the sequence for that particular fold. • The highest Z-value score indicates that the sequence almost certainly adopts that fold. Figure4-25. The method of profile-based threading

  31. 4-7. Structure from Sequence : Profile-Based Threading and “Rosetta” The ROSETTA method attempts to predict protein structure form sequence without the aid of a homologous sequence or structure • Rosetta is that the distribution of conformations sampled for a given short segment. • Each calculated structures similar to real crystal structure but not perfect. Figure4-26. Some decoy structures produced by the Rosetta method

  32. 4-7. Structure from Sequence : Profile-Based Threading and “Rosetta” The level of agreement with the known native structure varies, but in many cases the overall fold is predicted well enough to be recognizable. Figure4-27. Examples of the best-center cluster found by Rosetta for a number of different test proteins

  33. 4-7. Structure from Sequence : Profile-Based Threading and “Rosetta” The level of agreement with the known native structure varies, but in many cases the overall fold is predicted well enough to be recognizable. Figure4-27. Examples of the best-center cluster found by Rosetta for a number of different test proteins

  34. 4-8. Deducing Function from Structure : Protein Superfamilies - In contrast to the exponential increase in sequence information, structural information(X-ray or NMR) has up to now been increasing at a much lower rate. (=Sequence information) • Superfamily : loosely defined as a set of homologous proteins with similar three-dimensional structures. • Within each superfamily, there are families with more closely related functions and significant(>50%) sequence identity. Figure4-28. Growth in the number of structures in the protein data bank

  35. 4-8. Deducing Function from Structure : Protein Superfamilies The four superfamilies of serine proteases are examples of convergent evolution - Serine proteases fall into several structural superfamilies, which are recognizable from their amino-acid sequences and the particular disposition of the three catalytically important residues in the active site. Same superfamily Chymotrypsin Subtilisin Figure4-29. The overall folds of two members of different superfamilies of serine proteases

  36. 4-8. Deducing Function from Structure : Protein Superfamilies Taq. DNA polymerase Reverse transcriptase DNA polymerase - Another large enzyme superfamily with numerous different biological roles is characterized by the so-called polymerase fold, which resembles an open hand. Figure4-30. A comparison of primer-template DNA bound to three DNA polymerases

  37. 4-9. Strategies for Identifying Binding Sites Binding sites are identified as regions where the computed interactionenergy between the probe and the protein is favorable for binding - Zone1 : good site for binding positive charged group. - Zone2 : good site for binding hydrophobic group. - Zone3 : good site for binding negative charged group. Overlay of three pieces of a known inhibitor of dihydrofolate reductase onto the zones. By GRID method(program) Figure4-31. Example of the use of GRID

  38. 4-9. Strategies for Identifying Binding Sites MSCS(multiple solvent crystal structures) is a crystallographic technique that identifies energetically favorable binding sites and orientations of small organic molecules on the surface of proteins. Figure4-32. Some organic solvents used as probes for binding sites for functional groups

  39. 4-9. Strategies for Identifying Binding Sites Small organic molecules bind to on the protein surface Figure4-33. Structure of subtilisin in 100% acetonitrile

  40. 4-9. Strategies for Identifying Binding Sites - The binding sites for different organic solvent molecules were obtained by X-ray crystallography of crystals of thermolysin soaked in the solvent. Figure4-34. Ribbon representation showing the experimentally derived functionality map of thermolysin

  41. 4-10. Strategies for Identifying Catalytic Residues Active-site residues in a structure can sometimes by recognized computationally by their geometry -Searches the structure for geometrical arrangements of chemically reactive side chains that match those in the active sites of known enzymes. - The geometry of the catalytic triad of the serine proteases as used to locate similar sites in other proteins. Figure4-35. An active-site template

  42. 4-10. Strategies for Identifying Catalytic Residues THEMATICS : net charge of potentially ionizable groups on each residue in the protein structure is calculated as a function of pH. - Amino acids, which show abnormal ionization curve (green His 95 and blue Glu 165 in triosephosphoate isomerase), are possibly catalytic residues. Figure4-36. Theoretical microscopic titration curves

  43. 4-10. Strategies for Identifying Catalytic Residues Structure of triosephosphate isomerase. His 95 and Glu 165 are both located in the active site. Figure4-37. Residues that show abnormal ionization behavior with changing pH define the active site

  44. 4-11. TIM Barrels : One Structure with Diverse Functions - Mandelate racemase : intercpmvert R- and S-mandelate. Figure4-38. The chemical reaction catalyzed by mandelate racemase

  45. 4-11. TIM Barrels : One Structure with Diverse Functions - Muconate lactonizing enzyme : transforms the cis, cis-muconic acid derived from mandelate into muconolactone. Figure4-39. The chemical reaction catalyzed by muconate lactonizing enzyme

  46. 4-11. TIM Barrels : One Structure with Diverse Functions Mandelate racemase Muconate lactonizing enzyme 26% sequence identity and overall fold are essentially identical. Figure4-40. Mandelate racemase (left) and muconate lactonizing enzyme (right) have almost identical folds

  47. 4-11. TIM Barrels : One Structure with Diverse Functions Mandelate racemase Muconate lactonizing enzyme The amino acids that coordinate with the metal ion are conserved between the two enzymes and similar catalytic residues. Figure4-41. A comparison of the active sites of mandelate racemase (left) and muconate lactonizing enzyme (right)

  48. 4-12. PLP Enzymes : Diverse Structures with One Function L-aspartate aminotransferase : L-aspartate → L-glutamate Use the cofactor “puridoxal phosphate(PLP)” Figure4-42. The overall reaction catalyzed by the pyridoxal phosphate-dependent enzyme L-aspartate aminotransferase

  49. 4-12. PLP Enzymes : Diverse Structures with One Function Step 2 : PLP catalyzes a rearrangement of the amino acid substrate. Step 3 : followed by hydrolysis of the kero0acid portion, leaving the nitrogen of the amino acid bound to the cofactor to form the intermediate PMP. Step 1 : The amino group of the amino acid substrate displaces the side-chain amino group of the lysine residue that holds the cofactor PLP in the active site. Figure4-43. The general mechanism for PLP-dependent catalysis of transamination, the interconversion of α-amino acids and α-keto acids

  50. 4-12. PLP Enzymes : Diverse Structures with One Function L-aspartate aminotransferase D-amino acid aminotransferase Absolutely no identity and folding structures totally different. Figure4-44. The three-dimensional structures of L-aspartate aminotransferase (left) and D-amino acid aminotransferase (right)

More Related