Cotranslational Protein Folding National Seminar on Bioinformatics and Functional GenomicsFebruary 15-17, 2006 madhav kulkarni
The challenges of ‘Self Assembly’ • Challenges in general • living organisms put themselves together, all by themselves. • getting into the right shape can't happen just by chance. So where are the directions? And how do living things follow them? • Questions that echo through all of biology: • transformation of embryo to infant • complexity of organs • design of a single cell • the building materials of life -- the proteins
Proteins • Building blocks of life. • Assembled in biological organisms to form cell structures, enzymes and other chemicals necessary for life. • Each amino acid (20) has unique properties including size, 3D shape and polarity Institut fur Cheme. (1998) Amino Acid Dictionary. Amino Acid Dictionary
Protein Biosynthesis • Protein Biosynthesis • Transcription • Translation • Events following Protein Biosynthesis • DNA • Gene • mRNA • Ribosome & tRNA • Protein http://www.stanford.edu/group/pandegroup/folding/education/protfold.html
Translation • The message of mRNA is decoded to make proteins. • Initiation and elongation • the ribosome recognizes the starting codon on the mRNA strand and binds to it. • tRNA, has an anticodon that matched with the codon on the mRNA. tRNA also has a single unit of amino acid attached to it. • As the ribosome travels down the mRNA one codon at a time, another tRNA is attached to the mRNA at one of the ribosome site. • The first tRNA is released, but the amino acid that is attached to the first tRNA is now moved to the second tRNA, and binds to its amino acid. This translocation continues on, and a long chain of amino acid (protein), is formed. • As the entire unit reaches the end codon on the mRNA, it falls apart and a newly formed protein is released.
Events followed by Biosynthesis • Protein folding (complete) • protein takes its functional shape or conformation • There are hydrophilic and hydrophobic amino acids in protein, wherein the main driving force for folding is from the hydrophobic portions of the protein chain to fold away from the outside water environment (typical of globular protein) • Folding process creates cavity containing amino acids which can make non-covalent bonds (hydrogen bond and/or ionic interactions) only with certain ligands • Post-translational modifications • formation of disulfide bridges and attachment of any of a number of biochemical functional groups, such as acetate, phosphate, various lipids and carbohydrates. • Removal of one or more amino acids from the amino end of the polypeptide chain, or cutting the polypeptide in the middle of the chain • two or more polypeptide chains that are synthesized separately may associate to become subunits of a protein with quaternary structure
Protein Folding – The Problem • The mechanism by which it happens • The short time span in which it happens • How an amino acid sequence folds into unique 3-D shape? • How can native conformation be found and recognized? • The entire duration of the folding process varies dramatically depending on the protein of interest • Slowest folding proteins - many minutes or hours to fold • Small proteins, with lengths of a hundred or so amino acids, typically fold on time scales of milliseconds • Very fastest known protein folding reactions are complete within a few microseconds • Possible intermediates have a very short lifetime
Objective of protein folding studies • To learn the details of the pathways involved (in misfolding) • The kinetics of the folding process, the partitioning of polypeptides among alternative forms, and the yield of correctly folded protein are consequences of kinetic partitioning between alternative pathways. • When proteins do not fold correctly (i.e. "misfold"), there can be serious consequences, including many well known diseases, such as Alzheimer's, Mad Cow (BSE), CJD, ALS, Huntington's, Parkinson's disease, and many Cancers and cancer-related syndromes.
Implications • Would greatly enhance the ability to utilize the enormous amount of data being generated by genome sequencing project. • No/less need to rely on resource-intensive experimental methods for determining protein structures but could determine them computationally. • Drug discovery could be accelerated, saving significant resources. • Genetic engineering experiments to improve the function of particular proteins would be possible.
Theoretical approach – Folding Process • Leading strategy - To find an amino acid chain's state of minimum energy • The shape that yields the lowest energy state must be a protein's natural shape, or, as chemists call it, its "native conformation."(http://wsrv.clas.virginia.edu/~rjh9u/protfold.html) • Calculation of protein energy landscapes. • Folding funnel • proposed that natural proteins have evolved such that this complicated energy surface has a funneled shape which leads towards the native state, which is the lowest-energy conformation available to the protein.
Practical Approach – Folding Process • Studying folding process by techniques like • Renaturation • Permuted version of a protein/mutant protein • Incorporation of probes (fluorescent) in the protein • Analytical methods • photochemical methods • laser temperature jump spectroscopy • SDS/page with conformation-dependent antigenicity • ultrafast mixing of solutions • Denaturation and renaturation to study the unfolding and/or refolding process (kinetics and pathway) • Many proteins refolded from a fully denatured state to the native biologically active structure. • The same principle have been assumed to govern the folding of protein during biosynthesis.
Theoretical approach – Final Fold Bioinformatics (Protein Structure Prediction) • prediction of native structure from amino-acid sequences alone • Comparative modeling algorithms (homology) • To build a model based on a previously determined structure of related sequence • Threading algorithms • To identify proteins that are structurally similar to one another, although sequence similarity is negligible • Ab initio folding algorithms • To fold the proteins according to basic structural template. • The native fold can often be predicted on the basis of homology or threading. • Only around 2000 distinct protein folds in nature! (Fetrow J.S. et. al., Current Pharmaceutical Biotechnology, 3, 329-347 (2002)
Practical Approach – Final Fold • X-ray crystallography and • NMR • determination of the folded structure of a protein • lengthy and complicated process Source: www.rcsb.org updated: 14 February 2006
Amino acid sequence to 3D structure The primary sequence of a protein contains all information needed for a protein to attain active conformation Anfinsen, C. B., et al., Proc. Natl. Acad. Sci. USA, 47, 1309-1314 (1961) in Fetrow J.S. et. al., Current Pharmaceutical Biotechnology, 3, 329-347 (2002)
Levinthal Paradox • It would be impossible for a protein to fold at observed rates by randomly searching all possible conformations of the polypeptide chain. • Three possible conformations: a, b, and L (Ramachandran Plot) • If each residue of a 100 residue polypeptide had only three conformations, the total number of conformations would be 3100 = 5 x 1047. Since conformational changes occur on the timescale of 10-13 seconds, the time required by the 100 residue protein to search all conformations would be 5x1047x10-13 » 1037 years. Nevertheless, proteins are observed to fold in 10-1 - 103 seconds both in-vivo and in-vitro • Thus, proteins might be going through a sequence of progressively more structured intermediate states that limit the conformational search and direct the polypeptide chain along a preferred route toward the native conformation. (Roder H. and Colon W., Current Opinion in Structural Biology, 7, 15-28 (1997)) (Levinthal, C., Chem. Phys., 65, 44-45 (1968) in Fetrow J.S. et. al., Current Pharmaceutical Biotechnology, 3, 329-347 (2002))
Protein Folding • Is it really spontaneous? How spontaneous? • Does it happen only after the polypeptide chain is completely synthesized? • Does it overlap with the translation process (Cotranslational protein)?
Cotranslational folding • Does it occur? • If yes, is the folding pathway the same as when starting from an unfolded protein?
Cotranslational folding Cartoon depiction of Cotranslational folding of a polypeptide. Schematic representation of a section through a protein folding landscape in which the basic funnel concept for refolding polypeptides has been adapted to include the processes of Cotranslational folding.
Difficulties in study of Cotranslational folding • Low concentration of nascent polypeptide • Heterogeneity of the translation mixture • Aggregation of the intermediates through exposed hydrophobic groups • Formation of incorrect disulfide bonds • Isomerization of proline residues (Fedorov A.N., et. al., Journal of Molecular Biology, 228, 2, 351-358 (1992)) (Branden C. and Tooze J., in Introduction to Protein Structure 2nd edition, pg. 91, 1999))
Cotranslational folding • The process of protein folding is concomitant with synthesis was articulated, and experimental testing was begun in the early 1960s(Kiho, Y., and Rich, A. (1964) J. Mol. Biol. 51, 111–118) • Today there is substantial experimental support for the Cotranslational folding hypothesis. (Fedorov A.N. and Baldwin T.O., JMB, 294, 579-586 (1999) (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997)
Cotranslational folding Evidences • Escherichia coli tryptophan synthase b chains begin to fold during translation, even before appearance of the entire N-terminal domain showing conformation dependent antigenicity(Fedorov, A. N., Friguet, B., Djavadi-Ohaniance, L., Alakhov, Yu. B., and Goldberg, M. E. (1992) J. Mol. Biol. 228, 351–358; Friguet, B., Fedorov, A. N., Serganov, A., Navon, A., and Goldberg, M. E.(1993) Anal. Biochem. 210, 344–350) • No lag was detected between synthesis of the nascent chains and appearance of immunoreactivity (Tokatlidis, K., Friguet, B., Deville-Bonne, D., Baleux, F., Fedorov, A. N., Navon, A., Djavadi-haniance, L. & Goldberg, M. E. (1995) Philos. Trans. R. Soc. Lond. B Biol. Sci. 348, 89–95 in Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997))
Cotranslational folding Evidences • Ribosome-bound bovine rhodanese form protease-resistant N-terminal domains. (Reid, B. G., and Flynn, G. C. (1996) J. Biol. Chem. 271, 7212–7217) • Enzymatically active forms of rhodanese and firefly luciferase still bound to the ribosomes when these polypeptides are expressed with extended C-terminal segments so that each enzyme was in the bulk solution. (Kudlicki, W., Chirgwin, J., Kramer, G., and Hardesty, B. (1995) Biochemistry 34, 14284–14287 23; Makeyev, E. V., Kolb, V. A., and Spirin, A. S. (1996) FEBS Lett. 378, 166–170) • Important to note that the full length Luciferase is virtually inactive in the ribosome bound sate, although acquisition of the activity occurs immediately upon release from the ribosome.
Cotranslational folding Evidences • Rat serum albumin - a secretory protein with 17 disulfide bonds in the native structure - spread throughout the polypeptide chain. • In the nascent polypeptides, about one half of the cysteinyl residues exist in disulfide bonds, indicating completion of a substantial part of the overall folding process (Peters, T., and Davidson, L. K. (1982) J. Biol. Chem. 257, 8847–8853) (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997)
Cotranslational folding Role of Molecular Chaperones and Folding Catalysts • Involved in proper folding and assembly as well as preventing premature folding. • While being elongated, nascent polypeptide portions reaching the funnel opening interact with ribosome-associated chaperons assisting the folding process.(Baram D, and Yonath A., FEBS Letters 579, 948-954 (2005)) • SecB can bind nascent polypeptides of E. coli secretory proteins, apparently preventing premature folding in the cytoplasm. (Randall, L. L., et. al. (1997) Proc. Natl. Acad. Sci. U. S. A. 94, 802–807) • Protein disulfide isomerase (PDI)1 - affects folding of disulfide-containing proteins, both in vivo and in vitro. • PDI is essential for efficient cotranslational formation of disulfide bonds in a coupled translation/translocation system. (Bulleid, N. J., and Freedman, R. B. (1988) Nature 335, 649–651) • Eucaryotic peptidylprolyl isomerase (PPI). (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997)
Cotranslational folding Ribosomes as general protein folding modulators • General protein folding activity observed in the large subunit of the ribosome • located in the peptidyl transferase domain of the large RNA of this subunit. • In contrast to the protein folding activity of the molecular chaperones, this activity is • (a) present in the RNA and is • (b) universal, not selective for any protein. • The overlap of this active site with the peptidyl transferase centre on the ribosomal RNA suggests a functional overlap between protein synthesis and folding by ribosome in the cell. • Ribosomes from both prokaryotic and eukaryotic sources could refold a large number of proteins from their denatured states to active form. (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997)
Cotranslational folding Ribosomes as general protein folding modulators • A large number of proteins like bacterial alkaline phosphatase, glucose 6-phosphate dehydrogenase, glucose oxidase, lactate dehydrogenase, horse radish peroxidase, malate dehydrogenase, b lactamase, restriction endonucleases like EcoR1, BamH1, HindIII, PstI, b -galactosidase, carbonic anhydrase, etc. could be folded by the ribosomes. (Das, B., Chattopadhyay, S. and Das Gupta, C., Biochem.; iophys. Res. Commun., 1992, 183, 774–780.; Chattopadhyay, S., Das, B., Bera, A. K., Dais Gupta, D. and Das Gupta, C., Biochem. J., 1994, 300, 717–721; Bera, A. K., Das, B., Chattopadhyay, S. and Das Gupa, C., Biochem. Mol. Biol. Int., 1994, 32 215–223; Das, B., Chattopadhyay, C., Bera, A. K. and Das Gupta, C., Eur. J. Biochem., 1996, 235 613–621.) • Renaturation of some proteins is improved by the presence of ribosomes • attributed to the large ribosomal subunit, specifically to its RNA, the 23 S and 28 S RNA of prokaryotic and eukaryotic ribosomes, respectively (Chattopadhyay, S., Das, B., and Dasgupta, C. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 8284–8287; Kudlicki, W., Coffman, A., Kramer, G., and Hardesty, B. (1997) Fold. Des. 2, 101–108) http://www.ias.ac.in/currsci/aug25/articles27.htm by DasGupta Chanchal
Cotranslational folding Is ribosome-mediated protein folding co-translation or post-translational? • Growing polypeptide chain fairly flexible • Cross linking of growing polypeptide chain with the 50S particle showed many contacts, especially with the nucleotides in the domain V • Two major activities, polypeptide synthesis and its folding into active form • Folding intermediates having large part of its secondary structures formed and even with tertiary structure formation • The final level of folding - outside the ribosome - ‘post translational’. But the released polypeptide chain received the instructions for folding from the ribosome (Fedorov, A. N., Friguet, B., Djavadi-Ohaniance, L., Alakhov, Yu, B. and Goldberg, M. E., J. Mol. Biol., 1992, 228, 351–358 in http://www.ias.ac.in/currsci/aug25/articles27.htm by DasGupta Chanchal)
Cotranslational folding • Statistical analysis of more than 200 protein structures has revealed the tendency that, within the length of polypeptide typical for a domain, residues tend to interact with the N-terminal portion of the polypeptide and that the N-terminal region is, on average, more compact than the C-terminal region. This observation is consistent with vectorial folding of nascent polypeptides beginning from the N terminus and proceeding to the C terminus. (Alexandrov, N. (1993) Protein Sci. 2, 1989–1991) (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997))
Cotranslational folding • Biosynthetic folding, proceeding through a series of intermediate structures (I1, I2, I3), avoids certain kinetic traps, such as Mi in the Figure, which are encountered during refolding of denatured protein. • In the absence of cotranslational folding (Iu1 , Iu2 , Iu3 ), the fully synthesized polypeptide would begin folding from an unfolded ensemble, Mu, similar to the refolding reaction and unavoidably proceeds through the slow-folding Mi intermediate. • The rate of either reaction is limited by the highest activation barrier. • In Cotranslational folding, the protein released from the ribosome is close to the transition state, TS, and therefore rapidly assumes the native structure Mn.
Cotranslational folding • Quick • Secondary structure formation and compaction require much less than 1 s(Roder, H., and Colo´n, W. (1997) Curr. Opin. Struct. Biol. 7, 15–28) • Formation of compact globular intermediates usually requires no more then a few seconds(Ptitsyn, O. B. (1995) Adv. Protein Chem. 47, 83–229) • Polypeptide synthesis requires many seconds (50–300 residues/min for cell-free systems and somewhat faster in vivo; compact intermediates must be formed in the process of synthesis.(Fedorov, A. N., and Baldwin, T. O. (1998) Methods Enzymol. 290) • Stereochemical analysis suggests that the nascent polypeptide emerges from the peptidyltransferase center in an a-helical configuration(Lim, V. I., and Spirin, A. S. (1986) J. Mol. Biol. 188, 565–574) (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997)
Cotranslational folding Kinetics and Pathway • An upper limit of the rate of Cotranslational folding is imposed by the rate of polypeptide synthesis. • For many proteins, as mentioned above, the C-terminal segment of 20–30 amino acid residues, which is sheltered by the ribosome prior to the release of the full-length polypeptide into the bulk solution, is essential for formation of the native, biologically active structure. Consequently, folding cannot be completed before release of the nascent polypeptide from the ribosome. • Kinetics of folding would be a function of the rates of polypeptide synthesis, folding of the full-length monomer, and for oligomeric proteins, subunit assembly.
Cotranslational folding Kinetics and Pathway • Cotranslational folding of the bacterial luciferase ‘b’ subunit is rate-limiting in the formation of the native ‘ab’ heterodimer when prefolded ‘a’ subunit is available at a sufficiently high concentration • Coexpression of both subunits leads to much slower formation of the native enzyme, apparently because association becomes the rate-limiting step • Biosynthetic folding seems to be much faster and more efficient than renaturation for several proteins. (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997))
Cotranslational folding Kinetics and Pathway • Formation of secondary structural elements like alpha-helices, beta-sheets or beta turns which act as nucleation sites for the further collapse of the native structure. • Secondary structure formation -timescale– nanoseconds to microseconds http://svr.ssci.liv.ac.uk/~volk/folding/Fasteventinprotein folding.htm
Biosynthetic folding & Renaturation • One of the basic differences between biosynthetic protein folding and protein renaturation is Cotranslational folding, folding that occurs during synthesis. • The same conformations are achieved by polypeptides folded in cells as a consequence of biosynthetic processes and as a result of refolding of the full-length polypeptide from the denatured state. • However, identification of the final protein structures does not necessarily mean identity of the pathways leading to their formation (Baldwin, R. L. (1975) Annu. Rev. Biochem. 44, 454–477) • How the pattern observed for refolding in vitro relate to protein folding within the living system?
Biosynthetic folding & Renaturation • Protein folding in the cell significantly faster than refolding of the denatured protein in vitro • bacterial luciferase • contains no disulfide bonds. • association of ‘a’ with ‘b’ chain determines the overall rate of enzyme formation. • the ‘b’ subunit released from the ribosome associates with the ‘a’ subunit much faster than does ‘bi’, which predominates in refolding experiments, • suggesting that the structure of the ‘b’-subunit when it is released from the ribosome (partially folded) is different from bi (predominant intermediate in renaturation). • The ‘b’ subunit produced by biosynthetic folding is a folding intermediate which is beyond a rate-limiting step encountered during refolding of the subunit. (Fedorov, A. N., and Baldwin, T. O. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 1227–1231) (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997))
Biosynthetic folding & Renaturation • The evolutionary pressure for fast folding operates in the context of biosynthetic folding, including biosynthesis and concomitant folding of the nascent polypeptide chain, obviously not on refolding of the full-length polypeptide. • In this case, unlike the biosynthetic folding, all residues are initially present to influence the folding pathway. • However, in renaturation experiments, especially for large, multidomain and multisubunit proteins under conditions approximating physiological conditions, low final yields, slow rates and even an inability to achieve the native structure from the denatured state are often experienced. • Many proteins fail to fold to their native state. (Fedorov A. N. and Baldwin T. O., The Journal of Biological chemistry, 272, 52, 32715-32718 (1997))
Focus of theoretical studies • What are the sequence requirements for proteins to fold rapidly and be stable in their native conformations? • What are the thermodynamic mechanism(s) of protein stabilization and the kinetic mechanism(s) of folding? • Are there special native structures (motifs) that are more likely to corresponds to the native structures of foldable proteins? • What is the best approximation for protein folding energetic (potentials)? • Challenges • What are good models for the potential energy surface? • How can native conformation be found and recognized? (Shakhnovich E. I., Current Opinion in Structureal Bology, 7, 29-40 (1997)) (Fetrow J.S. et. al., Current Pharmaceutical Biotechnology, 3, 329-347 (2002)
Theoretical Studies • Homology modeling or threading could result in the final folded structure without giving insights into the folding process. • Ab initio with complete sequence could probably reach the native-like structure but the probability that it would follow the natural pathway is remote. • How to know that the pathways are similar or not? • In fact, it takes about a day to simulate a nanosecond (1/1,000,000,000 of a second). Unfortunately, proteins fold on the tens of microsecond timescale (10,000 nanoseconds). Thus, it would take 10,000 CPU days to simulate folding -- i.e. it would take 30 CPU years! That's a long time to wait for one result! • Classical molecular dynamics may miss many features of the folding process as the process involves ensemble of transition states. (http://folding.stanford.edu/science.html)
The folding funnel • energy landscape perspectives, describe the in vitro progression of an isolated polypeptide chain from an ensemble of denatured, random conformations to the native structure at the global energy minimum • do not account for the behavior of newly synthesized polypeptide chains released from ribosomes in cells. • cannot describe the behavior of most polypeptide chains under physiological conditions. • describes the folding behavior of only a single polypeptide chain at infinite dilution. They do not consider populations or incorporate realistic intermolecular collision frequencies. (Clark P., TRENDS in Biochemical Sciences, 29 (10) 527-534 (2004)
The folding funnel • an intrinsic feature of actual folding processes – namely, collisions between partially folded chains that lead to self-association – is excluded from consideration • misfolding associated with self-association, polymerization or aggregation is not considered • the cotranslational appearance of the polypeptide chain outside the ribosome therefore corresponds to a specific portion of the folding funnel, and the chain presumably folds reasonably quickly and efficiently to this available local energy minimum (Clark P., TRENDS in Biochemical Sciences, 29 (10) 527-534 (2004)
The folding funnel - Open questions • Is the observed folded conformation the one with lowest free energy? Or • Is it the most stable of the kinetically accessible conformations? (kinetically trapped in local minima)
How nascent protein can fold correctly? • Protein folding, no matter how it worked, has to be pretty simple and fast. • Reasonable approach • Folding is hierarchical process with primary structure preceding secondary structure which is then followed by tertiary structure (and finally quaternary structure). (Johnson A. E., FEBS Letters 579,916-920 (2005)) Figure from http://folding.stanford.edu/science.html
Other methods • Renaturation of denatured protein may not give correct insights into the folding kinetic and/or pathways. • Computational techniques like homology modeling, threading techniques and ab initio algorithms also may not give correct insights into the folding kinetic and/or pathways.
The problem remains unsolved “Despite all the efforts… understanding of protein folding mechanism remains elusive”. • We are not very close to realizing this goal, and so the Protein Folding problem remains one of the most basic unsolved problems in biology .