1 / 60

Modeling real molecules in virtual water.

Modeling real molecules in virtual water. Alexey Onufriev The Scripps Research Institute La Jolla, CA. Acknowledgements: David Case (Scripps) Don Bashford (Scripps). Support: NIH grant GM 45607. Outline:. Key features of “virtual water” model.

webbashley
Télécharger la présentation

Modeling real molecules in virtual water.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling real molecules in virtual water. Alexey Onufriev The Scripps Research Institute La Jolla, CA Acknowledgements: David Case (Scripps) Don Bashford (Scripps) . Support: NIH grant GM 45607

  2. Outline: • Key features of “virtual water” model. • Simulated folding of a 46-residue protein using all-atom molecular dynamics based on the GB. • structural basis of stability of halophilic proteins.

  3. THEME I.Protein folding. Amino-acid sequence – translated genetic code. MET—ALA—ALA—ASP—GLU—GLU--…. How? Experiment: amino acid sequence uniquely determines protein’s 3D shape (ground state). Nature does it all the time. Can we?

  4. The magnitude of the protein folding challenge: A small protein is a chain of ~ 50 mino acids (more for most ). Assume that each amino acid has only 10 conformations (vast underestimation) Total number of possible conformations: 1050 Say, you make one MC step per femtosecond. Exhaustive search for the ground state will take 1027 years. Why bother: protein’s shape determines its biological function.

  5. Single amino-acid(phenilalanin) Complexity of protein design Example: PCNA – a human DNA-binding protein. Drawn to scale

  6. Principles of Molecular Dynamics (MD): Y F = dE/dr System’s energy Bond spring + A/r12 – B/r6 + Q1Q2/r Kr2 VDW interaction Electrostatic forces Bond stretching x Each atom moves by Newton’s 2nd Law: F = ma + - + … E =

  7. Computational advantages of representing water implicitly, via a continuum solvent model Implicit water as dielectric continuum Explicit water (traditional) Low computational cost. Fast dynamics. • Other advantages: • Instant dielectric response => no water equilibration necessary. • No viscosity => faster conformational transitions. • Solvation in an infinite volume => no boundary artifacts. • Solvent degrees of freedom taken into account implicitly => easy to • estimate total energy of solvated system. Large computational cost. Slow dynamics.

  8. Traditional approach: e(r ) f(r) = -4pr(r)numerically, on a grid. D D To obtain electrostatic potential, solve Poisson-Boltzmann equation “inside”, e=4 Use molecular structure to define the dielectric boundary. “outside”, e=80

  9. Computational challenges: 1. Solution of a PDE is expensive. Slow.2. Need fine grid => run out of memory fast.3. Problems with obtaining derivatives of energy. Need fast method. Can’t afford to solve Poisson equation on a grid every 2 fs of simulation time. Also, an analytical formula for the energy would be nice here, to get the forces via F = dE/dr.

  10. + + rij qi - - qj - molecule Alternative: generalized Born approximation (GB): Total electrostaticenergy Solvent polarization, DW Vacuum part Function to be determined.

  11. + Solvation energy of individual ion +

  12.  0 f  rij E ~ 1/r f  (RiRj)1/2 E ~ 1/R  0  1 rij j i Interpolates between the case when atoms are far from each otherand Coulomb’s law is recovered i j And when they fuse into one, and Born’s formula is recovered The “magic” formula: f = [ rij2 + RiRjexp(-rij2/4RiRj) ]1/2 /Still et al. 1990 /

  13. All pairs of atoms f – simple, smooth function

  14. Key result: (Onufriev et al. J.Comp.Chem. 2002 ) Given a “good” set of effective radii Ri , the Generalized Born formula approximates the essentially exact Poisson-Boltzmann solution very well. that is given the correct self energy, the interaction energy can be compouted (for vast majority of atom pairs in the molecule).

  15. How to calculate the effectiveBorn rdius, Ri? Can be computed analytically,for each atom. Depends on molecule’s shape. Invert the Born formula Calculate solvationenergy from classicalelectrostatics. Make someapproximations… Do the integral analytically, overatom spheres.

  16. Need a SIMPLE, ANALYTICAL approximation for: 1 Molecular volume water water atom Molecularvolume => no water within. Gridintegration?Not a goodidea. Need simple, analyticalformula. Water doesnot eneter inside the molecule Water 1.4 A 1.4 A 1.4 A 1.4 A Computational challenges. Example. Molecular volume > Si 4/3pR3i

  17. This is how it is done: General idea: l ~ 4/3 compact packing of hard spheres atom spheres

  18. R-1 = (r - 0.09A)-1 – r-1 tanh(aF - bF2 + gF3 ) F = I(r -0.09A) How effective Born radii are computed in the new GB model. original: degree of burial. R-1 = (r - 0.09A)-1 - I new

  19. Effects of radii re-scaling in the new GB model. • Compared to the original GB, the new approach: • Makes Reff of buried atoms larger. • Removes numerical instability for large Reff. • Small molecule results as good (small Reff). New GB Original GB Degree of atom’s burial (normalized)

  20. The improved GB model is validated by: 1. Calculations of pKs of titratable groups in proteins -- comparison to experiment.2. Comparison to Poisson-Boltzmann (exact) theory. 3. MD simulations of native proteins – to see how close is simulated structure to the known X-ray one.

  21. GB – the Generalized Born model (modified). PB -- Poisson Boltzmann theory.

  22. How well does new GB work (compared to Poisson-Boltzmann) ? The change in the electrostatic part of solvation energy DW of the protein in going from its native (N) to the unfolded (U) states, calculated using the PB and GB models. The change, DW (N) - DW (U) , is in Kcal/mol. The quantity is calculated as an average over 50 structures representative of each state.

  23. Potential function: Electrostatic part of the of solvation free energy(calculated by the GB) Total energy Etot = Evacuum + DEelecGB + DGnonpolar Gas-phase potential (AMBER-7) Non-polar contributionto solvation. Approximatedas surface energy ~ sA

  24. The bottom of the folding funnel. 4 5 6 7

  25. Simulated Refolding pathway of the 46-residue protein Movie available at: www.scripps.edu/~onufriev/RESEARCH/in_virtuo.html 1 3 0 1 2 3 4 5 6 5 NB: due to the absence of viscosity, folding occurs on much shorter time-scale than in an experiment.

  26. Recent landmark attempt to fold a (36 residue) protein in virtuo using Molecular Dynamics: Duan Y, Kollman, P Science, 282 740 (1998). Simulation time: 3 months on 256 processors = 64 years on one processor. Result: partially folded structure. Problem: explicit water simulation are too expensive computationally – can’t wait long enough.

  27. Folding a protein in virtuo using Molecular Dynamics based on the Generalized Born (implicit solvation) model. Simulation time: overnight on 16 processors. Protein to fold: 46 -residue protein A (one of the guinea pigs in folding studies). No prior structural knowledge is necessary (homology, etc. ) Protocol details: AMBER-7 package, parm-94 force-field. New GB model.

  28. Protein-A re-folding steps. Formation of residue-residue contacts

  29. Initial stages of re-folding. Contacts are formed between residues in the loops. (mostly hydrophobic) 0 – 0.5 ns 0.5 – 1 ns Contacts superimposed on the native backbone . t=0 ns corresponds to the unfolded structure. Hypothesis: restricted motion in the loops (hinges) direct fast folding. NMR evidence for restricted motions in the “hinge regions” in the unfolded state of apomyoglobin: Schwarzinger S., Wright, P., Dyson, J. Biochem. 41, 12681, (2002)

  30. 1 Correcttopology T=300 K, 5ns T=450 K 2 Wrong topologyHigh energy T=300 K; 6ns T=750 K 3 Correcttopology T=300 K; 3ns T=750 K But loop fluctuations slightly restricted. (f,y) of 4 residues. Harmonic potential 2 kcalmol/rad if deviate more than 200 from native values Correct topology is achieved despite considerable fluctuations of dihedral angles in the high temperature unfolded state. Restricted motion in the loops in the unfolded state may be important for fast folding. (directing formation of the correct topology)

  31. Conclusions to part I: • Molecular Dynamics based on the improved GB model can be used to fold a 46-residue protein (to backbone RMSD to X-ray 2.4 A, starting from an unfolded state at 450 K. ) • Contacts formed in the early stages of folding between residues in the loop regions may direct fast formation of the correct topology.

  32. Restricted motion in the loops in the unfolded state may be important for fast folding. If, instead of 450K, the protein in unfolded at 750K (the rest of the simulation protocol remains the same), it misfolds upon cooling. The misfolded structure has the wrong topology (and higher energy) compared to the native fold achieved in the previous simulation, and represents a kinetic trap. If, however, (f,y) dihedral angles of loop residues (4 in each loop) are slightly restrained during the simulation, the protein finds the correct topology immediately upon cooling . Experimentally, restricted motions in the loops are observed in 8M urea unfolded apomyoglobin (S. Schwarzinger, P. Wright, et al. ) 2 1 3 End result (300K) Correct topology is achieved despite considerable fluctuations of dihedral angles in the high temperature unfolded state. Fluctuations in the unfolded state (0.1 – 1ns)red diamonds – values in the native state. 1 T=450 K 2 T=750 K 3 T=750 K, loop fluctuations slightly restricted loop loop

  33. The generalized Born formula can be more than just a computationally effective alternative to the Poisson-Boltzmann method. After all, its a (simple) formula, and as such can be used for back-of-the-envelope calculations.

  34. BIOLOGY + INFORMATICS = bioinformatics CHEMISTRTY + INFORMATICS = cheminformatics ? physionformatics PHYSICS + INFORMATICS =

  35. Application II: What makes proteins from halophilic organisms stable at very high salt? • Why study extremophiles? • Natural curiosity (generally not fundable). • Industrial uses of extremophiles: (PCR, pharmaceuticals, food industry, etc.Nature 409 1092 (2001). • Life on Mars?

  36. Halophilic proteins EXAMPLE: hMDH. Zaccai et al. JMB 208 491 (1989) an ordinary protein Unlike ordinary proteins, stability ofhalophilic proteins increases with salt concentrations. Why? Can we try to find out? 100% % Folded 0 1 2 3 4 [NaCl] [M]

  37. Unfolded Folded If halophilic, then: Folding free energy = - ( protein stability ) Unfolded DGU->F = GF - GU Folded A necessarycondition for halophilism: d(stability)/d(salt concentration) > 0.

  38. Folding free energy = - (protein stability) Salt dependence enters through the electrostatics… Only this term is affected by salt Can estimate using GB

  39. Which can be approximated via the generalized Born formula: Salt dependence e = 80 (water) Salt dependence ~ exp(-kfGB)

  40. Expect to be large and positive in halophilic proteins > 0 since RU < RF < 0 since fGBU > fGBF for most iJ is to have: Therefore, the only chance to be halophilic, i.e. to have: i.e. the protein must be highly charged The salt dependence of folding free energy is therefore: Unfolded Folded

  41. Sqi = 5 Sqi = 36 Stability decreases with salt (ordinary proteins) Stability increases with salt,as in halophilic proteins Increasing charge on a protein may indeed make it MORE stable at high salt. (salt decreases strongly unfavorable charge-charge interactions in the folded state) Example of a protein: apomyoglobin Onufriev, Bashford, Case JMB. 325, 555 (2003).

  42. Halophilic proteins are indeed strogly charged. Randomly selected N=125 non-homologous proteins from SwissProt database. GLU,ASP count as –1, ARG, LYS count as +1, HIS counts as 0

  43. As expected, is large and positive in halophilic proteins. homo sapiens halophilic Randomly selected N=125 non-homologous proteins from SwissProt database.

  44. As expected, is large and positive in halophilic proteins.

  45. ??? _ _ _ _ _ _ _ + + + + How can adding more charge stabilize a protein (at high salt)? Energy

  46. E13 > 0 (destabilizing ) E23 < 0 (stabilizing) E13 E23 But E13 < | E23| 1-3 interaction is more strongly screened by solvent than 1-2 and 2-3 which occur mostlythough low dielectric protein. _ _ • E12 + E23 + E13 < E12 Additional (negative) charge on the surface may lead to additional stabilizationwhich increases with salt Note: E13 decreases with salt. + Additional charges near the surface CAN make the protein more stable at high salt. E12 < 0 (stabilizing) E12 Solution e =80: +salt Protein e~4

  47. Halophilic proteins are rich in negatively charged surface residues. The positive countercharges are next to them. Structural data from 3 halophilic proteins available in PDB (1D3A, 1DOI, 1HLP)

More Related