340 likes | 468 Vues
CHARMM: A user’s perspective V. M. Dadarlat, March 2007.
E N D
CHARMM: A user’s perspectiveV. M. Dadarlat, March 2007 CHARMM - a research program developed at Harvard University for energy minimization and dynamics simulation of proteins, nucleic acids and lipids in vacuum, solution or crystal environments (Harvard CHARMM Web Page http://yuri.harvard.edu/). - Models the dynamics and mechanics of macromolecular systems using empirical and mixed empirical/quantum mechanical force fields. - Designed to investigate the structure and dynamics of large molecules. It performs free energy calculations of mutations and drug binding as well as conformational folding of peptides. - Uses classical mechanical methods to investigate potential energy surfaces derived from experimental and "ab initio" quantum chemical calculations. - Mixed quantum mechanical/classical systems can be defined to investigate chemical processes such as enzyme catalysis.
CHARMM - a command line program, i.e. a command is read from the input stream (typed, or from a file) and acted upon. • References: • CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations, J. Comp. Chem. 4, 187-217 (1983), by B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus. • CHARMM: The Energy Function and Its Parameterization with an Overview of the Program, in The Encyclopedia of Computational Chemistry, 1, 271-277, P. v. R. Schleyer et al., editors (John Wiley & Sons: Chichester, 1998), by A. D. MacKerell, Jr., B. Brooks, C. L. Brooks, III, L. Nilsson, B. Roux, Y. Won, and M. Karplus. • http://www.psc.edu/general/software/packages/charmm/charmm.html • http://www.psc.edu/general/software/packages/charmm/tutorial/index.html • http://www.ch.embnet.org/MD_tutorial/
Potential energy functions in CHARMM: - CHARMM22 - all atom potential function for - proteins (MacKerell et al 1998), - nucleic acids (MacKerell et al. 1995), - lipids (Schlenkrich et al. 1996) and carbohydrates (Ha et al. 1988). - CHARMM27 - CHARMM19
What CHARMM can do for you Main features of the CHARMM program: • Molecular Mechanics (Energy Minimization) • Molecular Dynamics • Classical Simulations • Crystal Simulations • Specialized techniques (Umbrella Sampling, Constant Pressure) • Miscellaneous Mean Field Potentials (MMFP) • Langevin/Implicit Euler Dynamics • Monte Carlo simulation package • Free Energy Perturbation Calculations • Block program • Perturb program • Thermodynamic Simulation Method (TSM)
- Normal Mode and Molecular vibrational analysis facility • - Reaction Path Determination • - Genetic Algorithm • - Advanced implicit solvent models • - Analytical Continuum Electrostatics (ACE) model • - Effective Energy Function 1 (EEF1) • - Generalized Born Solvation Energy (GBorn) • Interfaces • - Multi-body dynamics (MBOND) • - Merck Molecular Force Field (MMFF) • - Quantum and Molecular Mechanical Force Field (QM/MM) • - Analysis facility • - Time series and correlation function calculations • - NMR analysis facility • - Poisson-Boltzmann Equation Solver • - Reference Interaction Site Model (RISM)
Controlling a CHARMM Run: Commands IF command-parameter test-spec comparison-string command-spec GOTO label-string LABEL label-string STREAM [UNIT integer] [file-specification] RETURN SET command-parameter string INCRement command-parameter [BY real] DECRement command-parameter [BY real] IF tests to conditionally execute a single command GOTO and LABEL transfers within a file STREAM and RETURN transfers to different command files. The logical unit in OPEN, CLOSE, and REWIND commands are useful in working with streams.
CHARMM units for quantities: The AKMA system (Angstroms, Kilocalories/Mole, Atomic mass units). - Distances in Angstroms - Energies in kcal/mole - mass in atomic mass units - charge is in units of electron charge. - angles in degrees (analysis and constraint sections) the force constants for angles, dihedrals, and dihedral constraints are specified in kcal/mole/radian/radian.
Running MD with CHARMM: • A. BUILD STRUCTURE • B. EQUILIBRATION • C. PRODUCTION RUN • =EQUILIBRIUM DYNAMICS • D. ANALYSIS
Input files • Structure files: • Topology file (top*.inp); Parameter file (para*inp); • Data structures, information about the molecule: • composition, chemical connectivity, • atomic properties, internal coordinates and • parameters for the energy function. • For a specific molecule, the data is extracted • from these files and stored in the Protein Structure File (PSF). • The Coordinates (COOR): • Coordinate file: from *.pdb (from the Protein Data Bank) to a • CHARMM specific format *.crd • The coordinates are Cartesian for all the atoms in the PSF; • 2 coordinate files: main and comparison sets • CHARMM script file: kidkixdynamics.inp
Output Files • Standard output: *.out (time, energy, temperature, statistics, etc.) • Trajectory file (binary) *.dcd (or *.trj) – a time series, collection of “snapshots” of the positions of each atom in the system as a function of time • A restart file, *.rst – contains positions and velocities of all atoms at the end of the current dynamics run • A time series of velocities can also be saved
A. BUILD STRUCTURE 1fsc.pdb structure representation showing disulfide bonds. • Download 1fsc.pdb from PDB • Import pdb file in QUANTA – gui • Pdb format to crd format • Can build hydrogens • Find and show disulfide bonds, patches • Structure comparisons
Initial decisions and analysis: read the pdb file carefully • - Titratable groups: what to do with Histidines, Glutamic and Aspartic acids? • - Xray waters • - Missing residues • - bound ions and coordination complexes • Disulfide bonds • Solution conditions if NMR • ~ 20 NMR lowest energy structures: which one? • Minimization and solvation • Minimize structure with CHARMM force field, center the molecule, get structural info: Xmin, Xmax; Ymin, Ymax; Zmin, Zmax • - Patches for protein and ligands and any additional rtf and parameter information all into the solvation script. • - Decide on periodic boundary conditions: image or crystal, cubic or octahedral (box) • Build adequate water box, with sizes Xmin-12 A, Xmax+12 A • Equilibrate water box • Overlay protein in water box by removing all water molecules that overlap on the protein
Histidine HSD resid 6 Histidine HSE resid 29
Figuring out Titratable groups for Histidine: Charmm < histidines.inp > histidines.out CHARMM command COOR MINDist will give the closest atom to each ring nitrogen in each histidine important output as follows. Residue 6 has an electron donor near ND1 indicating it should be protonated NE2 has an electron acceptor near it indicating it should not be protonated. We should decide to make residue 6 a HSD. Residue 29 a HSE since there is nothing near ND1 but an acceptor near NE1. MINIMUM DISTANCE FOR ALL SELECTED ATOMS 83 1FSC HSD 6 ND1 - 106 1FSC THR 7 O 2.8497 MINIMUM DISTANCE FOR ALL SELECTED ATOMS 88 1FSC HSD 6 NE2 - 583 1FSC ARG 37 HH12 1.9869 MINIMUM DISTANCE FOR ALL SELECTED ATOMS 446 1FSC HSD 29 ND1 - 420 1FSC ARG 28 HB1 3.6827 MINIMUM DISTANCE FOR ALL SELECTED ATOMS 451 1FSC HSD 29 NE2 - 690 1FSC ASN 47 OD1 2.6855
Equilibrate minimize solvent with solute fixed to relieve bad solvent vdW contacts Run several ps CPT MD at 300K with weak constraints on the solute Run several hundred ps CPT MD to equilibrate all. Analyze energy Analyze pressure and temperature Production Run Depending on the system and investigated properties you have to decide how long your simulation should run. For some systems, e. g. folded proteins, few ns would be enough to assess stability, volume fluctuations, etc. For PMF calculations from MD simulations, hundreds of ns are needed for sampling the conformational space Analyze RMSD with respect to the x-Ray (NMR) structure Atomic Fluctuations rmsd time series Energy, temperature, density time series P2 correlation functions and order parameters
EXAMPLE: MD SIMULATIONS of 2 ALANINE DIPEPTIDE MOLECULES IN WATER
CHARMM Input file - script: generate a potential energy surface for alanine dipeptide(example) • * This is a sample command file for CHARMM which calls a stream file • * to build a structure and then maps out an adiabatic potential • * surface defined by a pair of dihedrals * • OPEN UNIT 10 READ FORM NAME makestruc.inp • STREAM UNIT 10 • SET 1 -180. • SET 2 -180. • LABEL LOOP • CONS CLDH • CONS DIHE first-dihedral-angle-spec FORCE 100.0 MIN @1 • CONS DIHE second-dihedral-angle-spec FORCE 100.0 MIN @2 • MINI minimization-spec • INCR 1 BY 30.0 • IF 1 LT 170. GOTO LOOP • SET 1 -180. • INCR 2 BY 30.0 • IF 2 LT 170. GOTO LOOP • STOP
* file: adp2.dynam.inp * EQUILIBRIUM DYNAMICS:this program starts 5000 CPT dynamics steps with a timestep of 0.002 * bomlev -2 set 1 /bio/terry3/charmm/c30a2x1/toppar open unit 1 read form name @1/top_all22_model.inp read rtf card unit 1 close unit 1 open unit 2 read form name @1/par_all22_prot.inp read param card unit 2 close unit 2 read sequence card * Sequence for generating adp * 1 ALAD generate ADP1 read sequence card * Sequence for generating adp * 1 ALAD generate ADP2 read sequence tip3 990
generate BULK noangle nodihe open unit 1 form read name 2adp990wateq.crd read coor card unit 1 close unit 1 ! Define the box for PBC set theta = 90.00 set alongx = 30.0 set alongy = 30.0 set alongz = 30.0 CRYSTAL DEFINE CUBIC @alongx @alongy @alongz @theta @theta @theta CRYSTAL BUILD CUTOFF 15.0 NOPERATIONS 0 !coor dist sele type CA* .and. segid ADP1 end sele type CA* .and. segid ADP2 end !coor copy comp Image byres select resname TIP3 end Image byseg select segid ADP1 end Image byseg select segid ADP2 end open unit 20 read card name 2adp5free.rst.in open unit 30 write file name 2adp5free.trj open unit 40 write card name 2adp5free.rst
update inbfrq -1 imgfrq 20 - cutim 14.0 ctonnb 10.0 wmin 1.0 - CTOFNB 12.0 CUTNB 14.0 - ! atom vatom vswitch - ewald pmewald order 6 kappa 0.340 fftx 32 ffty 32 fftz 32 !mini abnr nstep 200 LATTICE shake bonh param tol 1.0e-09 !!!!!!Dynamics Specifications DYNAMICS nstep 20000 timestep 0.002 nprint 50 iprfrq 500 - restart firstt 278.0 finalt 278.0 - ! start firstt 278.0 finalt 278.0 - - CPT hoover - Pcons Pref 1.0 Pmass 500.0 Pgamma 25.0 - refT 278.0 Tmass 1000.0 - - inbfrq -1 imgfrq 50 - ntrfrq 100 - iunrea 20 - iunwri 40 - iuncrd 30 nsavc 50 - iunvel -1 nsavv 0
open write card unit 1 name 2adp990wat.crd write coor card unit 1 * After Nps * coor dist sele segid ADP1 .and. type CA* end - sele segid ADP2 .and. type CA* end Stop _____________________________________________________________
ANALYZE the TRAJECTORY: file analyze.inp open unit 34 read unfo name adp5freep.trj traj iread 34 nread 1 skip 50 set 5 1 label loop traj read coor dipo sele segid ADP1 end coor rgyr sele segid ADP1 end inte sele segid ADP1 end sele segid BULK end incr 5 by 1 if 5 lt 401 goto loop stop close unit 34
File: analyze #!/bin/bash STYPE=2adp278nowat DPATH=/caracas/voichi/thermprot/2adp278ana FPATH=caracas-scratch/voichi/2adp278snowat #max=201 #icnt=100 #max=2 max=1501 #icnt=1001 icnt=1376 while [ "$icnt" != "$max" ] do cp $FPATH/${STYPE}.$icnt.dcd 2adp278nowat.dcd /bio/terry3/charmm/c30a2x1/exec/gnu.serial/charmm < analyzenowat.inp > analyze.out #/bio/terry3/charmm/c30a2x1/exec/gnu.serial/charmm < phipsicalc.inp > analyze.out #grep "" fort.11 >> psi1.check.out #grep "" fort.10 >> phi1.check.out #grep -A 3 -f rotmat.grep analyze.out >> rotmat1.out #grep -f surf1apol.grep analyze.out | awk -f surfapol.awk >> partsurfapol.ADP1.out #grep -f surf2apol.grep analyze.out | awk -f surfapol.awk >> partsurfapol.ADP2.out #grep -f surfpol.grep analyze.out | awk -f surfpol.awk >> surfpol.ADP2.out grep -f distcm.grep analyze.out | awk -f distcm.awk >> distcm.out #grep -f rmsd.grep analyze.out | awk -f rmsd.awk >> rmsd1.out #grep -f rotangle.grep analyze.out | awk -f rotangle.awk >> rotangle1.out #grep -f dipo.grep analyze.out | awk -f dipo.awk >> dipo22.all.out #grep -f rgyr.grep analyze.out | awk -f rgyr.awk >> rgyrADP1+ADP2.out #grep -f rotangle.grep analyze2.out | awk -f rotaxis.awk >> rotaxis.out #grep -f ener.grep analyze.out | awk -f ener.awk >> tote.alawat.out #grep -f ener.grep analyze.out | awk -f ener.awk >> tote12.out #grep -f vdw.grep analyze.out | awk -f vdw.awk >> vdw12.out
CONTINUE ANALYZE (note: useful to know grep and awk) #grep -f elec.grep analyze.out | awk -f elec.awk >> vdwelec.alawat.out #grep -f elec.grep analyze.out | awk -f elec.awk >> elec12.out #grep -f cm.grep analyze.out | awk -f cm.awk >> cm1.out #grep -f end2enddist.grep analyze.out | awk -f end2enddist.awk >> endtoenddist2.out #grep -f x.grep analyze.out | awk -f file.awk >> xcl1.out #grep -f y.grep analyze.out | awk -f file.awk >> ycl1.out #grep -f z.grep analyze.out | awk -f file.awk >> zcl1.out #grep -f x.grep analyze.out | awk -f file.awk >> x2ca.out #grep -f y.grep analyze.out | awk -f file.awk >> y2ca.out #grep -f z.grep analyze.out | awk -f file.awk >> z2ca.out #grep -f x1.grep analyze.out | awk -f file.awk >> x1ca.out #grep -f y1.grep analyze.out | awk -f file.awk >> y1ca.out #grep -f z1.grep analyze.out | awk -f file.awk >> z1ca.out icnt=$(($icnt +1)) done
Some Results from Analysis of Trajectories: - Potentials of Mean Force - Conformational Space of alad-alad system
Running CHARMM Under Unix General syntax (assuming /bin/csh): > charmm < filename.inp > filename.out [[param:value] ...] [ & ] filename.inp - a text file containing CHARMM input commands. filename.out- the log file for the CHARMM run, containing echoed commands, and various amounts of command output. The output print level may be increased or decreased, and procedures such as minimization and dynamics have printout frequency specifications. & - the optional ampersand will place the program in the background under most Unix shells.
Run Multiple Dynamics Trajectories: runDyna #!/bin/bash STYPE=2adp5free DPATH=/caracas/voichi/thermprot/2adp278 FPATH=/pucc/fortress/voichi/2adp278sml max=2001 icnt=1 while [ "$icnt" != "$max" ] do /bio/terry3/charmm/c30a2x1/exec/gnu.serial/charmm < $STYPE.dynam.inp > $STYPE.dynam.out cp $DPATH/$STYPE.rst $DPATH/trj/$STYPE.rst.$icnt mv $DPATH/$STYPE.rst $DPATH/$STYPE.rst.in cp $STYPE.dynam.out $DPATH/trj/dynam.$STYPE.out.$icnt cp $DPATH/$STYPE.trj $DPATH/trj/$STYPE.trj.$icnt gzip $DPATH/trj/dynam.$STYPE.out.$icnt gzip $DPATH/trj/$STYPE.rst.$icnt mkdir ${STYPE}trj$icnt mv $DPATH/trj/dynam.$STYPE.out.$icnt.gz ${STYPE}trj$icnt mv $DPATH/trj/$STYPE.rst.$icnt.gz ${STYPE}trj$icnt mv $DPATH/trj/$STYPE.trj.$icnt ${STYPE}trj$icnt tar -cf ${STYPE}trj$icnt.tar ${STYPE}trj$icnt rm -r ${STYPE}trj$icnt mv ${STYPE}trj$icnt.tar $FPATH/. icnt=$(($icnt +1)) done
Submit as batch jobs: the rcac– ITAP computers have a batch job queueing system which should be used for running long MD simulations with CHARMM. CHARMM calculations should be limited to around 12 to 16 hours, to both promote resource sharing and to minimize data loss due to machine or network failures. To accomplish this, the user has to carefully choose the number of time steps for the calculations of dynamics trajectories. Trajectories are stored in the .dcd (or .trj) binary files and typically contain 20 to 40 ps each. Example: qsub –q sp_huge nodes=8, walltime=16:00:00 rundyna
Interfacing to CHARMM A mechanism is provided to allow users of CHARMM to write their own special purpose subroutines which can be incorporated into the system without threatening its integrity. There are six "hooks" into CHARMM which have been specially provided for casual modifiers. For detailed descriptions of each of these hooks, consult the routine in ~/charmm/source/main/usersb.src on UNIX machines or [...CHARMM.SOURCE.MAIN]USERSB.SRC under VAX/VMS.
RUNNING CHARMM at NIH CHARMM at NIH (Bernie Brooks): http://www.lobos.nih.gov/Charmm/ Available on several computer systems at NIH CHARMM at Pittsburg Supercomputing Center ? http://www.lobos.nih.gov/Charmm/chmdoc.html NHLBI LBC Computational Biophysics SectionCHARMM Documentation Index
Use with caution! A tale from the literature: J. Chem. Phys. 2002, 117: 8892-8897. “An examination of the five-site potential (TIP5P) for water” Martin Lísal, Ji í Kolafa, Ivo Nezbeda Parameterization of the five-site model (TIP5P) for water [M. W. Mahoney and W. L. Jorgensen, J. Chem. Phys. 112, 8910 (2000)] has been examined by several computer simulation methods accounting properly for long-range forces. ... It is shown that the simple spherical cutoff method used in the original simulations to find optimized parameters of this five-site model yields results that differ from those obtained by both the Ewald summation and reaction field methods. Consequently, the pivot property to which the parameters were adjusted, the location of the density maximum at 1 atm, does not agree with experimental values. The equilibrium properties then show only a fair agreement with experimental data and are uniformly inferior to those of the four-site TIP4P water over the entire coexistence range.
Use with caution: J. Phys. Chem. B, 104 (15), 3668 -3675, 2000. Molecular Dynamics Simulations of a Polyalanine Octapeptide under Ewald Boundary Conditions: Influence of Artificial Periodicity on Peptide Conformation Wolfgang Weber, Philippe H. Hünenberger,* and J. Andrew McCammon Ewald and related mesh methods are nowadays routinely used in explicit-solvent simulations of solvated biomolecules, although they impose an artificial periodicity in systems which are inherently nonperiodic. In the present study, we investigate the consequences of this approximation for the conformational equilibrium of a polyalanine octapeptide... We report three explicit-solvent molecular dynamics simulations of this peptide in cubic unit cells of edges L = 2, 3, and 4 nm, using the P3M method... The initial configuration of the peptide is helical. In the largest unit cell (L = 4 nm), the helix unfolds quickly... By contrast, in the two smaller unit cells (L = 2 and 3 nm), the helix remains stable during 2 ns. ... The helical conformation is stabilized by artificial periodicity relative to any other configuration sampled during the trajectories. This artificial stabilization is larger for smaller unit cells, and is responsible for the absence of unfolding in the two smaller unit cells... These results suggest that artificial periodicity imposed by the use of infinite periodic (Ewald) boundary conditions in explicit-solvent simulations of biomolecules may significantly perturb the potentials of mean force for conformational equilibria, and even in some cases invert the relative stabilities of the folded and unfolded states.