Chemistry in Parallel Computing

Chemistry in Parallel Computing Brian W. Hopkins Mississippi Center for Supercomputing Research 29 January 2009

What We’re Doing Here • Discuss the importance of computational chemistry in the HPC field. • Define some common terms in computational chemical sciences. • Discuss the two major branches of computational chemistry. • Discuss the particular needs of various computational chemistry applications and methodologies.

Why We’re Doing It • Computational chemistry is one of the driving forces for continued investment in HPC infrastructure. • Better System Use • More Production • Less Money • More Opportunities • &c. • Stop me if there are questions!

The Primacy of Computational Chemistry in HPC • Nationwide, computational chemistry and molecular biology consume a very large share of HPC resources. • Here at UM and MCSR, CC is even more important: >90% of MCSR flops used by CC applications. 54% Total Use!

Computational Chemistry at UM/MCSR • Quantum programs are the biggest consumer of resources at MCSR by far: • Redwood: 99% (98 of 99 jobs) • Mimosa: 100% (86 of 86 jobs) • Sweetgum: 100% (24 of 24 jobs) • The one job in this snapshot that was not a QC job was an AMBER MD simulation. • This is typical.

Computational Chemistry: A Sort-Of Dichotomy • Quantum chemistry is the attempt to solve the molecular electronic Schrodinger equation, and to compute chemical properties therefrom. • Molecular dynamics is the attempt to simulate the motion of atoms and molecules in space over short (1-10ns) timespans. • There is actually some overlap between the two.

Quantum Chemistry: Overview • The equations that describe chemical behavior are known: • While known, these equations are not solvable by any analytic approach. • The basic problem: interdependence of a very large number of electronic coordinates. • While analytic solutions are not available, approximate numerical solutions are.

The Polynomial Scaling Problem • Because of the complexity of the Schrodinger equation, the baseline QC method (HF theory) scales with the system size N as O(N4). • More accurate methods scale from O(N4) -- O(N8). • The very best method scales with (get this) O(N!). • “System size” here is some cooked-up number generated by hashing the number of electrons, the number of orbitals, symmetry, &c. • The polynomial scaling problem applies to every resource used by a job: CPU time, memory, disk, everything.

A Word on Alphabet Soup • Always remember that the Schrodinger Equation cannot be solved; we’re always working at some level or approximation • HF • DFT • MP2 • MP4 • CCSD, CISD • CCSD(T) • CCSDT, CISDT • CCSDTQ, CISDTQ • … • FCC, FCI • The fewer approximations we make, the better the results (and the more the calculation costs). Increasing accuracy Increasing expense Decreasing scalability Decreasing availability

Iteration in Quantum Chemistry • To solve the interdependence of coordinates, QC programs rely on iteration. • A guess is made for the location of each electron; that guess is processed; lather, rinse, repeat. • When the solution stops changing, you’re done. • The converged solution gives both a total energy of the molecule and a wavefunction that decribes its state.

So…What, Exactly, Is This Program Doing? • Building a guess wavefunction, represented by a huge 4D matrix of double-precision numbers. • Processing that matrix in a variety of ways (mostly matrix multiplies and inversions) • Diagonalizing the matrix. • Using the resulting eigenvectors to build a new guess. • Iterate until self-consistency.

Common Chemical Properties • Many common chemical properties are computed by building derivatives of the molecular electronic wavefunction. • molecular structures • harmonic vibrational frequencies • polarizabilities • &c. • These derivatives can be calculated analytically or numerically.

Geometry Optimization • One extremely common job type is the geometry optimization. • Procedure: • Start with a guess set of nuclear coordinates • Compute the wavefunction for the molecule • Compute the derivative of the wavefunction with respect to the nuclear coordinates • Adjust the nuclear coordinates • Repeat until the derivative is within tolerance of zero in every dimension • Note that this is a nested iteration: we’re iterating to build a wavefunction, Requested convergence on RMS density matrix=1.00D-08 within 64 cycles. Requested convergence on MAX density matrix=1.00D-06. SCF Done: E(RHF) = -565.830259809 A.U. after 16 cycles Convg = 0.7301D-08 -V/T = 2.0017 S**2 = 0.0000 • then we’re iterating again to find a geometry Item Value Threshold Converged? Maximum Force 0.000289 0.000015 NO RMS Force 0.000078 0.000010 NO Maximum Displacement 0.037535 0.000060 NO RMS Displacement 0.006427 0.000040 NO

Analytic vs. Numerical Derivatives • Computing derivatives can be done two ways: • analytically, if the relevant functional form is in the code • add significant expense relative to the underlying energy point • often not as scalable as the corresponding energy point calculation • numerically, by finite displacements of the relevant properties • always available; just do lots (and lots, and lots) of energy points (3N-5 internal coordinates) • embarrassingly parallel

Scaling a Quantum Chemistry App • QC apps tend not to be very scalable. • There’s often no really good way to decompose the problem for workers. • symmetry blocks excepted • As a result, these codes are extremely talky • Talkiness is mitigated somewhat by use of specialized MP libs (TCGMSG). • Also, the biggest jobs tend to be I/O intensive, which murders performance. • SMP is better than MP, but limited by machine size (watch out!)

The Scaling Wall • Gaussian scales to ~8 procs in the very best cases; many jobs will not scale at all. • NWChem will scale for most jobs to a few dozen procs; some jobs to just a handful. • MPQC will scale to many procs, but functionality is limited. • All parallel QC programs show some limited soft scaling • Always consult program manuals for scalability of a new method. • For most quantum chemists, the principal utility of a big machine like redwood is for running a large number of jobs on a few procs each.

Quantum Chemistry and the Computing Specialist • User-set parameters: • the molecule to be studied • the set of orbitals used to describe the molecule (ie, basis set) • the level of approximation used to compute  • Opportunities for user guidance: • what program to use? • how to build/optimize that program for a platform • how to effectively run the program on the machine • identification of common pitfalls (and pratfalls, too) • PARALLEL PROJECTS, SERIAL JOBS

Molecular Simulation Methods • Basic idea: do a very rudimentary energy calculation for a very large number of atomic configurations; translate these energies into thermodynamic properties via the molecular partition function • Configurations can be determined either deterministically (MD) or stochastically (MC), but that doesn’t matter. • we’ll look at MD as an example

The Molecular Dynamics Procedure • Begin with a set of molecules in a periodic box • like Asteroids, only geekier • Compute instantaneous forces on every atom in the box • bonds, angles, dihedrals, impropers within molecules • vdW and coulomb forces for proximal atoms • kspace electrostatic forces for distal atoms • Allow the atoms to move for 0.5 -- 2 fs. • Repeat from 100,000 to 10,000,000 times • Occasionally print out atomic positions, velocities, forces, and thermodynamic properties • Most analysis done post-hoc.

A Snapshot of Computational Demands • Bonded forces: • bonds • angles • dihedrals • impropers • Short-range pair forces: • van der Waals forces • coulomb forces • Long-range pair forces: • kspace Ewald sum for elctrostatics

What’s an Ewald Sum? • The energy of interaction for a van der Waals pair falls away with r6. • Consequently there’s a point (~10A) where these interactions are negligible and can be excluded • We set this point as a cutoff and exclude vdW pairs beyond it. • By contrast, ES potential falls away with r. • Over the years, it’s been demonstrated that imposing even a very long cutoff on the ES part is an unacceptable approximation. • The Ewald sum is the current solution.

Electrostatics in Fourier Space • When transformed into Fourier space (aka kspace), the ES part converges more rapidly. • Thus, the Ewald sum: • Real space ES for pairs within the vdW cutoff • Fourier (k-)space ES for longer range pairs until convergence is achieved (usually 5-7 layers of periodic images). • Fourier space ES for short range pairs as a correction against double counting. • And the particle-mesh Ewald sum: • Real space as in Ewald • Fourier space done by projecting atomic forces onto a grid, computing kspace part, projecting forces from grid back onto atoms

The Cost of an Ewald Sum • An ordinary, real-space LRES calculation would eventually converge, but require >15 periodic images to do so. • The kspace LRES of Ewald is similarly pairwise and similarly scales with N2. • However, the more rapid convergence means we only need 5-7 periodic images. • On the other hand, we now have to do some extra stuff: • 3d FFT to generate kspace part • extra SR calcs for double counting correction • 3d FFT tp transform Ewald forces back to real space

Particle-Mesh Ewald • The expense of an Ewald sum can be reduced by mapping atomic (particle) charges onto a grid (mesh) with far fewer points. • This grid can be transformed into kspace, the Ewald sum done, and transformed back. • The Ewald forces are then reverse-mapped onto the atoms that made up the grid. • This is an approximation, of which there are various flavors: • PME • SPME • PPPM • &c. • Scales as NlogN rather than N2. • Reduced scaling + extra mapping expense = crossover point? • In practice, crossover point is so low as to not matter.

Scaling an MD Simulation • MD programs lend themselves moderately well to MP parallelism. • Typical approach is spatial decomposition; ie, each PE computes forces & integrates for a region within the simulation cell. • Talk points are limited: • Pairlist builds • 3D FFT for Ewald sum • Mapping and unmapping for PME • New atomic positions in integrator • Thus, the main issue is whether or not there’s enough work to go around --> soft scaling very pronounced • I/O can be an issue, esp. for large boxes; should be limited to once every 100-1000 steps, though.

So, How Do MD Programs Do? • As usual, there’s a compromise to be made between “full featured” and “high performance”. • Old, heavily developed platforms like Amber have lots of features but only scale moderately well (to a few dozen procs). • New, high tech platforms like NAMD scale to hundreds of procs but lack common features (NH, &c.) • Again, all MD programs exhibit pronounced soft scaling: • bigger problems more accessible • smaller problems are no faster.

Moving Between MD Programs • Trickier than moving between QC programs. • There are a lot of subtle things that must be considered: • available PME approaches • different pairlisting algorithms • trajectory synchronization • scaling of 1-3 and 1-4 interactions • &c. • It’s almost never possible to build a single research project on simulations run with two different programs • Thus, it’s critical to choose the right program for the whole job at the beginning.

Molecular Simulation and the Computing Specialist • User-set parameters: • the size of the box • the length of the timestep • the number of steps needed • the forcefield features needed • Opportunities for guidance: • Which program to use • building/optimizing for a platform • scaling limits for a job of given size • &c.

Summary • Quantum Chemistry: • listen to what your users need • help the user organize jobs into “parallel projects” • go shopping for the best-scaling program to do individual job types • programs are more or less perfectly interchangeable, with a little care • Molecular Simulation • listen to what your users need • help the user shop for the best program for their sim • be careful about what you choose, because you’ll be stuck with it

Chemistry in Parallel Computing

Chemistry in Parallel Computing

Presentation Transcript

Parallel Computing

Parallel Computing in Chemistry

Domain decomposition in parallel computing

Parallel Computing

Parallel Computing in Matlab

Parallel Computing Explained Parallel Computing Overview

Parallel Computing

Parallel Computing

Chemistry in Parallel Computing

Parallel computing

Parallel Computing

Parallel Computing

GPU Parallel Computing

Contemporary Languages in Parallel Computing

Parallel Computing in Matlab

Parallel Computing

Parallel Computing

Parallel Computing

Parallel Computing

Parallel Computing

Parallel computing

Parallel Computing