400 likes | 603 Vues
CCLMS. Large-Scale Molecular Dynamics Simulations of Materials on Parallel Computers. Aiichiro Nakano & Priya Vashishta Concurrent Computing Laboratory for Materials Simulations Department of Computer Science Department of Physics & Astronomy Louisiana State University
E N D
CCLMS Large-Scale Molecular Dynamics Simulations of Materials on Parallel Computers Aiichiro Nakano & Priya Vashishta Concurrent Computing Laboratory for Materials Simulations Department of Computer Science Department of Physics & Astronomy Louisiana State University Email: nakano@bit.csc.lsu.edu URL: www.cclms.lsu.edu VII International Workshop on Advanced Computing & Analysis Techniques in Physics Research Organizers: Dr. Pushpalatha Bhat & Dr. Matthias Kasemann October 19, 2000, Fermilab, IL
Outline 1.Scalable atomistic-simulation algorithms 2.Multidisciplinary hybrid-simulation algorithms 3. Large-scale atomistic simulation of nanosystems > Nanophase & nanocomposite materials > Nanoindentation & nano-impact damage > Epitaxial & colloidal quantum dots 4.Ongoing projects
Concurrent Computing Laboratory for Materials Simulations Faculty (Physics, Computer Science): Rajiv Kalia, Aiichiro Nakano, Priya Vashishta Postdocs/research faculty: Martina Bachlechner, Tim Campbell, Hideaki Kikuchi, Sanjay Kodiyalam, Elefterios Lidorikis, Fuyuki Shimojo, Laurent Van Brutzel, Phillip Walsh Ph.D. Students: Gurcan Aral, Paulo Branicio, Jabari Lee, Xinlian Liu, Brent Neal, Cindy Rountree, Xiaotao Su, Satavani Vemparala, Troy Williams Visitors: Elisabeth Bouchaud (ONERA), Antonio da Silva (São Paulo), Simon de Leeuw (Delft), Ingvar Ebbsjö (Uppsala), Hiroshi Iyetomi (Niigata), Shuji Ogata (Yamaguchi), Jose Rino (São Carlos)
Education: Dual-Degree Opportunity • Ph.D. in physics & MS from computer science in 5 years —Broad career options (APS News, August/September, ‘97) • Synergism between HPCC (MS) & application (Ph.D.) research —Best dissertation award(Andrey Omeltchenko, ‘97) —MS publication(Parallel Comput., IEEE CS&E, Comput. Phys. Commun., etc.) •Internship—deliverable-oriented approach to real-world problems providesexcellentjob training Boeing, NASA Ames, Argonne Nat’l Lab.(Web-based simulation/ experimentation, Alok Chatterjee, Enrico Fermi Scholar, ‘99) •International collaboration Niigata, Yamaguchi (NSF/U.S.-Japan), Studsvik (Sweden), Delft (The Netherlands), São Carlos (Brazil) • NSF Graduate Research Traineeship Program •New program: Ph.D. biological sciences & MS computer science
International Collaborative Course T3E SP Origin The Netherlands Delft Univ. USA Louisiana State Univ. Alpha cluster Video Conferencing ImmersaDesk VR workbench Virtual Classroom Chat Tool Whiteboard Tool Web-based course involving LSU, Delft Univ. in the Netherlands, Niigata Univ. in Japan, & Federal Univ. of Sao Carlos in Brazil
DoD Challenge Applications Award 1.3 million node-hours in 2000/2001
CCLMS 1. Scalable Atomistic-Simulation Algorithms
Atomistic Simulation of Nanosystems 1 m Petaflop Continnum regime 10 12 10 -10 atoms CMOS (SIA Roadmap) Atomistic Simulation 0.25 m of Real Devices Line width 100 nm Teraflop Molecular Dynamics 70 nm 7 9 10 -10 atoms Atomistic regime 10 nm 1996 1998 2000 2002 2004 2006 2008 2010 Year •Peta (1015) flop computers direct atomistic simulations •Scalable applications multiresolution algorithms are key
Molecular Dynamics Simulation • Newton’s equations of motion •Many-body interatomic potential >2-body: Coulomb; steric repulsion; charge-dipole; dipole-dipole >3-body: Bond bending & stretching —SiO2, Si3N4, SiC, GaAs, AlAs, InAs, etc.
Validation of Interatomic Potentials Neutron static structure factor Phonon dispersion amorphous amorphous SiO2 Si3N4 Johnson et al. (‘83) Yoshida et al. (‘93) GaAs SiC High-pressure phase transition
Long-range Short-range Slow Rapid FMM MTS • Hierarchical Fast Multipole Method •Multiple Time-Scale method O(N2) O(N) Space-time Multiresolution Algorithm Challenge 1: Scalability to billion-atom systems Scaled speedup on Cray T3E 1.02 billion-atom MD for SiO2: 26.4 sec/step on 1,024 Cray T3E processors at NAVO-MSRC, Parallel efficiency = 0.97
Regular mesh topology in computational space, Curved partition in physical space, x Wavelet-based Load Balancing Challenge 2: Load imbalance on a parallel computer Irregular data-structures/ processor-speed Map Parallel computer “Computational-space decomposition” in curved space Wavelet representation speeds up optimization of (x)
Fractal-based Data Compression 14 13 11 12 4 10 3 5 6 2 1 9 7 8 Challenge 3: Massive data transfer via OC-3 (155 Mbps) 75 GB/frame of data for a 1.5-billion-atom MD! Scalable encoding: •Spacefilling curve—store relative positions Result: •I/O size, 50 Bytes/atom 6 Bytes/atom
Inter-atomic Intra-atomic Variable-charge MD Challenge 4: Complex realism—chemical reactions Electronegativity equalization: • Determine atomic charges at every MD step—O(N3)! (Streitz & Mintmire, ‘94) • i) Fast multipole method; ii) q(init)(t+t) = q(t) O(N) Multilevel preconditioned conjugate gradient (MPCG): • Sparse, short-range interaction matrix as a preconditioner •20% speed up • Enhanced data locality: parallel efficiency, 0.93 0.96 for 26.5M-atom Al2O3 on 64 SP2 nodes
Linear-Scaling Quantum-Mechanical Algorithm Challenge 5: Complexity of ab initio QM calculations •Density functional theory (DFT) (Kohn, ‘98 Nobel Chemistry Prize)—O(CN )O(N3 ) •Pseudopotential (Troullier & Martins, ‘91) •Higher-order finite-difference (Chelikowsky, Saad, et al., ‘94) •Multigrid acceleration(Bernholc, et al., ‘96) •Spatial decomposition O(N) algorithm (Mauri & Galli, ‘94) •Unconstrained minimization •Localized orbitals •Parallel efficiency ~ 96% for a 22,528-atom GaAs system on 1,024 Cray T3E processors
Scalable MD/QM Algorithm Suite Design-space diagram on 1,024 Cray T3E processors On 1,280 IBM SP3 processors: •8.1-billion-atom MD of SiO2 •140,000-atom DFT of GaAs
Immersive & Interactive Visualization Last Challenge: Sequential bottleneck of graphics pipeline • Octree data structure for fast visibility culling • Multiresolution & hybrid (atom, texture) rendering • Parallel preprocessing/predictive prefetch • Graph-theoretical data mining of topological defects
CCLMS 2. Multidisciplinary Hybrid-Simulation Algorithms
Multiscale Simulation Lifetime prediction of safety-critical micro-electro-mechanical systems (MEMS) [R. Ritchie, Berkeley] • Engineering mechanics experimentally validated > 1 mm • Atomistic simulation possible < 0.1 mm Bridging the length-scale gap by seamlessly coupling: •Finite-element (FE) calculation based on elasticity; •Atomistic molecular-dynamics (MD) simulation; •Ab initio quantum-mechanical (QM) calculation.
QM Handshake atoms MD Hybrid QM/MD Algorithm MD simulation embeds a QM cluster described by a real-space multigrid-based density functional theory Additive hybridization Reuse of existing QM & MD codes Handshake atoms Seamless coupling of QM & MD systems FE/MD/Tight-binding QM (Abraham, Broughton, Bernstein, Kaxiras, ‘98)
MD HS FE [1 1 1] [1 1 1] _ _ [2 1 1] [0 1 1] Hybrid MD/FE Algorithm • FE nodes & MD atoms coincide in the handshake region • Additive hybridization
Oxidation on Si Surface MD FE QM cluster MD Si QM O QM Si Handshake H Dissociation energy of O2 on a Si (111) surface dissipated seamlessly from the QM cluster through the MD region to the FE region
CCLMS 3. Large-Scale Atomistic Simulation of Nanosystems
Fracture Simulation & Experiment Microcrack coalescence Ti3Al alloy E. Bouchaud Si3N4 Multiple branching Glass K. Ravi-Chandar Graphite
-0.8 0 0.8 Shear stress (GPa) Fracture Energy of GaAs: 100-million-atom MD Simulation 256 Cray T3E processors at DoD’s NAVO-MSRC 1.3 m Good agreement with experiments *Messmer (‘81) #Michot (‘88)
Si3N4-SiC Fiber Nanocomposite 1.5-billion-atom MD on 1,280 IMB SP3 processors at NAVO-MSRC Color code: Si3N4;SiC;SiO2 0.3 mm Fracture surfaces in ceramic-fiber nanocomposites: Toughening mechanisms?
-5 0 -2 2 5 10 >20 Pressure (GPa) Nanoindentation on Silicon Nitride Surface Use Atomic Force Microscope (AFM) tip for nanomechanical testing of hardness 10 million atom MD at ERDC-MSRC Highly compressive/tensile local stresses
<1210> <1010> Indentation Fracture & Amorphization Indentation fracture at indenter diagonals Anisotropic fracture toughness Amorphous pile-up at indenter edges <0001>
Hypervelocity Impact Damage Design of damage-tolerant spacecraft Diamond impactor Diamond coating Meteoroid detector on Mir Orbitor Impact velocity: 8 - 15 km/s Impact graphitization Reactive bond-order potential (Brenner, ‘90)
V = 8 km/s V = 11 km/s V = 15 km/s Impact-Velocity Sensitivity Crossover from quasi-elastic to evaporation at ~ 10 km/s time
10nm AlGaAs QD GaAs AlGaAs 101 001 Epitaxially Grown Quantum Dots - Substrate-encoded size-reducing epitaxy GaAs (001) substrate; <100> square mesas A. Madhukar (USC)
Stress Domains in Si3N4/Si Nanopixels 70 nm -2GPa 2GPa 27 million atom MD simulation Si3N4 Si Stress well in Si with a crystalline Si3N4 film due to lattice mismatch Stress domains in Si due to an amorphous Si3N4 film
Colloidal Semiconductor Quantum Dots • Applications • LED, display • Pressure synthesis of novel materials 17.5 GPa 22.5 GPa 30 Å High-pressure structural transformation in a GaAs nanocrystal Nucleation at surface Multiple domains
110 Å 70 Å Oxide Growth in an Al Nanoparticle Unique metal/ceramic nanocomposite Al AlOx Oxide thickness saturates at 40 Å after 0.5 ns —Excellent agreement with experiments
CCLMS 4. Ongoing Projects
Universal access to networked supercomputing http://www.nas.nasa.gov/aboutNAS/Future2.html Information Grid I. Foster & C. Kesselman, The Grid: Blueprint for a New Computating Infrastructure (‘99) Metacomputing collaboration with DoD MSRCs: 4-billion-atom MD simulation of 0.35 mm fiber composites
MD Moore’s Law Number of atoms in MD simulations has doubled: • Every 19 months in the past 36 years for classical MD • Every 13 months in the past 15 years for DFT-MD 1,280 x IBM SP3 CDC3600 A petaflop computer will enable 1012-atom MD & 107-atom QM
Hybrid Simulation of Functionalized AFM Si3N4 AFM Tip QM MD FE Nanodevices to design new biomolecules Biological Computation & Visualization Center, LSU ($3.9M, 2000- )
CCLMS Conclusion Research supported by NSF, AFOSR, ARO, USC/LSU MURI, DOE, NASA, DOD Challenge Applications Award Large-scale, multiscale simulations of realistic nanoscale systems will be possible in a metacomputing environment of the Information Grid