1 / 23

Grid-enabled Hybrid Computer System for Computational Astrophysics using OmniRPC

Grid-enabled Hybrid Computer System for Computational Astrophysics using OmniRPC. Mistuhisa Sato & T aisuke Boku Center for Computational Physics University of Tsukuba {msato,taisuke}@is.tsukuba.ac.jp http://www.hpcs.is.tsukuba.ac.jp/. PC. PC. PC. PC. PC. PC. PC. PC. PC. PC. PC.

Ava
Télécharger la présentation

Grid-enabled Hybrid Computer System for Computational Astrophysics using OmniRPC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid-enabled Hybrid Computer System for Computational Astrophysicsusing OmniRPC Mistuhisa Sato & Taisuke Boku Center for Computational Physics University of Tsukuba {msato,taisuke}@is.tsukuba.ac.jp http://www.hpcs.is.tsukuba.ac.jp/

  2. PC PC PC PC PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC Cluster Grid and Parallel Applications • “Typical” Grid Applications • Parametric execution: Execute the same program with different parameters using an large amount of computing resources • master-workers type of parallel program • “Typical” Grid Resources • A Cluster of Clusters: some PC Clusters are available • Dynamic resources: load and status are changed time-to-time. • Special computing resources like GRAPE6 Grid Environment HMCS-G

  3. Programming for Grid • Using Globus shell (GSH) • Submit batch job scripts to remote nodes • Grid MPI (MPICH-G, PACX MPI, …) • General-purpose, but difficult and error-prone • No support for dynamic resource and fault-tolerance • Grid RPC • A good programming interface • Ninf (ver.1) • Use asynchronous RPC calls for parallel programming • NetSolve • Ninf-G • OmniRPC Ninf_call_async(A,…) Ninf_call_async(B,…) … Ninf_call_wait()

  4. OmniRPC Design • Support Master-workers parallel programs for parametric search grid applications • A gridRPC based on Ninf Grid RPC • The thread-safe RPC design allows to use OpenMP for simple and easy parallel programming API for existing programs • Make use of remote clusters of PC/SMP as Grid computing resource • omrpc-agent architecture • Support large-scale grid environments (upto 1000 hosts!) • Use local scheduler (PBS, SunGridEngine, Condor, …) for scheduling jobs in remote • Seamless programming environment from clusters to grid. It use “rsh” for a cluster, “GRAM” for a grid managed by Globus, “ssh” for a remote nodes. • Persistent data support in remote workers • For applications which requires large data • re-use connection to workers for fast invocation • Automatic initializable persistence model for flexible re-allocation and fault tolerance • Use direct stub invocation to provide a RPC interface to a specific remote host with security and authentication • HMCS-G OmniRpcinit(&argv,&argc); /* initialize RPC */ .... #pragma omp parallel for private(t) reduction(+:s) for(i = 0; i < N; i++){ OmniRpcCall("work",i,&t,..); s += … t …; }

  5. worker (Library) Worker (Library) Worker (Library) OmniRPC Agent • An agent process in the remote node • Invoked by “rsh”, “GRAM”, “ssh” at the beginning • Accept request from client, and submit RPC executable’s jobs at cluster nodes to the local scheduler (PBS, Condor, etc…) • Proxy for cluster nodes • support for clusters with private IP address • I/O multiplexing for reduction of # of connections. Scalable upto 1000 hosts!!! • connection using ssh port forwarding • Non-server design for security • OmniRPC agent is invoked at each run • Master accepts connections at dynamically allocated ports Cluster node Cluster server Scheduler (PBS,Condor…) Master (client) OmniRPC agent proxy multiplex I/O

  6. GRAPE-6 U. of Tokyo 8 boards, 8 TFLOPS (Gordon Bell Award 2000, 2001 by U. Tokyo) Main resources in CCP, U. Tsukuba CP-PACS/2048 U. Tsukuba/Hitachi 2048 PUs, 614 GFLOPS #1 in TOP500 at Nov. 1996 SR8000/G1 Hitachi 12 nodes, 172 GFLOPS

  7. Main resources in CCP (cont’d) Alpha CPU Cluster HP/Compaq DS20L + Myrinet2K Dual 833 MHz 21264 29 nodes, 96 GFLOPS PC Clusters AthlonMP 2x16 nodes Itanium 4x4 nodes Gigabit Ethernet SGI Origin2K & Onyx2 O2K: 8 CPUs + 8 Fast Ether Onyx2: 4 CPUs + 4 Fast Ether Graphics and File Servers

  8. HMCS-G is a concept of hybrid computational system to combine General purpose and Special purpose machines based on Grid-RPC What is HMCS-G ? Importance of Grid for Computational Physics • Efficient utilizationof world-wide generic HPC resources (MPP, cluster, storage, network, etc.)= Quantitative Contribution • Sharing special purpose machines installed to small number of institutes from all over the world= Qualitative Contribution

  9. Basic concept of HMCS • Computational power required for computational physics • In the state-of-the-art computational physics, various physical phenomena have to contribute for detailed and precise simulation • Some of them require enormous computational power in large problem size • FFT: O(N logN) • gravity, molecular dynamics: O(N 2) • nano-scale material: O(N 3), O(N 4), ... • Ordinary general purpose machine is not enough in many cases • Requirement of Special Purpose Machines

  10. Basic concept of HMCS (cont’d) • General Purpose Machines:Variety of algorithm and easy programming formulti-purpose utilization • Special Purpose Machines:Absolute computational power with very limitedtype of calculation • We need both ! Heterogeneous Multi-Computer System combining high speed computation power with high bandwidth network

  11. GRAPE-6 • The 6th generation of GRAPE (Gravity Pipe) Project • Gravity calculation for many particleswith 31 Gflops/chip • 32 chips / board ⇒ 0.99 Tflops/board • 64 boards of full system is installed in University of Tokyo ⇒ 63 Tflops • On each board, all particles data are set onto SRAM memory, and each target particle data is injected into the pipeline, then acceleration data is calculated • Gordon Bell Prize at SC2000, SC2001(Prof. Makino, U. Tokyo)also nominated at SC2002

  12. Paralel File Server (SGI Origin2000) Block Diagram of HMCS-L (local) Special Purpose Machine Particle Simulation (GRAPE-6) MPP for Continuum Simulation (CP-PACS) Parallel I/O System PAVEMENT/PIO … … 32bit PCI × N … … … … ・・ ・・・・ … … 100base-TX Switches Hybrid System Communication Cluster (Compaq Alpha) Parallel Visualization Server (SGI Onyx2) Parallel Visualization System PAVEMENT/VIZ

  13. HMCS-G: Grid-enabled HMCS • Purpose • Sharing special purpose machine GRAPE-6 among users over the world who needs gravity calculation • Providing local and remote services of GRAPE-6 facility access simultaneously • Utilizing GRAPE-6 resource efficiently • Hiding long network access latency through communication buffers between end-points • Implementation • Multiple clients support for GRAPE-6 server cluster • OmniRPC: Grid-enabled RPC • Authentication: globus & ssh

  14. HMCS-L protocol OmniRPC (ssh) OmniRPC (globus) HMCS-G block diagram U. Tsukuba CCP Cluster CP-PACS G6 HMCS-G server cluster G6 Cluster G6 Agent (OmniRPC stub) Vector Machine Firewall Cluster MPP

  15. Agent (OmniRPC stub) • Authentication and Accounting • Various benefits for application users • easy API • globus option for de facto standard authentication • ssh option for easy system installation • Data buffering and communication speed-gap absorbing fast OmniRPC Agent HMCS-G server client OmniRPC Agent slow client OmniRPC Agent client GRAPE-6

  16. G6 HMCS-L protocol OmniRPC (ssh) OmniRPC (globus) Current Environment U. Tsukuba Alpha cluster CCP AthlonMP cluster HMCS-G server cluster Agent (OmniRPC stub) Firewall P-4 cluster P-III cluster TITECH (SuperSINET) Rikkyo Univ. (Internet)

  17. Network condition • CCP local • 1000base-SX on backbone, 100base-TX for leaves • Inside the university • 1000base-LX on backbone, 100base-TX for leaves • Between U. Tsukuba and TITECH • SuperSINET (4Gbps commodity)→ will be upgraded to 1Gbps dedicated link • Between U. Tsukuba and Rikkyo U. • Commodity Internet (b/w ??)

  18. Performance results • GRAPE-6 pure computation time with 1 node • 0.8 sec for 64K particles • 2.1 sec for 128K particles Computation load for 128K particles: 1.3 TFLOP • Turn around time for 1 time step with 64K particles[total] (communication) • [1.2 sec] (0.4 sec) (local, direct) • [1.7 sec] (0.9 sec) (university, OmniRPC-globus) • [2.1 sec] (1.3 sec) (university, OmniRPC-ssh) • [2.8 sec] (2.0 sec) (TITECH, OmniRPC-ssh) SuperSINET • [9.1 sec] (8.3 sec) (Rikkyo, OmniRPC-ssh) Internet

  19. On-line Demonstration • Run simple client process to compute 10,000 particles with same mass for gravity calculation only⇒ on my laptop HERE • Transfer gravity calculation request to Agent of HMCS-G running in Center for Computational Physics, U. Tsukuba • Show particles movement on 2-D mapping (actual calculation is performed in 3-D) • Multiple processes can be served simultaneously

  20. Simulation of Galaxy Formation • Precise simulation of Galaxy Formation based on RT-SPH • Smoothed Particle Hydrodynamics with Radiative Transfer (RT-SPH) under Gravity • Combination of hydro-dynamics computation and gravity calculation • Cluster and MPP for RT-SPH • GRAPE-6 for gravity • RT-SPH with 128K particles for 40,000 time steps takes approximately 60 hours with 32 CPUs of Alpha 21264 cluster (DS20L base, 833 MHz) • Gravity calculation with GRAPE-6 including communication takes 11 hours

  21. Simulation Result (SPH with High-mass) Blue: “Stars” Other Color: Temperature

  22. Simulation Result (SPH with Low-mass) Blue: “Stars” Other Color: Temperature

  23. Conclusions • HMCS-G enables world-wide utilization of special purpose machine (GRAPE-6)based on OmniRPC • Multi-physics simulation is very important in next generation computational physics • This platform concept is expandable to various special purpose systems • OmniRPC/ssh is very easy to implement for pure application users as well as OmniRPC/globus

More Related