Operational COSMO Demonstrator OPCODE
240 likes | 374 Vues
The Operational COSMO Demonstrator (OPCODE) is a collaborative initiative aimed at enhancing the COSMO production suite through advanced GPU technology. Running from June 2011 to the end of 2012, the project integrates contributions from MeteoSwiss, SCS, and the Swiss HPC center CSCS. The key goals include leveraging research outcomes from the HP2C COSMO project and optimizing post-processing tools to improve forecast efficiency. Focused on porting existing codes to GPU and optimizing workflows, this demonstrator addresses significant bottlenecks while enhancing performance and reducing costs.
Operational COSMO Demonstrator OPCODE
E N D
Presentation Transcript
Operational COSMO Demonstrator OPCODE COSMO-GM, Rome, 5-9 September 2011 André Walser and Oliver Fuhrer MeteoSwiss
Project overview • Additional proposal to the Swiss HP2C initiative to build an “OPerational COSMO DEmonstrator (OPCODE)” • Project proposal accepted by end of May • Start of project 1 June 2011 until end of 2012 • Project resources: • second contract with IT company SCS to continue collaboration until end of 2012 • 2 new positions at MeteoSwiss for about 1 year • Swiss HPC center CSCS • C2SM (collaboration with ETH Zurich and others)
GPU based hardware (a few rack units) Cray XT4 (3 cabinets) Main goals • Leverage the research results of the ongoing HP2C COSMO project • Prototyp implementation of the COSMO production suite of MeteoSwiss making aggressive use of GPU technology • MeteoSwiss ready to buy a GPU based hardware for the 2015 production machine • Same time-to-solution on substantially cheaper hardware:
GPU perspectives GFLOPS per Watt is expected to increase strongly in the next years
Current production scheme COSMO-7 assimilation COSMO-7 forecast COSMO-7 TC products COSMO-2 assimilation COSMO-2 forecast COSMO-2 TC products COSMO-7 / COSMO-2suite: Elapsed time in min 1 7 11 46 0 34 49 61 25-72h forecast (00 UTC) and TC products 0-24h forecast (00 UTC) and TC products 0-24h forecast (00 UTC) and TC products 3h assimilation (21 UTC) 3h assimilation (21 UTC) • Time-critical post-processing takes about 15 minutes longer than forecasts for both COSMO-2 and COSMO-7 • current bottleneck is post-processing tool fieldextra • entire suite has to be optimized for demonstrator
Two workpages • Workpage A:Porting remainig parts of opr COSMO code @ MeteoSwiss to demonstrator • Workpage B:Porting suite to demonstrator, optimize it, and operate it
Work package A To use full speed-up, data has to remain on GPU within a time step; sent to CPU for I/O only COSMO workflow: What’s still missing for a full GPU implementation? Input Physics Dynamics Assimilation Boundary Conditions Diagnostics Output
Work package A To use full speed-up, data has to remain on GPU within a time step; sent to CPU for I/O only COSMO workflow: What’s still missing for a full GPU implementation? Input Physics HPC2 Dynamics HPC2 Assimilation Boundary Conditions Diagnostics Output
Task A2: Inter-/intra-GPU parallelization • COSMO requires a communication library with halo-update as well as several other communications (e.g. global reduce, gather, scatter) • e.g. peer-to-peer:
A4. Data Assimilation: Porting to GPU Assimilation part is a huge code!
Organization 0.9 FTE new position @MeteoSwiss1 yearstill open 1.9 FTEnew collaborator @MeteoSwiss 15 months, CSCS 1.7 FTESCS, CSCS, C2SM