1 / 25

CyberShake Study 2.3 Technical Readiness Review

CyberShake Study 2.3 Technical Readiness Review. Study re-versioning. SCEC software uses year.month versioning Suggest renaming this study to 13.4. Study 13.4 Scientific Goals. Compare Los Angeles-area hazard maps RWG V3.0.3 vs AWP-ODC-SGT (CPU) CVM-S4 vs CVM-H

mary
Télécharger la présentation

CyberShake Study 2.3 Technical Readiness Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CyberShake Study 2.3 Technical Readiness Review

  2. Study re-versioning • SCEC software uses year.month versioning • Suggest renaming this study to 13.4

  3. Study 13.4 Scientific Goals • Compare Los Angeles-area hazard maps • RWG V3.0.3 vs AWP-ODC-SGT (CPU) • CVM-S4 vs CVM-H • Different version of CVM-H than previous runs • Adds San Bernardino, Santa Maria basins • 286 sites (10 km mesh + points of interest)

  4. Proposed Study sites

  5. Study 13.4 Data Products • 2 CVM-S4 Los Angeles-area hazard maps • 2 CVM-H 11.9 Los Angeles-area hazard maps • Hazard curves for 286 sites – 10s, 5s, 3s • Calculated with OpenSHA v13.0.0 • 1144 sets of 2-component SGTs • Seismograms for all ruptures (about 470M) • Peak amplitudes in DB for 10, 5, 3s • Access via CyberShake Data Product Site (in development)

  6. Study 13.4 Notables • First AWP-ODC-SGT hazard maps • First CVM-H 11.9 hazard maps • First CyberShake use of Blue Waters (SGTs) • First CyberShake use of Stampede (post-processing) • Largest CyberShake calculation by 4x

  7. Study 13.4 Parameters • 0.5 Hz, deterministic post-processing • 200 m spacing • CVMs • Vs min = 500 m/s • GTLs for both velocity models • UCERF 2 • Latest rupture variation generator

  8. Verification work • 4 sites (WNGC, USC, PAS, SBSM) • RWG V3.0.3, CVM-S • RWG V3.0.3, CVM-H • AWP, CVM-S • AWP, CVM-H • Plotted with previously calculated RWG V3 • Expect RWG V3 slightly higher than the others

  9. WNGC CVM-S CVM-H RWG V3.0.3 - Green AWP - Purple RWG V3 - Orange

  10. USC CVM-S CVM-H RWG V3.0.3 - Green AWP - Purple RWG V3 - Orange

  11. PAS CVM-S CVM-H RWG V3.0.3 - Green AWP - Purple RWG V3 - Orange

  12. SBSM CVM-S CVM-H RWG V3.0.3 - Green AWP - Purple RWG V3 - Orange

  13. SBSM Velocity Profile

  14. Study 13.4 SGT Software Stack • Pre AWP • New production code • Converts velocity mesh into AWP format • Generates other AWP input files • SGTs • AWP-ODC-SGT CPU v13.4 (from verification work) • RWG V3.0.3 (from verification work) • Post AWP • New production code • Creates SGT header files for post-processing with AWP

  15. Study 13.4 PP Software Stack • SGT Extraction • Optimized MPI version with in-memory rupture variation generation • Support for separate header files • Same code as Study 2.2 • Seismogram Synthesis / PSA calculation • Single executable • Same code as Study 2.2 • Hazard Curve calculation • OpenSHA v13.0 • All codes tagged in SVN before study begins

  16. Distributed Processing (SGTs) • Runs placed in pending file on Blue Waters (as scottcal) • Cron job calls build_workflow.py with run parameters • build_workflow.py creates PBS scripts defining jobs with dependencies • Cron job calls run_workflow.py • run_workflow.py submits PBS scripts using qsub dependencies • Limited restart capability • Final workflow jobs ssh to shock, call handoff.py • Performs BW->Stampede SGT file transfer (as scottcal) • scottcal BW proxy must be resident on shock • Registers SGTs in RLS • Adds runs to pending file on shock for post-processing

  17. Distributed Processing (PP) • Cron job on shock submits post-processing runs • Pegasus 4.3, from Git repository • Condor 7.6.6 • Globus 5.0.3 • Jobs submitted to Stampede (as tera3d) • 8 sub-workflows • Extract SGT jobs as standard jobs (128 cores) • seis_psa jobs as PMC jobs (1024 cores) • Results staged back to shock, DB populated, curves generated

  18. SGT Computational Requirements • SGTs on Blue Waters • Computational time: 8.4 M SUs • RWG: 16k SUs/site x 286 sites = 4.6 M SUs • AWP: 13.5k Sus/site x 286 sites = 3.8 M SUs • 22.35 M SU allocation, 22 M SUs remaining • Storage: 44.7 TB • 160 GB/site x 286 sites = 44.7 TB

  19. PP computational requirements • Post-processing on Stampede • Computational time: • 4000 SUs/site x 286 sites = 1.1 M SUs • 4.1 M SU allocation, 3.9 M remaining • Storage: 44.7 TB input, 13 TB output • 44.7 TB of SGT inputs; will need to rotate out • Seismograms: 46 GB/site x 286 sites = 12.8 TB • PSA files: 0.8 GB/site x 286 sites = 0.2 TB

  20. Computational Analysis • Monitord for post-processing performance • Will run after workflows have completed • May need python scripts for specific CS metrics • Scripts for SGT performance • Cronjob to monitor core usage on BW • Does wrapping BW jobs in kickstart help? • Ideally, same high-level metrics as Studies 1.0 and 2.2

  21. Long-term storage • 44.7 TB SGTs: • To be archived to tape (NCSA? TACC? Somewhere else?) • 13 TB Seismograms, PSA data • Have been using SCEC storage - scec-04? • 5.5 TB workflow logs • Can compress after mining for stats • CyberShake database • 1.4 B entries, 330 GB data (scaling issues?)

  22. Estimated Duration • Limiting factors: • Blue Waters queue time • Uncertain how many sites in parallel • Blue Waters → Stampede transfer • 100 MB/sec seems sustainable from tests, but could get much worse • 50 sites/day; unlikely to reach • Estimated completion by end of June

  23. Personnel Support • Scientists • Tom Jordan, Kim Olsen, Rob Graves • Technical Lead • Scott Callaghan • Job Submission / Run Monitoring • Scott Callaghan, David Gill, Phil Maechling • Data Management • David Gill • Data Users • Feng Wang, MarenBoese, Jessica Donovan

  24. Risks • Stampede becomes busier • Post-processing still probably shorter than SGTs • CyberShake database unable to handle data • Would need to create other DBs, distributed DB, change technologies • Stampede changes software stack • Last time, necessitated change to MPI library • Can use Kraken as backup PP site while resolving issues • New workflow system on Blue Waters • May be as yet undetected bugs

  25. Thanks for your time!

More Related