1 / 14

Remote HPC Computing

Remote HPC Computing. Mr. Robert Burke. Relevant FNMOC Projects. Enterprise Operational Modeling (EOM) Enable FNMOC exploitation of enterprise-wide HPC assets Run models remotely at the Navy DSRC Distribute data directly to customers from Navy DSRC

kemp
Télécharger la présentation

Remote HPC Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Remote HPC Computing Mr. Robert Burke

  2. Relevant FNMOC Projects • Enterprise Operational Modeling (EOM) • Enable FNMOC exploitation of enterprise-wide HPC assets • Run models remotely at the Navy DSRC • Distribute data directly to customers from Navy DSRC • Fully Coupled COAMPS-OS Modeling Capability Initiative • Atmospheric Model Bridge Strategy • Interim solution until Earth System Prediction Capability (ESPC) • Anticipated by 2015 • Needed until at least 2020 for ESPC implementation 2

  3. NOGAPS to ESPC Baseline and Assumptions • NOGAPS will be replaced with the Navy Global Environmental Model (NAVGEM) in 2011 • New Semi-Lagrangian dynamic core and new physics • Resolution upgrades will continue if computational resources allow • New data upgrades will continue if available and supported • NUOPC ensemble and common standards lead to national system • ESPC (or other next generation system) is targeted for operational implementation by 2020 - 2025 • Anticipate a national modeling capability with Navy as contributor • Development and schedule of ESPC is uncertain • Bridge strategy required for Navy global NWP between 2013 and 2020 • Based on NAVGEM data assimilation cycle run at FNMOC, with extended forecasts run at DSRC • Goal is to maintain Navy competence while investing in ESPC • Computational, manpower, and R&D resources will constrain COAs 3

  4. HPC Requirements Implied by Models Roadmap 4

  5. HPC Requirements Implied by Models Roadmap 5

  6. EOM Project Plans • FY11 EOM Plans • Operationalize COAMPS-OS for NAVO regions at the Navy DSRC • Data Management and Transfer • Job Management and Control • Information Assurance (IA) • Documentation – Processes, Approvals, SOPs • Demonstrate NOGAPS Ensemble at the Navy DSRC • FY12 and beyond EOM Plans • Optimize Operationalization among FNMOC, NAVO, and Navy DSRC • Data Management and Transfer • Job Management and Control • Information Assurance (IA) • Configuration Management • Operationalize other Compute Intensive Models at Navy DSRC • NAVGEM • Global ensemble • COAMPS-OS ensemble 6

  7. EOM Data Plan • COAMPS-OS Operational Data Transfer Alternatives • Best solution: data transfer mechanism via ticketless, kerberized remote copy • Best data transfer performance • Can be completely automated with any scheduling mechanism • Bi-directional data transfer, either system can push or pull data • Requires one or more (scalable) A2 Emerald gateway nodes to be provisioned and kerberized • Navy ODAA (NAVNETWARCOM) waiver needed to address IA issues • Interim solution: data sources via CAGIPS and BFT • CAGIPS for all supported data types (currently NOGAPS initial and boundary conditions) • BFT for all data types • Backup data source: GODAE for NAVDAS atmospheric observations and NCODA ocean observations • Interim solution: data transfer back to FNMOC via DMZ 7

  8. EOM Data Transfer 8

  9. EOM Data Transfer 9

  10. EOM Job Management • COAMPS-OS Job Management Situation • NAVO runs COAMPS-dependent ocean models once daily • Model run mechanisms and paradigms • NAVO runs are time dependent, automated via script, and run generally without intervention • FNMOC runs are event dependent, tightly controlled and monitored • COAMPS-OS Operational Job Management Alternatives • PBS Pro remote execution without Supervisor Monitor Scheduler (SMS) • PBS Pro unkerberized already in use at both FNMOC and Navy DSRC • Longer term plans for EOM should minimize software dependency • Alternate control mechanisms with greater operator activity for initiating and controlling run are possible and necessary • Rapid Ocean Assessment Model Environment Relocatable (ROAMER) System • Script-based job monitoring system • Could be tailored and extended for FNMOC usage 10

  11. EOM IA • COAMPS-OS IA Situation • EOM Framework uses three types of connectivity • FNMOC and Navy DSRC connectivity (logon, data transfer, run models) • Data transfer from FNMOC to NAVO and NAVO to Navy DSRC • FNMOC Job Initiation, Control and Monitoring of DSRC model runs • FNMOC and Navy DSRC Maintain Different Security Postures • FNMOC part of operational community requiring full C&A with necessary demilitarized zones (DMZ), firewall, and border routers • Navy DSRC an R&D HPC center bound by HPCMP and DOD IA policies for R&D systems – currently no DMZ or firewall • Navy DSRC does have NAVNETWARCOM ATO with residual risk rating of Low • EOM IA Special Requirements • Most Ports, Protocols and Services (PPS) required for connection of FNMOC workstations and FNMOC operational cluster to Navy DSRC are Navy network policy compliant • Data transfer and job management between MAC II (FNMOC) and MAC III (DSRC) systems 11

  12. EOM IA Issues Explored • DoD IA Mission Assurance Categories • Mission Assurance Category I (MAC I) • Systems handling vital information to mission effectiveness of deployed or contingency forces in terms of both content and timeliness • Require most stringent protection measures • Not applicable to FNMOC • Mission Assurance Category II (MAC II) • Systems handling important information to support deployed or contingency forces • Consequences of loss of integrity are unacceptable • Loss of availability can only be tolerated for a short time • Require safeguards beyond best practices to ensure adequate assurance • FNMOC operational systems • Mission Assurance Category III (MAC III) • Systems handling information necessary for the conduct of day-to-day business, but does not materially affect support to deployed or contingency forces in the short term • Consequences of loss of integrity could include include delay or degradation of services or commodities enabling routine activities • Navy DSRC 12

  13. EOM Considerations with the DSRC • EOM IA Strategy • Minimize software and PPS used exclusively for EOM • Trade off EOM functionality and ease-of-use to gain IA, maintainability, and mobility • Obtain ODAA approval for preferred data transfer alternatives • DSRC Technology • Hardware technology refresh cycle • DSRC typically three years, • FNMOC 5-7 years • Software availability • DSRC a compute-engine • FNMOC requirements • Job management • Process control • Configuration management 13

  14. Summary • Leveraging remote HPC assets is part of a long-range strategy to deliver capability in a budget constrained world • There are unique challenges presented by DoD Information Assurance requirements • By carefully choosing what jobs can run remotely, “cloud-like” computing is possible

More Related