Enhancing Forecasting Systems: Reliability, Quality, and Effective Change Management

Forecast Revision Goals • Improve Reliability, Fault Tolerance, Recovery • Measure and Improve Quality • Change Management, Configuration Management, Standards, Documentation • Performance • Flexibility • System Monitoring, Maintenance • Facilitate Collaboration with CORIE Researchers

Towards Reliable Forecasts • Forecast monitoring team Arun, Ethan, Paul Science, systems, software Team members cross-train in specialty Oncall rotation Monitoring and Alerting Big Brother Oversee Change Management

Managing Change • Change and Configuration Management Development, production environments Deploy products from development to production Version control using CVS • Standards Perl, C coding standards CORIE.pm libelio.a • Documentation

Oncall • 24/7 • Weekly rotation • Respond to alerts received via E-mail, pagers and resolve problems – whatever it takes. • Oncall procedures page

Monitoring and Alerting • In control and processing scripts Problems with model forcings Run fails to complete Processing problems • Big Brother Monitors network connectivity, ping Network protocols e.g. HTTP, SSH Disk, CPU Specific processes e.g. master_process.pl

Measure and Improve Quality • Error analysis • 3 and 7 day error analysis (model data comparisons using database) • Summarized values (averaged over all stations) to quantify forecast skill • Comparisons with external forcings (river, wind (TBD)) • Comparisons (TBD) • between forecasts • With near term hindcast • With field exercises • Comparisons with verified data

Forecast Systems and Data Flow

Databases • Postgresql • Amb105 – production DB server • Amb104 – backup production DB server • Amb36 – development DB server • Ease of access via Perl DBI • Automatic archiving of external data • Telemetry (parallel with process on amb24) • Verified data (TBD) • Performance issues

Forecasts • Reference (AKA Production) • Experimental • Development • Near term hindcast

Reference Forecast • Runs every day • Controlled, infrequent changes • Failure rate minimal, most stable forecast • Atmospheric forcings from eta+osu • Hosted on amb1018

Experimental Forecast • Runs like production mode • Changes managed but more frequently allowed than reference • Failure rate can be higher • Failed forecasts need to be updated • Atmospheric forcings from eta only • Hosted on amb1017

Development Forecast • Does not run in production mode • Minimal results stored (3 days) • Test changes to be incorporated in ref/exp forecasts, e.g. model forcings • Development environment for new products and scripts • Hosted on amb1019

Hindcasts • Runs once a week for past week • Parameter files based on previously set database (currently database06) • Runs based on week number • River forcings from relational database • Atmospheric forcings from locally stored NOAA archive • Hosted on amb1020

Forecast Forcings • River forcings amb1020 daily: 7:45,10:45,13:45,16:45 getforcings.pl (to DB) • Atmospheric forcings amb103 daily: 00:05 get_eta.csh (to NFS) 00:10 get_gfs_air.csh • Atmospheric forcings amb104 daily: 02:00 get_avn.csh (to NFS) 04:00 get_mrf.csh 09:30 get_osu.csh

Forecast execution • On each forecast system daily: 00:10 simlink.pl on local directory 00:10 simlink.pl on NFS directory 09:00 do_error_analysis.pl (processing) 11:00 place_hdf_files_new.csh 11:25 prep.pl 11:35 checkinputs.pl 12:00 start.pl

Forecast processing • Master process, runs continuously as a daemon. Executes on local disk looping over: do_isolines.pl do_ll_isolines.pl do_transects.pl do_hab_isolines.pl do_plumevol.pl do_intrusionlength.pl extract_station_ADP.pl (from DB) extract_station_CTD.pl (from DB) do_stationextraction.pl do_stationplots.pl rsync to NFS

Hindcast Processing • Uses same scripts as forecasts • Remove differences between hindcast and forecast processing (2 vs 7 days) • Some plot parameter file differences

Develop and Deploy • Checkout module from CVS • Modify, add codes on a local copy • CVS commit • Deploy to development environment • Deploy to experimental environment • Deploy to reference environment • Development web page

Going Forward • Improve monitoring in processing codes • Failover for forcings, climatology • Revise relational databases (per Bill H.) • Tune BB threshholds and start paging • Review current products • Document procedures and products • Migrate to new grid, quadrangles • Forecast/forecast forecast/hindcast comparisons using verified data • Comparisons with external forcings

Enhancing Forecasting Systems: Reliability, Quality, and Effective Change Management

Enhancing Forecasting Systems: Reliability, Quality, and Effective Change Management

Presentation Transcript

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision

Revision Goals

Revision

REVISION

Revision

Revision!

ReVision