ENSEMBLE FORECASTING – A NEW PARADIGM IN NUMERICAL WEATHER PREDICTION

Zoltan Toth Global Systems Division, ESRL/OAR/NOAA Acknowledgements: Isidora Jankov, Malaquias Pena, Yuanfu Xie, Paula McCaslin, Paul Schultz, Linda Wharton, Roman Krysztofowicz, Yuejian Zhu, Andre Methot, Tom Hamill, Kathy Gilbert, et al. ENSEMBLE FORECASTING – A NEW PARADIGM IN NUMERICAL WEATHER PREDICTION

INTRODUCTION • Share experience / perspective • 19 yrs at NCEP/EMC • Operational needs • Over a year at NOAA research lab (GSD) • Research opportunities • Review host of NWP research / development issues pursued at NOAA • Some closer, others further from operational use • Highlight collaborative opportunities • Research thrives on exchange of ideas

OUTLINE / SUMMARY • Why ensembles? • Knowledge about uncertainty • Complete the forecast with probabilistic information • User needs for covariance / scenarios • Capturing uncertainty in initial conditions • Focus on dynamical consistency in perturbations • Derive estimate of error variance in best analysis • New NWP modeling paradigm • Focus on ensemble (not single value) forecasting • Stochastically represent effect of unresolved scales • Choice of DA scheme • Variational or sequential? • Both can use ensemble-based covariances • Challenges in data assimilation • Careful attention to many details related to • Forward operators, control variables, DA/forecast cycle, moist constraints, dynamical consistency (initialization of forecast with imperfect models), etc • Ensemble-based covariances – “hybrid” mehod?

DATA ASSIMILATION / ENSEMBLE FORECASTING PREAMBLE • Objective • Probabilistic / ensemble estimate of current / future state of system • Ensemble of gridded multivariate 3D fields • Approach • Estimate initial state – Data assimilation • Represent via ensemble • Project initial ensemble into future – Ensemble forecasting • Capture model related uncertainty

NUMERICAL WEATHER PREDICTION (NWP) BASICS COMPONENTS OF NWP • Create initial condition reflecting state of the atmosphere, land, ocean • Create numerical model of atmosphere, land, ocean ANALYSIS OF ERRORS • Errors present in both initial conditions and numerical models • Coupled atmosphere / land / ocean dynamical system is chaotic • Any error amplifies exponentially until nonlinearly saturated • Error behavior is complex & depends on • Nature of instabilities • Nonlinear saturation IMPACT ON USERS • Analysis / forecast errors negatively impact users • Impact is user specific (user cost / loss situation) • Information on expected forecast errors needed for rational decision making • Spatial/temporal/cross-variable error covariance needed for many real life applications • How can we provide information on expected forecast errors?

WHAT INFORMATION USERS NEED • General characteristics of forecast users • Each user affected in specific way by • Various weather elements at • Different points in time & • Space • Requirements for optimal decision making for weather sensitive operation • Probability distributions for single variables • Lack of information on cross-correlations • Covariances needed across • Forecast variables, space, and time • Format of weather forecasts • Joint probability distributions • Provision of all joint distributions possibly needed by users is intractable • Encapsulate best forecast info into calibrated ensemble members • Possible weather scenarios • 6-Dimensional Data-Cube (6DDC) • 3 dimensions for space, 1 each for time, variable, and ensemble members • Provision of weather information • Ensemble members for sophisticated users • Other types of format derived from ensemble data • All forecast information fully consistent with calibrated ensemble data

HOW CAN WE REDUCE & ESTIMATE EXPECTED FORECAST ERRORS? STATISTICAL APPROACH • Statistically assess errors in past unperturbed forecasts (eg, GFS, RUC) • Can correct for systematic errors in expected value • Can create probabilistic forecast information – Eg, MOS PoP • Limitation • Case dependent variations in skill not captured • Error covariance information practically not attainable DYNAMICAL APPROACH – Ensemble forecasting • Sample initial & model error space - Monte Carlo approach • Leverage DTC Ensemble Testbed (DET) efforts • Prepare multiple analyses / forecasts – • Case dependent error estimates • Error covariance estimates • Limitation • Ensemble formation imperfect – not all initial / model errors represented DYNAMICAL-STATISTICAL APPROACH • Statistically post-process ensemble forecasts • Good of both worlds • How can we do that?

AVIATION EXAMPLE • Recovery of a carrier from weather related disruptions • Operational decisions depend on multitude of factors • Based on United / Hemispheres March 2009 article, p. 11-12 • Factors affecting operations • Weather – multiple parameters • Over large region / CONUS during coming few days • Federal regulations / aircraft limitations • Dispatchers / load planners • Aircraft availability • Scheduling / flight planning • Maintenance • Pre-location of spare parts & other assets where needed • Reservations • Rebooking of passengers • Customer service • Compensation of severely affected customers • How to design economically most viable operations? • Given goals / requirements / metrics / constraints

SELECTION OF OPTIMAL USER PROCEDURES • Generate ensemble weather scenarios ei, i = 1, n • Assume weather is ei, define optimal operation procedures oi • Assess cost/loss cij using oi over all weather scenarios ej • Select oi with minimum expected (mean) cost/lossci over e1,…en as optimum operation EXPECTED COST OPERATION PROCEDURES

DATA ASSIMILATION BASICS • Two distinct objectives • Reproduce reality as faithfully as possible • May hinder NWP forecast application • Create initial condition leading to best NWP forecast • Initial state must be consistent with model dynamics • Technique • Probabilistic / ensemble estimate of current / future state of system • Ensemble of gridded multivariate 3D fields • Bayesian combination of “prior” & observations • Must have error estimates for both prior & observations • Basic functionalities (steps) needed • Relate observations to model variables • Forward operators • Combine information from various observing systems into “superob” • For each model variable • Accurate error estimates needed • Spread effect of observations across time/space/variables • Use dynamical constraints, ensemble-based covariances, etc • Combine prior and observationally based analysis • Use error estimates

CHOICE OF DA SCHEME • Criteria • Actual or expected quality of performance • Results / expectations • Some indications that 4DVAR offers higher quality • Dynamically constraint increments not restricted to ensemble space • 4Dvar with ensemble-based covariance superior • Buehner et al – “hybrid” scheme • How to estimate error variance in 4Dvar analysis? Two approaches • Run ensemble-based DA • Very expensive • Not 4Dvar, but ensemble-based DA errors are estimated • Estimates affected by DA methods/assumptions • Alternative approach • Based on basic assumptions independent of DA schemes • Under development / testing

ESTIMATING & REPRESENTING INITIAL CONDITION RELATED UNCERTAINTY • Objective • Make initial perturbations consistent with uncertainty in analysis Two approaches available • Variational DA • Estimate uncertainty in 3/4Dvar analysis • Initialize ensemble with estimated analysis error variance • Ensemble-based DA • Use one of several ensemble-based DA schemes

Estimating analysis error variances for ensemble initialization Malaquías Peña1, Isidora Jankov2 and Zoltan Toth3 1IMSG at EMC/NCEP/NOAA, 2CIRA at GSD, 3GSD/ESRL/NOAA NOAA Earth System Research Laboratory 1. Introduction 4. Application to Ensemble Data Assimilation Systems Accurate estimates of analysis error variances is critical for the proper initialization of ensembles. This variance is the initial uncertainty that the ensemble perturbations try to mimic. Because of large computational costs, not all DA schemes explicitly compute analysis error variances (e.g., GSI does not). Furthermore, estimates of analysis errors derived via DA schemes are influenced by the assumptions used to create the analysis fields, resulting in a scheme-dependent analysis error. For example, in regions where observations are scarce or the DA scheme gives low weights to them, the analysis errors will be highly correlated with the first guess errors, making it difficult to estimate the true errors (Simmons and Hollinsworth, 2002). A methodology for the estimation of analysis and forecast errors is introduced here. The method is based on a few simple assumptions that are independent of any data assimilation method and provides error estimates with a range of uncertainty. GSI-ENKF HYBRID EDAS A schematic showing an Ensemble Data Assimilation scheme where two assimilation schemes, one flow-dependent and the other static, are run and combined to produce a hybrid analysis. In this scheme, the analysis error estimation (derived from the EnKF scheme) is fed into the ensemble generation scheme but does not reflect the analysis error from the GSI. PROPOSED HYBRID EDAS A schematic showing an EDAS scheme where the ensemble (based on ET) produce a flow-dependent error covariance matrix that is combined with the static covariance matrix generated by a variational DA (GSI) and, in turn the analysis error-variance is used to initialize the Ensemble forecast scheme. The method introduced in this paper allows a scheme-independent estimation of error covariances to initialize the ensemble. 2. Concept The perceived error (forecast minus analysis at the verifying time) variance, is decomposed into the true analysis error variance and the true forecast error variance: where d is the perceived root mean square error, F is the forecast, A is the analysis, T is the true state and ρ is the correlation between true forecast error and true analysis error. Defining the true analysis error-variance: f02 ≡ (A-T)2 and the true forecast error-variance: flead2 ≡ (F-T)2, the perceived forecast error variances measured on each lead time can be estimated via the following set of equations: 5. Tests of the analysis error estimation method SIMULATED FORECASTS WITH THE LORENZ 40 VARS We apply the method to estimate analysis and forecast errors in the 3-variables Lorenz model (Lorenz 1963) under a perfect model scenario. Synthetic data is generated from the control run (nature) plus a random value. A 3DVar scheme is used to assimilate the data. } (1) : Top panel. In blue: True Error variance; in red: Perceived Error variance; in green: the modeled Perceived Error variance. Bottom panel. In black: Anomaly correlation of analysis errors and forecast errors at different leads as generated from the model; in magenta: correlation using (2). The goal is to estimate the true analysis and forecast error-variance consistent with the perceived error variance observed on each lead time. This can be expressed as a minimization of the following J: where the estimated perceived error variance, is given in (1). GFS 500hPa total energy error at two gridpoints wi is a weighting function to ensure that the fitting is best at the initial time, where we have most confidence of the measurements. Point in the Extratropics Left: Estimate of true forecast error variance (f, blue) and the analysis error (f0) as estimated by the method, and the fitting curve (dhat) on the perceived error (d). Right panels: The diagnosed correlation and the fitting error. Note that the optimization procedure (the simplex method; Lagrarias et al., 1998) produces a very good fit with the observed perceived errors. 3. Assumptions The following assumptions are used to simplify the problem: 1. Small initial errors grow exponentially and saturate following a logistic function. Therefore, the evolution of errors can be parameterized with a minimal number of parameters that can be obtained via observations of perceived errors. Departures from this evolution of errors will be attributed to model errors, which will be modeled with a continuous function. 2. At short lead times, errors are local. That is, we ignore advection of errors from neighboring gridpoints. At long lead times, errors at any gridpoint results from the influence of all surrounding gridpoints. 3. The correlation between analysis errors and forecast errors decreases on each analysis cycle at a power rate: Point in the Tropics Same as above but for a point in the tropics. Compare the true forecast error variances on left panels and note that the true errors are underestimated in the tropics. Right panels: Compare the correlation function and note that analysis and forecasts are much more correlated in the tropics than in the extratropics ρm = (ρ1)m , m=2,..M (2) where ρ1is the correlation at 6h lead time, ρ2 =(ρ1)2 is the correlation at 12h lead time, ρ3 = ρ1ρ2 , is the correlation at 18h, etc. Only one parameter (ρ1) needs to be determined. References Lagarias, J.L., J. A. Reeds, M.H. Wrights and P.E. Wright, 1998: Convergence properties of the Nelder-Mead Simplex Method in Low dimensions, SIAM J. Optim., 9, 112-147 Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci. 20: 130–141. Simmons, A. J. and A. Hollingsworth, 2002: Some aspects of the improvement in skill of numerical weather prediction. Quart. J. Roy. Meteor. Soc., 128:647–677.

GSI-ENKF HYBRID EDAS A schematic showing an Ensemble Data Assimilation scheme where two assimilation schemes, one flow-dependent and the other static, are run and combined to produce a hybrid analysis. In this scheme, the analysis error estimation (derived from the EnKF scheme) is fed into the ensemble generation scheme but does not reflect the analysis error from the GSI. PROPOSED HYBRID EDAS A schematic showing an EDAS scheme where the ensemble (based on ET) produce a flow-dependent error covariance matrix that is combined with the static covariance matrix generated by a variational DA (GSI) and, in turn the analysis error-variance is used to initialize the Ensemble forecast scheme. The method introduced in this paper allows a scheme-independent estimation of error covariances to initialize the ensemble.

ENSEMBLE INITIALIZATION • Objectives • Perturbation variance reflect analysis error variance • Covariance reflect error dynamics • Minimize noise in ensemble • Current approaches – Gaussian, linear • Ensemble DA (EnKF, ETKF, EnSRF, etc) • Need to control filter divergence, spurious correlations • Inflate variance, localize covariances • Noise added in process • Ensemble Transform method • Dynamical consistency • Need good error variance estimate • Ad hoc localization of variance (Rescaling)

CONTROLING NOISE IN ENSEMBLE DATA ASSIMILATION Malaquias Peña1 and Zoltan Toth Environmental Modeling Center NCEP/NWS/NOAA 1 SAIC at EMC/NCEP/NOAA Acknowledgements: Mozheng Wei, Takemasa Miyosi, & Roman Krzysztofowicz Pena & Toth, NPG

Ensemble-based DA Analysis ensemble (Xa): Initial conditions and error covariance (Pa) Forecast ensemble (Xb) and error covariance (Pb=B) Observation ( y ) with error (co)variance DYNAMICS Model projects initial state into the future • STATISTICS • Define forecast error covariance B • Adjust B for DA applications • Merge Xb with observations (y) Ad –Hoc Solutions: Negative impact of finite ensemble size: Sampling errors make B and Pa noisy leading to a) Spurious long distance correlations b) Filter divergence Localization (e.g. Shur product) NOISE • Inflation (multiplicative noise) • Additive noise

Traditional mitigation efforts Ad hoc noise is inserted in EnKF procedure to reduce negative effects of sampling error in B …but forecasts from noisy initial states have sub-optimal performance. ALTERNATIVE APPROACH • Dynamical cycling of ensemble perturbations • Avoid addition of noise into ensemble forecasts • Do not feed back noise into ensemble forecasts • Preserve relevant info on dynamics of system • Minimize statistical effects on forecasts • Manipulation of Pb should only minimally affect Pa • => Pb reflects dynamics of system

Hybrid ET Design 3 After Hamill and Snyder, 2000; α=0.5 B-1 exists Hybrid NMC ETKF Eigenvalues of B Spectrum flatter than ETKF alone The hybrid approach is a regularization strategy

ETR Design 3 Statistics Add small (10%) value to diagonal of B ET without regularization ET with regularization By construction ET is rank deficient. Regularization allows B to be invertible Retains flow-dependent covariance structure

Forecast performance Regular ETKF w / cycled noise Hybrid ETKF 3DVAR ETKF w / noise not cycled ETR

WHAT WE LEARNT • Ensemble-based forecast error covariance has sampling error • Addition of random noise, inflation, or localization • Reduces rank deficiency (less ill-conditioned) BUT • Introduces noise wrt dynamics • If cycled => Suboptimal covariance & forecast performance • Alternatives tested: • Noise added to B to reduce rank deficiency is NOT cycled • Two sets of first guesses – effective but expensive • Ensemble Transform with Regularization of B (ETR) • Affects mainly the variance structure of Pa • Minimal effect on covariance • Superior performance with large ensemble • Effect is expected to be less with realistically small ensemble • Particle Data Assimilation Coming …

ENSEMBLE INITIALIZATION – NEW APPROACHES • Can rescaling method be revised? • To mimic background error reducing effect of DA? • Work in progress • Still Gaussian / linear approach • Quest for non-Gaussian ensemble formation • Need for highly nonlinear situations • E.g, convective processes, Tropical Cyclone development • Positive impact expected both for • Ensemble performance • Data assimilation (cleaner covariances)

MODEL RELATED UNCERTAINTIES • Origin • Due to truncation / approximations • Finite spatial resolution • Finite time steps • Approximations in physics • Forecast impact • Random errors added at each time step • Can be considered only stochastically • Design of current generation models • Aimed at making best single (unperturbed) forecast • Minimize RMS error • Intentionally ignores effect of unresolved processes • Fine spatial scales • Fine time scales • Full physics • No stochastic representation of unresolved processes

NEW NWP MODELING PARADIGM • Ensemble application • Effect of unresolved processes must be represented • Otherwise ensemble cloud misses reality • Approach • Stochastically simulate (generate) variance equal to error associated with each process truncated /approximated in model • Major change in modeling approach • Focus on / test in ensemble • Instead of single unperturbed forecast • Build stochastic element to represent model related errors into each model component • Major effort needed

CURRENT METHODS TO REPRESENTMODEL RELATED ERRORS • Multi-model/physics (Houtekamer et al) • Ad hoc and pragmatic approach • Unique & distinct solutions • Cannot provide continuum of realizations • No scientific foundation • Giving up ideal of capturing nature in cognizant manner • Stochastic perturbations added (Buizza et al) • Formal (not fully informed) response to need • Random noise has very limited effect • Structured noise used at NCEP (Hou et al) • Stochastic physics (Teixeira et al) • Right approach? • Capture/simulate, not suppress effect of unresolved processes

REPRESENTING MODEL RELATED UNCERTAINTY:A STOCHASTIC PERTURBATION (SP) SCHEME General Approach: Add a stochastic forcing term into the tendencies of the model eqs Strategy: Generate the S terms from (random) linear combinations of the conventional perturbation tendencies. Goal: Represent effect of unresolved processes Comparable RMSE Desired Properties of Forcing 1. Applied to all variables 2. Approximately balanced 3. Smoothly varying in space and time 4. Flow dependent 5. Quasi-orthogonal Increased Spread Reduced bias ---- Operation ---- Operation + SP ---- Operation + optimal pp (upper limit) Example of Combination Coefficients Improved Probabilistic Performance Increase Spread Dingchen Hou Reduced number of excessive outliers

FINE SCALE ENSEMBLE EXPERIMENTSTO CAPTURE MODEL RELATED UNCERTAINTY • Recognize importance of microphysics for moist processes • Capture model related forecast uncertainty • Hydrometeorological Testbed (HMT) ensemble • Two model cores, various microphysics schemes

Ensemble Prediction System Development for Aviation and other Applications Isidora Jankov1, Steve Albers1, Huiling Yuan3, Linda Wharton2, Zoltan Toth2, Tim Schneider4, Allen White4 and Marty Ralph41Cooperative Institute for Research in the Atmosphere (CIRA),Colorado State University, Fort Collins, CO Affiliated with NOAA/ESRL/ Global Systems Division2NOAA/ESRL/Global Systems Division3Cooperative Institute for Research in Environmental Sciences (CIRES)University of Colorado, Boulder, COAffiliated with NOAA/ESRL/Global Systems Division4 NOAA/ESRL/Physical Sciences Division

BACKGROUND • Objective • Develop fine scale ensemble forecast system • Application areas • Aviation (SF airport) • Winter precipitation (CA & OR coasts) • Summer fire weather (CA) • Potential user groups • Aviation industry, transportation, emergency and ecosystem management, etc

EXPERIMENTAL DESIGN 2009-2010 • Nested domain: • Outer/inner nest grid spacing 9 and 3 km, respectively. • 6-h cycles, 120hr forecasts foe the outer nest and 12hr forecasts for the inner nest • 9 members (listed in the following slide) • Mixed models, physics & perturbed boundary conditions from NCEP Global Ensemble • 2010-2011 season everything stays the same except initial condition perturbations?

QPF Example of 24-h QPF 9-km resolution 9 members: ARW-TOM-GEP0 ARW-FER-GEP1 ARW-SCH-GEP2 ARW-TOM-GEP3 NMM-FER-GEP4 ARW-FER-GEP5 ARW-SCH-GEP6 ARW-TOM-GEP7 NMM-FER-GEP8

HMT QPF and PQPF 24-hr PQPF 48-hr forecast starting at 12 UTC, 18 January 2010 0.1 in. 1 in. 2 in.

Reliability of 24-h PQPF Reliability diagrams of 24-h PQPF 9-km resolution Dec 2009 - Apr 2010 Observed frequency vs forecast probability Overforecast of PQPF Similar performance for different lead times Brier skill score (BSS): Reference brier score is Stage IV sample climatology BSS is only skilful for 24-h lead time at all thresholds and for 0.01 inch/24-h beyond 24-h lead time. OAR/ESRL/GSD/Forecast Applications Branch 34

CYCLING FINE SCALE PERTURBATIONS • How to create dynamically conditioned fine scale perturbations consistent with forcing from global ensemble? • Current approach • Interpolate global perturbed initial conditions • Fine scale motions missing initially • Need to spin up • UKMet, Canadian, part of NCEP SREF etc ensembles • Cycle LAM perturbations

Initial Perturbations for HMT-10/11“Cycling” GEFS (or SREF) perturbations Perturbations LAM forecast driven by global analysis Global Model Analysis interpolated on LAM grid 00Z 06Z 12Z Forecast Time

Cloud Coverage July 30 2010 00UTC LAPS CYC NOCYC 00hr 03hr 06hr

ON THE HORIZON:COUPLED DA – ENSEMBLE SYSTEMS • Analysis / forecast error estimation independent of DA schemes • Poster results • Rescaling of global perturbations consistent with DA • Nonlinear / non-Gaussian initial perturbations via Bayesian particle filters • Coupled with 4-DVAR • Bayesian particle filter for nonlinear / non-Gaussian DA / ensemble forecast system

CHALLENGES IN DA - RELATE OBSERVATIONS TO MODEL VARIABLES • Critical step in relating reality to NWP model • Wide range of observing systems / instruments / sensors • Construct “forward operators” • Tedious but important work • Requires detailed knowledge about observing system and model • Examples - Relate • Radar reflectivity to • Convective processes in model • Radar radial wind to • 3-dimensional wind structure

j Intensity: WRF Katrina forecast by STMAS Wind Barb, Windspeed image, Pressure contour at 950mb Surface pressure

Track: WRF 20km Katrina forecast by STMAS Best track: every 6 hours WRF-ARW 72 hour fcst w/ Ferrier physics: every 3 hours

CHALLENGES IN DAIMPROVE PRIOR / BACKGROUND FIELD • Background must contain all information available • Prior to latest observations • Ideally a short range NWP forecast from latest analysis • As used in global DA • Only limited success with Limited Area Model (LAM) applications • Full LAM DA/forecast cycling attempts fail • Noise around boundary conditions amplify via cycling? • Current approach • Periodically cold-start LAM analysis cycle from global analysis • E.g., NAM, RUC at NCEP • Scientifically unsatisfactory, suboptimal performance • Promising experiments at GSD • Use lateral boundary as constraint in LAM analysis • Bring in dynamically consistent fine scale info from LAM background

Cycling Impact on STMAS analysis With Without cycling

STMAS-WRF ARW cycling Impact OAR/ESRL/GSD/Forecast Applications Branch

CHALLENGES IN DAENSURE CONSISTENCY WITH MODEL DYNAMICS 1 • Dynamically inconsistent information is lost - insult • Quick transitional process introduces additional errors – injury • Possible approaches • Balance constraints – widely used • Digital filter – E.g., Rapid Refresh cycle at GSD • 4-DVAR – used in global DA • Additional constraints • Local Analysis and Prediction System (LAPS) “Hot Start” • “Conceptual” relationships among moist & other variables

Local Analysis and Prediction System (LAPS) Steve Albers, Dan Birkenheuer, Isidora Jankov Paul Schultz, Zoltan Toth, Yuanfu Xie, Linda Wharton OAR/ESRL/GSD/Forecast Applications Branch

LAPS DA-Ensemble System Data Data Ingest Intermediate data files Error Covariance Trans Traditional GSI Analysis Scheme STMAS3D Trans Post proc1 Post proc2 Post proc3 Model prep WRF-ARW MM5 WRF-NMM Probabilistic Post Processing Ensemble Forecast OAR/ESRL/GSD/Forecast Applications Branch 47

FH FL LAPS HOT START INITIALIZATION Three-Dimensional Cloud Analysis METAR + FIRST GUESS

Cloud / Reflectivity / Precip Type (1km analysis) Obstructions to visibility along approach paths DIA

6-hr LAPS Diabatically initialized WRF-ARW forecast Analysis 13 June 2002 Developing Squall Line Animation OAR/ESRL/GSD/Forecast Applications Branch 50

ENSEMBLE FORECASTING – A NEW PARADIGM IN NUMERICAL WEATHER PREDICTION