Taiwan Central Weather Bureau GSI-based Hybrid Data Assimilation: Practical Implementation and impact on the NCEP GFS Daryl Kleist Environmental Modeling Center NOAA/NWS/NCEP With acknowledgements to John Derber (EMC), Dave Parrish (EMC), Jeff Whitaker (NOAA/ESRL), Kayo Ide (UMD), and Ricardo Todling (NASA/GMAO)
Variational Data Assimilation J : Penalty (Fit to background + Fit to observations + Constraints) x’ : Analysis increment (xa – xb) ; where xb is a background Bvar: Background error covariance H : Observations (forward) operator R : Observation error covariance (Instrument + representativeness) yo’ : Observation innovations Jc : Constraints (physical quantities, balance/noise, etc.) B is typically static and estimated a-priori/offline 2
Current flow-dependence • Although flow-dependent variances are used, confined to be a rescaling of fixed estimate based on time tendencies • No multivariate or length scale information used • Does not necessarily capture ‘errors of the day’ • Plots valid 00 UTC 12 September 2008
Kalman Filter in Var Setting Forecast Step Extended Kalman Filter Analysis • Analysis step in variational framework (cost function) • BKF: Time evolving background error covariance • AKF: Inverse [Hessian of JKF(x’)]
Motivation from KF • Problem: dimensions of AKF and BKF are huge, making this practically impossible for large systems (GFS for example). • Solution: sample and update using an ensemble instead of evolving AKF/BKF explicitly Forecast Step: Ensemble Perturbations Analysis Step:
What does Be gain us? • Allows for flow-dependence/errors of the day • Multivariate correlations from dynamic model • Quite difficult to incorporate into fixed error covariance models • Evolves with system, can capture changes in the observing network • More information extracted from the observations => better analysis => better forecasts 6
What does Be gain us? Temperature observation near warm front Bf Be 7
What does Be gain us? Surface pressure observation near “atmospheric river” First guess surface pressure (white contours) and precipitable water increment (A-G, red contours) after assimilating a single surface pressure observation (yellow dot) using Be. 8
What is “hybrid DA”? Simply put, linear combination of fixed and ensemble based B: Bf: Fixed background error covariance Be: Ensemble estimated background error covariance bf:Weighting factor for fixed contribution (0.25 means 25% fixed) be:Weighting factor for ensemble contribution (typically 1- bf) 9
Experiments with toy model • Lorenz ‘96 • 40 variable model, F=8.0, dt=0.025 (“3 hours”) • 4th order Runge-Kutta • OSSE: observations generated from truth run every 2*dt (“6 hours”) • [N(0,1)] • Experimental design • Assimilate single time level observations every 6 hours, at appropriate time, R=1.0 • F=7.8 (imperfect model) for DA runs
Analysis Error (50% observation coverage) 3DVAR bf = 0.7 bf = 0.3 ETKF • M (ensemble size) = 20, r (inflation factor) = 1.1 • Hybrid (small alpha) as good as/better than ETKF (faster spinup) • Hybrid (larger alpha) in between 3DVAR and ETKF
Sensitivity to b Analysis RMSE (x10) over 1800 cases • 50% observation coverage (M = 20, r = 1.1) • Improvement a near linear function of weighting parameter • Small enough weighting (on static error estimate) improves upon ETKF
GSI Hybrid [3D] EnVar(ignoring preconditioning for simplicity) • Incorporate ensemble perturbations directly into variational cost function through extended control variable • Lorenc (2003), Buehner (2005), Wang et. al. (2007), etc. bf & be: weighting coefficients for fixed and ensemble covariance respectively xt’: (total increment) sum of increment from fixed/static B (xf’) and ensemble B ak: extended control variable; :ensemble perturbations - analogous to the weights in the LETKF formulation L: correlation matrix [effectively the localization of ensemble perturbations]
Preconditioning Sidebar In default GSI minimization, double CG, inverses of B and L not needed and the solution is pre-conditioned by full B. Discussed later, this is why the GSI specifies the inverses of beta via the hybrid namelist. This formulation differs from the UKMO and Canadians, who use a square root formulation and apply the weights directly to the increment:
Hybrid with (global) GSI • Control variable has been implemented into GSI 3DVAR* • Full B preconditioning • Working on extensions to B1/2 preconditioned minimization options • Spectral filter for horizontal part of A • Eventually replace with (anisotropic) recursive filters • Recursive filter used for vertical • Dual resolution capability • Ensemble can be from different resolution than background/analysis (vertical levels are the exception) • Various localization options for A • Grid units or scale height • Level dependent (plans to expand) • Option to apply TLNMC (Kleist et al. 2009) to analysis increment *Acknowledgement: Dave Parrish for original implementation of extended control variable
Single Temperature Observation 3DVAR bf-1=0.0 bf-1=0.5
Single Pressure Observation 3DVAR bf-1=0.0 bf-1=0.5 Single ps observation (-2mb O-F, 1mb error) near center of Hurricane Ike
GSI Hybrid Configuration/Options • Several hybrid related parameters controlled via GSI namelist &hybrid_ensemble • Logical to turn on/off hybrid_ensemble option (l_hyb_ens) • Ensemble size (n_ens), resolution (jcap_ens, nlat_ens, nlon_ens) • Source of ensemble: GFS spectral, native model, etc. (regional_ensemble_option) • Weighting factor for static contribution to increment (beta1_inv) • Horizontal and vertical distances for localization, via L on augmented control variable (s_ens_h, s_ens_v) • Localization distances are the same for all variables since operating on a • Option to specify different localization distances as a function of vertical level (readin_localization) • Instead of single parameters, read in ascii file containing a value for each layer • Example for global in fix directory (global_hybens_locinfo.l64.txt) • Other regional options related to resolution, pseudo ensemble, etc.
So what’s the catch? • Need an ensemble that represents first guess uncertainty (background error) • This can mean O(50-100+) for NWP applications • Smaller ensembles have larger sampling error (rely more heavily on Bf) • Larger ensembles have increased computational expense • Updating the ensemble: In NCEP operations, we currently utilize an Ensemble Kalman Filter • EnKF is a standalone (i.e. separate) DA system that updates every ensemble member with information from observations each analysis time using the prior/posterior ensemble to represent the error covariances. Google “ensemble based atmospheric assimilation” for a good review article by Tom Hamill.
Dual-Res Coupled Hybrid Var/EnKF Cycling Generate new ensemble perturbations given the latest set of observations and first-guess ensemble member 1 forecast member 1 analysis T254L64 EnKF member update member 2 forecast member 2 analysis recenter analysis ensemble member 3 analysis member 3 forecast Ensemble contribution to background error covariance Replace the EnKF ensemble mean analysis GSI Hybrid Ens/Var high res forecast high res analysis T574L64 Previous Cycle Current Update Cycle
QC Codes Surface Cycle GFS HiRes “Prepbufr” Create Guess TC Relocation GSI Hybrid 3DVAR TC Vitals from NHC/JTWC Low Resolution GSI Pre-processor Ensemble Ensemble O-F 09hr forecasts EnKF Interpolate Hybrid Recenter about Analysis Hybrid Analysis GDAS Hybrid Cycle Observations Pre-processing Analysis Forecast Post-Processing Ingest Pressure Grib Data Dump Bufr Generation Graphics Verification
Initial Hybrid Var-EnKF GFS experiment • Model • GFS deterministic (T574L64; post July 2010 version – current operational version) • GFS ensemble (T254L64) • 80 ensemble members, EnKF update, GSI for observation operators • Observations • All operationally available observations (including radiances) • Includes early (GFS) and late (GDAS/cycled) cycles as in production • Dual-resolution/Coupled • High resolution control/deterministic component • Includes TC Relocation on guess • Ensemble is recentered every cycle about hybrid analysis • Discard ensemble mean analysis • Satellite bias corrections • Coefficients come from GSI/VAR • Parameter settings • 1/3 static B, 2/3 ensemble • Fixed localization: 800km & 1.5 scale heights • Test Period • 15 July 2010 – 15 October 2010 (first two weeks ignored for “spin-up”)
500 hPa Anom.Corr. Northern Hemisphere Southern Hemisphere
AC Frequency Distributions Northern Hemisphere Southern Hemisphere
Geopotential Height RMSE Northern Hemisphere Southern Hemisphere Significant reduction in mean height errors
Stratospheric Fits Improved fits to stratospheric observations
Tropical Winds Hybrid forecasts seem worse than control at certain levels, but ratio of forecast/analysis standard deviation reduced (analysis standard deviation increased in hybrid).
Forecast Fits to Obs (Tropical Winds) Forecasts from hybrid analyses fit observation much better.
Global Data AssimilationSystem Upgrade Implemented 22 May 2012 • Satellite radiance monitoring code • Allows quicker awareness of problems (run every cycle) • Monitoring software can automatically detect many problems • Post changes • Additional fields requested by forecasters (80m variables) • Partnership between research and operations • Hybrid system • Most of the impact comes from this change • Uses ensemble forecasts to help define background error • NPP (ATMS) assimilated • Quick use of data after launch • Use of GPSRO Bending Angle rather than refractivity • Allows use of more data (especially higher in atmos.) • Small positive impacts
Operational Configuration • Full B preconditioned double conjugate gradient minimization • Spectral filter for horizontal part of L • Eventually replace with (anisotropic) recursive filters • Recursive filter used for vertical • 0.5 scale heights • Same localization used in Hybrid (L) and EnSRF • TLNMC (Kleist et al. 2009) applied to total analysis increment* 32
What next? • Prototype dual-resolution, two-way coupled hybrid Var/EnKF system outperforms standard 3DVAR • Prior results with one-way hybrid not as good • Localization (currently: global/fixed parameter) • Level dependent • Flow dependent, adaptive, anisotropic • Weighting • What is optimal? Scale-dependent? • Adaptive • How should EnKF members be used for ensemble forecasting? • Comparisons with 4DVAR • Regular, hybrid, and ens4DVAR
What if you don’t have an EnKF? • In principle, any ensemble can be used • However, ensemble should represent well the forecast errors • GSI can ingest GFS (global spectral) ensemble to update regional models (WRF ARF/NMM) • Has been highly successful in NAM, RR, HWRF applications • 80 member GFS/EnKF 6h ensemble forecasts are archived at NCEP since May 2012 • Real-time ensemble should also be publicly available
Dual-Res Coupled Hybrid Var/EnKF Cycling Generate new ensemble perturbations given the latest set of observations and first-guess ensemble member 1 forecast member 1 analysis T254L64 EnKF member update member 2 forecast member 2 analysis recenter analysis ensemble member 3 analysis member 3 forecast Ensemble contribution to background error covariance Replace the EnKF ensemble mean analysis and inflate GSI Hybrid Ens/Var high res forecast high res analysis T574L64 Previous Cycle Current Update Cycle
Post EnKF Inflation(Whitaker and Hamill 2012) • Multiplicative inflation factor, r, (function of reduction of spread by assimilation of observations): • Additive inflation extracts quasi-balanced pseudo-random perturbations from database of lagged forecast pairs Inflation is an ad-hoc method for overcoming lack of consideration of model/system error within ensemble-filtering systems
Use of inflation and re-centering • Ensemble part of Hybrid DA includes: re-centering plus inﬂation • Evaluations in NASA/GEOS DAS suggest: • Hybrid approach provides noticeable improvements only when using additive inﬂation, i.e., EnKF alone doesn’t do it • Forecasts from EnKF analyses plus additive inﬂation result in mild spread within the background time window • It seems that much of the initial (analysis) spread can be simulated with additive inﬂation alone • Appreciable background spread is obtained in the latter case Question: how does hybrid-DA perform when the ensemble ﬁlter is dropped and an ensemble of analyses is created from simply additively inﬂating the central analysis?
No Filter Experiment Settings • 0.5o outer loops, 72 level, NASA GEOS • 32 member ensemble, 1.0o, 72 levels • GSI (3D) Hybrid, 50% ensemble/static • TLNMC applied to total increment • Standards vertical/horizontal localization for ensemble B • Additive perturbations from lagged forecast pairs (24-48h) • Experiment period: April 2012 • EnKF • 0.25 additive inflation, EnKF to update ensemble (cycled) • Filter free • 0.6 additive inflation, cold started each cycle from hybrid analysis (no EnKF, no cycling)
Evolution of 6hr Spread EnKF-basedhybrid Filter-Freehybrid
Spread within 9hr window EnKF-basedhybrid Filter-Freehybrid
Cycled Results 3DVAR EnKF-Hybrid FilterFree-Hybrid
Cycled Results 3DVAR EnKF-Hybrid FilterFree-Hybrid