EPS Diagnostic Tools

EPS Diagnostic Tools Renate Hagedorn European Centre for Medium-Range Weather Forecasts

A forecast has skill if it predicts the observed conditions well according to some objective or subjective criteria. A forecast has value if it helps the user to make better decisions than without knowledge of the forecast. Forecasts with poor skill can be valuable (e.g. location mismatch) Forecasts with high skill can be of little value (e.g. blue sky desert) Objective of diagnostic/verification tools Assess quality of forecast system i.e. determine skill and value of forecast

Ensemble Prediction System • 1 control run + 50 perturbed runs (TL399 L62)  added dimension of ensemble members  f(x,y,z,t,e) • How do we deal with added dimension when  interpreting, verifying and diagnosing EPS output?

Individual members (“stamp maps”)

max 75% 25% min EPSgrams Cloud Cover Precipitation median 10m wind 2m Temperature

Ensemble mean • The ensemble mean forecast is the average over all ensemble members Day+6 control Day+6 Ensemble mean • It gives a smoother field than the deterministic forecasts, but the same result can’t be achieved with a simple filtering of a deterministic forecast

Ensemble mean • It gives a smoother field than the deterministic forecasts, but the same result can’t be achieved with a simple filtering of a deterministic forecast Day+6 control(filtered) Day+6 Ensemble mean • If spread is large the EM may be a very weak pattern and may not represent any of the possible evolutions (use measure of ens. spread!)

Probabilistic forecast verification has similarities to deterministic verification Reliability <-> Bias Resolution <-> ACC Brier Score <-> RMS Deterministic vs. Probabilistic use of EPS Use ensemble mean only or explicit use of whole PDF 5 10 15 20 25 5 10 15 20 25

• Test the system for 100 days:  30 x T>25ºC -> 30 x (100 – 20) = 2400  70 x T<25ºC -> 70 x ( 0 – 20) = -1400 +1000 Why Probabilities? • Open air restaurant scenario:  open additional tables: £20 extra cost, £100 extra income (if T>25ºC)  weather forecast: 30% probability for T>25ºC  what would you do? • Employing extra waiter (spending £20) is beneficial when probability for T>25 ºC is greater 20% • The higher/lower the cost loss ratio, the higher/lower probabilities are needed in order to benefit from action on forecast

25 25 25 Reliability Take a sample of probabilistic forecasts: e.g. 30 days x 2200 GP = 66000 forecasts How often was event (T > 25) forecasted with X probability?

• • • • • Reliability Take a sample of probabilistic forecasts: e.g. 30 days x 2200 GP = 66000 forecasts How often was event (T > 25) forecasted with X probability? 100 OBS-Frequency 0 0100 FC-Probability

Reliability Diagram over-confident model perfect model

Reliability Diagram under-confident model perfect model

Reliability diagram Reliability score (the smaller, the better) perfect model imperfect model

Components of the Brier Score N = total number of cases I = number of probability bins ni= number of cases in probability bin i fi = forecast probability in probability bin i oi = frequency of event being observed when forecasted with fi  Reliability: forecast probability vs. observed relative frequencies

Reliability diagram Reliability score (the smaller, the better) Resolution score (the bigger, the better) c c Good resolution Poor resolution

 Uncertainty: variance of observations frequency in sample Components of the Brier Score N = total number of cases I = number of probability bins ni= number of cases in probability bin i fi = forecast probability in probability bin I oi = frequency of event being observed when forecasted with fi c = frequency of event being observed in whole sample  Reliability: forecast probability vs. observed relative frequencies  Resolution: ability to issue reliable forecasts close to 0% or 100%

• Brier skill score (BSS) is a measure for skill relative to climatology (p=frequency of the event in the climate sample) Brier Score Brier Score = Reliability – Resolution + Uncertainty • The Brier score is a measure of the accuracy of probability forecasts • p is forecast probability (fraction of members predicting event) • o is observed outcome (1 if event occurs; 0 if event does not occur) • BS varies from 0 (perfect deterministic forecasts) to 1 (perfectly wrong!) • positive (negative) BSS better (worse) than reference

BSS Rel-Sc Res-Sc 0.095 0.926 0.169 0.039 0.899 0.141 0.039 0.899 0.140 -0.001 0.877 0.123 0.047 0.893 0.153 0.065 0.918 0.147 -0.064 0.838 0.099 0.204 0.990 0.213 Reliability: 2m-Temp.>0 1 month lead, start date May, 1980 - 2001 CERFACS CNRM ECMWF INGV LODYC MPI UKMO DEMETER

Brier Skill Score Europe: 850hPa Temperature, D+4

• Ranked Probability Skill Score (RPSS) is a measure for skill relative to a reference forecast Ranked Probability Score • Measures the quadratic distance between forecast and verification probabilities for several categories • It is the average Brier score across the range of the variable • negative / positive RPSS  worse / better than reference

• RPS takes into account ordered nature of variable (“extreme errors”) 1 5 10 15 20 25 Brier Score -> Ranked Probability Score • Brier Score used for two category (yes/no) situations (e.g. T > 15oC) 1 5 10 15 20 25

Ranked Probability Skill Score Northern Hemisphere: 500hPa Geopotential

Verification of two category (yes/no) situation • Compute 2 x 2 contingency table: (for a set of cases) • Event Probability: s = (a+c) / n • Probability of a Forecast of occurrence: r = (a+b) / n • Frequency Bias: B = (a+b) / (a+c) • Proportion Correct: PC = (a+d) / n

Example of Finley Tornado Forecasts (1884) • Compute 2 x 2 contingency table: (for a set of cases) • Event Probability: s = (a+c) / n = 51/2803 = 0.018 • Probability of a Forecast of occurrence: r = (a+b) / n = 100/2803 = 0.036 • Frequency Bias: B = (a+b) / (a+c) = 100/51 = 1.961 • Proportion Correct: PC = (a+d) / n = 2708/2803 = 0.966 96.6% Accuracy

Example of Finley Tornado Forecasts (1884) • Compute 2 x 2 contingency table: (for a set of cases) • Event Probability: s = (a+c) / n = 51/2803 = 0.018 • Probability of a Forecast of occurrence: r = (a+b) / n = 0/2803 = 0.0 • Frequency Bias: B = (a+b) / (a+c) = 0/51 = 0.0 • Proportion Correct: PC = (a+d) / n = 2752/2803 = 0.982 98.2% Accuracy!

Some Scores and Skill Scores

Ignorance Score: IGN = - 1/n ΣnΣi pn,i,ver ln pn,i,,fc See Roulston & Smith, 2001 Definition of a proper score • Consistency is one of the characteristics of a good forecast • Some scoring rules encourage forecasters to be inconsistent, e.g. some scores give better results when a forecast closer to climatology is issued rather than the actual forecast (e.g. reliability) • Scoring rule is strictly proper when the best scores are obtained if and only if the forecasts correspond with the forecaster’s judgement • Examples of proper scores are the Brier Score or Ignorance Score • n: forecast-verification pairs, i: quantiles • Minimum only when pfc = pver -> proper score • The lower/higher the IGN the better/worse the forecast system

Verification of two category (yes/no) situation • Compute 2 x 2 contingency table: (for a set of cases) • Event Probability: s = (a+c) / n • Probability of a Forecast of occurrence: r = (a+b) / n • Frequency Bias: B = (a+b) / (a+c) • Hit Rate: H = a / (a+c) • False Alarm Rate: F = b / (b+d) • False Alarm Ratio: FAR = b / (a+b)

Example of Finley Tornado Forecasts (1884) • Compute 2 x 2 contingency table: (for a set of cases) • Event Probability: s = (a+c) / n = 0.018 • Probability of a Forecast of occurrence: r = (a+b) / n = 0.036 • Frequency Bias: B = (a+b) / (a+c) = 1.961 • Hit Rate: H = a / (a+c) = 0.549 • False Alarm Rate: F = b / (b+d) = 0.026 • False Alarm Ratio: FAR = b / (a+b) = 0.720

Extension of 2 x 2 contingency table for prob. FC

• • • • • • Extension of 2 x 2 contingency table for prob. FC 1 Hit Rate 0 01 False Alarm Rate

A=0.83 • ROC area (area under the ROC curve) is skill measure A=0.5 (no skill), A=1 (perfect deterministic forecast) ROC curve • ROC curve is plot of H against F for range of probability thresholds H low threshold moderate threshold high threshold F

ROC area

ROCA vs. RPSS vs. BSS

ROCSS vs. BSS • ROCSS or BSS > 0 indicate skilful forecast system Northern Extra-Tropics 500 hPa anomalies > 2σ(spring 2002) ROC skill score Brier skill score Richardson, 2005

Benefits for different users - decision making • A user (or “decision maker”) is sensitive to a specific weather event • The user has a choice of two actions:  do nothing and risk a potential loss L if weather event occurs  take preventative action at a cost C to protect against loss L • no forecast information: either always take action or never take action • deterministic forecast: act when adverse weather predicted • probability forecast: act when probability of specific event exceeds a certain threshold. This threshold depends on the user • Value V of a forecast  savings made by using forecast  normalised so that V=1 for perfect forecast, V=0 for forecast no better than climatology • simplest possible case - but shows many important features (see also Richardson, 2000)

• Climate information – expense: • Perfect forecast – expense: • Always use forecast – expense: • Value: Decision making: the cost-loss model Fraction of occurences Potential costs

with: α = C/L H = a/(a+c) F = b/(b+d) o = a+c Northern Extra-Tropics (winter 01/02) D+5 deterministic FC > 1mm precip • For given weather event and FC system: o, H and F are fixed • value depends on C/L • max if: C/L = o • Vmax = H-F Decision making: the cost-loss model

Potential economic value Northern Extra-Tropics (winter 01/02) D+5 FC > 1mm precipitation deterministic EPS p = 0.2 p = 0.5p = 0.8

Potential economic value Northern Extra-Tropics (winter 01/02) FC > 1mm precipitation EPS: each user chooses the most appropriate probability threshold EPS Control Results based on simple cost/loss models have indicated that EPS probabilistic forecasts have a higher value than single deterministic forecasts

Potential economic value Northern Extra-Tropics (winter 01/02) D+5 FC > 20mm precipitation • BSS = 0.06 (measure of overall value for all possible users) • ROCSS = 0.65 (closely linked to Vmax)

Summary • Different ways of incorporating added dimension of EPS (EM vs. PDF) • Ensemble mean is best deterministic forecast  EM should be used together with measure of spread • Verification of probability forecast  different scores measure different aspects of forecast performance Reliability / Resolution, Brier Score (BSS), RPS (RPSS), ROC,…  Perception of usefulness of ensemble may vary with score used  It is important to understand the behaviour of different scores and choose appropriately • Potential economic value  Decision making is user dependent  Cost-Loss model a simple illustration – but shows many useful features

References and further reading • ECMWF newsletter for updates on EPS performance • Jolliffe, I.T. and D.B. Stephenson, 2003: Forecast Verification. A Practitioner’s Guide in Atmospheric Science. Wiley, pp. 240 • Katz, R. W. and A.H. Murphy, 1997: Economic value of weather and climate forecasting. Cambridge University Press, pp. 222. • Palmer, T.N. and R. Hagedorn (editors), 2006: Predictability of weather and climate. Cambridge University Press (available from July 2006) • Richardson, D. S., 2000. Skill and relative economic value of the ECMWF Ensemble Prediction System. Q. J. R. Meteorol. Soc.,126, 649-668. • Roulston, M. S. and L.A. Smith, 2001: Evaluating Probabilistic Forecasts Using Information Theory. Monthly Weather Review,130, 1653-1660. • Wilks, D. S., 2006: Statistical methods in the atmospheric sciences. 2nd ed. Academic Press, pp.627

EPS Diagnostic Tools

EPS Diagnostic Tools

Presentation Transcript

Classroom Diagnostic Tools

Classroom Diagnostic Tools

Diagnostic Tools... Sooo cool!

DIAGNOSTIC TEST TOOLS

Classroom Diagnostic Tools

Diagnostic tools and technology transfer

Scan Tools and Diagnostic Procedures

Car Diagnostic tools and equipment

Lamborghini Diagnostic Tools

Use QuickBooks Diagnostic Tools

Automotive Diagnostic And Scanning Tools

Car Diagnostic Tools

LABORATORY DIAGNOSTIC TOOLS

Governance Diagnostic Tools

Advanced Network Diagnostic Tools

Automotive Diagnostic Scan Tools Market

Importance of Heatmaps Diagnostic Tools?

automotive diagnostic tools market size