1 / 23

Task 1 of PP Interpretation

Task 1 of PP Interpretation. 1.1 Further applications of boosting: This talk 1.2 Publication on boosting: Paper of Oliver Marchand submitted, but not yet published. Thunderstorm Prediction with Boosting: Verification and Implementation of a new Base Classifier. André Walser (MeteoSwiss)

kordell
Télécharger la présentation

Task 1 of PP Interpretation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Task 1 of PP Interpretation 1.1 Further applications of boosting:This talk 1.2 Publication on boosting:Paper of Oliver Marchand submitted, but not yet published

  2. Thunderstorm Prediction with Boosting: Verification and Implementation of a new Base Classifier André Walser (MeteoSwiss) Martin Kohli (ETH Zürich, Semester Thesis)

  3. Overview • Boosting Algorithm • Impact of learn data • Verification results • Mapping to probability forecast • New base classier: decision tree

  4. New Data yes/no Supervised Learning Learner Rules Historic Data Classifier

  5. COSMO-7 assml cycle Data for 79 SYNOP stations in Switzerland At least on year, every hour e.g. SI, CAPE, W, date, time LABEL DATA a thunderstorm „yes“ if an appropriate ww-code was reported in the SYNOP or at least 3 lightnings were registered within 13.5 km Learn data 13.5 km station

  6. Iteration 1determine base classifier G 2calculate error, weights w 3adapt the weights of falselyclassified samples Input Weighted learn samples Number of base classifier M AdaBoost Algorithm

  7. Output of the Learn process • M base classifier • Threshold classifier:

  8. Iteration 1determine base classifier G 2calculate error, weights w 3adapt the weights of falselyclassified samples Input Weighted learn samples Number of base classifier M AdaBoost Algorithm Classifier:

  9. Output of the Classifier: C_TSTORM Biased! 17 UTC 18 UTC Biased! 19 UTC

  10. Reason: Inappropriate learn data… • SYNOP messages contain events and non-events, but are only available every 3 hours (most messages for 6, 12, 18 UTC). • Lightning data only contains events

  11. New learn data sets • B – biasedSYNOP messages; only events from lightning data • F – fullSYNOP messages; all missing values are considered as non events • AL1 – at least 1SYNOP messages; when lightning data shows at least 1 events, all non missing value are considered as non-events

  12. Without bias… 17 UTC 18 UTC 19 UTC

  13. Verification • POD and FAR for different C_TSTORM values between 0.3 and 0.6 FAR = False Alarms / #Alarms • Learn data:Model: COSMO-7 assimilation cycle Jun 06 – May 07Obs: B / AL1 / F • Verification data: Model: COSMO-7 forecasts July 06 and May/June 07Obs: F

  14. Verification: earlier results • Results reported last year for 2005:POD = 72%, FAR = 34% • Unfortunately not realistic, verification done with obs data B

  15. July 2006 ~7% events Random forecast

  16. 18 May – 24 June 2007

  17. Comparison with other system • DWD Expert-System: • Periode April 2006 - September 2006: POD = 0.346, FAR = 0.740

  18. Mapping to a probability forecast Polygon fit in a reliability diagram: PC_TSTORM

  19. Mapping to a probability forecast 0 ifx ≤ 0.4; ax2 + bx + c if 0.4 < x < 0.6; a0.62 + b0.6 + c if x ≥ 0.6. PC_TSTORM = Limitedresolution: Thesystempredictsprobabilitiesonlybetween 0 and ~40%

  20. New Base Classifier: Decision Tree threshold classifier 1 1 0

  21. New Base Classifier: Decision Tree threshold classifier 1 class 1 class 0 threshold classifier 2 threshold classifier 3 0 1 0 1

  22. Decision Tree: Example

  23. Conclusions & Outlook • Boosting • is a simple, efficient and effective machine learning method for model post-processing • is completely general • can employ a number of redundant indicators • computes a certainty of the classification mapped to probability forecast • First verification results promising, extended verification required • Benefit of decision trees?

More Related