1 / 46

Integr ácia a spracovanie údajov o životnom prostredí Technol ógia ADMIRE

Integr ácia a spracovanie údajov o životnom prostredí Technol ógia ADMIRE. Ondrej Habala Seminár CRISIS, 18.10.2011 ITMS 26240220060. Goals. Accelerate access to and increase the benefits from data exploitation;

ulf
Télécharger la présentation

Integr ácia a spracovanie údajov o životnom prostredí Technol ógia ADMIRE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrácia a spracovanie údajov o životnom prostredíTechnológia ADMIRE Ondrej Habala Seminár CRISIS, 18.10.2011 ITMS 26240220060

  2. Goals Accelerate access to and increase the benefits from data exploitation; Deliver consistent and easy to use technology for extracting information and knowledge; Cope with complexity, distribution, change and heterogeneity of services, data, and processes, through abstract view of data mining and integration; and Provide power to users and developers of data mining and integration processes.

  3. ADMIRE Architecture: Separation of Concerns

  4. ADMIRE Architecture

  5. ADMIRE’s High-Level Architecture

  6. ADMIRE Gateways USMT

  7. DISPEL – Data Intensive Systems Process-Engineering Language • Data-intensive distributed systems • Connection point of complex application requests and complex enactment systems • Benefit: method development, engineering and evolution of supported practices can take place independently in each world • Describes enactment requests for streaming-data workflows processes • “Process-engineering time” – transform and optimize process in preparation for enactment period

  8. DISPEL: Simple Example Creating streams of literals String sql1 = "SELECT * FROM some_table"; String sql2 = “SELECT * FROM table2”; String resource = "128.18.128.255"; SQLQuery query = new SQLQuery; |- sql1, sql2 -| => query.expression; |- resource -| => query.resource; Tee tee = new Tee; query.result => tee.connectInput; Creating connections

  9. DISPEL – real use

  10. Aplikačné štúrieNasadenie technológie admire v životnom prostredí

  11. Flood ApplicationData sets used in hydrological scenarios FSKD 2010 Yantai, China, August 10-12 11

  12. Orava scenario • Legend • Green area – Orava (part of north Slovakia) • Blue – Orava reservoir and local rivers • Red dots– hydrological measurement stations • Notes • We are interested only on hydrological stations below the Orava reservoir • In our tests we will use the hydrological station 5830 (Tvrdosin)

  13. ORAVA – data mining concept • Targets – water level and temperature at a station below the reservoir Targets of data mining Given in a schedule Predicted by a meteo model Predictors – rainfall amount (reservoir and station), air temperature (reservoir and station), reservoir discharge, reservoir temperature

  14. ORAVA – data integration • Integration of data from • GRIB files • Reservoirs • Inputs • Time period of experiment • Reservoir ID • List of hydro stations • Geo coordinates

  15. ORAVA – data sets

  16. ORAVA ScenarioIntegrated and preprocessed data Integrated raw data Time [hours] Integrated preprocessed data Time [hours]

  17. Orava ScenarioWater temperature prediction

  18. Orava ScenarioWaterlevelprediction

  19. Orava ScenarioData integration workflow

  20. Orava ScenarioTraining workflow

  21. Orava ScenarioPrediction workflow

  22. Implementation Notes • Needed to write custom activities for certain data extraction tasks • Data integration was the most complex part of the scenario in terms of workflow design • Data integration was quite easy to write and modify in DISPEL once we had all the PEs in place • Used composite PE to extract different types of quantities from meteorological GRIB files

  23. ADMIRE Architecture: Separation of Concerns

  24. Orava Scenario Portal

  25. Orava Scenario Portal

  26. Radar Scenario Very short-term rainfall prediction from weather radar data

  27. Radar ScenarioDescription • Very short-term rainfall prediction from weather radar data • Movement of areas with higher air moisture content, and thus also higher precipitation potential • Networkofsynopticstations in Slovakia • 27 stations in Slovakia • Useddatafromyears 2007 and 2008 • Available variables: rainfall, humidity, Radar reflexivity, atmosphericpressure and temperaturevaluesforeachhour

  28. Overview of the main predictors and target variables in the Radar scenario. The green cells are predicted from meteo-model. Blue cells are from model, based on motions vectors. Yellow cells are final target of data mining. Radar ScenarioMain predictors and target variables

  29. Radar ScenarioAtributes of model • Isotonic regression model • 10-fold Cross Validation • Hydro-meteorological performance

  30. RADAR model • Other tested models • Neural networks, SMOreg, linear regression, ... • Reached correlation coeficient between 0,35 and 0,42 • Validation - 10 Cross Fold • Problems in model creation : • process is significantly stochastic • Some input variables/parameters (humidity) are backwards dependent on output – rainfall. • Meteorological process is very sensitive • Reflection matrix represents quantity of water in atmosphere, not exact rainfall rate in specified area, as opposed to data from synoptic stations

  31. Radar Scenario Training Forecast

  32. Radar ScenarioMotionvectorcomputation

  33. SVP Scenario Forecast of reservoir inflow based on temperature, precipitation and snow cover

  34. SVP ScenarioStructureofdata • Two steps of prediction : • Copy previous values of snow quantity and inflow volume. • Apply trained models (snow model at first, and then inflow model). P(t) = S(t-1) I(t) = F(t-1) S(t) = f(P(t), R(t), E(t)) F(t) = h(I(t), S(t), E(t), R(t))

  35. SVP ScenarioModels & Attributes • 10-Fold Cross Validation, 8760 records; models for inflow prediction • N-Fold Cross Validation, 8760 records; Decision Tree Model M5P

  36. SVP ScenarioData Integration workflow

  37. SVP ScenarioModel training workflow

  38. SVP ScenarioForecast workflow

  39. ADMIRE Tools Registry client GUI Process designer SKSA Gateway Process Manager DMI Model Visualizer

  40. Registry client GUI Read-only access to ADMIRE Registry list PEs and view their properties search, sort PEs Write access to Registry is done via DISPEL documents

  41. Process Designer Manage your DMI project (files, directories – project structure) Select elements from the Registry View the canonical (DISPEL) representation of your DMI process in real time View the properties of your chosen elements Edit your DMI process graphically

  42. Semantic Knowledge Sharing Assistant Context the user works in Several reservoirs, one settlement Knowledge that may be useful in this context previously entered by other users Provides access to existing user’s knowledge, sorting and selecting it automatically according to the user’s current working context

  43. Gateway Process Manager Keep track of running processes stop/pause/cancel the process view the process’ source DISPEL access process’ results (if available) in several ways – raw or visualized

  44. DMI Model VisualizerFor data mining experts Visualization of data mining models Read Weka classifier object produce PMML description of the model Show the PMML as a graphical tree

  45. Custom Application Portalfor end-users (domain experts)

  46. Vďaka za pozornosť

More Related