UncertWeb lessons learnt

UncertWeb lessons learnt Dan Cornford and the UncertWeb team Computer Science, Aston University, Birmingham, United Kingdom, UncertWeb Tools Workshop, 10 Jan 2013, Aston

UncertWeb successes • Tools and other software: • hope you found them useful! • they will outlive the project • but maintaining online web based tools is not easy … in a university once the authors have gone! • Applications • four really strong examples • plus an integration example • but they often don’t always use the tools or infrastructure! • sometimes it is easier to re-invent the wheel, and not all tools are always needed!

UncertWeb successes - II • Information models: • profiles for O&M, GML are supported by our tools • for spatially and temporally distributed data • UncertML provides a useful vocabulary • NetCDF scales to larger data sets when gridded • a unified approach (+ tool support) to representing all types of data would be very useful, but quite a challenge! • Infrastructure and services: • improvements to web service deployment suggested • brokering approach a sound architecture

Did we create the ‘Model Web’? • No – there are only a few models ‘out there’ • model ‘interfaces’ and deployments are complicated and variable, and not designed to integrate typically • this is just a problem – most models were not written for the web • each model is written uniquely making automated exposure hard • sometimes we used encodings that were too complicated (O&M / GML) • always use the most simple encoding – if it is a scalar use a simple type, use O&M for e.g. spatial / temporally referenced data • Standards / frameworks should be distilled from implementations • should be developed during / after implementation and not before as a theoretical exercise

Exposing models on web • Exposing real models is hard • UncertWeb profile provides a range of data types, but: • tools for support (e.g. conversion to/from commonly used types), and actual usage still very low! • mapping from model inputs / outputs to encodings time consuming, manual and complicated • not even always clear / unique how to encode all inputs / outputs • and which inputs / outputs are best exposed on the web interface? • We spent far more time on this than anticipated • Very hard to automate!

UncertWeb encodings • For big data ASCII and XML too verbose, also JSON • binary encodings are needed for big models, and some simply not suited to web deployment • bring the model to the data? e.g. as done with R scripts? • when working with big data, avoid thin clients doing lots of data handling! • Use appropriately simple types for encoding model inputs and outputs • scalar, vector, matrix (+ type) cover most inputs • O&M / GML add value to spatial and temporal data • NetCDF useful for gridded data • future will ideally combine O&M and NetCDF (and UncertML) • some tools only support simple encodings anyway!

UncertWeb Architecture • This was a challenge • OGC stack is complex • and not well enough supported by complete implementations • we have defined our own profiles – this is a big contribution • WPS too generic • provide richer description (more metadata) • Our proposed annotation of services needs further testing and refinement • Still no ‘universal’ modelling framework out there • brokering approach a good idea, but requires more work • bit ‘chicken and egg’ – we hope good tools make it worth exposing models, but exposing models remains time consuming • diversity likely to remain, as in programming languages but worse!

Uncertainty in Environmental Data and Models • UncertML is useful and will outlive the project • might need one more tweak to be optimal! • does not scale to massive data, and a better separation between the dictionary and encoding would be a plus • core idea is valid, probabilistic approach is right • Quantifying uncertainty on all inputs and models proved challenging! • tools help, but managing resulting uncertainties not easy • many uncertainties treated rather superficially • still a lot to do on quantifying uncertainties in models and data

Research funding • Research funding too short term • if you really want to support the “model web” • encourage global cooperation, but find a leader to push this and then fund them • make sure they have a vision but also listen! • fund it for the long term – make a decision and stick with this, maybe with some heavy management! • involve companies who can actually develop and maintain commercial strength software – be prepared to fund them • universities are good at developing prototypes, but cannot maintain things longer term • funding for open source / free use ... commercial companies then add value • fund things that integrate, not develop new solutions • the challenge now is one of software engineering in my view, not theory

Open questions • Need to make it easier to expose models on the web • tools to convert to web formats (JSON / O&M / NetCDF) • scalability, both of models and data models • consider chunking big data, parallel execution, automated replication of model instances • semantics and automating workflow composition • reliability, maintainability, security • Tools further developed • Greenland (vis client) … others? • More complex real model workflows • user driven real workflows will raise more issues! • time stepping, model discrepancy, calibration, data assimilation, uncertainty quantification

What next? • Other initiatives: PURE Experimental Zone • EVO … • other projects? • We’ll do our best to maintain and enhance the tools • all are open source, so you can take them on too! • we hope to have new projects in the future to further develop the tools and basic technology • linking with other frameworks, e.g. OpenMI attractive … but not funded!

Summary • Keep in touch • we want the tools to be used, and we want to find out when they do and don’t work! • you can help – everything is open source • Thanks for attending • hope you found it useful, or at least interesting … The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n° [248488].

UncertWeb lessons learnt

UncertWeb lessons learnt

Presentation Transcript

Corporate Scandals – Lessons Learnt

PCI COMPLIANCE - Lessons Learnt

Operational Issues – Lessons learnt

LESSONS LEARNT ON THE MOUNTAIN

Lessons Learnt While BUILDING Pex

Green Buildings - Lessons Learnt

UncertWeb overview

Content Licensing – Lessons Learnt

Workshop on Lessons Learnt

PwSI THE LESSONS LEARNT

Lessons Learnt regarding Implementation

GIS Vocational Education: Lessons Learnt

AntiPhish – Lessons Learnt

Entrepreneurship 9 Lessons Learnt

LESSONS LEARNT

Placenta Accreta-Lessons Learnt

TRANSACTIONAL SEX LESSONS LEARNT

LESSONS LEARNT

Placenta Accreta -Lessons Learnt

Lessons learnt