Harnessing Open Data in Agriculture: A Hands-On Tutorial for Building Data Workflows
Explore how to utilize open data infrastructures to enhance your agricultural data products. This tutorial, supported by EU projects, guides users through the process of creating a customizable dataset from existing data sources. Designed for both novice and advanced users, it illustrates the use of the agINFRA-powered RING panel to select data sources and execute complex workflows, enabling on-the-fly dataset downloads. Learn about configuring workflows, utilizing remote APIs, and accessing grid resources for data processing in agriculture.
Harnessing Open Data in Agriculture: A Hands-On Tutorial for Building Data Workflows
E N D
Presentation Transcript
Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products 12/12/2013 Athens, Greece Supported by EU projects
Tutorial on how to use from external tools or build new complex data processing workflows (example: data aggregation) Robert Lovas MTA SZTAKI, Hungary
Intro: user scenario • a user (information manager) wants to create a data set on-the-fly that will include specific filtered subsets of existing data sources, e.g. CIARD RING • E.g. covering a particular geographical region, time period, type of resources • use agINFRA-powered RING panel (Drupal module) to select desired data sources and relevant metadata properties, facets, formats • aggregation workflow executed when grid resources are available, notifying user to collect produced data set when ready • an advanced user wants to develop further the existing workflows or create new complex data processing workflows
Objectives This presentation aims to provide • illustration how an agINFRA powered data aggregation works with simple but powerful interface to • select targets, execute complex workflow, and provide the downloadable outputs (data sets on the fly) • overview on designing workflows for data processing with user access and documentation details
The background Steps: • Configuration • Starting aggregation (workflowsubmissionthroughRemote API ) • Enactingtheworkflow (Initharvests,etc.doingone-by-one) • As a step of theworkflow, uploadingtheharvesteddatato Grid Storage LFN(using Robot certificate) • RegisteringthedatasetinagCrouchDB • Retrievingthe HTTP location • Provide the downloadable datasets
agINFRA Science Gateway for workflows Liferay-basedgUSE/WS-PGRADE for creating and running workflows http://aginfra-portal.lpds.sztaki.hu/liferay-portal-6.0.5/ • or on-demand deployment • agINFRA VO is accessible (ca. 3500 CPU cores, 900 TB storage)
Documentation of gUSE For workflows Used from the GUI
New! Request form for a new instance of a workflow engine (+ workflows and other components) on the Cloud http://www.desktopgrid.hu/oc-public-aginfra
Thank you! Robert Lovas MTA SZTAKI Robert.Lovas@sztaki.mta.hu