1 / 11

Support for the Full e-Experimentation Cycle in the Virtual Laboratory Infrastructure

Support for the Full e-Experimentation Cycle in the Virtual Laboratory Infrastructure. Piotr Nowakowski (1), Eryk Ciepiela (1), Tomasz Gubała (1), Maciej Malawski (1, 2), Marian Bubak (1, 2) ( 1 ) ACC Cyfronet AGH, ul. Nawojki 11, 30-950 Kraków, Poland

vahe
Télécharger la présentation

Support for the Full e-Experimentation Cycle in the Virtual Laboratory Infrastructure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support for the Full e-ExperimentationCycle in the Virtual Laboratory Infrastructure Piotr Nowakowski (1), Eryk Ciepiela (1), Tomasz Gubała (1), Maciej Malawski (1, 2), Marian Bubak (1, 2) (1) ACC Cyfronet AGH, ul. Nawojki 11, 30-950 Kraków, Poland (2) Institute of Computer Science AGH, Mickiewicza 30, 30-059Kraków, Poland KUKDM’10 Zakopane, 18-19 March 2010

  2. Outline • Motivation • Problem definition • Scientific challenges • Iterative experimentation support • Experiment pipelines and traces • Sharing experiment data through Data Nets

  3. Motivation: e-Science Experiments,Dataand Publications • Reproducible experiments, provenance in e-Science • Need to link publications with primary data (experimental data, algorithms, software, workflows, scripts) • Plentitude of scientific software: jobs, workflows, services, components, scripts, experiment plans • Huge amount of scientific data consumed and producedby e-Science • Earth and life Sciences, HEP, etc. • Large number of publications makes research difficult: • Computer Science: DBLP contains more than 220 = 1,048,576 publications, • PubMed stores ~17 million articles to date, • CM digital library, ISI Web of Knowledge, Scopus, Citeseer,arXiv, Google Scholar • Emergence of the Web 2.0-based Scientific Social Community (SSC) model

  4. Open Science & Science 2.0 • New means of scientific communication: • Wikis, blogs • collaborative web 2.0 technologies • New methods of conducting science: • e-science, • in-silico experiments, • exploratory applications • Democratization of science • Increasing role of openness

  5. Problem Definition • To construct a theoretical model facilitating open, collaborative e-experimentation, from experiment inception to publication of results, including primary scientific data • To develop a framework implementing the above model • To exploit the emerging solution in the context of existing HPC infrastructures and scientific collaboration

  6. Scientific Challenges • Theoretical: A common method for referencing primary data (experimental data, algorithms, software, workflows, scripts) as part of publications should be developed and integrated with modern e-Science infrastructures • Technological: An integratedarchitecture for storing, annotating, publishing, referencing and reusing primary data sources.This architecture should span existing virtual laboratory and grid computing systems

  7. Description of the Solution • Phase 1: Iterative experiment preparation • Phase 2: Experiment execution involving semantic storage of results and ensuring repeatability

  8. Experimentation Pipeline • The process of developing an experiment beings with drafting its specification • This is followed by iteratively constructing an experiment plan • Each prototype is tested by a specific research community, using tools provided by the PL-Grid virtual laboratory • Upon completion of tests the experiment can be executed in a production mode • Obtained results can be published along with the experiment plan (i.e. a set of operations which enable reenactment and validation of a given experiment)

  9. ExperimentTraces • An experiment trace consists of the following: • any input data provided by the experiment enactor; • all steps performed in order to transform this data into publishable scientific results (chronologically arranged); • the documentation of the experiment plan, prepared by a domain scientist (in the form of annotations and comments). • The outcome of this process will be easily manageable and readable, similarly to weblog entries • Our VL system will enable enrichment of individual data elements with provenance information, linking them to appropriate stages of the experiment

  10. SharingPrimary Data: DataNets Data Net– unifying modern data storage mechanisms (relational databases, Grid-based file systems, Wiki pages etc.) A Data Net is a group of data entities linked by named relationships. Such relationships impose a structure upon the dataset and facilitate querying for entities

  11. References • W. Funika, D. Harezlak, D. Krol, M. Bubak; Environment for Collaborative Development and Execution of Virtual Laboratory Applications. In: M. Bubak, G.D.v. Albada, J. Dongarra, P.M.A. Sloot (Eds.), Proceedings ICCS 2008, Kraków, Poland, LNCS 5103, pp. 246-458, Springer 2008. • T. Gubala, M. Bubak, P.M.A. Sloot; Semantic Integration of Collaborative Research Environments, M. Cannataro (ed.) Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine and Healthcare, Information Science Reference, 2009, IGI Global. • M. Bubak, M. Malawski, T. Gubala, M. Kasztelnik, P. Nowakowski, D. Harezlak, T. Bartynski, J. Kocot, E. Ciepiela, W. Funika, D. Krol, B. Balis, M. Assel, and A. Tirado Ramos. Virtual laboratory for collaborative applications. In M. Cannataro, editor, Handbook of Research on Computational GridTechnologies for Life Sciences, Biomedicine and Healthcare, chapter XXVII, pages 531-551. IGI Global, 2009. • https://gs2.cyfronet.pl

More Related