420 likes | 555 Vues
Multiscale Applications on European e- Infrastructures Marian Bubak AGH Krakow PL and University of Amsterdam NL on behalf of the MAPPER Consortium http://www.mapper-project.eu/. Chalmers e-Science Initiative Seminar. 2 Dec 2011. About the speaker.
E N D
Multiscale Applications on European e- InfrastructuresMarian Bubak AGH Krakow PL and University of Amsterdam NL on behalf of the MAPPER Consortiumhttp://www.mapper-project.eu/ Chalmers e-Science Initiative Seminar 2 Dec 2011
Aboutthe speaker AGH University of Science and Technology (1919) 15 faculties, 36000 students; 4000 employees http://www.agh.edu.pl/en 1. Academic Computer Centre CYFRONET AGH (1973) 120 employees http://www.cyfronet.pl/en/ Other 14 faculties Faculty of Electrical Engineering, Automatics,Computer Scienceand Electronics (1946) 4000 students, 400 employees http://www.eaie.agh.edu.pl/ Distributed ComputingEnvironments (DICE) Team http://dice.cyfronet.pl) Department of Computer Science AGH (1980) 800 students, 70 employees http://www.ki.agh.edu.pl/uk/index.htm 2. University of Amsterdam, Institute for Informatics, Computational Science http://www.science.uva.nl/~gvlam/wsvlam
DICE team (http://dice.cyfronet.pl) • Main research interests: • investigation of methods for building complex scientific collaborative applicationsand large-scale distributed computing infrastructures • elaboration of environments and tools for e-Science • development of knowledge-based approach to services, components, andtheir semantic composition and integration
Plan • Multiscale applications • Multiscale modeling • Objectives of the MAPPER project • Programming and Execution Tools • MAPPER infrastructure • ISR application scenario • Summary
Vision • Distributed Multiscale Computing... • Strongly science driven, application pull • ...on existing and emerging European e-Infrastructures, ... • ... and exploiting as much as possible services and software developed in earlier (EU-funded) projects. • Strongly technology driven, technology push
Nature is multiscale • Natural processes are multiscale • 1 H2O molecule • A large collection of H2O molecules, forming H-bonds • A fluid called water, and, in solid form, ice.
spatial scale Dx L temporal scale Dt T Multiscale modeling • Scale Separation Map • Nature acts on all the scales • We set the scales • And then decompose the multiscale system in single scale sub-systems • And their mutual coupling
spatial scale Dx L temporal scale Dt T From a Multiscale System to many Singlescale Systems • Identify the relevant scales • Design specific models which solve each scale • Couple the subsystems using a coupling method
Why multiscale models? • There is simply no hope to computationally track complex natural processes at their finest spatio-temporal scales. • Even with the ongoing growth in computational power.
Multiscale computing • Inherently hybrid models are best serviced by different types of computing environments • When simulated in three dimensions, they usually require large scale computing capabilities. • Such large scale hybrid models require a distributed computing ecosystem, where parts of the multiscale model are executed on the most appropriate computing resource. • Distributed Multiscale Computing
spatial scale spatial scale Dx Dx L L temporal scale temporal scale Dt Dt T T Two paradigms • Loosely Coupled • One single scale model provides input to another • Single scale models are executed once • workflows • Tightly Coupled • Single scale models call each other in an iterative loop • Single scale models may execute many times • Dedicated coupling libraries are needed
MAPPER Multiscale APPlications on European e-infRastructures Poznan Supercomputing and Networking Centre Max-Planck Gesellschaft zur Foerderung der Wissenschaften E.V. Chalmers Tekniska Högskola University of Amsterdam AkademiaGorniczo-Hutniczaim. StanislawaStaszica w Krakowie University College London University of Geneva University of Ulster Ludwig-Maximilians-UniversitätMünchen
Motivation: user needs Fusion VPH Engineering Computional Biology MaterialScience Distributed Multiscale Computing Needs
Applications • 7 applications from 5 scientific domains ... • ... brought under a common generic multiscale computing framework fusion virtual physiological human hydrology SSM Coupling topology (x)MML Task graph Scheduling computational biology nano material science
Ambition • Develop computational strategies, software and services for distributed multiscale simulations across disciplines exploiting existing and evolving European e-infrastructure • Deploy a computational science infrastructure • Deliver high quality components aiming at large-scale, heterogeneous, high performance multi-disciplinary multiscale computing. • Advance state-of-the-art in high performance computing on e-infrastructures enable distributed execution of multiscale models across e-Infrastructures,
High level tools: objectives • Design and implement an environment for composing multiscale simulations from single scale models • encapsulated as scientific software components • distributed in various e-infrastructures • supporting loosely coupled and tightly coupled paradigm • Support composition of simulation models: • using scripting approach • by reusable “in-silico” experiments • Allow interaction between software components from different e-Infrastructures in a hybrid way. • Measure efficiency of the tools
Requirements analysis • Focus on multiscale applications that are described as a set of connected, but independent single scale modules and mappers (converters) • Support describing such applications in uniform (standardized) way to: • analyze application behavior • support switching between different versions of the modules with the same scale and functionality • support building different multiscale applications from the same modules (reusability) • Support computationally intensive simulation modules • requiring HPC or Grid resources • often implemented as parallel programs • Support tight (with loop), loose (without loop) and hybrid (both) connection modes
Overview of tools • MAPPER Memory (MaMe) - a semantics-aware persistence store to record metadata about models and scales • Multiscale Application Designer (MAD) - visual composition tool transforming high level MML description into executable experiment • GridSpace Experiment Workbench (EW) - execution and result management of experiments on e-infrastructures via interoperability layers (AHE, QCG)
Multiscale modeling language Submodelexecutionloopinpseudocode f := finit /*initialization*/ t := 0 while not EC(f, t): Oi(f, t) /*intermediate observation*/ f := S(f, t) /*solving step*/ t += theta(f) end Of(f, t) /*final observation*/ • Uniformly describes multiscale models and their computational implementation on abstract level • Two representations: graphical (gMML), textual (xMML) • Includes description of • scale submodules • scaleless submodules (so called mappers and filters) • ports and their operators (for indicating type of connections between modules) • coupling topology • implementation Corresponding symbols in gMML Of finit Oi S undefined Example for Instent Restenosis application IC – initial conditions DD- drug diffusion BF – blood flow SMC – smooth muscle cells
jMML library Supports XMML analysis: • Detection of initial models • Constructing coupling topology (gMML) • Generating task graph • Deadlock detection • Generating Scale Separation Map • Supports Graphviz or pdf formats
MaMe - MAPPERmemory • Provides rich, semantics-aware persistence store for other components to recordinformation • Based on a well-defined domain model containing MAPPER metadata defined in MML • Other MAPPER tools store, publish and reuse such matadata throughout the entire Project and its Consortium • Provides dedicated web interface for human users tobrowse and curate metadata
MAD: Application Designer • User friendly visual tool for composing multiscale applications • Supports importing application structure from xMML (section A and B) • Supports composing multiscale applications in gMML (section B) with additional graphical specific information - layout, color etc. (section C) • Transforms gMML into xMML • Performs MML analysis to identify its loosely and tightly coupled parts • Using information from MaMe and GridSpace EW, transforms gMML into executable formats with information needed for actual execution (section D) : • GridSpace Experiment • MUSCLE connection file (cxa.rb)
GridSpace Experiment Workbench • Supports execution of experiments on e-infrastructures via interoperability layers • Result management support • Uses Interpreter-Executor model of computation: • Interpreter - a software package available on the infrastructure, usually programatically accessible by DSL or script language e.g: MUSCLE, LAMMPS, CPMD • Executor - a common entity for hosts, clusters, grid brokers etc. capable of running Interpreters • Allows easy configuration of available executors and interpreters Transforming example MML into executable GS Experiment
User environment Registration of MML metadata: submodules and scales Application composition: from MML to executable experiment Execution of experiment using interoperability layer on e-infrastructure Result Management
E-infrastructure • Joined taskforce between MAPPER, EGI and PRACE • Collaborate with EGI and PRACE to introduce new capabilities and policies onto e-Infrastructures • Deliver new application tools, problem solving environments and services to meet end-users needs • Work closely with various end-users communities (involved directly in MAPPER) to perform distributed multiscale simulations and complex experiments Tier - 0 MAPPER Taskforce 1st EU review selected two apps on MAPPER e-Infrastructure (EGI and PRACE resources) Tier - 1 Taskforce established MoU signed 1st evaluation Tier - 2 2011 05 06 08 09 11 2012 2013 … …
MAPPER e-infrastructure • MAPPER pre-production infrastructure • Cyfronet, LMU/LRZ, PSNC, TASK, UCL, WCSS • Environment for developing, testing and deployment of MAPPER components • Central services • GridSpace, MAD, MaMe, monitoring, web-site • EGI-MAPPER-PRACE task force • SARA Huygens HPC system
2 scenarios in operation loosely coupled DMC tightly coupled DMC
In-stent restenosis • Coronary heart disease (CHD) remains the most common cause of death in the Europe, being responsible for approximately 1.92 million deaths each year* • A stenosis is an abnormal narrowing of a blood vessel • Solution: a stent placed with a balloon angioplasty • Possible response, in 10% of the cases: abnormal tissue growth in the form of in-stent restenosis • Multiscale, multiphysics phenomenon involving physics, biology, chemistry, and medicine
In-stent restenosis model • A 3D model of in-stent restenosis (ISR3D) • why does it occur, when does it stop? • Ultimate goal: • Facilitate stent design • Effect of drug eluting stents • Models: • cells in the vesselwall; • blood in the lumen; • drug diffusion; and • most importantly their interaction • 3D model is computationally very expensive • 2D model has published results* *H. Tahir, A. G. Hoekstra et al.Interface Focus, 1(3), 365–373
Scale separation map • Four main submodels • Same spatial scale • Different temporal scale
Coupling topology • Model • is tightly coupled (excluding initial condition) • has a fixed number of synchronization points • has one instance per submodel
MML of ISR3D mapper submodel start stop edge heads/tails finalization initialization intermediate intermediate
Demo: Mapper Memory (MaMe) • Semantics-aware persistence store • Records MML-based metadata about models and scales • Supportsexchanging and reusing MML metadata for • other MAPPER tools via REST interface • humanusers via dedicated Web interface Ports and their operators
Demo: Gridspace EW for ISR3D • Obtains MAD generatedexperimentcontaining a configuration file for MUSCLE interpreter • Providestwoexecutors for MUSCLE interpreter • SSH on Polish NGI UI – clusterexecution • QCG – multisiteexecution • Uses QCG executor for running MUSCLE interpreter on QCG and staginginput/outputfiles # declare kernels which can be launched in the CxA cxa.add_kernel(’submodel_instance1, ’my.submodelA’) cxa.add_kernel(’submodel_instance2’, ’my.submodelB’) … # configure connection scheme of the CxA cs = cxa.cs # configure unidirectional connection between kernels cs.attach ’ submodel_instance1’=> ’submodel_instance2’ do tie ’portA’, ’portB’ ….. end …
Computing • ISR3D is implemented using the multiscale coupling library and environment (MUSCLE) • Contains Java, Fortran and C++ submodels • MUSCLE provides uniform communication between tightly coupled submodels • MUSCLE can be run on a laptop, a cluster or multiple sites.
QCG Role • Provides an interoperability layer between PRACE and EGI infrastructures • Co-allocates heterogeneous resources according to the requirements of a MUSCLE application using an advance reservation mechanism • Synchronizes the execution of application kernels in multi-cluster environment • Efficiently executes and manages tasks on EGI and UCL resources • Manages data transfers
ISR3D - conclusion • Before MAPPER, ISR2D ran fast enough, ISR3D took too much exection time and a lot of time to couple • Now, ISR3D runs distributedly using the MAPPER tools and middleware • To get scientific results we will have to, and can, run many batch jobs • Done in the MeDDiCa EU project • Involves 1000s of runs • Also, the code can be parallelized to run faster
Summary • Elaboration of a concept of anenvironment supportingdevelopers and users of multiscale applications for gridand cloud infrastructures • Design of the formalism for describing connections in multiscale simulations • Enabling access to e-infrastructures • Validation of the formalism against real applications structure by using tools • Proof of concept for transforming high level formal description to actual execution using e-infrastructures
More about MAPPER • http://www.mapper-project.eu/ • http://dice.cyfronet.pl/