170 likes | 297 Vues
Federating cloud resources for building and execution of VPH applications Marian Bubak Department of Computer Science and Cyfronet AGH Krakow , PL and VPH- Share Project team dice.cyfronet.pl / projects / VPH-Share www.vph-share.eu . VPH-Share (No 269978). Co - authors.
E N D
Federating cloud resources for building and execution of VPH applications Marian Bubak Department of Computer Science and CyfronetAGH Krakow, PL and VPH-ShareProject team dice.cyfronet.pl/projects/VPH-Share www.vph-share.eu VPH-Share (No 269978)
Co-authors AGH Krakow – Cyfronet:PiotrNowakowski, MaciejMalawski, MarekKasztelnik,Daniel Harezlak, Jan Meizner, Tomasz Bartynski, Tomasz Gubala, BartoszWilk, WlodzimierzFunika University of Amsterdam: SpirosKoulouzis, Dmitry Vasunin, Reggie Cushing, Adam Belloum UCL London: David Chang, Stefan Zasada, Peter Coveney ATOS Research: Dario Ruiz Lopez, Rodrigo Diaz Rodriguez University of Sheffield: SusheelVarma
Basic functionality of cloud platform Install any scientific application in the cloud Access available applications and data in a secure manner End user Application Managedapplication Developer Cloud infrastructure for e-science Manage cloud computing and storage resources Administrator Install/configure eachapplication service(which we callanAtomic Service) once – thenuse themmultiple times in different workflows; Direct access to rawvirtualmachinesisprovided for developers, with multitudes of operating systems to choose from (IaaSsolution); Install whatever you want (root access to Cloud Virtual Machines); The cloud platform takesover management and instantiation of Atomic Services; Many instances of Atomic Services can be spawnedsimultaneously; Large-scalecomputations can be delegated from the PC to the cloud/HPC via a dedicatedinterface; Smart deployment: computationscanbe executed close to data (or the other way round).
Scientific objectives Investigating the applicability of cloud computing model for complex scientific applications Optimization of resource allocation for scientific applications on clouds Resource management for services on heterogeneous resources Researching means of supporting urgent computing scenarios on distributed infrastructures Elaborating a billing and accounting models Research of procedural and technical aspects of ensuring efficient yet secure data storage, transfer and processing Research on methods for component dependency management, composition and deployment Design of domain-specific, consistent information representation model for VPHShare platform, its components and its operating procedures
Resource allocationmanagement Management of the VPH-Sharecloudfeaturesisdone via the CloudFacadewhichprovides a set of APIs for the Master Interface and anyexternalapplication with the propersecuritycredentials. Admin VPH-Share Master Int. External application OpenStack/Nova Computational Cloud Site VPH-Share Core Services Host Amazon EC2 Other CS Atmosphere Management Service (AMS) Cloud Facade (secure RESTful API ) Developer Scientist Cloud Manager Atmosphere Internal Registry (AIR) Cloud stack plugins (JClouds) Development Mode Generic Invoker Workflow management Worker Node Worker Node Worker Node Worker Node Worker Node Worker Node Worker Node Worker Node Head Node Cloud Facade client Customizedapplicationsmaydirectlyinterfacethe Cloud Facade via itsRESTfulAPIs Image store (Glance)
Cloudexecutionenvironment • Privatecloudsitesdeployedat CYFRONET, USFD and UNIVIE • A survey of public IaaS cloud providershasbeenperformed • Performance and cost evaluation of EC2, RackSpace and SoftLayer • A grant from Amazonhasbeenobtained and @neuFuse services aredeployed on Amazon resources
HPC execution environment • Providesvirtualized access to high performance execution environments • Seamlessly provides access to high performance computing to workflows that require more computational power than clouds can provide • Deploys and extends the Application Hosting Environment – provides a set of web services to start and control applications on HPC resources Application Hosting Environment Invoke the Web Service API of AHE to delegate computation to the grid Auxiliary component of the cloud platform, responsible for managing access to traditional (grid-based) high performance computing environments. Provides a Web Service interface for clients. Present security token (obtained from authentication service) AHE Web Services (RESTlets) User access layer Application Tomcatcontainer -- or -- Job Submission Service (OGSA BES / Globus GRAM) QCG Computing RealityGrid SWS WebDAV GridFTP Resource client layer Workflow environment Delegate credentials, instantiate computing tasks, poll for execution status and retrieve results on behalf of the client -- or -- Grid resources running Local Resource Manager (PBS, SGE, Loadleveler etc.) End user
SWIFT storage backend Data access for largebinaryobjects Ticket validation service Master Interface component LOBCDER host (149.156.10.143) Auth service WebDAV servlet Core component host (vph.cyfronet.pl) Data Manager Portlet (VPH-Share Master Interface component) REST-interface LOBCDER service backend GUI-based access Resource factory Storage driver (SWIFT) Storage driver Atomic Service Instance (10.100.x.x) Service payload (VPH-Share application component) Encryption keys Resource catalogue Mounted on local FS (e.g. via davfs2) Generic WebDAV client External host • VPH-Sharefederated data storagemodule (LOBCDER) enables data sharing in the context of VPH-Shareapplications • The moduleiscapable of interfacingvarioustypes of storageresources and supports SWIFT cloudstorage (support for Amazon S3 isunder development) • LOBCDER exposes a WebDAVinterface and can be accessed by any DAV-compliantclient. It canalso be mounted as a component of the localclientfilesystemusingany DAV-to-FS driver (such as davfs2).
Data reliability and integrity • Provides a mechanismwhichkeepstrack of binary data stored in cloudinfrastructure • Monitors data availability • Advises the cloud platform wheninstantiatingatomic services LOBCDER DRI Service Metadata extensions for DRI A standalone application service, capable of autonomous operation. It periodically verifies access to any datasets submitted for validation and is capable of issuing alerts to dataset owners and system administrators in case of irregularities. Validation policy Register files Get metadata Migrate LOBs Get usage stats (etc.) Configurable validation runtime (registry-driven) Runtime layer Extensible resource client layer End-user features (browsing, querying, direct access to data, checksumming) Binary data registry Store and marshal data VPH Master Int. OpenStack Swift Cumulus Amazon S3 Data management portlet (with DRI management extensions) Distributed Cloud storage
Security framework • Provides a policy-driven access system for the security framework. • Providesa solution for an open-source based access control system based on fine-grained authorization policies. • ImplementsPolicy Enforcement, Policy Decision and Policy Management • Ensures privacy and confidentiality of eHealthcare data • Capable of expressingeHealth requirements and constraints in security policies (compliance) • Tailored to the requirements of public clouds VPH clients (or any authorized user capable of presenting a valid security token) Application Workflow management service Developer End user Administrator VPH Security Framework Public internet VPH Security Framework VPH Atomic Service Instances
Example: sensitivity analysis application • Problem: Cardiovascularsensitivitystudy: 164 input parameters (e.g. vessel diameter and length) • First analysis: 1,494,000 Monte Carlo runs (expected execution time on a PC: 14,525 hours) • Second Analysis: 5,000 runs per model parameter for each patient dataset;requiresanother 830,000 Monte Carlo runs per patient dataset for a total of four additional patient datasets – this results in 32,280 hours of calculation time on one personal computer. • Total: 50,000 hours of calculation time on a single PC. • Solution: Scale the application with cloudresources. Atmosphere Worker AS Worker AS Server AS Launcher script Scientist • VPH-Share implementation: • Scalable workflow deployed entirelyusing VPH-Share tools and services. • Consists of a RabbitMQ server and a number of clients processing computational tasks in parallel, eachregistered as anAtomic Service. • The server and client Atomic Services are launched by a script which communicates directly withe the Cloud Facade API. • Small-scale runs successfully competed, large-scale run in progress. Secure API RabbitMQ RabbitMQ RabbitMQ Cloud Facade Atmosphere Management Service (Launches server and automatically scales workers) DataFluo DataFluo Listener
Example: p-medicineOncoSimulator LOBCDER Storage Federation P-Medicine Data Cloud VPH-Share Computational Cloud Platform P-Medicine Portal P-Medicine users Cloud Facade Atmosphere Management Service (AMS) OncoSimulator Submission Form AIR registry Launch Atomic Services OncoSimulator ASI Visualization window Mount LOBCDER and select results for storage in P-Medicine Data Cloud Cloud WN Cloud HN OncoSimulator ASI VITRALL Visualization Service Store output Storage resources Storage resources • Deployment of the OncoSimulatorTool on VPH-Shareresources: • Uses a customAtomic Service as thecomputationalbackend. • Featuresintegration of data storage resources • OncoSimulator AS alsoregistered in VPH-Sharemetadatastore
Selected publications • R. Cushing, S. Koulouzis, A. Belloum and M. Bubak: Applying workflow as a service paradigm to application farming, Concurrency and Computation: Practice and Experience, DOI: 10.1002/cpe.3073, 2013 • R. Cushing, G. Putra, A. Belloum, S. Koulouzis, M. Bubak, C. de Laat: Distributed Computing on an Ensemble of Browsers, IEEE Internet Computing, Sept.-Oct. 2013, vol. 17 no. 5, pp. 54-61 • E. Deelman, G. Juve, M. Malawski, J. Nabrzyski: Hosted Science: Managing Computational Workflows in the Cloud,Parallel Processing Letters Vol. 23, No. 02, 2013 • P. Nowakowski, T. Bartynski, T. Gubala, D. Harezlak, M. Kasztelnik, M. Malawski, J. Meizner, M. Bubak: Cloud Platform for Medical Applications, eScience 2012 • S. Koulouzis, R. Cushing, A. Belloum and M. Bubak: CloudFederation for SharingScientific Data, eScience 2012 • P. Nowakowski, T. Bartyński, T. Gubała, D. Harężlak, M. Kasztelnik, J. Meizner, M. Bubak: ManagingCloudResources for Medical Applications, CracowGrid Workshop 2012, Kraków, Poland, 22 October 2012 • M. Bubak, M. Kasztelnik, M. Malawski, J. Meizner, P. Nowakowski, and S. Varma: Evaluation of Cloud Providers for VPH Applications, CCGrid 2013 (2013) • M. Malawski, K. Figiela, J. Nabrzyski: CostMinimization for Computational Applications on HybridCloudInfrastructures, FGCS 2013 M.Sc. Theses: • Bartosz Wilk: Installation of Complex e-Science Applications on HeterogeneousCloudInfrastructures, AGH University of Science and Technology, Kraków, Poland (August 2012), PTI award • Krzysztof Styrc: Managing Data Reliability and Integrity in Federated Cloud Storage, AGH University of Science and Technology, Kraków, Poland (September 2013)
Exploitation • Collaboration with the core teams of the VPH community • Transfer of the elaborated methods and tools to the PL-Grid • Solutions and software artefacts as a teaching material at the Department of Computer Science AGH (Large scale systems) • Already in use at the Collegium Medicum (prof. I. Roterman) • New areas of usage, e.g. city science, computing on demand for early warning systems, a problem solving environment • …
For moreinformation… • www.vph-share.eu Your one-stop entry to all VPH-Sharefunctionality. Youcan log in with yourBioMedTownaccount (available to allmembers of the VPH NoE) The VPH-Share Cyfronet Team is very grateful to the Management and staff members of the ACC Cyfronet AGH for continued support and encouragement. dice.cyfronet.pl documentation, publications, links to manuals, videos, etc.