1 / 40

Research Area Background

Research Area Background. Area: systems – applied computer science Question: what to do? Dr. Dan Reed, Vice President Microsoft, in his Keynote talk “Clouds: from Both Sides New” in Washington in 2011 stated (my interpretation)

nuri
Télécharger la présentation

Research Area Background

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Area Background • Area: systems – applied computer science • Question: what to do? • Dr. Dan Reed, Vice President Microsoft, in his Keynote talk “Clouds: from Both Sides New” in Washington in 2011 stated (my interpretation) • University researchers should find a research niche because they do not have enough resources (human and financial) to compete against main stream of research carried out by big companies SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  2. Toward SaaS Clouds Supporting HPC Biology and Medicine Applications Andrzej Goscinski Service and Cloud Computing Lab Senior Members: A. Wong, P. Church, M. Brock

  3. Biology and Medicine Needs • Biology and medicine specialists collect a lot of data • Many of them only use their workstations, desktops and even laptops to carry out data analysis • Many of them are not familiar with HPC • Many biology and medicine specialists do not program well and do not have system admin skills (they should not have it I guess) • Biology and medicine specialists would like to use computers to get analysis results quickly without a burden of computing “jargon” SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  4. Lab (Current) Research Aim • to carry out the study into the development of a technology for simplifying the deployment, exposure, access and customization of HPC science applications in SaaS clouds • This technology forms a basis of research environments enabling science specialists to use HPC resources in clouds for running their computational demanding software • easily • on-demand • at reasonable costs for the discovery of new and significant discipline knowledge SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  5. The NIST Definition of Cloud ComputingNIST Special Publication 800-145, P. Mell and T. Grance, Sept 2011 • Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction • This cloud model is composed of five essential characteristics, three service models, and four deployment models SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  6. NIST: Service Models • Infrastructure as a Service (IaaS) • The delivery of hardware resources as a service • Users are granted access to cloud infrastructure through virtual machines • Platform as a Service (PaaS) • Build services on IaaS clouds supporting cloud application deployment • Most cloud platforms consist of a high-level language and a well-defined Application Programming Interface • Software as a Service (SaaS) • Exposes applications designed to run on a cloud as services • Eliminates the need to install or run applications on the customer’s computer and is often cheaper than buying a full software licence SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  7. NIST: Deployment Models • Public Clouds • Accessed by the general public • Allows users to rent resources such as computational time or storage as necessary • Private Clouds • Used exclusively by an organisation • Allow for a specific service level agreement (SLA) to be made to ensure availability and security • Community Clouds • Used by a group of users that have shared concerns • Allows for a shared mission statement which has specific security and policy requirements • Hybrid Clouds • Combines cloud resources from two or more deployment models to accomplish a user’s goal SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  8. NIST: Essential Characteristics • On-demand self-service • Broad network access • Resource pooling • Rapid elasticity • Measured service SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  9. Characteristics of Clouds that Attract Business • Clients only pay for what they consume • Rather than spending money on buying, managing and upgrading servers, business administrators concentrate on the management of their applications • The required service is always there – availability is very high that leads to short times from submission to the completion of execution • Cloud computing provides opportunities to small businesses by giving them access to world class systems otherwise unaffordable On the other hand, even small companies can export their specialized services to clients SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  10. When Using Clouds Additional Steps Must be Carried out Depending on the Service Model • IaaS - involves construction of a virtual cluster, compilation and deployment of distributed software • System administrators jobs • PaaS - aimed at developers provide users with a development environment and automating the deployment of resources • Limited access to development tools and languages • SaaS - users are able to access HPC applications through graphical interfaces; however users are reliant on what cloud service providers have made available • Such software would have expensive licenses or be not readily available SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  11. Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)} • Over the past 2.5 years the percentage of companies who say they are currently using public cloud computing services has climbed from 14% to 40%. SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  12. Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)} • The results in the latest ChangeWave cloud survey point to continued growth for public, private and hybrid cloud computing • Within public cloud computing, software as a service (SaaS) remains the area with the fastest growth rate • When asked why their companies do not use cloud computing, the most important reasons are Security Concerns (41%), while 15% cite the Complexity of Integrating with Existing IT Infrastructure SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  13. Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)} SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  14. HPC vs. HPC Clouds vs. Discipline Specialists • Problem 1: HPC requires • powerful and expensive computational and data storage hardware • advanced middleware • sophisticated discipline oriented applications • knowledgeable programmers and system managers • Clouds have been created for business ($$$), not to earn money from HPC ($) • Most HPC clouds are based on IaaSclouds enhanced by additional hardware and middleware to support HPC • Problem 2: the cost and time overheads in learning how to • prepare a HPC cloud and • properly install and configure applications in the underlying HPC facilities • Conclusion: if discipline specialists want to use HPC clouds for scientific discovery, they also must become system administrators and good programmers SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  15. Clouds and HPC • A response to Problems 1 & 2 faced by discipline specialists lies in cloud computing • These days clouds can support some HPC workloads • Clouds are oriented to support High Scalability Computing (HSC) rather than HPC • Note: with the improvement of communication performance clouds are becoming a major tool for HPC • Question: what kind of HPC applications could be executed on a cloud? SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  16. HPC Clouds vs. Applications SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  17. HPC Clouds vs. Discipline Specialists • Most HPC clouds are based on IaaSclouds enhanced by additional hardware and middleware to support HPC • Problem 3 again: the cost and time overheads in learning how to prepare a HPC cloud and its applications remain a problem • HPC cloud users are • presented with a set of virtual and physical servers • required to put the servers together to form the HPC facilities to run their software applications on • The software applications must be properly installed and configured in the underlying HPC facilities • Conclusion: if discipline specialists want to use HPC clouds for scientific discovery, they must also become • system administrators and • good programmers SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  18. Web-based Software Tools/Packages • In many areas of science, discipline specialists benefit from Web-based software tools • Software tools are easy to use and attractive to specialists through their discipline oriented interfaces • scientific workflow systems (Galaxy) • web portals for accessing grid resources (P-GRADE) • web portals of scientific gateway such as HubZero • Observation: specialists appreciate easy to use Web-based discipline oriented interfaces! Plenary "Cloud in Action" CLOUD 2013 panel

  19. HPC Applications Exposed as Services in SaaSClouds • Use of clouds (ChangeWave Research) • Conclusion: discipline specialists could benefit most from the execution of their HPC applications if they are • exposed as services in SaaS clouds and • accessed through discipline (tool-based) interfaces SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  20. Merging SaaS Cloud Services and Web Tools • Question: are we on a good track? • Yes, we are! • Providing users faster turnaround times on their experiments using clouds has been one of the major issues promised to be addressed in a new version of the AGAVE software tool • AGAVE is one of the well known and widely used Web-based software tools • AGAVE delivers science-as-a-service • Data processed using analytics provided as SaaS services Plenary "Cloud in Action" CLOUD 2013 panel

  21. Direct Research Questions • How to make scientists able to deploy software applications in clouds? • How to make clouds easy to use for discipline researchers to run HPC applications? • How to support the customization and reuse of HPC applications in clouds? • These three questions form the current research scope of our Lab • Our research aim again: develop a technology that • automatically creates a virtual machine (VM) • exposes an application as a service • deploys it on the VM • generates an easy to use interface – a Web form SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  22. Initial Lab’s Research • Web services, which are used to develop services, are stateless • Our response: stateful Web services • Service discovery and selection is a major threshold of the application of cloud computing (only simple catalogues are in use) • Our response: a dynamic broker based on attributed names • The application of HPC is unaffordable to small and medium research groups and institution • Our response: the CaaS framework that exposes a cluster as a service, and makes it available within a private and public cloud SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  23. From IaaS/PaaS to SaaS with a Broker (M. Brock) SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  24. From IaaS/PaaS to SaaS with a Broker (M. Brock) • The RVWS Framework • Allows current activity and characteristics of resources to be exposed as services via WSDL documents • A compatible extension to existing Web standards • The Dynamic Broker • A discovery service that uses stateful WSDL documents • CaaSInfrastructure • Web service-based middleware for easy publishing, discovery and use of clusters • HPCynergy • A prototype private cloud built using CaaS for easy access to HPC resources and applications • HPC Hybrid Deakin (H2D) Cloud • Able to discover suitable resources from both public and private clouds to execute single applications too large to singular clusters • All tasks such as parameter modification, data file break up and multiple application monitoring handled on behalf of the user SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  25. SaaS Cloud HPC Resources HPC Application Service Registry HPC Application Service Publishing Virtual User Machine Web Form Accessing Yes Service Image Discovery No HPC Application Deploying Service, Web Form Generation HPC Application HPC Application Deployment I aaS Cloud SaaS Cloud Supporting HPC Science Applications • Three steps: • Deployment of HPC applications on IaaS clouds • Exposure of HPC application services • Access of HPC application services • Transforming complicated HPC applications into easy-to-use SaaS cloud services SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  26. Using the Framework • The discipline researcher to conduct a scientific discovery by executing HPC applications on clouds contacts the HPC Application Service Registry • Scenario 1: the HPC application services of researcher ’s interest is found • Researcher selects the cloud service • Resources are selected automatically and the application deployment service sets up and configures the cloud • The automated interface generation service constructs a user friendly discipline specific interface for the requested HPC application service • Researcher accesses the cloud service through the provided interface • Scenario 2: the HPC application service of user’s interest is not found but the discipline researcher has programming and system administration skills and decides to deploy a new targeted HPC application in IaaS cloud • The Automatic HPC Application Deployment System can automate parts of this process • The outcome is either • a virtual machine image containing a copy of the properly installed and configured HPC application or • a software service (consisting of input/output, invocation information and hardware requirements) which can be deployed on a virtual machine • Stage 1:the cloud service published in the HPC application service registry is readily accessible in IaaScloud • The new cloud service generated by the Automatic HPC Application Deployment System is stored for future use in the HPC Application Service Registry • Stage 2:the user can employ the Automatic HPC Application Service and Web Form Generation System to automate the formation of a HPC Application Service exposing the HPC application • The HPC Application Service is abstracted by a user friendly discipline specific interface that is published in the HPC application service registry (see Scenario 1) SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  27. Implementation of the HPC Cloud Framework (A. Wong) • Services provided at the Cloud service stack: • Bottom (IaaS layer): the Amazon EC2 was used to provide cloud infrastructure services • Middle (HPCaaS Layer): a HPC software library was used to expose and access Amazon EC2 services • Top (SaaS Layer): a HPC application service was developed and exposed as a tool in the Galaxy server SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  28. The Galaxy Web-based Platform (A. Wong) • Galaxy provides a powerful feature for tool integration where each tool (application) is presented to users as a Web form SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  29. An Interface to Access the HPC Cloud (A. Wong) • A HPC cluster was being constructed where compute instances of the cluster would support mpiBlast execution SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  30. An Interface to Access the HPC Cloud (A. Wong) • A cluster of 8 nodes was constructed at Amazon EC2 SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  31. An Interface to Access mpiBlast(A. Wong) • mpiBlast was accessed by supplying parameters: cluster name, number of processes and other typical parameters SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  32. An Interface to mpiBlast(A. Wong) • mpiBlast execution finished at Amazon EC2; its result file was transferred automatically to the Galaxy server for post processing SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  33. Uncinus: Cloud Deployment (P. Church) • Supports • Resource Allocation • Workflow Orchestration • Cloud Bursting • Genomics in the clouds • Gene Discovery • Personalized Genomics • Leverage EC2 to improve the speed and accuracy of analysis SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  34. Uncinus: Case Study (P. Church) • To identify genes transferred upon digestion of dairy products • Mother -> Child • A 8 step workflow was developed and ran on Uncinus • Run on the following resources; SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  35. Uncinus: Case Study (P. Church) • Cloud bursting improved performance • Workflow mode reduced run time by 8 hours SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  36. Uncinus: Case Study (P. Church) • Results from the workflow found genes active during lactation and during digestion of dairy • Is this gene transfer or a reaction? Further work is needed SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  37. Increasing Scalability – Hybrid Clouds Public Clouds Compute Cloud Storage Cloud Service Request Publishing (Distributed) Service Broker Broker 1 Broker N Compute Private Compute Cloud Cloud Storage Cloud SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  38. Solutions from Hybrid/Federated Clouds • Hybrid/Federated Cloud Management (FCM) Architecture • A recent work that provides a reference architecture consisting of brokering services • User requests are serviced by creating virtual appliances based on user request parameters and ran inside virtual machines • Appliances are stored in repositories and decomposed over time to support the creation of future appliances • As virtual appliances contain a software stack (operating system) upwards, there are high data transfer costs SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  39. Solutions from Hybrid/Federated Clouds • There is also an (unnamed) toolkit for VM migration between clouds • Users are able to transfer VMs between public and private clouds to control load (manually or automatically) • However, the interface itself is primitive at best SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

  40. Conclusions • Clouds are being moved from business to specialized research • HPC on clouds promise scalability, faster turnaround times, lower costs, services on demand • Discipline specialist should not be forced to become (good) programmers and system administrators • Easy and discipline oriented interfaces are very important • Web tools offer discipline oriented interfaces but are inflexible and do not support HPC widely • Combining HPC clouds and Web tools is the way • HPC applications exposed as services of SaaS cloud and accessed using Web forms is the solution! • Hybrid clouds will grab the HPC market SaaS Clouds Supporting HPC Biology Sciences - CMU CV July 2013

More Related