240 likes | 642 Vues
A Dynamic Service Deployment Infrastructure for Grid Computing or Why it’s good to be Jobless. Paul Watson School of Computing Science University of Newcastle, UK.
E N D
A Dynamic Service Deployment Infrastructure for Grid ComputingorWhy it’s good to be Jobless Paul Watson School of Computing Science University of Newcastle, UK Thanks: Chris Fowler, Charles Kubicek, Arijit Mukherjee, John Colquhoun, Savas Parastatidis, Mark Hewitt, Isi Mitrani, Jennie Palmer, Rob Smith, Paul McKee & Mike Fisher
Data in Science • Bowker’s “Standard Scientific Model”1 • Collect data • Publish papers • Gradually loose the original data 1The New Knowledge Economy and Science and Technology Policy, G.C. Bowker, E1-30-03-05
Publishing data as well as papers • e-Science is trying to change this to: • Collect data • Publish data & papers e.g. SkyServer, OGSA-DAI publish databases through Web Services…
Problem: Moving Data • Databases are good at localising computation & data • But, often large amounts of data must still be transferred • this may severely limit the performance
Jobs: the Grid Solution? • Grid Computing offers remote job scheduling • Therefore, we could package the analysis code & data as a job and send it to compute resources close to the data • We decided to explore an alternative…
Why Jobs & Services? • Grid applications are being built from Web Services • But, if the computational requirements can’t be met by the service hosting environment then a job must be created and scheduled • Why do we need both jobs and services? • Dynasoar • a service-only approach to building grid applications • an infrastructure for the dynamic deployment of web services
Dynasoar Components • Web Service Provider (WSP) • exposes service endpoints • accepts the incoming SOAP message sent to the endpoint • chooses a Host Provider and passes the message to it • holds a copy of service code • Host Provider (HP) • manages computational resources (e.g. a cluster or a grid) • accepts the message from the WSP • dynamically deploys the service if necessary • processes the message and returns any response Consumer
Routing to an Existing Service Deployment A request for s2 is routed to an existing deployment of the service
Dynamic service deployment A request to s4 cannot be met by an existing deployment of the service R The deployed service remains in place and can be re-used - unlike job scheduling
Dynasoar Advantages • Simplicity: just services • Efficiency: a deployed service can process many messages • important if cost of deployment is high… e.g. VMs • Support a range of new e-science/ e-business models: • defining the interactions between the major components allows them to be distributed in a variety of ways
Dynamic Outsourcing • Biocorp are experts in writing bioinformatics services • They don’t want to manage their own compute resources • Therefore, they use Hosting Inc to process messages sent to their services
The National Grid Service as a Host Provider • A researcher writes their own services but does not have sufficient local compute resources • They deploy a local WSP, and configure it so that it sends messages to the National Grid Service • their services are then transparently deployed on the NGS as required
Brokers for Matching Web Service Providers to Host Providers • Selection on: • Price, • Performance, • Dependability,…
A Broker for e-Science Local Campus Grid National Grid Service
Moving Computation to Data • The data owner provides compute resources close to a database • Researchers can write services and deploy them on their own WSP • The service is dynamically deployed close to the database when requests are sent to the WSP
Tripartite Security Model The 3 actors can define policies (XACML) that Dynasoar enforces at run-time, e.g…. Only use Host Providers trusted to not re-use the deployed service without payment Accept only messages from WSPs trusted to not send malicious code Only send the message to a HP trusted not to look at the contents
Current Implementation GridShed Cluster Management (includes algorithm to decide when to deploy extra copies of a service to meet performance requirements)
New Host Provider Architecture • Layer as high-level infrastructure over lower level grid fabric • Use OMII Job Submission and Monitoring Service to provide stable interface to different underlying fabrics • Newcastle Grid (Condor), National Grid Service, local clusters,….
Dynamic Service Grids Key to success: the availability of services for deployment
Current Work • Experimenting with Bioinformatics Services • Deploying Services in Virtual Machines • can encapsulate a complex service implementation environment • Use of QoS to guide decisions on where to deploy a service • When and where to deploy within Host Provider? • GridSHED project • Reproducible e-science
Conclusions • Grid applications can be built entirely from services • jobless grid computing • simpler conceptual model • performance improvements due to sharing the cost of service deployment over multiple requests • Dynasoar is built as a high-level infrastructure on top of existing grid fabrics • Separating the Web Service Provider from the Host Provider – with a well-defined interface – opens up a range of e-science/ e-business models