620 likes | 793 Vues
Status of Grid Technology/Middleware (e-Science, Cyberinfrastructure). May 29 2003 Geoffrey Fox, Indiana University. Note the terms Grid, e-Science Technology/Middleware, and Cyberinfrastructure are NOT distinguished. What is a Grid I?. Collaborative Environment (Ch2.2,18)
E N D
Status ofGrid Technology/Middleware(e-Science, Cyberinfrastructure) May 29 2003 Geoffrey Fox, Indiana University Note the terms Grid,e-Science Technology/Middleware, and Cyberinfrastructure are NOT distinguished
What is a Grid I? • Collaborative Environment (Ch2.2,18) • Combining powerful resources, federated computing and a security structure (Ch38.2) • Coordinated resource sharing and problem solving in dynamic multi-institutional virtual organizations (Ch6) • Data Grids as Managed Distributed Systems for Global Virtual Organizations (Ch39) • Distributed Computing or distributed systems (Ch2.2,10) • Enabling Scalable Virtual Organizations (Ch6) • Enabling use of enterprise-wide systems, and someday nationwide systems, that consist of workstations, vector supercomputers, and parallel supercomputers connected by local and wide area networks. Users will be presented the illusion of a single, very powerful computer, rather than a collection of disparate machines. The system will schedule application components on processors, manage data transfer, and provide communication and synchronization in such a manner as to dramatically improve application performance. Further, boundaries between computers will be invisible, as will the location of data and the failure of processors. (Ch10)
What is a Grid II? • Supporting e-Science representing increasing global collaborations of people and of shared resources that will be needed to solve the new problems of Science and Engineering (Ch36) • As infrastructure that will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications. (Ch1) • Makes high-performance computers superfluous (Ch6) • Metasystems or metacomputing systems (Ch10,37) • Middleware as the services needed to support a common set of applications in a distributed network environment (Ch6) • Next Generation Internet (Ch6) • Peer-to-peer Network (Ch10, 18) • Realizing thirty year dream of science fiction writers that have spun yarns featuring worldwide networks of interconnected computers that behave as a single entity. (Ch10)
What is Grid Technology? • Grids support distributed collaboratories or virtual organizations integrating concepts from • The Web • Distributed Objects (CORBA Java/Jini COM) • Globus Legion Condor NetSolve Ninf and other High Performance Computing activities • Peer-to-peer Networks • With perhaps the Web being the most important for “Information Grids” and Globus for “Compute Grids” • Use Information Grids and not usual Data Grids as “distributed file systems” (holding lots of data!) are handled in Compute Grids
Taxonomy of Grid Functionalities Note: Term Data Grid not used consistently in community so avoided
Data Data Filter Filter Filter Data OGSA-DAIGrid Services Grid Data AnalysisControl Visualize Filter This Type of Grid integrates with Parallel computing as on TeraGrid HPC Simulation Filter Other Gridand Web Services Data Distributed Filters massage data For simulation Complexity Grid Computing Model
WS WS WS WS 6: Domain-Specific (Application) Grid Services 5: OGSA-compliant System Grid Services 4: Key OGSA Services 3: Permeating Principles and Policies 1: Hosting Environment 2: OGSI Web service Enhancements “Central” Architecture/Functionality/Style Gaps • Substantial comments on “hosting environments” OGSI and “permeating principles” • Agreement on Web service model “Modular” Servicesnatural for distributed teams Specific Gaps “Central Services And Architecture” Central Gaps
Permeating Principles and Policies • Meta-data rich Message-linked Web Services as the permeating paradigm • “User” Component Model such as “Enterprise JavaBean (EJB)” or .NET. • Service Management framework including a possible Factory mechanism • High level Invocation Framework describing how you interact with system components. • This could for example be used to allow the system to built from either W3C or GGF style (OGSI) Web Services and to protect the user from changes in their specifications. • Security is a service but the need for fine grain selective authorization encourages • Policy context that sets the rules for each particular Grid. • Currently OGSA supports policies for routing, security and resource use. • The Grid Fabric or set of resources needs mechanisms to manage them. This includes automatic recording of meta-data and configuration of software. • Quality of service (QoS) for the Network and this implies performance monitoring and bandwidth reservation services. • Challenging as end-to-end and not just backbone QoS is needed. • Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as converting between different interface specifications. • Messaging is built on transport mechanisms which can be used to support mechanisms to implement QoS and to virtualize ports
World Wide Grid Service Activities I • Commercial activities especially those of IBM, Avaki, Platform, Sun, Entropia and United Devices • The GT2 and GT3 Globus Toolkits. Here we effectively covering not just the Globus team but the major projects such the NASA Information Power Grid that have blazed the trail of “productizing” Grids. • Note that we can “already” see GT3 (Grid Service) like functionality from GT2 wrapped with the various (Java, Perl, Python, CORBA) CoG kits. So GT2 capabilities can be classified as Services • Trillium (GriPhyn, iVDGL and PPDG) and NeesGrid; the major NSF (DoE for PPDG) projects in the USA. • Condor from the University of Wisconsin which is being integrated into Grid services through the Trillium and NMI activities. • The NSF Middleware Initiative (NMI) packaging a suite of Globus, Condor and Internet2 software. • This has overlaps with the VDT (Virtual Data Toolkit from GriPhyn)
World Wide Grid Service Activities II • Unicore (GRIP), GridLab, the European Data Grid (EDG) and LCG (LHC Computing Grid) • Many other (20) EU Projects but these have most of technology development • Storage Resource Broker SRB-MCAT from SDSC • The DoE Science Grid and related activities such as the Common Component Architecture (CCA) project • Examination of services from a collection of portal projects in the US from Argonne, Indiana, Michigan, NCSA and Texas. • This includes best practice discussion from Global Grid Forum in portals. • Review of contributions to the recent book Grid Computing: Making the Global Infrastructure a Reality edited by Fran Berman, Geoffrey Fox and Tony Hey, John Wiley & Sons, Chichester, England, ISBN 0-470-85319-0, March 2003 • This includes other major projects like Cactus, NetSolve, Ninf • Some 6 Core and other application specific UK e-Science Projects
Architecture and Style 8.1 Basic Technology Runtime and Hosting Environment 8.2 PortalsPSE’s 8.10 Information 8.7Compute/File 8.8 Grid Services: Application Specific Resource Specific Generic Security 8.3 Workflow 8.4 Notification 8.5 Meta-data 8.6 Other 8.9 Information Compute Resources Network 8.11 Categorization of Technical Gaps and Grid Services Section Numbers in Report available Mid June
1) Types of Grid R3 Lightweight P2P Federation and Interoperability 2) Core Infrastructure and Hosting Environment Service Management Component Model Service wrapper/Invocation Messaging 3) Security Services Certificate Authority Authentication Authorization Policy 4) Workflow Services and Programming Model Enactment Engines (Runtime) Languages and Programming Compiler Composition/Development 5) Notification Services 6) Metadata and Information Services Basic including Registry Semantically rich Services and meta-data Information Aggregation (events) Provenance 7) Information Grid Services OGSA-DAI/DAIT Integration with compute resources P2P and database models 8) Compute/File Grid Services Job Submission Job Planning Scheduling Management Access to Remote Files, Storage and Computers Replica (cache) Management Virtual Data Parallel Computing 9) Other services including Grid Shell Accounting Fabric Management Visualization Data-mining and Computational Steering Collaboration 10) Portals and Problem Solving Environments 11) Network Services Performance Reservation Operations Categories of Worldwide Grid Services
Features of Worldwide Grid Services • UK activities have a strong web service and Information Grid emphasis • Important compute/file activities as well (White Rose, RealityGrid, UK part of EDG etc.) • Non UK activities are dominantly focused on compute/file Grids • Submit jobs in distributed UNIX shell (Gridshell) fashion • Gather data from instruments (accelerator, satellite, medical device); process in batch mode mapping between filesets • Little emphasis on lightweight or R3 Grids but NSF in USA and EDG have aimed at better support and software quality • EDG has useful “tension” between technology and application focus working groups • NMI and even GT3 have changed packaging and added service view – have not changed “underlying” architecture for robustness • Coordinated set of Portal activities in USA • Little work on integrating parallel computing and Grid although TeraGrid in USA could change this
Central Gaps:Gaps in Grid Styles and Execution Environment • Need for both robust (fault tolerant) and lightweight (suitable for small groups) Grid styles identified • Peer-to-peer style supports smaller decentralized virtual organizations • Note opportunities for modern middleware ideas to be used – lightweight, message-based • Note that Enterprise JavaBeans not optimized for Science which has high volume dataflow • Federated Grid Architecture natural for integration of heterogeneous functionality, style and security • Bioinformatics and other fields require integration of Information and Compute/File Grids
R1 R2 Enterprise Grid Dynamic light-weight Peer-to-peer Collaboration Training Grid Students Information Grid Compute Grid Campus Grid Teacher Overlapping Heterogeneous Dynamic Grid Islands
(a) Layered OGSA Grid Application Service Application Service Application Service Core Service Core Service Core Service Core Service OGSA Interface (b) Federated OGSA Grid Appl. Service Appl. Service Appl. Service Appl. Service Core Service Core Service Core Service Core Service Core Service Core Service Grid-1 Grid-2 OGSA Mediation OGSA or non OGSA Interface-1 OGSA or non OGSA Interface-2
Many Gaps in Generic Services • Some gaps like Workflow and Notification are to make production versions of current projects • Just in UK workflow from DAME, DiscoveryNet, EDG, Geodise, ICENI, myGrid, Unicore plus Cardiff, NEReSC …. • RGMA and Semantic Grid offer improved meta-data and Information services compared to UDDI and MDS (Globus) • Need comprehensive federated Information service • Security requires architecture supporting dynamic fine-grain authorization • UK e-Science has pioneered Information Grids but gap is continuation of OGSA-DAI, integration with other services and P2P decentralized models • Functionality of Compute/File Grids quite advanced but services probably not robust enough for LCG or Campus Grids
Gaps in Other Grid services • Portals and User Interfaces – Noted gap that many not using Grid Computing Environment “best practice” with component based user-interfaces matching component-based middleware • Programming Models (using workflow runtime) • Fabric Management (should be integrated with central service management and Information system), Computational Steering, Visualization, Datamining, Accounting, Gridmake, Debugging, Semantic Grid tools (consistent with Information system), Collaboration, provenance • Application-specific services • Note new production central Infrastructure can support both research and production services of this type
PPPH: Paradigms Protocols Platforms and Hosting I • We will start from the Web view and assert that basic paradigm is • Meta-data rich Web Services communicating via messages • These have some basic support from some runtime such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3) • These are the distributed equivalent of operating system functions as in UNIX Shell • Called Hosting Environment or platform
OGSA OGSI & Hosting Environments • Start with Web Services in a hosting environment • Add OGSI to get a Grid service and a component model • Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities
Functional Level above OGSA • Systems Management and Automation • Workload / Performance Management • Security • Availability / Service Management • Logical Resource Management • Clustering Services • Connectivity Management • Physical Resource Management • Perhaps Data Access belongs here
OGSI Open Grid Service Interface • http://www.gridforum.org/ogsi-wg • It is a “component model” for web services. • It defines a set of behavior patterns that each OGSI service must exhibit. • Every “Grid Service” portType extends a common base type. • Defines an introspection model for the service • You can query it (in a standard way) to discover • What methods/messages a port understands • What other port types does the service provide? • If the service is “stateful” what is the current state? • A set of standard portTypes for • Message subscription and notification • Service collections • Each service is identified by a URI called the “Grid Service Handle” • GSHs are bound dynamically to Grid Services References (typically wsdl docs) • A GSR may be transient. GSHs are fixed. • Handle map services translate GSHs into GSRs.
OGSI and Stateful Services • Sometimes you can send a message to a service, get a result and that’s the end • This is a statefree service • However most non-trivial services need state to allow persistent asynchronous interactions • OGSI is designed to support Stateful services through two mechanisms • Information Port: where you can query for SDE (Service Definition Elements) • “Factories” that allow one to view a Service as a “class” (in an object-oriented language sense) and create separate instances for each Service invocation • There are several interesting issues here • Difference between Stateful interactions and Stateful services • System or Service managed instances
Factories and OGSI 1 1 F A C T O R Y F A C T O R Y 2 2 3 3 4 4 • Stateful interactions are typified by amazon.com where messages carry correlation information allowing multiple messages to be linked together • Amazon preserves state in this fashion which is in fact preserved in its database permanently • Stateful services have state that can be queried outside a particular interaction • Also note difference between implicit and explicit factories • Some claim that implicit factories scale as each service manages its own instances and so do not need to worry about registering instances and lifetime management • See WS-Addressing from largely IBM and Microsofthttp://msdn.microsoft.com/webservices/default.aspx?pull=/library/en-us/dnglobspec/html/ws-addressing.asp Explicit Factory Implicit Factory
Two-level Programming I Nugget Data • The paradigm implicitly assumes a two-level Programming Model • We make a Service (same as a “distributed object” or “computer program” running on a remote computer) using conventional technologies • C++ Java or Fortran Monte Carlo module • Data streaming from a sensor or Satellite • Specialized (JDBC) database access • Such nuggets accept and produce data from users files and databases • The Grid is built by coordinating such nuggets assuming we have solved problem of programming the nugget
Two-level Programming II Nugget1 Nugget3 Nugget2 Nugget4 • The Grid is discussing the linkage and distribution of the nuggets with the onlyaddition runtime interfaces to Grid as opposed to UNIX data streams • Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs • Such interpretative environments are the single processor analog of Grid Programming or Workflow • Some projects like GrADS from Rice University are looking at integration between nugget levels but dominant effort looks at each level separately
Portal Services SystemServices SystemServices Application Service Application Metadata Middleware SystemServices SystemServices SystemServices Raw (HPC) Resources Actual Application Database UserServices GridComputingEnvironments “Core”Grid
PPPH: Paradigms Protocols Platforms and Hosting II • Self-describing programs/interfaces are key to scaling • Minimize amount of work system has to do • Hide as much as possible in services and applications • Protocols describe (in “principle” at least) those rules that system obeys and uses to deliver information between services (processes) • Interfaces tell the service what to do to interpret the results of communication • HTTP is the dominant transport protocol of the Web • HTML is the “interface” telling browser how to render • But you can extend interface to allow PDF, multimedia, PowerPoint using “helper applications” which are (with more or less convenience) “automatically” downloaded if not already available • “Mime types” essentially self-describe” each interface
Protocol/Interface Analogy with Web II • HTTP and HTML are the analogies on the client side • A “Web Service” generalizes a CGI Script on server side • CGI is essentially a Distributed Object technology allowing server to access an arbitrary program labeled by a URL plus an ugly syntax to specify name and parameters of program to run • Roughly WSDL (Web Service Description Language) is a better way to specify program name and its parameters • Web uses other protocols – HTTPS for secure links and RTP etc. for multimedia (UDP) streams • These again are required to integrate system – codecs like MPEG are interfaces interpreted by client • There are further protocols like H323 and SIP which will be replaced (IMHO) by HTTP plus RTP etc. We should minimize number of protocols to get maintainable systems
PPPH: Paradigms Protocols Platforms and Hosting III • There are set of system capabilities which cannot be captured as standalone services and permeate Grid • Meta-data rich Message-linked Web Services is permeating paradigm • Component Model such as “Enterprise JavaBean (EJB)” or OGSI describes the formal structure of services – EJB if used lives inside OGSI in our Grids • Invocation Framework describes how you interact with system • Security in fine grain fashion to provide selective authorization (Globus and EDG WP6) • Policy context describes rules for this particular Grid • Transport mechanisms abstract concepts like ports and Quality of Service • Messaging abstracts destination and customization of content • Network (monitoring, performance) EDG WP7 • Fabric (resources) EDG WP4
Architecture in Pictures I Services Abstract Model OGSI Messaging Services Hosting Environment determines physical model Messaging Network Resources ABSTRACT AC TUAL Invocation Framework
Architecture in Pictures IIOGSA Interoperable Grid OGSA Interfaces Exposed by every OGSI Grid Services Messaging Network Monitoring and Scheduling Network Resources
Architecture in Pictures IIIOGSA Federated Grid Native Services not necessarilyOGSI Messaging Network Monitoring and Scheduling Network Resources Mediation Serviceconverting between OGSA and “native” services Mediation Service
Standards Compliant InterGrids Federation Environment JXTA AWS1 Platform AWS3 Service1 GT3 Grid Jini Grid IBM Grid Avaki Grid AWS2 Resource1 Resource2 Service2 Federation/Interoperability Problem? • Have a collection of Web Services running in Grids defined by different suppliers? • Interoperability – “particular application Web Service of supplier X” can utilize “core service of supplier Y” • Federation– “core service of supplier X” can be integrated with “core service of supplier Y” to provide a integration/amalgam that is also a realization of core service. Need mediation to link different Grid Islands
Resource Resource Resource Resource Resource Resource R Grid Instance Grid Instance Grid Instance Grid Instance Service Service Service Service Service Service Service Service M Resource Resource Service Service Service Service M R M R M R M R M R Resource Resource M M Resource Resource R Resource Resource Resource Resource Federation Architecture Routing Node Mediation Node
Virtualization • The Grid could and sometimes does virtualize various concepts • Location: URI (Universal Resource Identifier) virtualizes URL • Replica management (caching) virtualizes file location generalized by GriPhyn virtual data concept • Protocol: message transport and WSDL bindings virtualize transport protocol as a QoS request • P2P or Publish-subscribe messaging virtualizes matching of source and destination services • Semantic Grid virtualizes Knowledge as a meta-data query • Brokering virtualizes resource allocation • Virtualization implies references can be indirect
IFS: Interfaces and Functionality and Semantics I • The Grid platform tries to minimize detail in protocols and maximize detail in interfaces to enhance scaling • However rich meta-data and semantics are critical for correct and interesting operation • Put as much semantic interpretation as you can into specific services • Lack of Semantic interoperation is in fact main weakness of today’s Grids and Web services • Everything becomes a service (See example of education) whether system or application level • There are some very important “Global Services” • Discovery (look up) and Registration of service metadata • Workflow • MetaSchedulers
IFS: Interfaces and Functionality and Semantics II • There are many other generally important services • OGSA-DAI The Database Service • Portal Service linked to by WSRP (Web services for Remote Portals) • Notification of events • Job submission • Provenance – interpret meta-data about history of data • File Interfaces • Sensor service – satellites … • Visualization • Basic brokering/scheduling
OGSA/OGSI Top Level View Domain - specific services Other models More specialized services: data replication, workflow, etc., etc. Broadly applicable services: registry, authorization, monitoring, data & other entities Models for resources access, etc., etc. OGSI Host. Env . & Protocol Bindings Hosting Environment Transport Hosting Environment Protocol Chapters 7 to 9 of Book http://www.gridforum.org/Meetings/ggf7/docs/default.htm http://www.globusworld.org/globusworld_web/jw2_program_tut.htm • OGSA is the set of “core” Grid services • Stuff you can’t live without • If you built a Grid you would need to invent these things
Open Grid Service Architecture • OGSA-WG chaired by • Ian Foster, ANL and Univ. of Chicago • Jeff Nick, IBM • Dennis Gannon, IU • Active Members from • IBM, Fujitsu, NEC, SUN, Hitachi, Avaki • Univ. of Mich, Chicago, Indiana (not much academic involvement)
OGSA Core Services I • Registries, and namespace bindings • Registry is a collection of services indexed by service metadata. • “find me a service with property X.” • Directory is a map from a namespace to GSHs. • A namespace is a human understandable version of a Grid Handle • Queues • For building schedulers and resource brokers • Jobs and other requests are in queues • This is high-level messaging
Security • Base this on Web Services Security • Authentication • 2-way. Who are you and who am I? • Authorization • What am I authorized to use/see/modify • Accounting/Billing • (not really security – see monitoring) • Privacy • Group Access • Easily create a group to share access to a virtual Grid. • Very complex issues related to services and message delivery.
Common Resource Model • Every resource on the grid that is manageable is represented by a service instance • CRM is the Schema hierarchy that defines each resource (with its meta-data) • Service for a resource presents its management interface to authorized parties.
Policy Management • Policy management services • Mechanism to publish policy and the services it applies to. • Policy life-cycle mgmt. • Policy languages exist for routing, security, resource use
Grid Service Orchestration • Creating new services by composing other services • Two types of Orchestration • Composition in space • One services is directly invoking another • Composition in time • Managing the workflow • First do this. • Then do this and that • When that is done do this • If something goes wrong do this • And so on…
Data Services • Distributed Data Access • Data Caching • Data Replication Services • Metadata Catalog Services • Storage Services
Metering Resource Consumption • At what granularity do services report resource consumption? • How do they report it? • How are services metered?
Transactions • Two threads/workflows must synchronize and agree they have done so before moving on. • Usually involves modification to two or more persistent states • WS-transactions has been “proposed”.