1 / 41

Web Service Grids for iSERVO

Web Service Grids for iSERVO. International Workshop on Geodynamics: Observation, Modeling and Computer Simulation University of Tokyo Japan October 14 2004 Geoffrey Fox Community Grids Lab Indiana University gcf@indiana.edu. e-Infrastructure.

jjanes
Télécharger la présentation

Web Service Grids for iSERVO

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Service Grids for iSERVO International Workshop on Geodynamics: Observation, Modeling and Computer Simulation University of Tokyo Japan October 14 2004 Geoffrey Fox Community Grids Lab Indiana University gcf@indiana.edu

  2. e-Infrastructure • e-Infrastructure builds on the inevitable increasing performance of networks and computers linking them together to support new flexible linkages between computers, data systems and people • Grids and peer-to-peer networks are the technologies that build e-Infrastructure • e-Infrastructure called CyberInfrastructure in USA • We imagine a sea of conventional local or global connections supported by the “ordinary Internet” • Phones, web page accesses, plane trips, hallway conversations • Conventional Internet technology manages billions of broadcast or low (one client to Server) or broadcast links • On this we superimpose high value multi-way organizations (linkages) supported by Grids with optimized resources and system support and supporting virtual (electronic) enterprises • Low multiplicity fully interactive real-time sessions • Resources such as databases supporting (larger) communities

  3. Web services • Web Services build loosely-coupled, distributed applications, (wrapping existing codes and databases) based on the SOA (service oriented architecture) principles. • Web Services interact by exchanging messages in SOAP format • The contracts for the message exchanges that implement those interactions are described via WSDL interfaces.

  4. What is a Grid? • You won’t find a clear description of what is Grid and how does differ from a collection of Web Services • I see no essential reason that Grid Services have different requirements than Web Services • Geoffrey Fox, David Walker, e-Science Gap Analysis, June 30 2003. Report UKeS-2003-01, http://www.nesc.ac.uk/technical_papers/UKeS-2003-01/index.html. • Notice “service-building model” is like programming language – very personal! • Grids were once defined as “Internet Scale Distributed Computing” but this isn’t good as Grids depend as much if not more on data as well as simulations • So Grids can be termed “Internet Scale Distributed Services” and represent a way of collecting services together to solve problems where special features and quality of service needed.

  5. Community Resources • Grid Community databases have analogy to Television and the News Web that allow individuals to communicate instantly with each other via Web Pages and Headline News acting as proxies • N resources deposit information and N can view – Complexity O(N)

  6. Large and Small Grids • N resources in a community (N is billions for the world and 1000-10000 for many scientific fields) • Communities are arranged hierarchically with real work being done in “groups” of M resources – M could be 10-100 in e-Science • Metcalfe’s law: value of network grows like square of number of nodes M – we call Grids where this true Metcalfe or M2 Grids • Nature of Interaction depends on size of M or N • Shared Information O(N) Complexity Grids for largish N • Complexity M2 Metcalfe Grids for smaller M < N • Grids must merge with peer-to-peer networks to support both Complexity O(N) and M2 Systems

  7. M2Interactions • Superimpose M2 “Grids” on the sea (heatbath) of O(N) “ordinary” interactions

  8. Field Trip Data Database ? GISGrid Discovery Services RepositoriesFederated Databases Streaming Data Sensors Database Sensor Grid Database Grid Research Education SERVOGrid Compute Grid Customization Services From Researchto Education Data FilterServices ResearchSimulations Analysis and VisualizationPortal EducationGrid Computer Farm Geoscience Research and Education Grids

  9. Grids and Earthquake Science • Complexity N ≈ 1000 to 10000 Community resources building • Thousands of Data Servers of raw and curated data • Services filtering and mining data • Simulation Services • Visualization Services • Geographical Information Services • Registry and metadata Services • These services can support several communities • National and International earth science researchers • Emergency response and critical infrastructure planning and management • Web Services will harmonize different countries (SERVO to iSERVO) • Web Services will harmonize members of a community and between communities with common resources • Curation will bring data to interoperable certified form • National and International research collaborations analyzing particular ideas with many M2 Complexity Grids • Typically many closely knit groups of say around M=10-100 people and services

  10. (i)SERVO Web (Grid) Services • Programs:All applications wrapped as Services using proxy strategy • Job Submission: supports remote batch and shell invocations • Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh generation (Akira/Apollo) and visualization packages (RIVA, GMT). • File management: • Uploading, downloading, backend crossloading (i.e. move files between remote servers) • Remote copies, renames, etc. • Job monitoring • Workflow: Apache Ant-based remote service orchestration (NCSA) • Move towards a BPEL framework (can still implement with ANT) • Database services: support SQL queries • Expect Simpler version of OGSA-DAI (“Web Service-DAI”) Grid Database • Data services: support interactions with XML-based fault and surface observation data. • For simulation generated faults (i.e. from Simplex) • XML data model being adopted for common formats with translation services to “legacy” formats. • Migrating to Geography Markup Language (GML) descriptions.

  11. GUI-1 Service-1 Aggregation Portal GUI-N Service-N Integration of Services • Use OGCE Grid Portal Architecture to allow importing of existing Grid Services and their user interfaces • Can expect GGF activities like OGSA to define/refine interfaces and projects around the world to produce more powerful services which can easily be added replacing existing services • Geoscience Education Grid by transformations on research grid • Emergency Response and Planning Grids by adding real-time control/collaboration and GIS tools • These additions common to all crises

  12. Each Service has its own portlet Individual portlet for the Proxy Manager Use tabs or choose different portlets to navigate through interfaces to different services 2 Other Portlets

  13. File Filter Service Filter Service Key Grid Features of iSERVO • The service model avoids a lot of the security complications that have caused trouble in other simulation based Grids • We don’t support from the portal general computer logins – you can run Geofest and not rm –r * • Geographical Information Systems is key set of generally useful service • Currently largely file based but streams will become more important • Data moves directly between services and is not necessarily written to and read from files • Must support high performance (fast) streams Filebased Stream based

  14. Data Data Filter Filter Filter Data Filter Data OGSA-DAIGrid Services AnalysisControl Visualize Grid Data Filter Data Deluged ScienceComputing Architecture HPC Simulation Grid Data Assimilation Other Gridand Web Services Distributed Filters massage data For simulation Which is better use of money More compute nodes Or more Sensors?

  15. Geographical Information Service (GIS) Data Formats and Services • OpenGISConsortium (OGC) is an international group for defining GIS data formats and services. • Main data format language is the XML-based GML. • Subdivided into schemas for drawing maps, representing features, observations, … • First Step: design GML schemas and build specialized Web Services for GPS and Earthquake data. • OGC also defines services. • Services include Web Features Services, Web Map Services, • Next Step: Implement OGC compatible Web Services for this problem i.e. build a GIS Grid • Also build services to interact with QuakeTables Fault DB.

  16. QuakeTables+OGC Web Map Service Demo Intend to build OGC compatible map and feature services supporting high performance simulations

  17. WMS IS WFS WFS california fault data @gridnode1 WFS california boundary data @gridnode3 Grid Information Service Integrating GIS Web and Feature Services • Need to support dynamic feature services with different access restrictions (especially in iSERVO) and with high performance streams UDDI california river data @gridnode2

  18. Different Performance Issues for iSERVO • All systems are built of interlinked entities • Nature, Society, Grids and Parallel computing all link entities by messages • Most(all) complex systems have a hierarchical architecture • Grids link large macroscopic systems including sensors, databases, parallel computers • Parallel Computers consists of many desktop size nodes • Nodes have hierarchical memory structure with many cache levels • Systems have dimension d ≈ 2 to 3 • Communication bandwidth into a system of complexity C is proportional to C(1-1/d) (Bandwidth/C α C-1/d) • C(Grid Resource) = M C(Desktop) where M ≈ 1 to 1000 is typical number of nodes in simulation resource • Parallel Computers need gigabit or better internal node bandwidth and node to node latency of around a microsecond • Grids will have terabit bandwidth but latency is AT BEST a millisecond (nodes next to each other) and is better considered as 100 milliseconds or greater across countries • Need to improve Web Service technology as science needs more bandwidth than business!

  19. Service B Service A Messages Module B Module A 0.1 to 1000 millisecond latency Method Calls.001 to 1 millisecond Two ways of Linking Modules • Method based linkage of classic programming • Message based Grid and Service linkage

  20. Grid Programming Model Application (level 1 Programming) Fortran, C++, Java (Method based) Application Semantics (Metadata, Ontology) Level 2 “Programming” Semantic Web (Message based) Systems Metadata (Context, State) Basic WS-* Infrastructure Web Service 1 WS 2 WS 3 WS 4 Workflow (level 3) Programming Of Services AND Streams BPEL, HPSearch (Message based) All SERVOGrid capabilities are built as Web Services with 3 level programming model

  21. What is a Simple Service? • Take any system – it has multiple functionalities • We can implement each functionality as an independent distributed service • Or we can bundle multiple functionalities in a single service • Whether functionality is an independent service or one of many method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL • Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond” • Distributed services incur messaging overhead of one (local) to 100’s (far apart) of milliseconds to use message rather than method call • Use compiled integration of functionalities ONLY when require <1 millisecond interaction latency • Latency not bandwidth is criterion

  22. CPUs Clusters Compute Resource Grids Overlay and Compose Grids of Grids MPPs Methods Services Component Grids Federated Databases Databases Data Resource Grids Sensor Sensor Nets Grids of Grids of Simple Services • Link via methods  messages  streams • Services and Grids are linked by messages • Internally to service, functionalities are linked by methods • A simple service is the smallest Grid • We are familiar with method-linked hierarchyLines of Code  Methods  Objects  Programs  Packages

  23. Component Grids? • So we build collections of Web Services which we package as component Grids • Visualization Grid • Sensor Grid • Utility Computing Grid • Person (Community) Grid • Earthquake Simulation Grid • Control Room Grid • Crisis Management Grid • We build bigger Grids by composing component Grids using the Service Internet and Service Programming

  24. Electricity CIGrid Security Notification Workflow Messaging Flood CIGrid … … Earthquake CIGrid Flood Servicesand Filters Earthquake Services Portals Collaboration Grid Visualization Grid Sensor Grid GIS Grid Compute Grid Data Access/Storage Registry Metadata Core Grid Services Physical Network Critical Infrastructure (CI) Grids built as Grids of Grids of Services

  25. iSERVO Strategy • Agree on what (type of) resources and capabilities need to put on the ISERVO Grid • Computers, instruments, databases, visualization, maps, job submittal …. • Agree on interfaces to resources from OGSA-DAI (databases) to particular data structures (GML/OpenGIS) – specify in XML • Implement Resources and Capabilities as Services • User Interface should be a portlet that can be integrated by the portal into web interface • Make certain overarching Grid capabilities such as workflow, federation and metadata are sufficient • SERVO Grid is a prototype of this strategy using several US sites rather than several countries • Can be naturally extended to iSERVO, education, emergency response by extending resources • Web Service Architecture ensures continued interoperability and extensibility

  26. Further iSERVO Challenges • Make everything a Service • Understand algorithms and implementation for data assimilation • Agree on security and access control policies • Think about Data Curation • Set up policies for observational data and criteria for inclusion in iSERVO data repositories • Think about Data Provenance • Generate and maintain metadata describing ownership, origins and transformations • Applies to both “experimental data” and results from simulations (visualizations) • Curation and Provenance change in research methodologies and requires funding! • Education and Emergency Response/Planning interesting offshoots of iSERVO

  27. Architecture of (Web Service) Grids • Grids built from Web Services communicating through an overlay network built in SOFTWARE on the “ordinary internet” at the application level • A new Internet built with SOAP messages replacing TCP pockets • Grids provide the special quality of service (security, performance, fault-tolerance) and customized services needed for “distributed complex enterprises” • Developing Web Service compatible high bandwidth streaming transports • We need to work with Web Service community as they debate the 60 or so proposed Web Service specifications • Use Web Service Interoperability WS-I as “best practice” • Must add further specifications to support high performance • Database “Grid Services” for N plus N case • Streaming support for M2case

  28. Importance of SOAP • SOAP defines a very obvious message structure with a header and a body • The header contains information used by the “Internet operating system” • Destination, Source, Routing, Context, Sequence Number … • The message body is only used by the application and will never be looked at by “operating system” except to encrypt, compress it etc. • Much discussion in field revolves around what is in header! • e.g. WSRF adds a lot to header

  29. Web Services • Java is very powerful partly due to its many “frameworks” that generalize libraries e.g. • Java Media Framework • Java Database Connectivity JDBC • Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services” • Some 60 active WS-* specifications for areas such as • a. Core Infrastructure Specifications • b. Service Discovery • c. Security • d. Messaging • e. Notification • f. Workflow and Coordination • g. Characteristics • h. Metadata and State • i. User Interfaces

  30. WS-I Interoperability • Critical underpinning of Grids and Web Services is the gradually growing set of specifications in the Web Service Interoperability Profiles • Web Services Interoperability (WS-I) Interoperability Profile 1.0a." http://www.ws-i.org. gives us XSD, WSDL1.1, SOAP1.1, UDDI in basic profile and parts of WS-Security in their first security profile. • We imagine the “60 Specifications” being checked out and evolved in the cauldron of the real world and occasionally best practice identifies a new specification to be added to WS-I which gradually increases in scope • Note only 4.5 out of 60 specifications have “made it” in this definition

  31. Web Services Grids and WS-I+ • WS-I Interoperability doesn’t cover all the capabilities need to support Grids • WS-I+ is designed to minimal extension of WS-I to support “most current” Grids: it adds support for • Enhanced SOAP Addressing (WS-Addressing) • Fault tolerant (reliable) messaging • Workflow as in IBM-Microsoft standard BPEL • Security and Notification best practice and support will probably get added soon • There are Web Service frameworks here but various IBM v Microsoft v Globus differences to be resolved • UK OMII Open Middleware Infrastructure Institute is adopting this approach to support UK e-Science program • http://www.omii.ac.uk/

  32. Application Specific Grids Generally Useful Services and Grids Workflow WSFL/BPEL Service Management (“Context etc.”) Service Discovery (UDDI) / Information Service Internet Transport  Protocol Service Interfaces WSDL Higher Level Services ServiceContext ServiceInternet Base Hosting Environment Protocol HTTP FTP DNS … Presentation XDR … Session SSH … Transport TCP UDP … Network IP … Data Link / Physical Bit level Internet (OSI Stack) Layered Architecture for Web Services and Grids

  33. Working up from the Bottom • We have the classic (CISCO, Juniper ….) Internet routing the flood of ordinary packets in OSI stack architecture • Web Services build the “Service Internet” or IOI (Internet on Internet) with • Routing via WS-Addressing not IP header • Fault Tolerance (WS-RM not TCP) • Security (WS-Security/SecureConversation not IPSec/SSL) • Information Services (UDDI/WS-Context not DNS/Configuration files) • At message/web service level and not packet/IP address level • Software-based Service Internet possible as computers “fast” • Familiar from Peer-to-peer networks and built as a software overlay network defining Grid (analogy is VPN) • SOAP Header contains all information needed for the “Service Internet” (Grid Operating System) with SOAP Body containing information for Grid application service

  34. NaradaBrokering Audio/Video Conferencing Client Computer Modem Server Peers NaradaBrokering Broker Network Minicomputer Firewall Laptop computer Workstation Peers Audio/Video Conferencing Client PDA Web Service B Queues Stream Server-enhanced Messaging NB supports messages and streams

  35. NaradaBrokering and IOI • “Software Overlay Network” features • Support for Multiple Transport protocols • Support for multiple delivery mechanisms • Reliable Delivery • Exactly-once Delivery • Ordered Delivery • Optional Delivery optimization modules for different modes • Compression/Decompression of payloads with optional module • Coalescing/Fragmentation of payloads with optional module • NTP Time Service • Security Service • Performance Monitoring • Performance optimized routing with optional module • Support for WS-Reliability, WS-ReliableMessaging and their Federation

  36. Virtualizing Communication • Communication specified in terms of user goal and Quality of Service – not in choice of port number and protocol • Bit Internet Protocols have become overloaded e.g. MUST use UDP for A/V latency requirements but CAN’t use UDP as firewall will not support ……… • A given “Service Internet” communication can involve multiple transport protocols and multiple destinations – the latter possibly determined dynamically NB Brokers FastLink FirewallHTTP B1 SatelliteUDP A Hand-HeldProtocol B2 Software Multicast Dial-upFilter NB Broker B3 Client Filtering

  37. Performance Monitoring • Every broker incorporates a Monitoring service that monitors links originating from the node. • Every link measures and exposes a set of metrics • Average delays, jitters, loss rates, throughput. • Individual links can disable measurements for individual or the entire set of metrics. • Measurement intervals can also be varied • Monitoring Service, returns measured metrics to Performance Aggregator.

  38. Pure SOAP SOAP over UDP Binary over UDP Fast Web Service Communication I • IOI Application level Internet allows one to optimize message streams at the cost of “startup time”, Web Services can deliver the fastest possible interconnections with or without reliable messaging • Typical results from Grossman (UIC) comparing Slow SOAP over TCP with binary and UDP transport (latter gains a factor of 1000) 7020 5.60

  39. Fast Web Service Communication II • Mechanism only works for streams – sets of related messages • SOAP header in streams is constant except for sequence number (Message ID), time-stamp .. • One needs two types of new Web Service Specification • “WS-StreamNegotiation” to define how one can use WS-Policy to send messages at start of a stream to define the methodology for treating remaining messages in stream • “WS-FlexibleRepresentation” to define new encodings of messages

  40. Fast Web Service Communication III • Then use “WS-StreamNegotiation” to negotiate stream in Tortoise SOAP – ASCII XML over HTTP and TCP – • Deposit basic SOAP header through connection – it is part of context for stream (linking of 2 services) • Agree on firewall penetration, reliability mechanism, binary representation and fast transport protocol • Naturally transport UDP plus WS-RM • Use “WS-FlexibleRepresentation” to define encoding of a Fast transport (On a different port) with messages just having “FlexibleRepresentationContextToken”, Sequence Number, Time stamp if needed • RTP packets have essentially this structure • Could add stream termination status • Can monitor and control with original negotiation stream • Can generate different streams optimized for different end-points

  41. IU SERVO Grid Contributions • NaradaBrokering provides streaming support • Fault Tolerance • Support for High Performance Streams • Basic Dynamic Information Environment • Notification • Good progress with GIS Grid with OGC compatible Web Map and Web Feature Services linked to pervasive Grid Information and workflow services • HPSearch provides programming and management model for streams and services • Supports multi-scale iterations (moving between different models implemented as different services) workflow and data assimilation

More Related