1 / 80

The Globus Toolkit™: and its application to GryPhyN

The Globus Toolkit™: and its application to GryPhyN. Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University of Southern California. Outline. Overview of the Globus toolkit Application of Globus to virtual data problem (GriPhyN)

kesia
Télécharger la présentation

The Globus Toolkit™: and its application to GryPhyN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Globus Toolkit™: and its application to GryPhyN Carl Kesselman Director of the Center for Grid Technologies Information Sciences Institute University of Southern California

  2. Outline • Overview of the Globus toolkit • Application of Globus to virtual data problem (GriPhyN) • Open Grid Services Architecture EO Grid Workshop

  3. Partial Acknowledgements • Open Grid Services Architecture design • Karl Czajkowski @ USC/ISI • Ian Foster, Steve Tuecke @ANL • Jeff Nick, Steve Graham, Jeff Frey @ IBM • Grid services collaborators at ANL • Kate Keahey, Gregor von Laszewski • Thomas Sandholm, Jarek Gawor, John Bresnahan • Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org) • Strong links with many EU, UK, US Grid projects • Support from DOE, NASA, NSF, Microsoft EO Grid Workshop

  4. Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations The Grid Problem EO Grid Workshop

  5. Grid Computing Concept • New applications enabled by the coordinated use of geographically distributed resources • E.g., distributed collaboration, data access and analysis, distributed computing • Persistent infrastructure for Grid computing • E.g., certificate authorities and policies, protocols for resource discovery/access • Original motivation, and support, from high-end science and engineering; but has wide-ranging applicability EO Grid Workshop

  6. Grids: Why Now? • Moore’s law Þ highly functional end-systems • Ubiquitous Internet Þ universal connectivity • Network exponentials produce dramatic changes in geometry and geography • 9-month doubling: double Moore’s law! • 1986-2001: x340,000; 2001-2010: x4000? • New modes of working and problem solving emphasize teamwork, computation • New business models and technologies facilitate outsourcing EO Grid Workshop

  7. The Grid World: Current Status • Dozens of major Grid projects in scientific & technical computing/research & education • Deployment, application, technology • Considerable consensus on key concepts and technologies • Open source Globus Toolkit™ a de facto standard for major protocols & services • Far from complete or perfect, but out there, evolving rapidly, and large tool/user base • Global Grid Forum a significant force • Industrial interest emerging rapidly EO Grid Workshop

  8. Application Internet Protocol Architecture “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link Layered Grid Architecture(By Analogy to Internet Architecture) Application EO Grid Workshop

  9. Globus Toolkit • Globus Toolkit is the source of many of the protocols described in “Grid architecture” • Adopted by almost all major Grid projects worldwide as a source of infrastructure • Open source, open architecture framework encourages community development • Active R&D program continues to move technology forward • Developers at ANL, USC/ISI, NCSA, LBNL, and other institutions www.globus.org EO Grid Workshop

  10. Globus ToolkitComponents Include … • Core protocols and services • Grid Security Infrastructure • Grid Resource Access & Management • MDS information & monitoring • GridFTP data access & transfer • Other services • Community Authorization Service • DUROC co-allocation service • Other Data Grid technologies • Replica catalog, replica management service EO Grid Workshop

  11. Job manager Job manager Globus Toolkit Structure Service naming Soft state management Reliable invocation GRAM MDS GridFTP MDS ??? Notification GSI GSI GSI Other Service or Application Compute Resource Data Resource EO Grid Workshop

  12. MDS-2 (Meta Directory Service) Soft state registration; enquiry Reliable remote invocation GSI (Grid Security Infrastruc-ture) User Reporter(registry +discovery) GIIS: GridInformationIndex Server (discovery) Gatekeeper(factory) Authenticate & create proxy credential Other GSI-authenticated remote service requests Create process Register User User process #1 process #2 Other service(e.g. GridFTP) Proxy Proxy #2 GRAM (Grid Resource Allocation & Management) The Globus Toolkit in One Slide • Grid protocols (GSI, GRAM, …) enable resource sharing within virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services) • Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, … EO Grid Workshop

  13. GriPhyN Project Goals • Amplify science productivity through the Grid • Provide powerful abstractions for scientists:datasets and transformations, not files and programs • Using a grid is harder than using a workstation. GriPhyN seeks to reverse this situation! • These goals challenge the boundaries of computer science in knowledge representation and distributed computing. • Apply these advances to major experiments • Not just developing solutions, but proving them through deployment EO Grid Workshop

  14. GriPhyN Approach • Virtual Data • Tracking the derivation of experiment data with high fidelity • Transparency with respect to locationand materialization • Automated grid request planning • Advanced, policy driven scheduling • Achieve this at peta-scale magnitude • We present here a vision that is still 3 years away, but the foundation is starting to come together EO Grid Workshop

  15. Virtual Data • Track all data assets • Accurately record how they were derived • Encapsulate the transformations that produce new data objects • Interact with the grid in terms of requests for data derivations EO Grid Workshop

  16. Request Automation • Request Planning and Execution • High performance • Grid resources are used in efficient ways for high throughput and/or fast response • Based on policy • Policy specifies how resources should be used and how workloads should be treated • Fault tolerant • It’s a grid – so failures are normal • Transparent to the user • Make the grid like a workstation EO Grid Workshop

  17. GriPhyN Challenge Problem:CMS Event Reconstruction 2) Launch secondary job on WI pool; input files via Globus GASS Master Condor job running at Caltech Secondary Condor job on WI pool 5) Secondary reports complete to master Caltech workstation 6) Master starts reconstruction jobs via Globus jobmanager on cluster 3) 100 Monte Carlo jobs on Wisconsin Condor pool 9) Reconstruction job reports complete to master 4) 100 data files transferred via GridFTP, ~ 1 GB each 7) GridFTP fetches data from UniTree NCSA Linux cluster NCSA UniTree - GridFTP-enabled FTP server 8) Processed objectivity database stored to UniTree Work of: Scott Koranda, Miron Livny, Vladimir Litvin, & others EO Grid Workshop

  18. Why is this useful? • Easier to FIND the data • A disciplined approch to tracking massive amounts of data • Can PRODUCE and analyze data easier • Automate details of data production • Can VALIDATE scientific results accurately • Can SHARE data easier • Can produce and analyze MORE data FASTER • Leverage huge storage and computing resources EO Grid Workshop

  19. Why is this hard? • Data derivation tracking • Diversity of transformations • Achieving fidelity of reproduction • Many modes of data storage • Automated request planning • Multiple levels of resource sharing and allocation policy • Faults are the norm in large grids • Resources are constantly in flux • An OS the size of the planet! • Peta-Scale performance level EO Grid Workshop

  20. The Virtual Data Model • Data suppliers publish data to the Grid • Users request raw or derived data from Grid, without needing to know • Where data is located • Whether data is stored or computed on demand • User and applications can easily determine • What it will cost to obtain data • Quality of derived data • Virtual Data Grid serves requests efficiently, subject to global and local policy constraints EO Grid Workshop

  21. file1 file1 File3,4,5 GriPhyN: Virtual DataTracking Complex Dependencies psearch –t 10 … file1 file8 simulate –t 10 … • Dependency graph is: • Files: 8 < (1,3,4,5,7), 7 < 6, (3,4,5,6) < 2 • Programs: 8 < psearch, 7 < summarize,(3,4,5) < reformat, 6 < conv, (1,2) < simulate file2 reformat –f fz … Requestedfile file7 conv –I esd –o aod summarize –t 10 … file6 EO Grid Workshop

  22. file1 file1 File3,4,5 Re-creating Virtual Data psearch –t 10 … file1 file8 simulate –t 10 … • To recreate file 8: Step 1 • simulate > file1, file2 file2 reformat –f fz … Requestedfile file7 conv –I esd –o aod summarize –t 10 … file6 EO Grid Workshop

  23. file1 file1 File3,4,5 Re-creating Virtual Data psearch –t 10 … file1 file8 simulate –t 10 … • To re-create file8: Step 2 • files 3, 4, 5, 6 derived from file 2 • reformat > file3, file4, file5 • conv > file 6 file2 reformat –f fz … Requestedfile file7 conv –I esd –o aod summarize –t 10 … file6 EO Grid Workshop

  24. file1 file1 File3,4,5 Re-creating Virtual Data psearch –t 10 … file1 file8 simulate –t 10 … • To re-create file 8: step 3 • File 7 depends on file 6 • Summarize > file 7 file2 reformat –f fz … Requestedfile file7 conv –I esd –o aod summarize –t 10 … file6 EO Grid Workshop

  25. file1 file1 file1 File3,4,5 file2 reformat –f fz … conv –I esd –o aod file6 Re-creating Virtual Data psearch –t 10 … file8 simulate –t 10 … • To re-create file 8: final step • File 8 depends on files 1, 3, 4, 5, 7 • psearch < file1, file3, file4, file5, file 7 > file 8 Requestedfile file7 summarize –t 10 … EO Grid Workshop

  26. MCAT; GriPhyN catalogs MDS MDS GDMP DAGMAN, Kangaroo GSI, CAS Globus GRAM GridFTP; GRAM; SRM GriPhyN/PPDGData Grid Architecture Application DAG (abstract) Catalog Services Monitoring Planner Info Services DAG (concrete) Repl. Mgmt. Executor Policy/Security Reliable Transfer Service Compute Resource Storage Resource EO Grid Workshop

  27. (evolving) View of Data Grid Stack Publish-Subscribe Service (GDMP) Reliable Replication Storage Element Manager Reliable File Transfer Replica Location Service Data Transport (GridFTP) Local Repl Catalog (Flat or Hierarchical) Storage Element EO Grid Workshop

  28. Job Execution Site U of Chicago Globus GRAM GSI JobSumissionSitesANL, SC,… GridFTPClient Condor-G Agent Job Execution Site U of Florida Globus Client Globus GRAM GSI Local File Storage GridFTPClient GridFTPServer GSI Job Execution Site U of Wisconsin Globus GRAM GridFTPClient Simulate CMS Detector Response Simulate Physics Copy flat-file to OODBMS Simulate Digitization of Electronic Signals CondorPool CondorPool CondorPool Initial GriPhyN Virtual Data Implementation Architecture of the System: Virtual Data Catalog (PostgreSQL) Virtual Data Language VDL Interpreter (VDLI) Grid testbed Production DAG of Simulated CMS Data: EO Grid Workshop

  29. TRANSFORMATION /bin/physapp1 version 1.2.3b(2) created on 12 Oct 1998 owned by physbld.orca DERIVATION ^ paramlist ^ transformation FILE LFN=filename1 PFN1=/store1/1234987 PFN2=/store9/2437218 PFN3=/store4/8373636 ^derivation FILE LFN=filename2 PFN1=/store1/1234987 PFN2=/store9/2437218 ^derivation Virtual Data CatalogConceptual Data Structure PARAMETER LIST PARAMETER i filename1 PARAMETER p -g PARAMETER E PTYPE=muon PARAMETER O filename2 EO Grid Workshop

  30. Planner considers: Policy (fairly static, from CAS/SAS) Grid resource status: state, load Job (user/group) resource consumption history Job profiles (resources over time) from Prophesy Planner Decision Making policy Prohphesy Job Profile Status planner (predictor) Records Job Usage Accounting info Records Job Profiling Data EO Grid Workshop

  31. Job A Job B Job C Job D Executor Example: Condor DAGMan • Directed Acyclic Graph Manager • Specify the dependencies between Condor jobs using DAG data structure • Manage dependencies automatically • (e.g., “Don’t run job “B” until job “A” has completed successfully.”) • Each job is a “node” in DAG • Any number of parent or children nodes • No loops Slide courtesy Miron Livny, U. Wisconsin EO Grid Workshop

  32. DAGMan A Condor Job Queue B B C C D Executor Example: Condor DAGMan (Cont.) • DAGMan acts as a “meta-scheduler” • holds & submits jobs to the Condor queue at the appropriate times based on DAG dependencies • If a job fails, DAGMan continues until it can no longer make progress and then creates a “rescue” file with the current state of the DAG • When failed job is ready to be re-run, the rescue file is used to restore the prior state of the DAG Slide courtesy Miron Livny, U. Wisconsin EO Grid Workshop

  33. DAG Usage • Abstract DAG • Represents user requests • Simplest case: request for one or more data product • Complex case: request execution of a chained set of applications • No file or execution locations need be present • Concrete DAG • Specifies any application invocations needed to derive data • Specifes locations of all invocations (to the site level) • Includes explicit job steps to move data EO Grid Workshop

  34. begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_fileendbegin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_fileendbegin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_fileendbegin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_fileendbegin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_dbendbegin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_dbend pythia_input CMS Pipeline in VDL pythia.exe cmsim_input cmsim.exe writeHits writeDigis EO Grid Workshop

  35. Caltech Tier2 San Diego Tier2    GriPhyN CMS SC2001 Demo http://pcbunn.cacr.caltech.edu/Tier2/Tier2_Overall_JJB.htm Full Event Database of ~40,000 large objects Denver Client Full Event Database of ~100,000 large objects    Request  Request   Parallel tuned GSI FTP Parallel tuned GSI FTP “Tag” database of ~140,000 small objects Bandwidth Greedy Grid-enabled Object Collection Analysis for Particle Physics Work of: Koen Holtman, J.J. Bunn, H. Newman, & others EO Grid Workshop

  36. SDSS Galaxy Cluster Finding EO Grid Workshop

  37. field field catalog cluster tsObj core brg tsObj field field brg brg core tsObj tsObj brg 1 5 4 3 2 1 1 3 2 2 1 2 Cluster-finding Data Pipeline EO Grid Workshop

  38. Cluster-finding Grid Work of: Yong Zhao, James Annis, & others EO Grid Workshop

  39. GriPhyN-LIGO SC2001 Demo Work of: Ewa Deelman, Gaurang Mehta, Scott Koranda, & others EO Grid Workshop

  40. Globus Toolkit: Evaluation (+) • Good technical solutions for key problems, e.g. • Authentication and authorization • Resource discovery and monitoring • Reliable remote service invocation • High-performance remote data access • This & good engineering is enabling progress • Good quality reference implementation, multi-language support, interfaces to many systems, large user base, industrial support • Growing community code base built on tools EO Grid Workshop

  41. Globus Toolkit: Evaluation (-) • Protocol deficiencies, e.g. • Heterogeneous basis: HTTP, LDAP, FTP • No standard means of invocation, notification, error propagation, authorization, termination, … • Significant missing functionality, e.g. • Databases, sensors, instruments, workflow, … • Virtualization of end systems (hosting envs.) • Little work on total system properties, e.g. • Dependability, end-to-end QoS, … • Reasoning about system properties EO Grid Workshop

  42. Job manager Job manager Globus Toolkit Structure Service naming Soft state management Reliable invocation GRAM MDS GridFTP MDS ??? Notification GSI GSI GSI Other Service or Application Compute Resource Data Resource Lots of good mechanisms, but (with the exception of GSI) not that easily incorporated into other systems EO Grid Workshop

  43. Open Grid Services Architecture • Service orientation to virtualize resources • Define fundamental Grid service behaviors • Core set required, others optional • A unifying framework for interoperability & establishment of total system properties • Integration with Web services and hosting environment technologies • Leverage tremendous commercial base • Standard IDL accelerates community code • Delivery via open source Globus Toolkit 3.0 • Leverage GT experience, code, mindshare EO Grid Workshop

  44. “Web Services” • Increasingly popular standards-based framework for accessing network applications • W3C standardization; Microsoft, IBM, Sun, others • WSDL: Web Services Description Language • Interface Definition Language for Web services • SOAP: Simple Object Access Protocol • XML-based RPC protocol; common WSDL target • WS-Inspection • Conventions for locating service descriptions • UDDI: Universal Desc., Discovery, & Integration • Directory for Web services EO Grid Workshop

  45. Web Services Example:Database Service • WSDL definition for “DBaccess” porttype defines operations and bindings, e.g.: • Query(QueryLanguage, Query, Result) • SOAP protocol • Client C, Java, Python, etc., APIs can then be generated DBaccess EO Grid Workshop

  46. Transient Service Instances • “Web services” address discovery & invocation of persistent services • Interface to persistent state of entire enterprise • In Grids, must also support transient service instances, created/destroyed dynamically • Interfaces to the states of distributed activities • E.g. workflow, video conf., dist. data analysis • Significant implications for how services are managed, named, discovered, and used • In fact, much of our work is concerned with the management of service instances EO Grid Workshop

  47. The Grid Service =Interfaces + Service Data Reliable invocation Authentication Service data access Explicit destruction Soft-state lifetime Notification Authorization Service creation Service registry Manageability Concurrency GridService … other interfaces … Service data element Service data element Service data element Implementation Hosting environment/runtime (“C”, J2EE, .NET, …) EO Grid Workshop

  48. Open Grid Services Architecture:Fundamental Structure 1) WSDL conventions and extensions for describing and structuring services • Useful independent of “Grid” computing 2) Standard WSDL interfaces & behaviors for core service activities • portTypes and operations => protocols EO Grid Workshop

  49. WSDL Conventions & Extensions • portType (standard WSDL) • Define an interface: a set of related operations • serviceType (extensibility element) • List of port types: enables aggregation • serviceImplementation (extensibility element) • Represents actual code • service (standard WSDL) • instanceOf extension: map descr.->instance • compatibilityAssertion (extensibility element) • portType, serviceType, serviceImplementation EO Grid Workshop

  50. instanceOf instanceOf instanceOf instanceOf … serviceImplementation serviceImplementation cA … serviceType cA serviceType cA cA compatibilityAssertion = Structure of a Grid Service … … service service service service Service Instantiation Service Description … PortType PortType PortType = Standard WSDL EO Grid Workshop

More Related