Grid Basics CS-780-3 Notes

Grid BasicsCS-780-3 Notes In courtesy of Chaman Singh Verma

Power Generation Past: Till the end of 19th Century, power generation was considered a local luxury. Only rich could generate them into their backyards. Present: We take electricity for granted, without knowing the sources and complexities of distribution. We use the services and pay for it. Many of the countries provide high Quality of Service. Future: We want to break the 19th century model in computer usages. We want to provide a service model in computation and storage similar to power generation.

What is Grid ? Checklist A Grid is a system that • Coordinates resources that are not subject to centralized control (not for each single node) • Uses standard, open, general-purpose protocols and interfaces. • Provide high quality of services Reference: What is the Grid ? By Ian Foster

Grid: A Virtual Organization Grid resource sharing paradigm has greater scope than P2P system. Grid implicitly allow direct access to computers, software, data and any other resources. Both providers and consumers define clearly what they will share, who can share and conditions under which sharing will take place. A set of individuals and/or institutions defined by such sharing rules form what we call Virtual Organization.

Grid: An Evolution, not revolutionSource: IBM Grid Computing Grid can be seen as the latest and most complete evolution of more familiar development. • Like the Web: Grid keeps complexity hidden: multiple users enjoy a single unified experience. • Unlike the Web: enables full collaboration toward real business goal. • Like Peer-to-Peer It allows user to share files. • Unlike Peer-to-Peer Not only files, but everything which could be shared . • Like Clusters and distributed computing It bring computing resource together. • Unlike Clusters and distributed Computing Grid can be geographically distributed and heterogeneous. • Like Virtualization technologies enables virtualization of IT resources. • Unlike Virtualization technologies It can enable virtualization of vast and disparate resources.

Originally Targeted Applications What types of applications will grid be used for ? • Distributed Supercomputing • High-throughput Computing Cracking cryptosystems • On-demand Computing NetSolve, large archives • Data-Intensive Computing SloanDigital Sky Survey, Weather forecasting • Collaborative Computing Insors, GriPhyN, SciRUN

Grid Problem Defined: • Grid problem is defined as “Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations”. • The sharing raises many issues which were not addressed by distributed computing for example • How to structure flexible transient relationships. • How to structure fine grained access control over resources taking care of local and global policies. • How to agree on quality of service, scheduling and co-allocation.

Top 500 Supercomputers (June 2003) Earth Simulator: NEC : Yokohama : 35.86 TFlops ASCI Q: LANL: Los Alamos: HP Alphaserver SC: 13.88 TFlops MCR Linux Cluster: LLNL Livermore, 7.634 TFlops ASCI White: LLNL, Livermore IBM SP Power3, 7.304 TFlops Seaborg: NERSC/LBNL, Berkeley, IBM SP Power3, 7.303 TFlops Source : http://www.top500.org

Latest News Nov 8,2003 • Virginia Tech. Big Mac replaced 3rd position. It consists of 1100 Macintosh PCs and performed 17 TFlops.

General highlights from Top 500 (June 2003) • 157 systems reported to have peak performance above 1 TFlops. • Total accumulated performance is 375 TFlops. ( up from 293 TFlops ) • Entry level performance is 245.1 GFlops. (Up from 195.8) • A Total of 119 systems (up from 56) uses Intel processors. • 149 systems are now labeled as clusters ( up from 53 ) • 23 of them are self-made ( Up from 14 ) • Among top 10, 7 from US, 2 from Japan, 1 from France.

Economics and Control The infrastructures are very expensive and require years of hard work. The shear force of economics will require that these resources are under strict control and are optimally utilized. Many times freedom is costly and chaotic. This is the starting what we call Grid Computing

Changing face of Enterprise Computing • Most of the recent, enterprise systems are collection of heterogeneous resources. • Quality of services traditionally associated with mainframe centric computing are now essential to the effective conduct of e-business across distributed resources, inside as well as outside the enterprise. • Recently there is upsurge of services providers of various types such as web-hosting SP, storage SP, application SP All these require standardization.

Bird’s Eye view In the next few slides, we will get some broader picture followed by technical details.

Web Services Architecture Universal Description, Discovery and Integration (UDDI) allows us to find Web Services which meet certain requirements. Web Services Description Language Web-Services must be Self-describing and should Tell the invoker about operations it supports and How to invoke it. Simple Object Access Protocol Message passing between client and server using SOAP. Note: UDDI, WSDL, SOAP and HTTP are just an examples. Different implementations can use different technologies.

A Typical Web Service Invocation:

End User’s perspective

Stateless machines The above model is stateless. It can not remember what is done from one invocation to another. One client can mess up the another clients operations.

Factories: • The concept of factories solves the problems mentioned earlier. • Make Grid Stateful Machine • Create transient services

Web Service Application: Client and Server stubs are generated automatically from the specifications.

Technical Details: Service: A service is a network-enabled entity that provides a specific capability. ( example: the ability to move files, create processes or verifying access rights. Service = protocols + behavior Grid services are defined by OGSA ( Open Grid Services Architecture). (OpenGrid Forum) Grid services are specified by OGSI ( Open Grid Services Infrastructure) Globus Toolkit is the most popular open implementation of OGSA.

Major Players in Grid Service World

Example from NetSolve Suppose you want to multiply Matrix A and Matrix B. There is one site which provides the facility. You may want to directly integrate the function in your software. request = netsolve( “matmul”, a, b) C = netsolve( “wait”, request)

Nature of Grid Architecture Grid architecture is a set of protocols for establishment, management and usage of dynamic, cross-organizational virtual organizations. The main issues in the architecture are • Interoperability • Standard Protocols • Services • Application Programming Interface( API) and Software Development Kits (SDK)

Hourglass Model • Narrow neck of glass defines • a small set of core abstractions • and protocols. It consists of • protocols for • Connectivity • Resource Management • These protocols must be chosen • so as to capture the fundamental • mechanism of sharing across • many different types.

Grid Architecture Fabric layer implements the local, resource Specific operations that occurs on specific Resources. Connectivity protocols are concerned with communication and authentication. Resource protocols are concerned with negotiating access to individual resources Collective protocols and services are concerned with coordinating use of multiple resources.

General list of services • Performance guarantees • Monitoring • Adaptation • Intrusion detection • Resource Management • Accounting and payment • Fault management • Identity & Authentication • Authorization & policy • Resource discovery • Resource characterization • Resource allocation • Co-reservation, workflow • High-Speed data transfer • Remote data access

Resource Management At the minimum the following resource should be available for query • Computational: Mechanism for starting program, monitoring and controlling the execution, advanced reservations, hardware and software characteristics, state information such as current load etc. • Storage Mechanism for putting and getting files, state information such as available space and bandwidth utilization. • Network Mechanism for control over resource allocation for network transfer, information about network characteristics and load • Code Repositories Management for versioned source and object code. ( CVS style) • Catalogs

Connectivity Layer This layer defines core communications and authentication protocols. Communication protocols enable the exchange of data between different fabric layers. It include transport, routing and naming services. Authentications protocols build on communication services to provide cryptographically secure mechanisms for verifying the identity of users and resources.

Authentication Characteristics • Single sign on Single “log on” should be sufficient for access to multiple grid resources. • Delegation run a program on user’s behalf. • Integration with local security example : Kerberos or Unix security • User-based trust relationships. If an user uses services from multiple service providers at the same time, the security mechanism should not require that each of the resource providers to cooperate and interact with each other.

Resource layer It is built on top of communications. It defines protocols for • Secure negotiation • Initiations • Monitoring • Control • Accounting • Payment for sharing resources.

Resource Layer • Information protocols are used for obtaining information about structure and state of a resource. ( current load, usage policy, configuration etc) • Management protocols are used to negotiate access to shared resource, specifying resource requirements • Advanced reservation • Quality of service. • Operations to perform

Collective: Coordinating Multiple Resources • Directory Services: A user may query for resource by name and/or by its attributes such as type, availability, load. • Co-allocation, scheduling and brokering services allow VO participants to request for some specific resources for some specific purpose and duration. • Monitoring and Diagnostic services allows monitoring for resource failure, attacks, overload etc… • Data replication services allows management of VO storage to maximize data access performance with respect to some metric such as response time, reliability and cost.

Collective … • Grid-enabled programming systems enable familiar programming models to be used in Grid environment using other grid services such as resource discovery, security etc. etc. example: Globus MPI • Workload management and collaboration Allow problem solving environment. • Software discovery allows selection of the best software implementations and execution platform. Example NetSolve and Ninf • Accounting and payment services: gather usage information for the purpose of accounting, payment for the services.

Collective

OGSA Build on both Grid and Web-Services communities, OGSA defines uniform service semantic called Grid Services. • OGSA defines few persistent and many transient services • OGSA defines interfaces for managing Grid service instances. Factory, registry, discovery, lifetime • The OGSA defines interfaces and behavior for Reliable invocation, lifetime management, discovery, authorization, notification, upgradeability, concurrency, manageability • OGSA also defines WSDL interface and associated convention. • Protocols for reliable and secure management of distributed state.

Need for service oriented view • It allows us to address the need for standard interface definition, local/remote transparency and adaptation to local OS. • It allows multiple protocols bindings to facilitate localized optimization of services. • It simplify virtualization which in turn also allows consistent resource access multiple heterogeneous platform. • With service oriented view, we can partition the interoperability into two sub-problems, namely the definition of service interface and identification of protocols that can be used to invoke a particular interface

Globus Toolkit Globus toolkit is an open-architecture and open-source set of services and software libraries that support Grid and Grid applications. This toolkit address issues of security, information discovery, resource management, data management, communication, fault detection and portability. GRAM:Grid Resource Allocation and Management MDS :Meta Directory Service GSI :Grid Security Infrastructure This toolkit will be described in detail in the next presentation, therefore I will skip any more description.

Nature of Service • Services are location transparent. • Services are created and destroyed dynamically. • Services are stateful. Every service is assigned a globally unique name, called GridService Handle (GSH) • Grid services can change during their lifetime ( for example support new protocols).

Web Services • Web services are the basis for Grid services which are the cornerstones of OGSA and OGSI. • Web Services use simple Internet based protocols to address heterogeneous distributed computing. • Web Services define a technique for describing software components to be accessed, methods for accessing them and discovery about the components. • Web Services are language, programming model and system software neutral.

Upgradeability • Services within the complex systems must be independently upgradeable. • Versioning and compatibility between services must be managed and expressed so that clients can discover not only the specific service versions but also compatible services. • OGSA defines conventions that allow us to identify when a service changes and when those changes are backwardly compatible with respect to interface and semantics.

Some myths (misunderstanding) about Grid Computing • Grid is next generation Internet. • The grid is a source of free cycles. • Grid requires a distributed operating system. • Grid requires a new programming model. • Grid makes high-performance computing superfluous.

Distributed ComputingEconomics(Views of Jim Gray) • An equivalent price for following items: • one data base access • 10 bytes of internet traffic • 100,000 instructions • 10 bytes of disk storage • a megabyte of disk bandwidth • Break-even point is 10,000 instructions / byte. • This serves a basis how we do cost-effective Internet-based computing, such as grid computing.

How are the numbers computed? • A 2GH CPU with 2 GB RAM box: $2,000 • A 200 GB disk,100 accesses/s, or 50MB/s: $200 • 1 Mbps WAN link: $100/month • $1 is equivalent to: • 3.24 GB sent over WAN (7.2 hours) • 100+ Tera CPU instructions = 7.2 hours of CPU time • 1 GB disk • 2.592 million database accesses (in 7.2 hours) • 1.296 Tera Byte disk bandwidths (in 7.2 hours)

Cycle-based Computing is Almost Free • The accumulated cycles in SETI@Home are 54 Teraflops. • Google freely provides a trillion searches a year from the largest database (2 peterbytes). • Hotmail freely carries a trillion e-mails per year. • Amazon.com offers a free book search tool. • Many well-known media sites offer free news … • The maintenance prices paid are low and worthy.

What is SETI@Home? • It uses millions of computers in homes/offices world wide to analyze radio signals from space. • SETI: Search for Extraterrestrial Intelligence is to detect intelligent life outside Earth. • Uses radio telescope to listen (collect) for narrow-bandwidth radio signals from space. • Data analysis: (1) computing power spectrums, (2) finding ``candidate signals”, (3) eliminating meaningless signals. • Embarrassing Parallelism: CPU and Data Intensive, but infrequent communications. (high bandwidths interconnects in supercomputers are not necessary!)

Who are paying the``free” Computing • Advertisers pay it. • Google, hotmail, amazon.com collect $1 from a company for profits if its site is visited 1,000 time via these ``free” services: Cost Per thousand iMpressions (CPM). • Big companies are eager to pay maintenance. • Low cost but very effective promotion. • A Web site almost becomes the only ``Spoke-man”. • SETI@Home rely on donated cycles world wide. • It provided a 1,300 years of free computing on 2/3/03.

Cases for Grid Computing: at least 10,000 Ins/Byte • A cryptographic search problem: • only a few Kbytes input/output, but computing for days. • A representative job submitted to SETI@Home: • computing on 12 hours on 1/2 Mbytes of input • A CFD computation at Cornell: • 7 years computing for 100 MB of input, 10 GB output. • Making animated movie of Toy Story: • a 200 MB image to take several hours to render. (200,000-600,000 Ins/Byte).

Grid Computing Should Follow the Economics • Suitable Applications can be very limited. • A good solution: to send a GB over Internet to save years of computing. It isnoteconomic to send a KB if the result can be computed locally in a second. • If Internet cost drops slower than Moore’s Law, the analysis becomes stronger. • Over the 40 years, network cost fallen much slower. • Cluster computing has different economics • a GBps Ethernet costs $200/port, delivers 50 MBps • it is comparable to disk bandwidth cost, 10,000 lower than Internet costs. (so the CFD fits better on clusters).

Grid Basics CS-780-3 Notes

Grid Basics CS-780-3 Notes

Presentation Transcript

780 Music

Grid Computing Basics

Management 780

Notes: Blood Pressure Basics

CSC 780

CS 211 Java Basics

CS 3388: OpenGL Basics

Notes 3

CS 101 – Access notes

Grid programming basics

Notes 3

Notes #3

GOVERNMENT BASICS NOTES

CS 423 Compiler project notes

Notes 3

Basics of The Grid

Physics Basics – Summary Notes

Grid Basics

780 Music