WW Grid An Overview Grid Computing and Applications Subject Code: 433-498 Grid Computing and Distributed Systems (GRIDS) Lab. The University of MelbourneMelbourne, Australiawww.gridbus.org Rajkumar Buyya
Overview • Computing platforms and how the Grid is different ? • Towards global (Grid) computing. • Grid resource management and scheduling. • Application development challenges. • Approaches to Grid computing. Grid applications Grid Projects in GRIDS Lab@ Melbourne • Summary and conclusions
Major Networking and Computing Technologies Introduction * HTC * P2P * PDAs COMPUTING * Minicomputers * PCs * Workstations * Mainframes * Grids * PC Clusters * Crays * MPPs * WS Clusters * XEROX PARC worm Technologies Introduced * IETF * W3C NETWORKING * TCP/IP * Ethernet * HTML * Mosaic * Web Services * Email * Internet Era * WWW Era * XML * ARPANET 1960 1970 1975 1980 1985 1990 1995 2000
Internet: Past, Present, Future 140 120 100 The 'Network Effect’ kicks in, and the web goes critical' Number of hosts (millions) 80 60 40 20 0 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 TCP/IP HTML Mosaic XML 4. with XML PHASE 2. The Internet is Born 3. The World Wide Web 5. The Grid 1. Packet Switching Networks HTML hypertext system created 1969: 4 US Universities linked to form ARPANET TCP/IP becomes core protocol CERN launch World Wide Web 1972: First e-mail program created Domain Name System created IETF created (1986) NCSA launch Mosaic interface 1976: Robert Metcalfe develops Ethernet
Internet and WWW Growth 10,000,000 1,000,000 Internet Hosts 100,000 10,000 1,000 WWW Servers 100 10 4 1 1969 1970 1975 1980 1985 1990 1995 2000
Installed base and Growth rate for telephone lines, mobile phones, & Internet hosts - 1995 Installed, 1995 1994-95 Growth Rates (%) Income Group/ Phone Mobile Internet Phone Mobile Internet Region Lines Phones Hosts Lines Phones Hosts Lower Income 2.0 0.12 1.35 35.7 135.1 246.0 Lower- Middle 9.1 0.33 73.31 8.7 105.1 167.0 Upper - Middle 14.5 1.34 380.13 6.4 66.8 111.9 High 53.2 8.70 10749.23 3.6 55.6 97.0 Africa 1.7 0.09 69.14 7.9 60.5 81.4 Americas 29.0 5.17 8359.58 5.4 42.3 91.5 Asia 5.4 0.62 121.70 14.7 108.3 150.0 Europe 33.0 3.04 2732.24 3.6 59.5 112.2 Oceans 39.7 9.55 12845.55 4.0 85.7 88.8 World 12.1 1.56 1661.89 7.0 60.4 97.8 Source: ACM, Nov, 97 (phones, international telecommunication union, hosts, network Wizards
2100 2100 2100 2100 2100 2100 2100 2100 2100 Scalable HPC: Breaking Administrative Barriers ? PERFORMANCE Administrative Barriers • Individual • Group • Department • Campus • State • National • Globe • Inter Planet • Universe Desktop SMPs or SuperComputers Global Cluster/Grid Inter Planet Cluster/Grid ?? Local Cluster Enterprise Cluster/Grid
Why Grids ? Large Scale Exploration needs them—Killer Applications. • Solving grand challenge applications using computer modeling, simulation and analysis Aerospace Internet & Ecommerce Life Sciences Digital Biology CAD/CAM Military Applications Military Applications Military Applications
Cluster 1 Scheduler Master Daemon LAN/WAN Submit Graphical Control Cluster 3 Execution Daemon Scheduler Clients Master Daemon Cluster 2 Scheduler Submit Graphical Control Execution Daemon Master Daemon Clients Submit Graphical Control Execution Daemon Clients Cluster of Clusters - Hyperclusters
http://www.sun.com/hpc/ Grid: Towards Internet Computing for (Coordinated) Resource Sharing Grid enables: • Resource Sharing • Selection • Aggreation - Unification of geographically distributed resources
What is Grid ? • A paradigm/infrastructure that enabling the sharing, selection, & aggregationof geographically distributed resources: • Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; • Software – e.g., ASPs renting expensive special purpose applications on demand; • Catalogued data and databases – e.g. transparent access to human genome database; • Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy. • People/collaborators. [depending on their availability, capability, cost, and user QoS requirements] for solving large-scale problems/applications. Widearea
P2P/Grid Applications-Drivers • Distributed HPC (Supercomputing): • Computational science. • High-Capacity/Throughput Computing: • Large scale simulation/chip design & parameter studies. • Content Sharing (free or paid) • Sharing digital contents among peers (e.g., Napster) • Remote software access/renting services: • Application service provides (ASPs) & Web services. • Data-intensive computing: • Drug Design, Particle Physics, Stock Prediction... • On-demand, realtime computing: • Medical instrumentation & Mission Critical. • Collaborative Computing: • Collaborative design, Data exploration, education. • Service Oriented Computing (SOC): • Computing as Competitive Utility: New paradigm, new industries, and new business.
Building and Using Grids requires... • Services that make our systems Grid Ready! • Security mechanisms that permit resources to be accessed only by authorized users. • (New) programming tools that make our applications Grid Ready!. • Tools that can translate the requirements of an application into requirements for computers, networks, and storage. • Tools that perform resource discovery, trading, composition, scheduling and distribution of jobs and collects results.
database A Typical Grid Computing Environment Grid Information Service Grid Resource Broker Application R2 R3 R4 R5 RN Grid Resource Broker R6 R1 Resource Broker Grid Information Service
Sources of Complexity in Resource Management for World Wide Computing • Size (large number of nodes, providers, consumers) • Heterogeneity of resources (PCs, Workstatations, clusters, and supercomputers) • Heterogeneity of fabric management systems (single system image OS, queuing systems, etc.) • Heterogeneity of fabric management polices • Heterogeneity of applications (scientific, engineering, and commerce) • Heterogeneity of application requirements (CPU, I/O, memory, and/or network intensive) • Heterogeneity in demand patters • Geographic distribution and different time zones • Differing goals (producers and consumers have different objectives and strategies) • Unsecure and Unreliable environment
Traditional approaches to resource management are NOT useful for Grid ? • They use centralised policy that need • complete state-information and • common fabric management policy or decentralised consensus-based policy. • Due to too many heterogenous parameters in the Grid it is impossible to define: • system-wide performance matrix and • common fabric management policy that is acceptable to all. • So, we propose the usage of “economics” paradigm for managing resources • proved successful in managing decentralization and heterogeneity that is present in human economies! • We can easy leverage proven Economic principles and techniques • Easy to regulate demand and supply • User-centric, scalable, adaptable, value-driven costing, etc. • Offers incentive (money?) for being part of the grid!
Grid Resource Management systems need to ensure/provide: • Site autonomy. • Heterogeneous resources and substrate: • Each resource can be different – SMPs, Clusters, Linux, UNIX, Windows, Intel, etc. • Resource owners have their own policies or scheduling mechanisms (Codine/Condor). • Extend policies, through resource brokers. • Resource allocation/co-allocation • Online control - can apps (Graphics) tolerate non-availability of a resource and adapt themselves?
Grid RMS to support • Authentication (once). • Specify (code, resources, etc.). • Discover resources. • Negotiate authorization, acceptable use, Cost, etc. • Acquire resources. • Schedule Jobs. • Initiate computation. • Steer computation. • Access remote data-sets. • Collaborate with results. • Account for usage. • Discover resources. • Negotiate authorisation, • acceptable use, Cost, etc. • Acquire resources. • Schedule jobs. • Initiate computation. • Steer computation. Domain 1 Domain 2 Ack: Globus..
Information Service - MDS Resource Co-allocators Local Resource Mgr Resource Management Architecture Resource Brokers (RSL Specialization) RSL Application Local Resource Mgr Local Resource Mgr
NetSolve mix-and-match Object-oriented Internet/partial-P2P Grid Computing Approaches Network enabled Solvers Economy/Service-Oriented Grid Computing Gridbus
Australia Nimrod-G GridSim Virtual Lab Gridbus DISCWorld ..new coming up Europe UNICORE MOL UK eScience Poland MC Broker EU Data Grid EuroGrid MetaMPI Dutch DAS XW, JaWS Japan Ninf DataFarm Korea... N*Grid USA Globus Legion OGSA Javelin AppLeS NASA IPG Condor-G Jxta NetSolve AccessGrid and many more... Cycle Stealing & .com Initiatives Distributed.net SETI@Home, …. Entropia, UD, Parabon,…. Public Forums Global Grid Forum P2P Working Group IEEE TFCC Grid & CCGrid conferences Many Grid Projects & Initiatives http://www.gridcomputing.com
$grid Many Testbeds ? & who pays ? GUSTO EcoGrid Legion Testbed NASA IPG
Types of Grid Applications • Sequential – dusty deck codes. • Data Parallel: • Synchronous – tightly coupled; • Loosely synchronous. • Asynchronous: • Irregular in time and space; • Difficult to parallelise to exploit the massive parallelism. • Embarrassingly Parallel.
Grid Applications-Drivers • Distributed HPC (Supercomputing): • Computational science. • High-throughput computing: • Large scale simulation/chip design & parameter studies. • Content Sharing • Sharing digital contents among peers (e.g., Napster) • Remote software access/renting services: • Application service provides (ASPs). • Data-intensive computing: • Data mining, particle physics (CERN), Drug Design. • On-demand computing: • Medical instrumentation & network-enabled solvers. • Collaborative: • Collaborative design, data exploration, education.
SF-Express distributed interactive simulation. 100K vehicles (2002 goal) using 13 computers, 1386 nodes, 9 sites. Globus mechanisms for Resource allocation; Distributed startup; I/O and configuration; Security. Distributed Supercomputing (SF-Express/MPICH-G, Caltech) NCSA Origin Caltech Exemplar CEWES SP Maui SP P. Messina et al., Caltech http://www.globus.org/applications/
Interest Mgmt. Router MPI MPI Router Interest Mgmt. Local Simulation Local Simulation IP Interest Mgmt. Router MPI Local Simulation SF-Express Architecture • Create synthetic, representations of interactive environments. • Scalability via interest management. • Starting point: • MPI and socket communication; • Hand startup.
High Throughput Computing(parameter sweep applications) • A study involving exploration of possible scenarios - i.e., execution of the same program for various design alternatives (data). • It consists of large number of tasks (1000s). • Generally, no inter-task communication (task farming). • Large size data (MBytes+) files and I/O constraints • A large class of application areas: • Parameter explorations and simulations (Monte Carlo); • A large number of science, engineering, and commercial applications: Astrophysics, Drug Design, NeroScience, Network simulation, structural engineering, automobiles crash simulation, aerospace modeling, financial risk analysis • Condor, Nimrod/G, DesignDrug@Home, SETI@Home, FOLD@Home, Distributed.net.
Ad Hoc Mobile Network Simulation Ad Hoc Mobile Network Simulation: Network performance under different microware frequencies and different weather conditions – uses Nimrod.
Molecules Protein Drug Design: Data Intensive Computing on Grid • It involves screening millions of chemical compounds (molecules) in the Chemical DataBase (CDB) to identify those having potential to serve as drug candidates. Chemical Databases (legacy, in .MOL2 format)
GTS GTS GTS GTS DesignDrug@Home ArchitectureA Virtual Lab for “Molecular Modeling for Drug Design” on P2P Grid Grid Info. Service Grid Market Directory Data Replica Catalogue “Give me list PDBs sources Of type aldrich_300?” “service cost?” “service providers?” GTS Resource Broker “Screen 2K molecules in 30min. for $10” “mol.5 please?” (RB maps suitable Grid nodes and Protein DataBank) “get mol.10 from pdb1 & screen it.” PDB2 “mol.10 please?” (GTS - Grid Trade Server) PDB1
Data Generation Results • [deadline, budget, optimization preference] MEG(MagnetoEncephaloGraphy) Data Analysis on the Grid: Brain Activity Analysis Analysis All pairs (64x64) of MEG data by shifting the temporal region of MEG data over time: 0 to 29750: 64x64x29750 jobs 64 sensors MEG 2 3 1 Data Analysis 5 Nimrod-G 4 Life-electronics laboratory, AIST World-Wide Grid • Provision of expertise in • the analysis of brain function • Provision of MEG analysis [Collaboration with Osaka University, Japan]
Components of an AG Node RGB Video Digital Video DisplayComputer Digital Video NETWORK Video CaptureComputer NTSC Video AudioCaptureComputer Analog Audio Digital Audio Control Computer EchoCanceller Mixer Collaborative Engineering Access GRID: http://www-fp.mcs.anl.gov/fl/accessgrid/ • Group to group interactions. • Human collaboration across • distributed locations • Remote visualizations, virtual meeting, • seminars,etc. • Uses grid technologies for secure • communication etc. • May have interaction with scientific apps. Rick Stevens & Team, ANL
Parallelisation of Image Rendering • Image splitting (by rows, columns, and checker) • Each segment can be concurrently processed on different nodes and render image as segments are processed.
Scheduling (need load balancing) • Each row rendering takes different times depending on image nature – e.g, rendering rows across the sky take less time compared to those that intersect the interesting parts of the image. • Rending apps can be implemented using MPI, PVM, or p-study tools like Nimrod and schedule.
CERN Large Hadron Collider - circular particle accelerator to be placed in 27 km long tunnel in 2005.
Conclude with a comparison with the Electrical Grid……….. Where we are ????