Simulation for Grid Computing

Simulation forGrid Computing Henri Casanova Univ. of California, San Diego casanova@cs.ucsd.edu

Grid Research • Grid researchers often ask questions in the broad area of distributed computing • Which scheduling algorithm is best for this application on a given Grid? • Which design is best for implementing a distributed Grid resource information service? • Which caching strategy is best for enabling a community of users that to distributed data analysis? • What are the properties of several resource management policies in terms of fairness and throughput? • What is the scalability of my publish-subscribe system under various levels of failures • ... Grid Performance Workshop 2005

Grid Research • Analytical or Experimental? • Analytical Grid Computing research • Some have developed purely analytical / mathematical models for Grid computing • makes it possible to prove interesting theorems • often too simplistic to convince practitioners • but sometimes useful for understanding principles in spite of dubious applicability • One often uncovers NP-complete problems anyway • e.g., routing, partitioning, scheduling problems • And one must run experiments • Grid computing research: based on experiments • Most published works Grid Performance Workshop 2005

Grid Computing as Science? • You can tell you are a scientific discipline if • You can read a paper, easily reproduce (at least a subset of) its results, and improve • You can tell to a grad student “Here are the standard tools, go learn how to use them and come back in one month” • You can give a 1-hour seminar on widely accepted tools that are the basis for doing research in the area • We are not there today • But perhaps I can give a 1-hour seminar on emerging tools that could be the basis for doing research in the area, provided a few open questions are addressed • Need for standard ways to run “Grid experiments” Grid Performance Workshop 2005

Grid Experiments • Real-world experiments are good • Eminently “believable” • Demonstrates that proposed approach can be implemented in practice But... • Can be time-intensive • Execution of “applications” for hours, days, months, ... • Can be labor-intensive • Entire application needs to be built and functional • including all design / algorithms alternatives • include all hooks for deployment • Is it a bad engineering practice to build many full-fledge solutions to find out which ones work best? Grid Performance Workshop 2005

Grid Experiments (2) • What experimental testbed? • My own little testbed • well-behaved, controlled, stable • often not representative of real Grids • Real grid platforms • (Still) challenging for many grid researchers to obtain • Not built as a computer scientist’s playpen • other users may disrupt experiments • other users may find experiments disruptive • Platform will experience failures that may disrupt the experiments • Platform configuration may change drastically while experiments are being conducted • Experiments are uncontrolled and unrepeatable • even if disruption from other users is part of the experiments, it prevents back-to-back runs of competitor designs / algorithms Grid Performance Workshop 2005

Grid Experiments (3) • Difficult to obtain statistically significant results on an appropriate testbed And to make things worse... • Experiments are limited to the testbed • What part of the results are due to idiosyncrasies of the testbed? • Extrapolations are possible, but rarely convincing • Must use a collection of testbeds... • Still limited explorations of “what if” scenarios • what if the network were different? • what if we were in 2015? • Difficult for others to reproduce results • This is the basis for scientific advances! Grid Performance Workshop 2005

Simulation • Simulation can solve many (all) of these difficulties • No need to build a real system • Conduct controlled / repeatable experiments • In principle, no limits to experimental scenarios • Possible for anybody to reproduce results • ... • Simulation • “representation of the operation of one system (A) through the use of another (B)” • Computer simulation: B  a computer program • Key question: Validation • correspondence between simulation and real-world Grid Performance Workshop 2005

Simulation in Computer Science • Microprocessor Design • A few standard “cycle-accurate” simulators are used extensively • http://www.cs.wisc.edu/~arch/www/tools.html • Possible to reproduce simulation results • Networking • A few standard “packet-level” simulators • NS-2, DaSSF, OMNeT++ • Well-known datasets for network topologies • Well-known generators of synthetic topologies • SSF standard: http://www.ssfnet.org/ • Possible to reproduce simulation results • Grid Computing? • None of the above up until a few years ago • Most people built their own “ad-hoc” solutions • Promising recent developments Grid Performance Workshop 2005

Simulation of Distributed Computing Platforms? • Simulation of parallel platforms used for decades • Most simulations have made drastic assumptions • Simplistic platform model • Processors perform work at some fixed rate (Flops) • Network links send data at some fixed rate • Topology is fully connected (no communication interference) or a bus (simple communication interference) • Communication and computation are perfectly overlappable • Simplistic application model • All computation is CPU intensive • Clear-cut communication and computation phases • Application is deterministic • Straightforward simulation in most cases • Just fill in a Gantt chart with a computer rather than by hand • No need for a “simulation standard” Grid Performance Workshop 2005

Grid Simulations? • Simple models: • perhaps justifiable for a switched dedicated cluster running a matrix multiply application • Hardly justifiable for grid platforms • Complex and wide-reaching network topologies • multi-hop networks, heterogeneous bandwidths and latencies • non-negligible latencies • complex bandwidth sharing behaviors • contention with other traffic • Overhead of middleware • Complex resource access/management policies • Interference of communication and computation Grid Performance Workshop 2005

Grid Simulations • Recognized as a critical area • Grid eXplorer (GdX) project (INRIA) • Build an actual scientific instrument • Databases of “experimental conditions” • 1000-node cluster for simulation/emulation • Visualization tools • Still in planning • What simulation technology??? • Two main questions • What does a “representative” Grid look like? • How does one do simulation on a synthetic representative Grid? Grid Performance Workshop 2005

Presentation Outline • Introduction • Simulation for Grid Computing? • Generating Synthetic Grids • Simulating Applications on Synthetic Grids • Current Work and Future Directions Grid Performance Workshop 2005

Why Synthetic Platforms? • Two goals of simulations: • Simulate platforms beyond the ones at hand • Perform sensitivity analyses • Need: Synthetic platforms • Examine real platforms • Discover principles • Implement “platform generators” • What Simulation results in my paper? • Results for a few “real” platforms • Results for many synthetic platforms Grid Performance Workshop 2005

Generation of Synthetic Grids • Three main elements • Network Topology • Graph network link Grid Performance Workshop 2005

Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies backbone LAN Grid Performance Workshop 2005

Generation of Synthetic Grids workstation data store server cluster • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources Grid Performance Workshop 2005

Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources • “Background” Conditions • Load and unavailability workstation data store server cluster Grid Performance Workshop 2005

Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources • “Background” Conditions • Load and unavailability • Failures workstation data store X X X X server cluster Grid Performance Workshop 2005

Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources • “Background” Conditions • Load and unavailability • Failures • What is Representative and Tractable? workstation data store X X X X server cluster Grid Performance Workshop 2005

Synthetic Network Topologies • The network community has wondered about the properties of the Internet topology for years • The Internet grows in a decentralized fashion with seemingly complex rules and incentives • Could it have a mathematical structure? • Could we then have generative models? • Three “generations of graph generators” • Random • Structural • Degree-based Grid Performance Workshop 2005

Random Topology Generators • Simplistic • Place N nodes randomly in a square • Vertices u,v connected with prob. P(u,v) =  • Waxman [JSAC’88] • Place N nodes randomly in a CxC square • P(u,v) =  e-d / (C √2), 0 < ,  ≤ 1 • d = Euclidian distance between u and v • First model to be widely used for network simulations • Others • Exponential, Locality-based, ... • Shortcoming: Real networks have a non-random structure with some “hierarchy” Grid Performance Workshop 2005

Structural Generators • “... the primary structural characteristic affecting the paths between nodes in the Internet is the distribution between stub and transit domains... In other words, there is a hierarchy imposed on nodes...” [Zegura et al, 1997] • Both at the “AS level” (peering relationships) and at the “router level” (Layer 2) • Quickly became accepted wisdom: • Transit-Stub [Calvert et al., 1997] • Tiers [Doar, 1996] • GT-ITM [Zegura et al., 1997] Grid Performance Workshop 2005

Power-Laws! • In 1999, Faloutsos et al. [SIGCOMM’99] rocked topology generation with power laws • Results obtained both for AS-level and router-level information from real networks • Outdegree (number of edges leaving a node) • For each node v, compute its outdegree dv • Node rank, rv: the index of v in the order of decreasing degree • Nodes can have the same rank • Law: dv proportional to rvR Measured Grid Performance Workshop 2005

Power-Laws! • Random Generators do not agree with Power-laws • Structural Generators do not agree with Power-laws Waxman GT-ITM, flat GT-ITM, Transit-Stub Grid Performance Workshop 2005

Degree-based Generators • New common wisdom: A topology that does not agree with power-laws cannot be representative • Flurry of development of “power-law generators” after the Faloutsos paper • CMU power law graph generator [Palmer et al., 2000] • Inet [Jin et al., 2000] • BRITE [Medina et al., 2001] • PLRG [Aiello et al., 2000] BRITE Grid Performance Workshop 2005

Structure vs. Power-law • We know network have structure AND power laws • Combine both? • GridG project [Dinda et al., SC’03] • Use a Tiers-generated topology • Add random links to satisfy the power-laws • How about just using power-laws? • Comparison paper [Tangmunarunkit et al., SIGCOMM’02] • Degree-based generators capture the large-scale structure of real networks very well, better than structural generators! • structural generators impose “too strict” a hierarchy • Hierarchy arises naturally from the degree-based generators • e.g., backbone links • Works both for AS-level and router-level Grid Performance Workshop 2005

Short story • What generator? • To model large networks (e.g., > 500 routers) • use degree-based generators • To model small networks (e.g., < 100 routers) • use structural generators • degree-based will not create any hierarchy Grid Performance Workshop 2005

Bandwidths, latencies, traffic, etc. • Topology generators only produce a graph • We need link characteristics as well • Option #1: Model physical characteristics • Some “models” in topology generators • Need to simulate background traffic • No accepted model for generating background traffic • Simulation can be very costly • Option #2: Model end-to-end performance • Models ([Lee, HCW’01]) or Measurements (NWS, ...) • Go from path modeling to link modeling? • Turns out to be a difficult question • DARPA workshop on network modeling Grid Performance Workshop 2005

Bandwidths, latencies, traffic, etc. • Maybe none of this matters? • Fiber inside the network mostly unused • Communication bottleneck is the local link • Appropriate tuning of TCP or better protocols should saturate the local link • Don’t care about topology at all! • Or maybe none of this matters for my application • No network contention Grid Performance Workshop 2005

Compute Resources • What resources do we put at the end-points? • Option #1: “ad-hoc” generalization • Look at the TeraGrid • Generate new sites based on existing sites • Option #2: Statistical modeling • Examing many production resources • Identify key statistical characteristics • Come up with a generative/predictive model Grid Performance Workshop 2005

Synthetic Clusters? • Many Grid resources are clusters • What is the “typical” distribution of clusters? • “Commodity Cluster synthesizer” [Kee et al., SC’04] • Examined 114 production clusters (10K+ procs) • Came up with statistical models • Validated model against a set of 191 clusters (10K+ procs) • Models allow “extrapolation” for future configurations • Models implemented in a resource generator Grid Performance Workshop 2005

Architecture/Clock Models • Current distribution of proc families • Linear fit between clock-rate and release year within a processor family • Quadratic fraction of processors released on a given year • Model future distributions and speeds Grid Performance Workshop 2005

Other models? • Other models • Cache size: grows logarithmically • Number of processors per node: log2-normal • Memory size: log2-normal • Number of nodes per cluster: log2-normal • Models were validated against a snapshot of ROCKS clusters • These clusters have been added to the “training” set • More clusters are added every month • GridG • Provide a generic framework in which such laws and other correlations can be encoded Grid Performance Workshop 2005

Resource Availability / Workload • Probabilistic models • Naive: exp. distributed availability and unavailability intervals • Sophisticated: Weibull distributions [Wolski et al.] • Traces • NWS, etc. • Desktop Grid resources [Kondo, SC’04] • Workload models • e.g., Batch schedulers • Traces • Models [Feitelson, JPDC’03] • job inter-arrival times: Gamma • amount of work requested: Hyper-Gamma • number of processors requested: Compounded (2^, 1, ...) Grid Performance Workshop 2005

A Sample Synthetic Grid • Generate 5,000 routers with BRITE • Annotate latency according to BRITE’s Euclidian distance method (scaling to obtain the desired network diameter) • Annotate bandwidth based on a set of end-to-end NWS measurements • Pick 30% of the end-points • Generate a cluster at each end-point according to Kee’s synthesizer for Year 2006 • Model cluster load with Feitelson’s model with a range of parameters for the random distributions • Model resource failures based on Inca measurements on TeraGrid Grid Performance Workshop 2005

Synthetic Grid Generation • Still far from widely accepted standards • Many ongoing, promising efforts • Researchers have recognized this as an issue • Tools from networking can be reused • A few “Grid” tools are available • What is really needed • Repository of “Grid Measurements” • Repository of “Synthetic/Generated Grids” Grid Performance Workshop 2005

Presentation Outline • Introduction • Simulation for Grid Computing? • Generating Synthetic Grids • Simulating Applications on Synthetic Grids • Current Work and Future Directions Grid Performance Workshop 2005

Simulation Levels • Simulating applications on a synthetic platform? • Spectrum of simulation levels Mathematical Simulation Discrete-event Simulation Emulation more abstract Based solely on equations Abstraction of system as a set of dependent actions and events (fine- or coarse-grain) less abstract Trapping and virtualization of low-level application/system actions • Boundaries above are blurred (d.e. simulation ~ emulation) • A simulation can combine all paradigms at different levels Grid Performance Workshop 2005

Simulation Options more abstract less abstract • Network • Macroscopic: Flows in “pipes” • coarse-grain d.e. simulation + mathematical simulation • Microscopic: Packet-level simulation • fine-grain d.e. simulation • Actual flows go through “some” network • emulation • CPU • Macroscopic: Flows in a “pipe” • coarse-grain d.e. simulation + mathematical simulation • Microscopic: Cycle-accurate simulation • fine-grain d.e. simulation • Virtualization via another CPU / Virtual Machine • emulation more abstract less abstract Grid Performance Workshop 2005

Simulation Options more abstract less abstract • Application • Macroscopic: Application = analytical “flow” • Less Macroscopic: sets of abstract “tasks” with resource needs and dependencies • coarse-grain d.e. simulation • Application specification or pseudo-code API • Virtualization • emulation of actual code with trapping of application-generated events • Two projects • MicroGrid (UCSD) • SimGrid (UCSD + IMAG + Univ. Nancy) Grid Performance Workshop 2005

MicroGrid • Set of simulation tools for evaluating middleware, applications, and network services for Grid systems • Applications are supported by emulation and virtualization • Actual application code is executed on “virtualized” resources • Microgrid accounts for • CPU • network • Virtualization Application Virtual Resources MicroGrid Physical Resources Grid Performance Workshop 2005

MicroGrid Virtualization • Resource virtualization • “resource names” are virtualized • gethostname, sockets, Globus’ GIS, MDS, NWs, etc. • Time virtualization • Simulating the TeraGrid on a 4 node cluster • Simulating a 4 node cluster on the TeraGrid • CPU virtualization • Direct execution on (a fraction of) a physical resource • No application modification • Main challenge • Synchronization between real time and virtual time Grid Performance Workshop 2005

MicroGrid Network • Packet-level simulation • Network calls are intercepted • Are sent to a network simulator that has been configured with the virtual network topology • Implemented with MaSSF • An MPI version of DaSSF [Liu et al., 2001] • Configured via SSF standard (MDL, etc.) • Protocol stacks implemented on top of MaSSF • TCP, UDP, BGP, etc. • Socket calls are trapped without application modification • real data is sent • delays are scaled to match the virtual time • Main challenges • scaling (20,000 routers on a 128-node cluster) • synchronization with computation Grid Performance Workshop 2005

MicroGrid in a NutShell • Virtualization via trapping of • application events • (emulation) • Can have high overhead • But captures the overhead! Virtualization via another CPU (emulation) • Can be really slow • But hopefully accurate • Microscopic: Packet-level simulation • (fine-grain discrete event simulation) • Can be really slow for long transfers • But hopefully accurate CPU Application Network Emulation Discrete-event Mathematical Simulation Simulation less abstract more abstract Grid Performance Workshop 2005

SimGrid • Originally developed for scheduling research • Must be fast to allow for thousands of simulation • Application • No real application code is executed • Consists of tasks that have • dependencies • resource consumptions • Resources • No virtualization • A resource is defined by • a rate a which it does “work” • a fixed overhead that must be paid by each task • traces of the above if needed + failures • A task can use multiple resources • data transfer over multiple links, computation that uses a disk and a CPU Grid Performance Workshop 2005

SimGrid • Uses a combination of mathematical simulation and coarse-grain discrete event simulation • Simple API to “specify” an application rather than having it already implemented • Fast simulation • Key issue: Resource sharing • In MicroGrid: resource sharing “emerges” out of the low-level emulation and simulation • Packets of different connections interleaved by routers • CPU cycles of different processes get slices of the CPU • Drawback: slow simulation • How can one do something faster that is still reasonable? • Come up with macroscopic models of resource sharing Grid Performance Workshop 2005

Resource Sharing in SimGrid • Macroscopic resource sharing can be easy • A CPU: CPU-bound processes/threads get a fair share of the CPU “in steady state” • Why go through the trouble of emulating CPU-bound processes? • Just say how many cycles they need, and “compute” how many cycles they get per second • Macroscopic resource sharing can be not so easy • The Network • Many end-points, routers, and links • Many end-to-end TCP flows? • How much bandwidth does each flow receive? Grid Performance Workshop 2005

Bandwidth Sharing 10 Mb/sec 10 Mb/sec 10 Mb/sec 10 Mb/sec 2 Mb/sec 8 Mb/sec 8 Mb/sec 8 Mb/sec 8 Mb/sec • Macroscopic TCP modeling is a field • “Fluid in Pipe” analogy • Rule of Thumb: Share of what a flow gets on its bottleneck link is inversely proportional to its Round-Trip Time • Turns out TCP in steady-state implements a type of resource sharing called: Max-Min Fairness Grid Performance Workshop 2005

Max-Min Fairness • Principle • Consider the set of all network links, L • cl is the capacity of link l • Considers the set of all flows, R • a flow = a subset of L • xr is the bandwidth allocated to flow r • Bandwidth capacities are respected  l  L, ∑ r  R | l  r xr ≤ cl • TCP in steady-state is such that: minr  R xr is maximized • The above can be solved efficiently (with appropriate data structures) Grid Performance Workshop 2005

Simulation for Grid Computing

Simulation for Grid Computing

Presentation Transcript

Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing Now! Making the Case for Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid computing for CMS

Grid Computing:

Scheduling for Grid Computing

Grid Computing

Grid computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing

Grid Computing