660 likes | 1.05k Vues
Simulation for Grid Computing. Henri Casanova Univ. of California, San Diego casanova@cs.ucsd.edu. Grid Research. Grid researchers often ask questions in the broad area of distributed computing Which scheduling algorithm is best for this application on a given Grid?
E N D
Simulation forGrid Computing Henri Casanova Univ. of California, San Diego casanova@cs.ucsd.edu
Grid Research • Grid researchers often ask questions in the broad area of distributed computing • Which scheduling algorithm is best for this application on a given Grid? • Which design is best for implementing a distributed Grid resource information service? • Which caching strategy is best for enabling a community of users that to distributed data analysis? • What are the properties of several resource management policies in terms of fairness and throughput? • What is the scalability of my publish-subscribe system under various levels of failures • ... Grid Performance Workshop 2005
Grid Research • Analytical or Experimental? • Analytical Grid Computing research • Some have developed purely analytical / mathematical models for Grid computing • makes it possible to prove interesting theorems • often too simplistic to convince practitioners • but sometimes useful for understanding principles in spite of dubious applicability • One often uncovers NP-complete problems anyway • e.g., routing, partitioning, scheduling problems • And one must run experiments • Grid computing research: based on experiments • Most published works Grid Performance Workshop 2005
Grid Computing as Science? • You can tell you are a scientific discipline if • You can read a paper, easily reproduce (at least a subset of) its results, and improve • You can tell to a grad student “Here are the standard tools, go learn how to use them and come back in one month” • You can give a 1-hour seminar on widely accepted tools that are the basis for doing research in the area • We are not there today • But perhaps I can give a 1-hour seminar on emerging tools that could be the basis for doing research in the area, provided a few open questions are addressed • Need for standard ways to run “Grid experiments” Grid Performance Workshop 2005
Grid Experiments • Real-world experiments are good • Eminently “believable” • Demonstrates that proposed approach can be implemented in practice But... • Can be time-intensive • Execution of “applications” for hours, days, months, ... • Can be labor-intensive • Entire application needs to be built and functional • including all design / algorithms alternatives • include all hooks for deployment • Is it a bad engineering practice to build many full-fledge solutions to find out which ones work best? Grid Performance Workshop 2005
Grid Experiments (2) • What experimental testbed? • My own little testbed • well-behaved, controlled, stable • often not representative of real Grids • Real grid platforms • (Still) challenging for many grid researchers to obtain • Not built as a computer scientist’s playpen • other users may disrupt experiments • other users may find experiments disruptive • Platform will experience failures that may disrupt the experiments • Platform configuration may change drastically while experiments are being conducted • Experiments are uncontrolled and unrepeatable • even if disruption from other users is part of the experiments, it prevents back-to-back runs of competitor designs / algorithms Grid Performance Workshop 2005
Grid Experiments (3) • Difficult to obtain statistically significant results on an appropriate testbed And to make things worse... • Experiments are limited to the testbed • What part of the results are due to idiosyncrasies of the testbed? • Extrapolations are possible, but rarely convincing • Must use a collection of testbeds... • Still limited explorations of “what if” scenarios • what if the network were different? • what if we were in 2015? • Difficult for others to reproduce results • This is the basis for scientific advances! Grid Performance Workshop 2005
Simulation • Simulation can solve many (all) of these difficulties • No need to build a real system • Conduct controlled / repeatable experiments • In principle, no limits to experimental scenarios • Possible for anybody to reproduce results • ... • Simulation • “representation of the operation of one system (A) through the use of another (B)” • Computer simulation: B a computer program • Key question: Validation • correspondence between simulation and real-world Grid Performance Workshop 2005
Simulation in Computer Science • Microprocessor Design • A few standard “cycle-accurate” simulators are used extensively • http://www.cs.wisc.edu/~arch/www/tools.html • Possible to reproduce simulation results • Networking • A few standard “packet-level” simulators • NS-2, DaSSF, OMNeT++ • Well-known datasets for network topologies • Well-known generators of synthetic topologies • SSF standard: http://www.ssfnet.org/ • Possible to reproduce simulation results • Grid Computing? • None of the above up until a few years ago • Most people built their own “ad-hoc” solutions • Promising recent developments Grid Performance Workshop 2005
Simulation of Distributed Computing Platforms? • Simulation of parallel platforms used for decades • Most simulations have made drastic assumptions • Simplistic platform model • Processors perform work at some fixed rate (Flops) • Network links send data at some fixed rate • Topology is fully connected (no communication interference) or a bus (simple communication interference) • Communication and computation are perfectly overlappable • Simplistic application model • All computation is CPU intensive • Clear-cut communication and computation phases • Application is deterministic • Straightforward simulation in most cases • Just fill in a Gantt chart with a computer rather than by hand • No need for a “simulation standard” Grid Performance Workshop 2005
Grid Simulations? • Simple models: • perhaps justifiable for a switched dedicated cluster running a matrix multiply application • Hardly justifiable for grid platforms • Complex and wide-reaching network topologies • multi-hop networks, heterogeneous bandwidths and latencies • non-negligible latencies • complex bandwidth sharing behaviors • contention with other traffic • Overhead of middleware • Complex resource access/management policies • Interference of communication and computation Grid Performance Workshop 2005
Grid Simulations • Recognized as a critical area • Grid eXplorer (GdX) project (INRIA) • Build an actual scientific instrument • Databases of “experimental conditions” • 1000-node cluster for simulation/emulation • Visualization tools • Still in planning • What simulation technology??? • Two main questions • What does a “representative” Grid look like? • How does one do simulation on a synthetic representative Grid? Grid Performance Workshop 2005
Presentation Outline • Introduction • Simulation for Grid Computing? • Generating Synthetic Grids • Simulating Applications on Synthetic Grids • Current Work and Future Directions Grid Performance Workshop 2005
Why Synthetic Platforms? • Two goals of simulations: • Simulate platforms beyond the ones at hand • Perform sensitivity analyses • Need: Synthetic platforms • Examine real platforms • Discover principles • Implement “platform generators” • What Simulation results in my paper? • Results for a few “real” platforms • Results for many synthetic platforms Grid Performance Workshop 2005
Generation of Synthetic Grids • Three main elements • Network Topology • Graph network link Grid Performance Workshop 2005
Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies backbone LAN Grid Performance Workshop 2005
Generation of Synthetic Grids workstation data store server cluster • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources Grid Performance Workshop 2005
Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources • “Background” Conditions • Load and unavailability workstation data store server cluster Grid Performance Workshop 2005
Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources • “Background” Conditions • Load and unavailability • Failures workstation data store X X X X server cluster Grid Performance Workshop 2005
Generation of Synthetic Grids • Three main elements • Network Topology • Graph • Bandwidth and Latencies • Compute Resources • And other resources • “Background” Conditions • Load and unavailability • Failures • What is Representative and Tractable? workstation data store X X X X server cluster Grid Performance Workshop 2005
Synthetic Network Topologies • The network community has wondered about the properties of the Internet topology for years • The Internet grows in a decentralized fashion with seemingly complex rules and incentives • Could it have a mathematical structure? • Could we then have generative models? • Three “generations of graph generators” • Random • Structural • Degree-based Grid Performance Workshop 2005
Random Topology Generators • Simplistic • Place N nodes randomly in a square • Vertices u,v connected with prob. P(u,v) = • Waxman [JSAC’88] • Place N nodes randomly in a CxC square • P(u,v) = e-d / (C √2), 0 < , ≤ 1 • d = Euclidian distance between u and v • First model to be widely used for network simulations • Others • Exponential, Locality-based, ... • Shortcoming: Real networks have a non-random structure with some “hierarchy” Grid Performance Workshop 2005
Structural Generators • “... the primary structural characteristic affecting the paths between nodes in the Internet is the distribution between stub and transit domains... In other words, there is a hierarchy imposed on nodes...” [Zegura et al, 1997] • Both at the “AS level” (peering relationships) and at the “router level” (Layer 2) • Quickly became accepted wisdom: • Transit-Stub [Calvert et al., 1997] • Tiers [Doar, 1996] • GT-ITM [Zegura et al., 1997] Grid Performance Workshop 2005
Power-Laws! • In 1999, Faloutsos et al. [SIGCOMM’99] rocked topology generation with power laws • Results obtained both for AS-level and router-level information from real networks • Outdegree (number of edges leaving a node) • For each node v, compute its outdegree dv • Node rank, rv: the index of v in the order of decreasing degree • Nodes can have the same rank • Law: dv proportional to rvR Measured Grid Performance Workshop 2005
Power-Laws! • Random Generators do not agree with Power-laws • Structural Generators do not agree with Power-laws Waxman GT-ITM, flat GT-ITM, Transit-Stub Grid Performance Workshop 2005
Degree-based Generators • New common wisdom: A topology that does not agree with power-laws cannot be representative • Flurry of development of “power-law generators” after the Faloutsos paper • CMU power law graph generator [Palmer et al., 2000] • Inet [Jin et al., 2000] • BRITE [Medina et al., 2001] • PLRG [Aiello et al., 2000] BRITE Grid Performance Workshop 2005
Structure vs. Power-law • We know network have structure AND power laws • Combine both? • GridG project [Dinda et al., SC’03] • Use a Tiers-generated topology • Add random links to satisfy the power-laws • How about just using power-laws? • Comparison paper [Tangmunarunkit et al., SIGCOMM’02] • Degree-based generators capture the large-scale structure of real networks very well, better than structural generators! • structural generators impose “too strict” a hierarchy • Hierarchy arises naturally from the degree-based generators • e.g., backbone links • Works both for AS-level and router-level Grid Performance Workshop 2005
Short story • What generator? • To model large networks (e.g., > 500 routers) • use degree-based generators • To model small networks (e.g., < 100 routers) • use structural generators • degree-based will not create any hierarchy Grid Performance Workshop 2005
Bandwidths, latencies, traffic, etc. • Topology generators only produce a graph • We need link characteristics as well • Option #1: Model physical characteristics • Some “models” in topology generators • Need to simulate background traffic • No accepted model for generating background traffic • Simulation can be very costly • Option #2: Model end-to-end performance • Models ([Lee, HCW’01]) or Measurements (NWS, ...) • Go from path modeling to link modeling? • Turns out to be a difficult question • DARPA workshop on network modeling Grid Performance Workshop 2005
Bandwidths, latencies, traffic, etc. • Maybe none of this matters? • Fiber inside the network mostly unused • Communication bottleneck is the local link • Appropriate tuning of TCP or better protocols should saturate the local link • Don’t care about topology at all! • Or maybe none of this matters for my application • No network contention Grid Performance Workshop 2005
Compute Resources • What resources do we put at the end-points? • Option #1: “ad-hoc” generalization • Look at the TeraGrid • Generate new sites based on existing sites • Option #2: Statistical modeling • Examing many production resources • Identify key statistical characteristics • Come up with a generative/predictive model Grid Performance Workshop 2005
Synthetic Clusters? • Many Grid resources are clusters • What is the “typical” distribution of clusters? • “Commodity Cluster synthesizer” [Kee et al., SC’04] • Examined 114 production clusters (10K+ procs) • Came up with statistical models • Validated model against a set of 191 clusters (10K+ procs) • Models allow “extrapolation” for future configurations • Models implemented in a resource generator Grid Performance Workshop 2005
Architecture/Clock Models • Current distribution of proc families • Linear fit between clock-rate and release year within a processor family • Quadratic fraction of processors released on a given year • Model future distributions and speeds Grid Performance Workshop 2005
Other models? • Other models • Cache size: grows logarithmically • Number of processors per node: log2-normal • Memory size: log2-normal • Number of nodes per cluster: log2-normal • Models were validated against a snapshot of ROCKS clusters • These clusters have been added to the “training” set • More clusters are added every month • GridG • Provide a generic framework in which such laws and other correlations can be encoded Grid Performance Workshop 2005
Resource Availability / Workload • Probabilistic models • Naive: exp. distributed availability and unavailability intervals • Sophisticated: Weibull distributions [Wolski et al.] • Traces • NWS, etc. • Desktop Grid resources [Kondo, SC’04] • Workload models • e.g., Batch schedulers • Traces • Models [Feitelson, JPDC’03] • job inter-arrival times: Gamma • amount of work requested: Hyper-Gamma • number of processors requested: Compounded (2^, 1, ...) Grid Performance Workshop 2005
A Sample Synthetic Grid • Generate 5,000 routers with BRITE • Annotate latency according to BRITE’s Euclidian distance method (scaling to obtain the desired network diameter) • Annotate bandwidth based on a set of end-to-end NWS measurements • Pick 30% of the end-points • Generate a cluster at each end-point according to Kee’s synthesizer for Year 2006 • Model cluster load with Feitelson’s model with a range of parameters for the random distributions • Model resource failures based on Inca measurements on TeraGrid Grid Performance Workshop 2005
Synthetic Grid Generation • Still far from widely accepted standards • Many ongoing, promising efforts • Researchers have recognized this as an issue • Tools from networking can be reused • A few “Grid” tools are available • What is really needed • Repository of “Grid Measurements” • Repository of “Synthetic/Generated Grids” Grid Performance Workshop 2005
Presentation Outline • Introduction • Simulation for Grid Computing? • Generating Synthetic Grids • Simulating Applications on Synthetic Grids • Current Work and Future Directions Grid Performance Workshop 2005
Simulation Levels • Simulating applications on a synthetic platform? • Spectrum of simulation levels Mathematical Simulation Discrete-event Simulation Emulation more abstract Based solely on equations Abstraction of system as a set of dependent actions and events (fine- or coarse-grain) less abstract Trapping and virtualization of low-level application/system actions • Boundaries above are blurred (d.e. simulation ~ emulation) • A simulation can combine all paradigms at different levels Grid Performance Workshop 2005
Simulation Options more abstract less abstract • Network • Macroscopic: Flows in “pipes” • coarse-grain d.e. simulation + mathematical simulation • Microscopic: Packet-level simulation • fine-grain d.e. simulation • Actual flows go through “some” network • emulation • CPU • Macroscopic: Flows in a “pipe” • coarse-grain d.e. simulation + mathematical simulation • Microscopic: Cycle-accurate simulation • fine-grain d.e. simulation • Virtualization via another CPU / Virtual Machine • emulation more abstract less abstract Grid Performance Workshop 2005
Simulation Options more abstract less abstract • Application • Macroscopic: Application = analytical “flow” • Less Macroscopic: sets of abstract “tasks” with resource needs and dependencies • coarse-grain d.e. simulation • Application specification or pseudo-code API • Virtualization • emulation of actual code with trapping of application-generated events • Two projects • MicroGrid (UCSD) • SimGrid (UCSD + IMAG + Univ. Nancy) Grid Performance Workshop 2005
MicroGrid • Set of simulation tools for evaluating middleware, applications, and network services for Grid systems • Applications are supported by emulation and virtualization • Actual application code is executed on “virtualized” resources • Microgrid accounts for • CPU • network • Virtualization Application Virtual Resources MicroGrid Physical Resources Grid Performance Workshop 2005
MicroGrid Virtualization • Resource virtualization • “resource names” are virtualized • gethostname, sockets, Globus’ GIS, MDS, NWs, etc. • Time virtualization • Simulating the TeraGrid on a 4 node cluster • Simulating a 4 node cluster on the TeraGrid • CPU virtualization • Direct execution on (a fraction of) a physical resource • No application modification • Main challenge • Synchronization between real time and virtual time Grid Performance Workshop 2005
MicroGrid Network • Packet-level simulation • Network calls are intercepted • Are sent to a network simulator that has been configured with the virtual network topology • Implemented with MaSSF • An MPI version of DaSSF [Liu et al., 2001] • Configured via SSF standard (MDL, etc.) • Protocol stacks implemented on top of MaSSF • TCP, UDP, BGP, etc. • Socket calls are trapped without application modification • real data is sent • delays are scaled to match the virtual time • Main challenges • scaling (20,000 routers on a 128-node cluster) • synchronization with computation Grid Performance Workshop 2005
MicroGrid in a NutShell • Virtualization via trapping of • application events • (emulation) • Can have high overhead • But captures the overhead! Virtualization via another CPU (emulation) • Can be really slow • But hopefully accurate • Microscopic: Packet-level simulation • (fine-grain discrete event simulation) • Can be really slow for long transfers • But hopefully accurate CPU Application Network Emulation Discrete-event Mathematical Simulation Simulation less abstract more abstract Grid Performance Workshop 2005
SimGrid • Originally developed for scheduling research • Must be fast to allow for thousands of simulation • Application • No real application code is executed • Consists of tasks that have • dependencies • resource consumptions • Resources • No virtualization • A resource is defined by • a rate a which it does “work” • a fixed overhead that must be paid by each task • traces of the above if needed + failures • A task can use multiple resources • data transfer over multiple links, computation that uses a disk and a CPU Grid Performance Workshop 2005
SimGrid • Uses a combination of mathematical simulation and coarse-grain discrete event simulation • Simple API to “specify” an application rather than having it already implemented • Fast simulation • Key issue: Resource sharing • In MicroGrid: resource sharing “emerges” out of the low-level emulation and simulation • Packets of different connections interleaved by routers • CPU cycles of different processes get slices of the CPU • Drawback: slow simulation • How can one do something faster that is still reasonable? • Come up with macroscopic models of resource sharing Grid Performance Workshop 2005
Resource Sharing in SimGrid • Macroscopic resource sharing can be easy • A CPU: CPU-bound processes/threads get a fair share of the CPU “in steady state” • Why go through the trouble of emulating CPU-bound processes? • Just say how many cycles they need, and “compute” how many cycles they get per second • Macroscopic resource sharing can be not so easy • The Network • Many end-points, routers, and links • Many end-to-end TCP flows? • How much bandwidth does each flow receive? Grid Performance Workshop 2005
Bandwidth Sharing 10 Mb/sec 10 Mb/sec 10 Mb/sec 10 Mb/sec 2 Mb/sec 8 Mb/sec 8 Mb/sec 8 Mb/sec 8 Mb/sec • Macroscopic TCP modeling is a field • “Fluid in Pipe” analogy • Rule of Thumb: Share of what a flow gets on its bottleneck link is inversely proportional to its Round-Trip Time • Turns out TCP in steady-state implements a type of resource sharing called: Max-Min Fairness Grid Performance Workshop 2005
Max-Min Fairness • Principle • Consider the set of all network links, L • cl is the capacity of link l • Considers the set of all flows, R • a flow = a subset of L • xr is the bandwidth allocated to flow r • Bandwidth capacities are respected l L, ∑ r R | l r xr ≤ cl • TCP in steady-state is such that: minr R xr is maximized • The above can be solved efficiently (with appropriate data structures) Grid Performance Workshop 2005