Contract-Based Load Management in Federated Systems - Balazinska et al. (2004)

Contract-Based Load Management in Federated Distributed Systems M. Balazinska, H. Balakrishnan, and M. Stonebraker April, 7th, 2004 Presented by Beomjoo Seo

Contents • Motivation • Related Work • Problem Description • Cost Model • Definition of Acceptable Allocation • Bounded Price Mechanism • Fixed Price Contracts • Price Range Contracts • Evaluation • acceptable allocation • Convergence speed • Stability • Prototype Experiments • Conclusion

Motivation • Many end-to-end services are being deployed over an infrastructure that spans multiple control domains. • E-commerce, web-hosting, application-hosting • Federated operation helps to reduce load spikes by sharing the resources for common benefit. • But every organizations are selfish to maximize their own profit. • Computational Economy • Participants provide resources and perform computing for each other in exchange for payment. • The popularity of bilateral agreement (Service Level Agreement) between ISPs • SLA provides different QoS and prices to different partners. • Bounded Price Mechanism • Private pairwise contracts negotiated offline between participants. • privacy, service customization, price discrimination • light load management due to pre-negotiation, good(NOT OPTIMAL) system-wide load balance • Transfer load if the local processing is larger than remote processing.

Applicability to federated systems • Distributed stream processing applications • Data sources are distributed and managed by different organizations. • Large volumes of data • Well-defined operators • Examples) • Financial services (price feeds), • Medical applications (sensors attached to patients), • Infrastructure monitoring (computer networks, car traffic), • Military applications (target detection) • Medusa Project

Related Work 1 • Optimal/ Near-optimal allocation using Gradient-descent • Nodes exchange load among themselves producing successively less costly allocations. • Contract-based: are directed by self-interest but not produce optimal alloc. • Participant selfishness becomes more important ! • Mechanism Design (MD) • Agents reports their costs to central entity that compute the optimal allocation and compensating payments. • Algorithmic MD (AMD) • Additionally consider the computational complexity of MD implementations. • Distributed AMD (DAMD) such as BGP-based routing and cost-sharing of multicast trees • Assume participants correctly execute payment computations. • Contract-based: bilateral contracts

Related Work 2 • Computational Economies • Pricing : Resource consumers have different price to performance preferences and are allocated a budget. • e.g.) auction, bid, price adjustment • But participants need to hold and participate in auctions for every load movement, inducing a large overhead. • Dynamic load changes lead to price changes and frequent reallocation. • Bartering • Peers securely exchange tickets that provide access to resources. • Contract-based: no specification on resource amounts • Service Level Agreement (SLA) widely used for web services and e-commerce • Contract model is a variant of SLA • Free Collaboration

Problem Description 1 • The system consists of a set S of autonomous participants (nodes) and a set K of heterogeneous tasks. (S, K) • taskseti : taskset running at node i • Cost Model • Total cost function Di(real numbered) of each participant I depends on the load imposed by the taskseti. • It is assumed to be an increasing and convex function. • E.g., the increased difficulty in offering low delay at higher load. • Objectives • Maximize each participants’ utility. • No participant is overloaded. • If whole system is overloaded, use as much of the available capacity as possible.

Cost Function for a single resource (processing)

Problem Description 2 • Definitions • Acceptable Allocation : it should satisfy • (1) no participant is above its capacity threshold. • (2) all participants are at or above their capacity thresholds if the total offered load exceeds the sum of the capacity thresholds. • Marginal Cost (MC) for node I: MC(u,taskseti) • Incremental cost for node I of running task u given its current taskseti • Contract C(I,j) • A price range [min, max] that constrains the runtime price paid by participant I for each unit of load given to j. • A price per load that I should pays when migrating to J. • It assumes that node J should not report the consumed load maliciously. • Algorithms • Migrate a task when the marginal cost per unit of load of the task is higher than the price in a contract.

Fixed-Price Contracts • Contract: FixedPrice • Migration Requestor • An overloaded participant to select a maximal set of tasks from its taskseti that cost more to process locally than they would cost if processed by one of its partners and offer them to that partner. • To exercise the lower-priced contracts first with the hope of paying less and moving more tasks. • Migration Acceptor • Examine the higher unit-price offers first. • How to set up fixed price during the contract negotiation • A node determines its max desired load level X and corresponding MC per unit of load. • MC is also the max unit price that participant should accept for a contract. • Avoid too low price because of underutilization. • Need good estimation of the expected load level. • Need constraining the runtime task selection due to privacy issues.

Fixed-Price Contract Algorithms

Load Movement of Fixed-Price Contracts A B C Contract Contract Load Propagation ? C NOT Acceptable Allocation

Price Range Contracts • Fixed-price contract cannot propagate the load through the chain of identical contracts. • If all contracts are identical, a task can only propagate one hop away from its origin. • Thus, it cannot produce acceptable allocations. • Contract: [FixedPrice - ∆; FixedPrice] • It allows a node to forward load from an overloaded partner to a more lightly loaded one.

Load Movements of Price Range Contracts Fixed Price Another Contract Range

Derivation of minimal contract price-range • We assume a network of homogeneous nodes with identical contracts. • For a FixedPrice contract, tasksetF as the maximum set of tasks that a node can handle before its per-unit-load marginal cost exceeds FixedPrice and triggers a load movement • For [FixedPrice- ;FixedPrice], any task can travel two hops. • Similarly, [FixedPrice- ;FixedPrice] can travel M hops. • Larger price range; speed-up load movements, but increase runtime overhead, price volatility, and # of reallocation. • Thus, we need to keep the range as small as possible and extend it only enough to ensure convergence to acceptable allocations. • Minimal price range should be for the diameter M of the network of contracts.

Lemma on Minimal Price Range • In a network of homogeneous nodes, tasks, and contracts, to ensure convergence to acceptable allocations in underloaded systems, the unit price range in contracts must be at least : [ FixedPrice - ; FixedPrice], where M is the diameter of the network of contracts and tasksetF is the set of tasks that satisfies

Properties • Theorem2 • If nodes, contracts, and tasks are homogeneous, and contracts are set according to Minimal Price Range, the final allocation is an acceptable allocation for underloaded systems and a nearly acceptable allocation for overloaded systems. • Complexity of Convergence Time • For N nodes with at most C contracts each, the fixed-price mechanism has a convergence time of O(1) in the best case and O(N+C) in the worst case if notifications are used. • Complexity of Communication Overhead • To converge, the fixed-price contracts mechanism imposes a best-case communication overhead of O(N) and a worst-case overhead of O(NC) if notifications are used.

System Implementation • Data Stream: a continuous sequence of attribute-value tuples • Operators: functions that transform input streams into output streams • Remote Definitions: migrate operators rather than processes. • When selecting operators to be moved, consider the data flow. NOT COVERED HERE. Source IP, Destination IP, Time Protocols used

Medusa Software Architecture • Aurora: Query Processor • Lookup: a client of a distributed catalog (information on streams, schemas and queries running in the system) • Brain: monitors local load condition periodically.

Evaluation • Simulation • CSIM • Use random topologies, varying minimum # of contracts per node. • 995 nodes • Processing Time: 50 us/tuple • Input Rate: 500 tuples/s (500Kbps) • Output Link Capacity: 100 Mbps • Operator Utilization • CPU: 2.5 % • Bandwidth: 4 % • 4 Test Scenarios • Homogeneous : Fixed, Range • Heterogeneous: Random, Random Range • Limitation • Homogeneous tasks • No handling on multiple resources

Convergence to Acceptable Allocations

Convergence Speed

Stability under Changing Load • 50 nodes (min contracts: 3) • Examine for a sudden load increase and decrease • Two experiment set : 15 %, 30 % extra load Load increase Load decrease Operator movements

Prototype Evaluation • Network intrusion detection query • Connection Traces: • MIT (1 hour) -> divides logs into 3 traces with 20 x speed up • Utah (1 day) -> divides logs into 4 traces with 8 x speed up • Use fixed-price contracts. • Load transfer delay: 75 ms Utah load increase at 0 MIT load spike at 1 C0 C1

Conclusion • Bounded-price Mechanism • Each participant has pair-wise contracts with partners. • Contracts specify a bounded range of unit prices when transferring load. • Suggestion • To maximize shedding excess load, negotiate relatively high fixed-price contract. • Minimize runtime overhead and negotiate with lower prices later. • If any node maintains the current state of partners, it should renegotiate with a small price-range and make a small profit by forwarding load from their overloaded to underloaded partners. • Applicable to Web services, computational grids, overlay-based computing platforms, and P2P ? • Future Work • High availability issue • Customization of the contracts

OverQoS: An Overlay based Architecture for Enhancing Internet QoS L. Subramanian, I.Stoica, H. Balakrishnan, and R.H. Katz April, 7th, 2004 Presented by Beomjoo Seo

Contents • Motivation • What is OverQoS ? (OverQoS architecture) • Notation • Controlled-Loss Virtual Link (CLVL) • Loss Recovery Mechanisms • FEC, ARQ, hybrid of FEC + ARQ • Sample Test Applications • Evaluation • Metrics: • Statistical loss/bandwidth guarantees, OverQoS cost, Stability/Fairness • Simulation • Wide-Area Network Experiments • Conclusion

Motivation • QoS-aware services (IntServ, DiffServ) have not been successfully deployed so far • because they require all the network elements between a source and a destination aware of QoS mechanisms. • because of no incentives for ISPs to coordinate their deployment • So “Are there any meaningful QoS enhancements that can be provided in the Internet without requiring support from the IP routers?” • Let’s cope with it at overlay level rather than at IP level. • Abstraction of the overlay link called Controlled-Loss Virtual Link (CLVL) • Features • Reducing or eliminating the burst losses • Packet Prioritization • Statistical bandwidth/loss guarantees

OverQoS Architecture • Overlay Network Components • Overlay Nodes : OverQoS routers • Overlay Edges (Virtual Link) • IP path between two overlay nodes • Controlled Loss Virtual Link (CLVL) between overlay nodes • Bundle: a stream of application data packets carried across the virtual link. • Topology • Overlay path is determined by RON. • End-to-end path is fixed • Challenges • Node placement and cross traffic • Fairness • Stability • Design Principles • Bundle loss control • Resource Management within a bundle

high low Notation r = Redundancy 0 1 Need proper estimation = b(1-r) Given factor Given factor p Loss rate (measured over an encoding/decoding window)

Bundle Loss Model • Can we find a fixed q over CLVL for a given period ? • For a given q, what is the minimum amount of redundancy ( r ) ? • Total traffic of a given CLVL can be computed by • The traffic of the bundle • The redundancy traffic required to achieve the target loss rate q • b(t) : max traffic bound at time t • r(t) : the fraction of redundancy traffic to achieve q • c(t) : available bandwidth = b(t) * ( 1 – r(t1) ) • As long as the arrival rate of the bundle at the entry node does not exceed c(t), the packet ross rate across the virtual link will not exceed q, with high probability.

Controlled-Loss Virtual Link (CLVL) • Abstraction of an overlay link aimed to bound the bundle loss rate (q) to a small value. • Estimation of the maximum output rate, b • N-TCP pipe abstraction • N times of the throughput of a single TCP on the virtual link. • Use MulTCP to emulate the behavior of N TCP connections. • TCP-like congestion control with • Adv: Quickly respond to congestion. • Dis: may not provide smooth variations in the sending rate. • For smoother variations, use lesser value of • Achieving target loss rate, q • Design choice for packet recovery • FEC: quick loss recovery with high bandwidth overhead. • Retransmission (ARQ): high packet recovery time if the RTT is large. • Trade-off between recovery time and bandwidth overhead. • Combination of FEC and packet retransmission (ARQ) !!!

FEC+ARQ based CLVL construction average loss rate • Pure ARQ • Retransmit any lost packet times. • Recovery time : RTT x • Pure FEC • Use Reed-Solomon • Minimum r where q is achieved ? • FEC offers loss protection within a window if the fraction of packets lost in a window, p, is less than the amount of redundancy added for that window. • Based on the histogram of f(p), compute r for the next window at entry node. • FEC + ARQ • FEC incurs a high overhead when p is bursty. • Restrict the number of retransmissions to at most once. • Expected bandwidth overhead: r1+G(r1)(1+r2) • Problem • find r1 and r2 that minimizes expected bandwidth overhead subject to the target loss constraint: G(r1)G(r2) ≤ q • Practical Optimal Solution: r1=0 • use FEC only to protect retransmitted packet • r1=0, r2=0 ? r1 r2

Pure ARQ vs. Pure ARQ vs. FEC + ARQ

Implementation • 5000 Lines of C • Communication: UDP socket • End-to-end service provision: propagation of bmax along CLVLs.

Two Sample Applications • Streaming • Streaming Audio • MPEG Streaming • Smoothing losses / Preferential packet recovery can improve stream playback quality ? YES • Multiplayer Game (Counterstrike) • Smoothing works well only during short bursty loss-periods. • CLVL fail to achieve the given q due to congestion periods with very high loss-rates. But application progresses. Perceptual Eval of Speech Quality p = 2 % p = 3 % Heavy congestion Peak SNR p = 10 %

Evaluation • Evaluation Metrics • Loss guarantee ? Bandwidth guarantee ? OverQoS cost ? • Fairness/Stability ? • Methodology • Wide-Area Evaluation Testbeds: RON, PlanetLab • Construct 19 nodes : 6 E, 1 K, 1 Ca, 3 company, 8 behind-net • Exclude Internet2 connected nodes • Simulation • Use ns-2 v2.1b8 • Topology: a single congest link of 10 Mbps • Three different types of traffic loss patterns • Long lived TCP • Self Similar traffic • Web traffic

Statistical Loss Guarantees • q = 0.001 where N=10 • Simulation • Satisfy q with 0.5 % ~ 3.3 % losses • Experiment • 83 virtual links are lossy among 171 • at least 0.5 % loss • Duration: 20 minutes to 1 hour. • Sending Rate: between 120 Kbps to 2 Mbps • FEC+ARQ : 80 achieved q, 3 missed • 2 nodes failed because when all packets are transmitted along a virtual link are lost. • Due to routing changes or link resets. • 1 node failed because of Bi-modal loss distributions • Losses experienced in every window is zero is very high. (very bursty losses)

Statistical Bandwidth Guarantees (Experiment) • Monitor 83 links over 7 working days Cmin is stable compared to cavg u=0.005 0.35 median 0.4

OverQos Cost (Experiment) • Overhead • Run an N-TCP pipe for N=10 for each link, and q=0.1% • FEC+ARQ (r1=0) is very close to the average loss-rate • Delay • Delay will be caused by • The recovery process will cause additional delays. • If packet ordering is required, packet loss will incur delay of other packets. • End-to-end packet recovery is better than hop-by-hop recovery. • The additional delay by adding new nodes along a path is limited. • The additional delay is also dependent on the loss rate.

Fairness / Stability (Experiment) • OverQoS bundles compete • No cross-traffic • Cross-traffic consisting of five long live TCPs (wget op.) • Observations • Bundles coexist with each other and with the background traffic. • The ratio of throughputs of the bundles is preserved across both scenarios.

Conclusions • Three QoS enhancements are demonstrated • Smoothing losses • Prioritization of packets within an aggregate • Statistical loss and bandwidth guarantees • Bandwidth overhead is minimal. • Future Work • To combine admission control and path selection. • Compute “best” path that satisfies a flow’s requirements at the admission time. • To determine the “optimal” placement of the OverQoS nodes in the network.

Related Work • Overlay-based Techniques • Edge-to-edge congestion control • To support a limited range of bandwidth services using an overlay framework. • It requires modifications at all edge routers in a domain. • Service Overlay Network • Purchases bandwidth with certain QoS guarantees from network domains using SLAs and stitch them to provide e2e QoS guarantees. • Rely on the underlying domains to meet specific QoS requirements

Contract-Based Load Management in Federated Systems - Balazinska et al. (2004)

Contract-Based Load Management in Federated Systems - Balazinska et al. (2004)

Presentation Transcript

Distributed Object-Based Systems

Load Balancing in Distributed Systems

Distributed Component Based Systems

Resource Management in Distributed Systems

Resource Management in Distributed Systems: Distributed File Systems

Distributed Web-Based Systems

Distributed Systems Management

Federated Distributed Systems: Concepts of Distributed Systems (1)

Load Rebalancing for Distributed File Systems in Clouds

Distributed Object-Based Systems

Distributed Systems Management

Policy-Based Distributed Data Management Systems

Component based distributed systems

Distributed Coordination-Based Systems

Distributed Systems Management

Distributed Coordination Based Systems

Load Management in Distributed Video Servers

Component based distributed systems

Resource Management in Distributed Systems

Load Balancing in Distributed Systems