Distributed Queuing and Beyond

Distributed Queuing and Beyond Srikanta Tirthapura Brown University

App 1: Distributed Multiplayer Game Buy horses Build castle

Easy Implementation: Centralized Build castle Scalability Problems?

Alternate Peer-to-Peer Implementation Build castle Multicast moves to all players

Concurrent moves ordered consistently B: Attack treasury A: Move jewels A Move jewels B A: Move jewels Attack treasury B: Attack treasury Needs Ordered Multicast!

App 2: Synchronizing Access to Mobile Object Object resides at one node at a time

Concurrent Requests for the object Need the file Me too! Me too! Me too!

Requests Queued

Object Passed Down the Queue

Distributed Queuing • Distributed Logical Queue in Network • Processors issue requests to join the queue • Request joins the Queue by locating and informing the current tail about itself

Join Queue = Inform Tail New tail

Minimal Knowledge about Queue • Each element of queue only knows its successor (if there is one) • Sometimes need to know the predecessor • In general, immediate neighbors • Can still do useful things with this much knowledge!

Application to Mobile Objects • Accesses (maybe concurrent) to the object are queued • Object passes from request to successor

Application to Ordered Multicast • Old solutions use counting (token numbers) to order messages • We propose using Queuing, which might be faster (Tirthapura, Herlihy and Wattenhofer, 2001)

Traditional solution to Ordered Multicast: Counting Counter Count = 2 Count = 1 Count = 3 Counting can be done in a distributed manner

Ordering Using Counting Count = 1 Count = 3 Count = 2

Our Solution: Ordered Multicast using Queuing Single Distributed Queue of Messages

Ordering using Queuing Behind none Behind green Behind red

Queuing vs Counting • Both can be done in a distributed fashion • Distributed Queuing might be faster than Distributed Counting • Evidence: Most existing counting techniques can be “short-circuited” to queue faster than they can count - Combining Trees, Combining Funnels

Fault Tolerance • Very important issue • No single solution fits all • Tradeoff: Scalability versus Strength of Fault Tolerance Guarantees • Self-stabilization(Tirthapura and Herlihy, DISC 2001)

This Talk • Examine Arrow Protocol, a popular solution for distributed queuing • Performs well in practice • No previous systematic analysis • We give the first theoretical explanation of the concurrent performance

Roadmap • Distributed Queuing Protocols • Our contribution: Theoretical Analysis of Arrow Protocol • Our contribution: Further Applications of Queuing • Related Work, Conclusions and Future Work

Centralized Queuing tail

Problems • Scalability: Central Processor is a bottleneck (“hot-spot”) • No locality: Repeated requests from the same node have to travel all the way to the central node and back

Arrow Protocol • Invented by Kerry Raymond (1989), applied to Distributed Directories by Herlihy and Demmer (1998) • Path Reversal on a Spanning Tree of the Network

Model: Undirected Graph

Initialization: Spanning Tree

Initialization: Arrows tail

New Request tail

Path Reversal tail

Experiments • Herlihy and Warres compared Home-based (centralized) and Arrow Protocols for object management [JavaGrande 99] • Arrow protocol outperformed home-based even for small system sizes • We did experiments on a larger scale (80 processor SP-2 machine, MPI)

Arrow vs Centralized: 100,000 enqueues

Can we explain this? • Is there a fundamental reason why it performs well under concurrency? • We provide one such explanation..

Previous Work: Sequential Case • Demmer and Herlihy (1998) and Peleg and Reshef (1998) analyzed the Sequential Case • Requests spaced very far apart in time, next request issued after previous one queued

Our Analysis: Concurrent Case • How does the protocol perform when many nodes attempt to join the queue at the same time? • We analyze the combined cost of all the requests (Tirthapura, Herlihy and Wattenhofer, ACM PODC 2001)

Model • Graph with Nodes = processors, Edges = Communication Links • Edge weight = Delay of Communication Link • Latency = time till the request is queued (time till it informs predecessor)

One-shot Concurrent Problem • All requests occur at same time • An arbitrary subset of nodes R (of size r) request to join the queue at time zero • No more requests added later

Cost Metric • l(v) is latency of node v’s request • Cost of protocol is sum of latencies

Example 3 concurrent requests

bad good

Queuing Order is Important • Good queuing orders produce short paths- Latency to join the queue is small - Distance to successor is small • r requests, r! possible queuing orders • How good an order does Arrow produce? • Remember that Arrow is distributed and has to order requests on the fly!

Competitive Analysis • Compare an on-line distributed algorithm to an optimal “off-line” algorithm • Off-line algorithm has global information and gets synchronization for free(No on-line algorithm can do better) • Competitive ratio is worst case ratio between cost of on-line and off-line

Our Theorem The competitive ratio of Arrow for one-shot case is upper bounded by (log r) s- r is number of concurrent requests - s is the stretch of the (pre-selected) spanning tree T w.r.t. G Stretch of a spanning tree is the overhead of routing on tree versus routing on graph.

Roadmap of proof Two Parts: • Upper bound on Cost of Arrow: Nearest Neighbor characterization of order of queuing • Lower bound on Cost of Off-line algorithm

Arrow: Concurrent Requests

Distributed Queuing and Beyond

Distributed Queuing and Beyond

Presentation Transcript

Queuing

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Analysis

Queuing Theory

QueueTraffic and queuing theory

Distributed Computing Beyond The Grid

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing and Queue Management

Queuing

Queuing Theory

Queuing Theory

WLAN Throughput Improvement via Distributed Queuing MAC

Queuing

Queuing Systems

S6C10 - Queuing

Queuing Theory

Queuing Analysis