Créer une présentation
Télécharger la présentation

Télécharger la présentation
## The Load Distance Balancing Problem

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**The Load Distance Balancing Problem**Eddie Bortnikov (Yahoo!) Samir Khuller (Maryland) Yishay Mansour (Google) Seffi Naor (Technion)**The Load-Distance Balancing Problem**• Given n clients and k servers s1, s2,…sk we need to assign each client to a server. • Cost for client i assigned to server sj is as follows: Cost(i) = Distance(i,sj) + Delay(j,Lj) Delay(j,Lj) is a FUNCTION of the number of clients Lj assigned to sj. OBJECTIVE: Min Max Cost(i)**An Example**s2 s1 s3 C E D A B • Each server has its own delay function (can be arbitrary, we just assume its non-decreasing). • Note that C is closer to s1, but prefers to attach to s2 since Dist(C,s2)+Delay(s2,2)<Dist(C,s1)+Delay(s1,3) • Objective is to Minimize Max Cost for any client**Related Work**• Lots of research on locating facilities. Here the facilities are all given; we just have to compute assignment of clients to facilities. • Notion of capacities has been used for various covering problems such as Vertex Cover, K Centers, Facility Location etc.**Main Results**• The problem is NP-hard. • We develop a polynomial time 2 approx. • We show that the bound 2 cannot be improved to 2-ε unless NP=P. • With triangle inequality in the distance function the hardness reduces to 5/3-ε. • When all clients and servers are on a line its solvable in polynomial time. • For Min Sum Cost(i), we can solve in polynomial time using Min-Weight Matching.**NP-hardness by Exact Set Cover**• Given N elements and a collection S of K sets (each set has size m). Does there exist a subset S’ of S, such that each element belongs to exactly one set in S’? In other words, we need to pick exactly N/m subsets from S, to cover each element once.**Example of Exact Cover**• Here we have N=16 elements and 9 sets (m=4). • The FOUR blue sets form an EXACT COVER, and we discard the FIVE orange sets.**Reduction from Exact Cover**• Each element is a client. In addition we create a collection of M(K-N/m) dummy clients. • Subset Sj in S corresponds to server sj. • Dist(dummy,server)=d1 • Dist(i,sj) = d2 if i ε Sj, o.w. ∞ • d2 >> d1 d1 Dummy clients sj d2 i Clients (elements)**Reduction from Exact Cover**• Delay functions for servers are basically a step function. • Delay(j,Lj) = Δ-d2, when load is at most m. • Delay(j,Lj)=Δ-d1, when load exceeds m, but is at most M. Δ-d1 Δ-d2 m M**Reduction from Exact Cover**• Suppose there is a solution to exact cover, then there is a solution to the LDB problem with delay at most Δ. • For each chosen subset Sj, the corresp. server sj gets m clients each at distance d2.Since the delay is Δ-d2, the total cost is at most Δ. • For the remaining subsets, those dummies are all assigned to the remaining servers (K-N/m), each gets M dummies. • The proof in the other direction requires some work!**Proof (cont.)**• Each server can support at most M dummy clients if the total cost does not exceed Δ, and no more than m real clients. • Suppose a server supports both real and dummy clients; then the total number of servers with real clients is k’ > N/m. • These serve at most (mk’-N) dummy clients, while the rest can serve only M(k-k’) dummy clients. • Adding the two shows (some algebra needed) that we can only assign < M(k-N/m) dummy clients if M>m.**Hardness Results follow….**• We can set d1=ε and d2=Δ-ε. A solution to EXACT COVER exists if and only if a solution with cost Δ exists for LDB. • If there is no solution to EXACT COVER, then every solution to LDB has cost 2(Δ-ε). • However, here we do violate triangle inequality in the distance function.**Hardness results with triangle inequality**• We need to set d1=⅓ Δ, and d2= Δ-ε. • With these parameters, the distance between a (real) client i and a server sj such that i is not in Sj, is at least 5/3Δ- ε. • A solution to EXACT COVER exists if and only if a solution with cost Δ exists for LDB. • If there is no solution to EXACT COVER, then every solution to LDB has cost 5/3Δ-ε.**Approximation Algorithm with factor 2**• Suppose a solution exists with maximum cost Δ. • For each server sj, we can compute an upper bound on the number of clients that can be served with a delay of at most Δ (say L*j). • For each client i we can compute the subset of servers that are within distance Δ (S*i). • Now its just a flow problem to check if an assignment exists where each client i is assigned to a server from S*i and each sj has load at most L*j. • Minimizing Δ gives a trivial 2 approximation.**All servers and clients on a line**• Use dynamic programming!**Minimizing the Sum of Costs**• We reduce this to min cost matching in a bipartite graph. • Let G=(X,Y,E) where nodes in X correspond to n clients and there are nk nodes in Y. We have n nodes corresp. to each server. • We ask for a min cost matching to find a solution.**Capacitated K Centers**• The related problem of choosing K facilities has been considered (each client should be assigned to a closeby facility and the load on the facility should not be too high): [Khuller & Sussman] (K,L,5R) or ((2/c)K, cL, 2R) related to K-centers clustering.**Conclusions**• Can we improve the 2 approximation when triangle inequality holds? • Can we improve the 2 approximation when Delay functions satisfy specific properties? What is a natural delay function? • Are there other special cases that can be solved in polynomial time?