K-Server on Hierarchical Binary Trees

K-Serveron Hierarchical Binary Trees Prof. Adam Meyerson, UCLA Joint work with Aaron Coté (UCLA) and Laura Poplawski (Northeastern)

Outline of this Talk • K-Server Definitions and History • Examples, applications, competitive ratio, known results • Quasiconvexity: Solving K-Server Offline • Offline solution, relation to flow, useful theorems • A Hierarchical Problem • Defining binary hierarchical trees, divide and conquer • The Local Sub-problem • A special metrical task system, randomized greedy • Future work: Towards a General Randomized Algorithm • Non-binary hierarchical trees extend to finite metrics • Can we make this approach work for non-binary trees? • Infinite metrics and “online embeddings” • K-Server with random requests

K-Server Example We are given k initial locations for “servers”in a metric space. Request locations arrive one at a time. As each request arrives, we must move some server to that location. The goal is to minimize the total distance traveled by servers!

Why not Greedy? Optimum Greedy …… 6 4 2 3 Cost: Cost:

Measuring “Success” -- Competitive Ratio • Obviously we will not produce an optimum solution without knowledge of the future requests. • We need a way to measure the effectiveness of an algorithm. • The competitive ratio is the worst case (over all instances) ratio of the algorithm’s cost (total distance moved by servers) to the optimum. Ideally this would be small (close to 1). • CR = maxInstances I [cost of algorithm on I]/[cost of optimum on I]

Applications of K-Server • Emergency Response • The servers are emergency vehicles (i.e. police cars). • Requests correspond to emergency locations/calls. • Goal is to minimize average (mean) response time. • Obviously we do not know where future emergencies will be! • Caching and Paging • The servers represent pages in memory/cache. • Distances represent difficulty of replacement. • Often this is modeled with uniform metric. • But it makes sense if these are i.e. images for costs to vary. • Reconfigurable Devices • Locations represent configurations. • Servers represent the various devices. • Distance measures difficulty of reconfiguration. • Requests are tasks that need to be performed.

History and Known Results • Introduced by [Manasse-McGeoch-Sleater 1990] • n-1 competitive for n points, n-1 servers • Lower bound of k for deterministic algorithms • Competitive algorithm for any metric space [Fiat-Rabani-Ravid 1990] • 2k-1 competitive (Work Function) [Koutsoupias-Papadimitriou 1995] • More competitive deterministic algorithms for special metrics. • [Chrobak-Karloff-Payne-Vishwanathan 1990] • [Chrobak-Larmore 1991] • [Bartal-Koutsoupias 2000] • Randomized lower bound of log k, but no general o(k) upper bounds. • O(log k) uniform metric [Fiat-Karp-Luby-McGeoch-Sleater-Young 1991] • o(n) for equally spaced points on line [Csaba-Lodha 2001] poly-log k for widely-separated subspaces [Seiden 2001] • poly-log k if n=k+O(1) [Bartal-Blum-Burch-Tomkins 1997] • O(log k) weighted star [Bansal-Buchbinder-Naor 2007]

Expected Competitive Ratio and Randomized Algorithms • How do we measure the competitive ratio for a randomized algorithm? • There are basically two ways, one is the Adaptive Adversary, where the request sequence is sensitive to the random choices of the algorithm. • However, the more usual one is the Oblivious Adversary, where the request sequence is independent of the randomization. Here the goal is to minimize: • ECR = maxInstances I E[cost of algorithm on I]/[cost of optimum on I] • For a wide range of problems, randomized algorithms have been shown to obtain expected competitive ratios against oblivious adversaries which are provably better than the best possible deterministic guarantee.

The K-Server Conjectures • K-server conjecture: There exists a deterministic algorithm on any metric space with competitive ratio k. • Note that a 2k-1 competitive ratio is known. • There is a lower bound of k. • So the “gap” in this conjecture is just a factor of 2. • Tight for uniform metric. • Randomized k-server conjecture: There exists a randomized algorithm on any metric space with expected competitive ratio log k. • No general result is known that’s o(k). • There is a lower bound of log k. • Huge gap in this conjecture! • Tight for uniform metric.

This Talk -- New Results • We will give a randomized algorithm for k-server on a special class of metrics: Hierarchical Binary Trees. • The expected competitive ratio is O(log ) assuming that the hierarchical cost goes by =(log ). • This is poly-logarithmic in a natural problem parameter (diameter). • While Hierarchical Binary Trees are definitely a special case, we observe that: • There are few poly-logarithmic randomized results even for restricted metrics. Exceptions are uniform metric and n=k+O(1). [Fiat-Karp-Luby-McGeoch-Sleater-Young 1991] [Bartal-Blum-Burch-Tomkins 1997] • If we remove the “Binary” from the metric description, then embedding results will let us extend to general metrics! [Bartal 96] [Bartal 98] [Fakcharoenphol-Rao-Talwar 2003]

Solving K-Server Offline We will build a graph G as follows. Start by creating a node for each initial server location and each request. Copy this set of nodes for before and after each request. …….. Before 1 After 1 Before 2 After 2 ……..

Solving K-Server Offline Place a directed edge between any two nodes in consecutive copies. Give each edge a weight equal to the distance between locations represented by its endpoints. Before i After i

Solving K-Server Offline We now add special nodes s and t. Connect s to initial server locations in “before 1” with weight 0, capacity 1 edges. Edges from all nodes in “after T” to t; weight 0. s …….. t Before 1 After 1 After T

Solving K-Server via Flow • We observe that any k-server solution corresponds to a flow on this graph. The flow into a node is equal to the number of servers at the corresponding location and time. • If the i’th request is at location xi then a feasible solution requires one unit of flow (at least) between the “before i” and the “after i” copies of location xi. • It follows that after constructing the layered graph G (which has polynomial size) we can solve K-Server optimally by computing the minimum cost s-t flow of value k which satisfies various flow lower bounds. This can be done in polynomial time. • Further, any feasible k-server solution corresponds to such a flow on graph G.

Quasi-Convexity • The idea of quasi-convexity was introduced by: • [Koutsoupias-Papadimitriou 1995], [Koutsoupias 1999] • Let c(A) be the cost of the optimum k-server solution to some particular instance which ends with servers at locations A. • The quasi-convexity theorem states: • For any configurations A, B there exists a matching :AB such that for any partitioning of A into A1, A2 we have: • c(A1(A2)) + c((A1)A2) ≤ c(A) + c(B) • Further, if xAB then we can guarantee (x) = x.

Quasi-Convexity Picture Solution A has this final server location set. Solution B has this final server location set. There exists a matching  between these sets. If we swap any subset of server locations according to . And find the best solutions A’ and B’ with servers at these new locations. The sum of c(A’) + c(B’) ≤ c(A) + c(B)

Quasi-Convexity Proof Black nodes are requests! Node s Initial Config. Flow fA Flow fB Before 1 Circulation fB-fA After 1 Before 2 After 2 Node t

New Extensions/Applications of Quasi-Convexity • Comparing the cost of one extra request: • Let c(, k) be the optimum k-server cost for request sequence . • Let c(r, k) be the optimum k-server cost with request r added. • Obviously c(r, k) - c(, k) ≥ 0. • What happens if we have one extra server? • c(r, k) - c(, k) ≥ c(r, k+1) - c(, k+1) • The increase in cost is less! • This is intuitive, and can be proven using Quasi-Convexity.

New Extensions/Applications of Quasi-Convexity • Let cx(X) be the cost of the cheapest x-server solution for a particular request sequence ending in configuration X. Let cx* be the cost of the optimum x-server solution on the same request sequence. • Consider any integer y. There exists a configuration Y satisfying: • |Y| = y • |XY| = min(x, y) • cy(Y) ≤ cx(X) + cy* - cx* In other words: there is a solution with y servers that is low in cost which is as similar to configuration X as possible.

Hierarchical Binary Trees We will consider weighted binary trees with the following properties: All leaves are at the same depth Edge weights decrease geometrically by factor  All initial server locations and requests are at leaves 2 2    1 1 1 1 1

Multiple Decision-Makers At each node we run an online algorithm. The node “sees” requests in its subtrees. The node must partition servers between its subtrees. 5 Servers available. 4 Servers available. The number of available servers changes over time too!

How does it work? • Each decision maker is only responsible for partitioning servers between its subtrees. • It does not have to actually satisfy requests or determine locations of the servers. • We will assume that the “offline optimum” happens within each subtree -- the goal of the decision maker is just to decide a partition.

Two types of Cost • What is the goal of the decision-maker? • It needs to minimize two types of cost. • Move Cost. This is the cost of moving servers from one subtree to the other, or moving servers out of the entire tree. Because of hierarchical structure, this does not depend upon actual locations of the servers. • Hit Cost. This is the cost of (optimally) satisfying the requests given the partitionings selected by the decision-maker. We would like the change in this cost to depend only on the current partition and request, not the entire history. • We will measure hit cost as the sum (over requests) of the increases in optimum cost assuming we always use the current partition, based on each request.

Hit Cost Formula • Let cT(, k) be the optimum cost of k-server on subtree T and request sequence . • Define [i] to be the request sequence terminated at request i. • Define ki to be the number of servers assigned subtree T at request i. • Then the hit cost is given by: • HC = ∑i cT([i], ki) - cT([i-1], ki)

Existence of Cheap Solution: Base Case • Let OPT be the cost of the optimum solution on a subtree given the number of servers in that subtree at each time. • The online decision maker at the root of this subtree must compute a partition of the available servers at each time. We want to prove that there exists such a sequence of partitions with low hit cost and move cost. • The obvious thing to do is to use the partition inherent in the optimum solution. This gives the optimum move cost, but the hit cost is not actually equal to the cost of moving things around within the tree: • HC = ∑i cT([i], ki) - cT([i-1], ki) • Note that if ki is the same at all times (no serves move) then this summation telescopes to equal the cost of moving things around within the tree.

Existence of Cheap Solution: Inductive Step • Of course, in many cases the values of ki will change over time as the decision-maker moves servers from one child subtree to the other (or out of its subtree entirely). • The proof will be by induction on the number of such moves. Suppose we have: • Vector k: [5 5 6 6 5 5 5 4 4 3 3 3 4 4 4 4 5 5 5] • Vector k’: [6 6 6 6 5 5 5 4 4 3 3 3 4 4 4 4 5 5 5] • Note that vector k’ has one less move, so we can apply the inductive hypothesis.

Existence of Cheap Solution: Completing the Proof • We basically need to show that the optimum cost for k is related to the optimum cost for k’. • The key is to construct a solution for k’ by modifying the solution for k. The solutions will be identical from the point when the value of ki=ki’. • We need to construct the earlier server locations for k’. The key is to use Quasi-Convexity: • Consider any integer y. There exists a configuration Y satisfying: • |Y| = y • |XY| = min(x, y) • cy(Y) ≤ cx(X) + cy* - cx*

Building Hierarchical Solution • It is also necessary to show that a solution with good hit cost and move cost is actually a good solution; this can be done inductively in a similar way. • Assuming that we can find a solution online for a single decision-maker with approximately optimum hit cost and move cost, we will be able to obtain a good solution overall. • Of course, the single decision-maker is also acting online, so it is unlikely to guarantee HC≤HC* and MC≤MC*. • The tricky part is that we need to be very careful with the hit cost, since it can accumulate multiplicatively between levels. A constant competitive result is really not enough, we need HC≤HC* to get constant competitive ratio or HC≤HC*+MC* to get O(L) where L is the number of levels in the tree (L = O(log )).

The Local Problem • Consider the problem faced by a single decision-maker. • We have a set of states which correspond to the number of servers in the left subtree. • Switching from one state to another causes us to pay the “move cost.” • If we stay in a state when a request arrives, then we must pay a “hit cost.” • This is a metrical task system problem, but it has some useful special properties.

Cost Vectors 8 6 7 4 5 1 3 2 A cost is applied to each state. Here we have a request on the left subtree. Note the higher cost for fewer servers.

Cost Vector Property • Can these cost vectors be completely arbitrary? • The cost on a particular state x at request i is given by: • cx[i] = cT([i], x) - cT([i-1], x) • Note that this is just the added cost of one more request in an x-server problem. • We again apply Quasi-Convexity to see that more servers implies a smaller change in cost! • This implies that the cost vector will be either non-increasing (request on left subtree) or non-decreasing (request on right subtree).

Work Function and a Randomized Local Algorithm • Let cx[i] be the cost applied to state x at request i. • We define the work function • wx[i]  min(wx[i-1]+cx[i], wx-1[i]+2, wx+1[i]+2) • Here  is the distance to the root of the subtree. So this represents the cheapest of ending in state x by staying there from the prior request, or moving there from one fo the adjacent states. • Our algorithm is to select a random r[-1, 1]. We will then always stay in the state which minimizes wx[i] + 2xr.

Algorithm Intuition • Say we were to just always stay in the state with minimum work function value. • Then our hit cost paid at each step will be the change in work function, which means our total hit cost paid is bounded by the work function value of the state we’re in. So we would have HC≤HC* + MC*. • The problem is that the move cost can blow up without bound if we don’t use randomization. The request sequence could be such that we keep switching which state has the minimum work function value by having very low-cost requests.

Algorithm Demo w(x) Slope is 2 6 7 2 8 9 10 0 1 3 4 5 State x Initial state 5

Algorithm Demo Apply random r (here >0) w(x) 6 7 2 8 9 10 0 1 3 4 5 State x

Algorithm Demo w(x) 6 7 2 8 9 10 0 1 3 4 5 State x

Algorithm Demo Apply new cost for next request w(x) 6 7 2 8 9 10 0 1 3 4 5 State x Start at lowest w(x) state Move to new lowest w(x) state

Proof of Bounded Hit Cost • Let Xx[i] = wx[i] + 2xr. • At any time that we have some server partition, the hit cost paid will look like the change in Xx value. When we move from one partition to another, the Xx value cannot change (we can move at the moment the two values are equal). • So the total hit cost we pay is bounded by Xz[T] - Xa[0] where z is our algorithm’s final state and a is the initial state. • Let a* and z* be the initial and final state for the best solution. Then we can guarantee that a=a* and that Xz[T] ≤ Xz*[T]. So our HC is at most Xz*[T] - Xa*[0] = OPT + 2z*r - 2a*r = OPT + 2r(z*-a*). • Since E[r]=0, this gives us expected hit cost at most OPT=HC*+MC*.

Bounding Move Cost • We can split the cost vectors up into pieces, each of which applies equal cost to a bunch of consecutive states. • Such a cost vector can only cause us to move from a state whose cost is increased to a state whose cost is not increased. In fact if the cost increase is epsilon-sized there is only one pair of states we can move from. • We need to show that the probability of moving times the cost of moving is bounded. We use a potential function: • [t] = 2k - ∑ |wx+1[t] - wx[t]| • We observe that initial potential is zero, potential is never negative. • Applying a cost vector increases potential if cost is applied to the cheapest state, or reduces potential otherwise.

Who Cares about Binary Hierarchical Trees? • There is a sequence of results in metric embedding, showing that we can transform any finite metric space into a hierarchical (not necessarily binary) tree in a randomized way, such that the distances are distorted by an expected O( log n). • [Bartal 1996] [Bartal 1998] [Fakcharoenphol-Rao-Talwar 2003] • Thus, if we could extend this k-server result to non-binary hierarchical trees, we would obtain an O(logn log2 ) competitive algorithm for the problem on a general metric!

Obstacles to Extending to Polynomal Degree Trees • Why doesn’t the algorithm extend directly? • The obvious approach is to try to extend the local algorithm to work in the non-binary case. This removes some of the structure of the local case, since the states now represent partitionings of servers among potentially many children (not just two). Of course, the local algorithm is a metrical task system problem, and there do exist (in general) metrical task system competitive algorithms. • [Bartal-Blum-Burch-Tomkins 1997] • But these algorithms have competitive ratio poly-logarithmic in the number of states (which is now exponentially large)! • Since our hit cost will grow geometrically, we need an algorithm which guarantees HC≤HC*+MC* and simultaneously MC≤(HC*+MC*). It appears to be possible to get this from the metrical task system results, but we will end up with =(n) (poly-log in number of states).

Obstacles to Extending to Polynomial-Degree Trees • Another possibility is to restructure the tree. We can transform a hierarchical non-binary tree into a hierarchical binary tree while approximately preserving distances. • The issue is that the value of  decreases dramatically. • Unfortunately, the algorithm is strongly dependent on this value. The main issue is when we were proving that there exists a solution with low hit cost and move cost in the hierarchical problem. • These inductive proofs did not actually establish HC*+MC*≤OPT. • Instead, they establish HC*+MC*≤O(1+1/)OPT. This may seem a small difference, but recall that the hit cost grows geometrically, multiplying by this O(1+1/) term at every level. This gives a competitive ratio in the end of O((1+ 1/)L) which is fine if =(L) but not otherwise.

Infinite Metrics • Of course, the underlying metric for k-server might be infinite (say the Euclidean Plane). Even if we could solve the problem of non-binary trees, the metric embedding approach will not work on an infinite space. • The way around this, is to consider the subspace consisting only of points where there is a request at some time. This will be finite, but of course we do not know the request locations in advance! • This suggests a new problem, that of “online embedding.” We are given points in a metric space one at a time, and must construct a tree embedding for the points we are given while respecting the previously constructed tree.

Randomized Inputs • Another interesting direction involves K-Server with requests taken from some known random distribution. • If the distribution does not evolve over time then this is essentially the k-median problem and known solutions exist. • [Cogill-Lall 2006] • The more interesting case is if the distribution does evolve; for example say it is Markovian. • For the special case of k-server on a uniform metric (paging) there exists a constant-approximation for Markov distributions. • [Karlin-Phillips-Raghavan 2000] • No such result exists (to our knowledge) for k-server on a general metric, and this is an open problem. Interestingly, the analogous metrical task system problem is poly-time solvable!

Conclusions • This talk gave a randomized competitive algorithm for a special case of the k-server problem. This is one of the first competitive algorithms (for any non-uniform metric with n>>k) with competitive ratio sublinear in natural problem parameters. • Perhaps more interesting than this result itself is the potential for future work. Extending the result to non-binary trees would already be substantial progress on the k-server problem. • Hopefully in the next few years we will see a poly-log competitive randomized algorithm for k-server on a general metric space, and a resolution of the randomized k-server conjecture.

K-Server on Hierarchical Binary Trees

K-Server on Hierarchical Binary Trees

Presentation Transcript

Trees, Binary Trees, and Binary Search Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees, Binary Search Trees

Binary Trees

Binary Trees

Binary Trees

Binary Trees, Binary Search Trees

BINARY TREES

Binary Trees, Binary Search Trees

Binary Trees

Binary Trees and Binary Search Trees

Binary Trees, Binary Search Trees

Trees, Binary Trees, and Binary Search Trees

Binary Trees, Binary Search Trees

Binary Trees, Binary Search Trees