750 likes | 920 Vues
Approximations for hard problems. … just a few examples … what is an approximation algorithm … quality of approximations: from arbitrarily bad to arbitrarily good. Example: Scheduling. Given: n jobs with processing times, m identical machines.
E N D
Approximations for hard problems … just a few examples … what is an approximation algorithm … quality of approximations: from arbitrarily bad to arbitrarily good Example: Scheduling Given: n jobs with processing times, m identical machines. Problem: Schedule so as to minimize makespan.
Algorithm: List scheduling Basic idea: In a list of jobs, schedule the next one as soon as a machine is free machine 1 a b e machine 2 c machine 3 d machine 4 Good or bad ?
Algorithm: List scheduling Basic idea: In a list of jobs, schedule the next one as soon as a machine is free machine 1 a b e machine 2 c f machine 3 d machine 4 A S job f finishes last, at time A compare to time OPT of best schedule: how ?
job f must be scheduled in the best schedule at some time: • A – S <= OPT. • (2) up to time S, all machines were busy all the time, and OPT • cannot beat that, and job f was not yet included: S < OPT. • (3) both together: A = A – S + S < 2 OPT. • “2-approximation” (Graham, 1966) machine 1 a b e machine 2 c f machine 3 d machine 4 A S job f finishes last, at time A compare to time OPT of best schedule: how ?
Approximations in more generality P = set of problems with polynomial time solutions NP = set of problems with polynomial time certificates “guess and verify” Example Problem: Clique. Given: Graph G = (V,E); positive integer k. Decide: Does G contain a clique of size k ?
Hardness: with respect to transformation in P Problem is NP-hard: every problem ’ in NP can be transformed into in polynomial time Problem is NP-complete: is NP-hard and is in NP. Basis for transformation (reduction): 3-SAT
Theorem: Clique is NP-complete. Proof: (1) is in NP: guess and verify; (2) is NP-hard, by reduction: a or b or not c literal = vertex edge = compatibility a or not b or not c d or b or c requested clique size k = number of clauses
NP for decision problem Compute the largest value k for which G contains a clique of size k more interesting value problem most realistic Compute a largest clique in G optimization problem polynomial relationship: IF decision in P value in P optimization in P
Problem: Independent set. Given: G = (V,E); positive integer bound k. Decide: Is there a subset V’ of V of at least k vertices, such that no two vertices in V’ share an edge ? Theorem: Independent set is NP-complete. Proof: (1) is in NP: guess and verify; (2) is NP-hard, by reduction from Clique: build complement graph: edge no edge no edge edge.
Problem: Minimum vertex cover. Given: G = (V,E). Minimize: Find a smallest subset V’ of V, such that every edge in E has an incident vertex in V’ (“is covered”) ? Theorem: Vertex cover is NP-hard. Proof: by reduction from Independent set: vertex v in independent set v not in vertex cover. Lots of hard problems; what can we do ?
Solve problem approximately…. Minimum vertex cover First idea Repeat greedily: pick vertex that covers the largest number of edges not yet covered, and remove it and its incident edges.
Solve problem approximately…. Minimum vertex cover First idea Repeat greedily: pick vertex that covers the largest number of edges not yet covered, and remove it and its incident edges. Not too good … as we will see later
Solve problem approximately…. Minimum vertex cover Second idea Repeat greedily: pick both vertices of an arbitrary edge, and remove both and their incident edges.
Solve problem approximately…. Minimum vertex cover Second idea Repeat greedily: pick both vertices of an arbitrary edge, and remove both and their incident edges. great Theorem: This is a 2-approximation. Proof: One vertex per edge is needed, two are taken.
Solve problem approximately…. Independent set … by the reduction we know …
Solve problem approximately…. Independent set … by the reduction we know … • … does not work: • assume graph with 1000 vertices, • minimum vertex cover 499 vertices, • approximate vertex cover of 998 vertices. • maximum independent set has 501 vertices, approximate independent set has 2 vertices.
Solve problem approximately…. polynomial reductions need not preserve approximability careful, use special reductions … Independent set … by the reduction we know … • … does not work: • assume graph with 1000 vertices, • minimum vertex cover 499 vertices, • approximate vertex cover of 998 vertices. • maximum independent set has 501 vertices, approximate independent set has 2 vertices.
Decision versus optimization ? NPO = set of “NP-hard optimization problems” = roughly: verify that a proposed solution is feasible and compute its value in polynomial time.
What is an approximation algorithm ? A is an approximation algorithm for problem in NPO: for any input I, A runs in time polynomial in the length of I and if I is a legal input, A outputs a feasible solution A(I).
What is an approximation algorithm ? A is an approximation algorithm for problem in NPO: for any input I, A runs in time polynomial in the length of I and if I is a legal input, A outputs a feasible solution A(I). The approximation ratio of A for on input I is value (A(I)) / value (OPT(I)) … is at least 1 for minimization problems and at most 1 for maximization problems.
What is the approximation ratio of an approximation algorithm ? … the maximum over all inputs for minimization problems minimum maximization (and sometimes only the asymptotic ratio is of interest, for large problem instances) The approximation ratio of A for on input I is value (A(I)) / value (OPT(I)) … is at least 1 for minimization problems and at most 1 for maximization problems.
Example Problem: k-center. Given: G = (V,E); E = V x V; c(i,j) for all edges (i,j) with i j; positive integer number k of clusters. Compute set S of k cluster centers, S a subset of V, such that the largest distance of any point to its closest cluster center is minimum.
Example Problem: k-center. Given: G = (V,E); E = V x V; c(i,j) for all edges (i,j) with i j; positive integer number k of clusters. Compute set S of k cluster centers, S a subset of V, such that the largest distance of any point to its closest cluster center is minimum. Theorem: k-center is NP-complete (decision version). Proof: Vertex cover reduces to dominating set reduces to k-center.
Example Problem: k-center. Given: G = (V,E); E = V x V; c(i,j) for all edges (i,j) with i j; positive integer number k of clusters. Compute set S of k cluster centers, S a subset of V, such that the distance of any point to its closest cluster center is minimum. Theorem: k-center is NP-complete (decision version). Proof: Vertex cover reduces to dominating set reduces to k-center. For G=(V,E), find a smallest subset V’ of V such that every vertex is either itself in V’ or has a neighbor in V’.
V C D S Proof: Vertex cover reduces to dominating set reduces to k-center. For G=(V,E), find a smallest subset V’ of V such that every vertex is either itself in V’ or has a neighbor in V’.
D S k-center 1 2 bound 1 on cluster radius Proof: Vertex cover reduces to dominating set reduces to k-center. For G=(V,E), find a smallest subset V’ of V such that every vertex is either itself in V’ or has a neighbor in V’.
Non-approximability Theorem: Finding a ratio-M-approximation for fixed M is NP-hard for k-center. Proof: Replace 2 in the construction above by more than M.
Non-approximability Theorem: Finding a ratio-M-approximation for fixed M is NP-hard for k-center. Proof: Replace 2 in the construction above by more than M. Theorem: Finding a ratio-less-than-2-approximation is NP-hard for k-center with triangle inequality. Proof: Exactly as in the reduction above.
Theorem: A 2-approximation for k-center with triangle inequality exists.
Theorem: A 2-approximation for k-center with triangle inequality exists. Proof: Gonzalez’ algorithm. Pick v1 arbitrarily as the first cluster center. Pick v2 farthest from v1. Pick v3 farthest from the closer of v1 and v2. … Pick vi farthest from the closest of the v1,…vi-1. … until vk is picked.
Example Problem: Traveling salesperson with triangle inequality. Given: G = (V,E); E = V x V; c(i,j) for all edges (i,j) with i j. Compute a round trip that visits each vertex exactly once and has minimum total cost. Comment: is NP-hard.
Example Problem: Traveling salesperson with triangle inequality. Given: G = (V,E); E = V x V; c(i,j) for all edges (i,j) with i j. Compute a round trip that visits each vertex exactly once and has minimum total cost. Approximation algorithm Find a minimum spanning tree. Run around it and take shortcuts to avoid repeated visits.
Approximation algorithmFind a minimum spanning tree.Run around it and take shortcuts to avoid repeated visits. Quality ? Theorem: This is a 2-approximation. Proof: (1) OPT-TSP minus 1 edge is spanning tree. (2) MST is not longer than any spanning tree. (3) APX-TSP <= 2 MST <= 2 ST <= 2 OPT-TSP.
Better quality ? • Christofides’ algorithm. • Find MST. • Find all odd degree vertices V’ in MST. • Comment: There is an even number of them. • Find a minimum cost perfect matching for V’ in the induced • subgraph of G (no even vertices present). Call this M. • In MST + M, find an Euler circuit. • Take shortcuts in the Euler circuit.
Theorem: Christofides’ algorithm is a 1.5-approximation of TSP with triangle inequality.
Theorem: Christofides’ algorithm is a 1.5-approximation • of TSP with triangle inequality. • Proof: • MST <= TSP. • M <= TSP / 2 …. as we see soon. • 3. Shortcuts make the tour only shorter.
Proof of • M <= TSP / 2: • consider the subgraph induced by odd degree vertices V’; • consider TSP restricted to that subgraph, and compare • with the sub-TSP(V’) found there: • sub-TSP(V’) <= TSP; • picking alternate edges in sub-TSP(V’) gives • two perfect matchings for V’, • call them M1 and M2; • pick the cheaper of M1, M2, call it M, with M <= TSP / 2.
Notes: • Christofides algorithm may be as bad as 1.5 really. • It is unkown whether this is best possible. • For Euclidean TSP, a better bound is known: • any quality above 1 can be achieved. • For TSP without triangle inequality, no fixed approximation • is possible.
Example Problem: Set cover. Given: Universe U of elements e1, e2, …, en; collection of subsets S1, S2, …, Sk of U, nonnegative cost per Si. Find a collection of subsets that cover U with minimum cost.
Example Problem: Set cover. Given: Universe U of elements e1, e2, …, en; collection of subsets S1, S2, …, Sk of U, nonnegative cost per Si. Find a collection of subsets that cover U with minimum cost. Idea for an approximation: Repeat greedily choose a best set (cheapest per new covered element) until all of U is covered.
Quality ? Consider a step in the iteration. The greedy algorithm has selected some of the Sj, with covered C = union of all selected Sj so far. Choose Si in this step. Price per new element in Si is cheapest: price(i) = cost(Si) / (number of new elements of Si) is minimum. Consider the elements in the order they are chosen.
Consider the elements in the order they are chosen, and rename: e1, e2, e3, …., ek, …. en Consider ek, call the set in which ek is chosen Si. What is the price of ek? Bound the price from above: Instead of Si, the greedy algorithm could have picked any of the sets in OPT that have not been picked yet (there must be one), but at which price? compare with the optimum
Instead of Si, the greedy algorithm could have picked any of the sets in OPT that have not been picked yet (there must be one), but at which price? Take all elements not covered yet, and their total cost will be at most all of OPT. Across all sets in OPT not picked yet, the average cost is therefore at most OPT / size of U-C. Hence, at least one of the sets in OPT not picked yet has at most this average price. This set could have been picked. Hence, the price of ek is at most OPT / size of U-C.
For the k-th picked element, the size of U-C is at least n-k+1. Therefore, price(ek) <= OPT / (n-k+1) for each ek. The sum of all prices of ek gives the total cost of the greedy solution: SUM(k=1,..,n) price(ek) <= SUM(k=1,..,n) OPT / (n-k+1) <= OPT SUM(k=1,..,n) 1 / k <= OPT (1 + ln n)
Theorem: Greedy set cover gives a (1+ln n)- approximation. Notes: It can really be that bad. That is also a best possible approximation for set cover.
Example Problem: Knapsack. Given: n objects with weights and values, and weight bound: positive integers w1,w2, …, wn, W (weights, total weight); positive integers v1, v2, …, vn (values). Find a subset of the set of objects with total weight at most W and maximum total value. … is NP-hard
An exact algorithm for knapsack A 1 2 3 v’ n vmax 1 2 3 j n A(j,v’) = smallest weight subset of objects 1,…,j with total value =v’.
A(j,v’) = smallest weight subset of objects 1,…,j with total value =v’. inductively: A(1,v) = If v = v1 then w1 else infinity A(i+1,v) = min ( A(i,v) , A(i, v – v(i+1)) + w(i+1) ) if >= 0 … the result is: max v such that A(n,v) <= W … the runtime is: O(n2 vmax) …. pseudopolynomial
pseudopolynomial ? polynomial if numbers are small = value is polynomial in input length
pseudopolynomial ? polynomial if numbers are small = value is polynomial in input length Idea: scale numbers down, i.e., ignore less significant digits.