Approximation Algorithms: The Subset-sum Problem

Approximation Algorithms: The Subset-sum Problem Victoria Manfredi (252a-ad) Smith College, CSC 252: Algorithms December 12, 2000

Introduction • NP-completeness and approximation algorithms • Notation associated with approximation algorithms • The NP-complete subset-sum problem and the optimization problem associated with it • Proof that the approximation algorithm for the optimization problem is a fully polynomial-time approximation scheme Please note: the information and ideas in this presentation were gathered from various sources . For complete references, please see the last slide.

NP-Complete Problems • Those problems that are both • in NP (nondeterministic polynomial time) • the answer can be described in polynomial space • the answer can be verified correct or not in polynomial time • and are NP-hard • if problem being solved in polynomial time implies that any other NP problem can be solved in polynomial time • But the problem cannot currently be solved in polynomial time

NP-Complete Problems • Because NP-complete problems appear in many everyday problems (problems like the travelling salesman for flight scheduling) that need to be solved, they cannot be ignored. • We therefore need an acceptable way to solve these problems: we want a polynomial algorithm that can do the job that the exponential algorithm is doing. It is unlikely that we will find a polynomial time algorithm for the NP-complete problem. If you did, you would be solving the P = NP problem and you of course would then be rich and famous. • The solution: approximation algorithms

ApproximationAlgorithms • Say you are splitting a piece of cake with someone. Dividing it so that you and the person you were splitting it with got exactly half the cake, right down to the last atom, would be pretty hard, and would take an awfully long time (this is our exponential time algorithm for the NP-complete problem), although it would do the job exactly right. But do we ever do this? No, we estimate and say that looks like about half. • Approximation algorithms do the same sort of estimation in a more defined way, and are proved to do a good estimate in a reasonable (polynomial, for example) time

Approximation Algorithms • When talking about approximation algorithms, there is some terminology we need to know: • Ratio bound • Relative error • Relative error bound • Approximation scheme • Polynomial-time approximation scheme • Fully polynomial-time approximation scheme

Ratio Bound • Relates to how much bigger the correct answer is than the answer for the approximation algorithm (if a max problem) OR how much smaller the correct answer is (if a min problem) • max (C/C*, C*/C) <= p(n) where C is the optimization answer given by the approximation algorithm and C* is the correct optimization answer given by the NP-complete algorithm • note: 0<C*<= C if min and 0<C<=C* if max • p(n) is never less than one because one of C/C* or C*/C will be greater than or equal to one always • p(n) is the function bounding how big the C and C* ratio will be. It may depend on input n, hence p(n), but it also may not, and is then just p

Relative Error • Is how far off the correct answer the answer from the approximation algorithm is • |C-C*| / C* • For example, if the the optimization answer given by the approximation algorithm, C, equals 10 and the optimization answer given by the NP-complete algorithm is 8, (we’re doing a minimization), then • (10-8)/8 = 2/8 = 0.25 relative error • Relative error is always either a positive number or zero (because of the absolute value in the equation). This makes sense because what would a negative relative error mean?

Relative Error Bound • Is a bound on how far off the correct answer the answer from the approximation algorithm is • This bound can either be a function of input n as in (n) with the relative error changing according to the size of n, or it can be a constant, as in just  • |C-C*| / C* <= (n)

Approximation Schemes • Approximation scheme • approximation algorithm that also requires a relative error bound,  > 0 as well as the input data • Polynomial-time approximation scheme • approximation scheme that runs in polynomial time for input n • Fully Polynomial-time approximation scheme • approximation scheme that runs in polynomial time for input n and 1/ • Why 1/? To capture how the decreasing the relative error, , increases the running time

The Subset-sum Decision Problem • The subset-sum problem is a decision problem that asks, given a set of number and a number x, determine whether a subset of numbers of the set can be added together to equal x. • The subset-sum problem is based on the knapsack problem, but is simpler, although both are NP-complete. In the knapsack problem you’re looking at both the size and the profit of the objects, while in the subset-sum problem you’re just looking at the size of the objects • Naïve solution: come up with all possible combinations of the numbers in the set, sum them together and see if any of the resulting sums equals x • Is O(2^n)

The Subset-sum Optimization Problem • The optimization problem associated with the subset-sum problem asks, given a set of numbers and a number x, determine the subset that sums to the largest number less than or equal to x. • Since the decision problem associated with it is NP-complete, the optimization problem is also NP complete. • Uses of subset-sum algorithm: for example, packing a truck maximally • The approximation algorithm is for both the subset-sum optimization problem and decision problem

Solving Subset-sum Optimization Problem in Exponential Time • Start with an x, a set E ={0} and the set to find the subset of, S= {s1,s2,…,sn} • Define the set operation S+i to equal {s1+i,s2+i,…,sn+i} • Then do, • E1 = (E + s1) U E • E2 = (E1 + s2) U E1 • … • En = (En-1 + Sn) U En-1 • Where each S and Sn are sorted lists • At each step, if any element in Ei is greater than x, then remove the number from the set • At the end, the largest number in En is the answer • Notice that the set Enis growing exponentially

Solving Subset-sum Optimization Problem in Exponential Time • x = 14 E ={0} and S= {1,4,7} • Then, • E1 = {0+1} U {0} = {0,1} Set size = 2 • E2 = {0+4,1+4} U {0,1} = {0,1,4,5} Set size = 4 • E3 = {0+7,1+7,4+7,5+7} U {0,1,4,5} Set size = 8 = {0,1,4,5,7,8,11,12} • 2 + 2^2 + 2^3 ….2^n = (2^n+1)/(2-1) = O(2^n) • We did obtain the correct answer (12) but we had to use an exponential amount of space in order to do so • Note, in this example the space use doubles; in other examples, this is not necessarily the case

Solving Subset-sum Optimization Problem in Polynomial Time • How do we avoid exponential space use? Trim the set Ei at each step. Get rid of some larger numbers in the set by having smaller numbers represent them • Trimming: • Our trimming parameter,  >= (y-z)/ y, with 1> > 0 • To see if # should stay or go: if the previous number is less than or equal to (1- ) times the following number, starting from the first number in the set, then the following number can be removed. The first element of the set always stays. • Trimming the set {3,5,6,8} with a  = 0.2, that is, 20% error, then we get the set • (5-3)/5 = 0.4 keep 3 • (6-5)/6 = 0.2 don’t keep 6 (let 5 represent it) • (8-6)/8 = 0.25 keep 8 • Final set {3,5,8}

Solving Subset-sum Optimization Problem in Polynomial Time • How do we choose ? Remember  for the relative error bound? We choose  to be /n where n is the number of elements in the set and 0 <  < 1 • Looking at our example from before, • x = 14, T ={0}, S= {1,4,7}, n = 3,  = 0.3 so /n = 0.1 • Then, now before • M1 = {0+1}U{0} = {0,1} Set size = 2 2 • T1: {0,1} • M2 = {0+4,1+4}U{0,1} = {0,1,4,5} Set size = 3 4 • T2: {0,1,4} where 4 rep 5 • M3 = {0+7,1+7,4+7}U{0,1,4}Set size = 5 8 = {0,1,4,7,8,11} • T3: {0,1,4,7,11} where 7 rep 8 • We get 11 now, instead of 12 as the answer. But 11 is within 1-  times 12, so it is acceptable.

Proof that Approximation Algorithm is a Fully Polynomial-time Approximation Scheme • If we can prove that the approximation algorithm is a fully polynomial-time approximation scheme, we will be showing that • the algorithm runs in polynomial time for input n and 1/  • We want to show this because it would mean that the approximation algorithm is using polynomial time/space instead of exponential and would therefore be a practical algorithm

Proof cont’d • Our trimmed set from the approximation algorithm is a subset of the untrimmed set from the NP-complete algorithm. That is Ti is subset of Ei • This means that the answer we find using the approximation algorithm, some z, is the sum of a subset of the set we were given. • If this is a good approximation algorithm than if we were to multiply the answer we would get using the NP-complete algorithm by 1-  (because of our max relative error equation for C*(1-  ) <= C), we should find that z is at least as big as (1-  ) times the result we would get, which we’ll call m (we are looking for at least as big because this is a maximization problem)

Proof cont’d • We must therefore prove that z>= (1- )m (If you remember from before from relative error bounds) • |C*-C| / C* <= (n), (C*-C instead of C-C* because max) • |C*-C| / C* = (C*-C)/C* = C*/C*-C/C* <= (n) • 1-C/C* <= (n) • 1- (n) <= C/C* • which can be derived to C>=(1- )C* since in our example  is not a function of n (the size of the input). This is what we are working with and it corresponds with z>=(1- )m • We want to show that z and m are very close together.  is between 0 and 1 so we want to show that z is, for example, 0.89*m

Proof cont’d • Since  was chosen to be /n, this means that the relative error between a number in T and the number in M it represents is no more than /n. Therefore the relative error between the correct answer and the approximated answer will be no more than . • So, from (y-z)/y <=  (see slide on Solving Subset-sum in Polynomial Time) •  >= (y-z)/y = y/y -z/y = 1-z/y •  + z/y >= 1 • z/y >= 1- • z >= (1- )y • y(1- ) <= z

Proof cont’d • And since we know y>z, we get • y(1- ) <= z <= y • y(1-  /n)^i <= z <= y • It has been shown through induction on i, that for all the y’s that were removed, there is a z that fits this equation • Let y* be the best answer. Then we get • y*(1-  /n)^n <= z <= y* • and the approximation algorithm gives the largest z that fits this

Proof cont’d • By taking the derivative of the function in the above equation, (1- /n)^n, with respect to n, we find that it is greater than zero. This means that when n increases (1-  /n)^n increases. Then when n>0, we get, • 1-  < (1-  /n)^n remains increases when the same n increases • From this it follows that (1-)y* <= z because from the previous equation y*(1-  /n)^n <= z <= y* we can now get, • (1-) y* < y*(1-  /n)^n <= z <= y* • and from this we get (1-  )y* <= z

Proof cont’d • So we have just proved that z>= (1- )m, meaning that the solution the approximation algorithm gives us is pretty close (1- ) to the solution from the exponential time algorithm • We will now show that the approximation algorithm is a fully polynomial-time approximation scheme, thereby proving that it runs in polynomial time in respect to n and 1/ instead of exponential time, while also yielding an answer reasonably close to the correct answer.

Proof cont’d • We can show that a function is polylogarithmically bounded if f(n) = (log n)^O(1) • We can use this to show a function is polynomially bounded because instead of getting a polylogartihmic answer we’ll get a polynomial answer • The trimmed list M is what is growing, so we hope to prove that its length is polynomially bounded. • Well, we know that the difference between mi and mi+1 in M is given by 1/(1- ) where  = /n and  was the trimming factor. • So our function isf (1/(1-/n)) = log1/(1-/n) t where 1/(1-E/n) is the base of the logarithm and t is what we’re taking the log of

Proof cont’d • Changing the base of the log using, logb a = logc a/ logc b we get 1 loge t ln t ln t log----- t = --------------- = ------------- = ------------------- 1-/n loge(1/(1-/n)) ln(1/(1-/n)) ln(1- /n)^-1 ln t Since we know ln(1+x)<= x, ln t n* ln t = ----------------- = then if(1-/n)then if (1-/n) <= --------- = --------- -1 * ln(1- /n) is our x, we get -(-/n)  • This is a polynomial in respect to n and 1/ , so our approximation algorithm is a fully polynomial-time approximation scheme • Note: the proof presented here is from Cormen et al

Proof cont’d • Theorem: There is no fully polynomial approximation scheme for a strongly NP-complete problem, unless NP = P (Theorem from Approximation Algorithms for NP-hard Problems) • The reason we could prove that the approximation algorithm for the subset-sum problem was a fully polynomial approximation scheme was because subset-sum is a weakly NP-complete problem

Speedup: Some Times Subset-sum problem - Comparison of Algorithms Algorithm Subset sum Time GS 25554 0.05 DPS 25557 240.24 APPROX_SUBSET_SUM 25436 12.31 DIOPHANT 25557 0.82 GS = Greedy, DPS = exponential time algorithm, APPROX_SUBSET_SUM = the algorithm I presented, and DIOPHANT = algorithm by the author of the web page Source for this table: http://www.geocities.com/zabrodskyvlada/aat/a_suba.html#approx_subset_sum

Conclusion • Approximation algorithms are one way to come up with an answer in a reasonable (polynomial) amount of time for a NP-complete problem

References: Basically All the Info in this Presentation came from the below Sources • Sources: • Main Source: Introduction to Algorithms,Cormen, T.H., Leiserson, C.E., and Rivest, R.L. (1999), Chapter 37. • Approximation Algorithms for NP-hard Problems, HochBaum, D.S. (1997), Introduction, pp.9-10 and pp.359-365 • Ileana • lecture notes from class • http://www.geocities.com/zabrodskyvlada/aat/a_suba.html#approx_subset_sum • http://cse.hanyang.ac.kr/~jmchoi/class/1996-2/algorithm/classnote/node7.html • What I did: • Web and library research • Asked Ileana :-)

Approximation Algorithms: The Subset-sum Problem

Approximation Algorithms: The Subset-sum Problem

Presentation Transcript

Approximation Algorithms

Approximation Algorithms for VLSI Routing

Approximation Algorithms

Approximation Algorithms for Demand-Robust and Stochastic Min-Cut Problems

On approximation algorithms for the terminal Steiner tree problem

Lectures on NP-hard problems and Approximation algorithms

Approximation Algorithms:

Approximation Algorithms for Orienteering and Discounted-Reward TSP

CSCI 3160 Design and Analysis of Algorithms

Polynomial Time Approximation Schemes and Parameterized Complexity

CSC 5160 - Topics in Algorithms: Combinatorial Optimization and Approximation Algorithms

Algorithms Problem Solving

Approximation Algorithms for MAX-MIN tiling

Approximation Algorithms

Approximation Algorithms

Approximation Algorithms

Approximation Algorithms

On the Unique Games Conjecture

Approximation Algorithms

On the Unique Games Conjecture

Approximation Algorithms