Approximation Algorithms: The Subset-sum Problem

Approximation Algorithms:

The Subset-sum Problem

Victoria Manfredi (252a-ad)Smith College, CSC 252: AlgorithmsDecember 12, 2000

Introduction

NP-completeness and approximation algorithms Notation associated with approximation algorithmsThe NP-complete subset-sum problem and the optimization problem associated with itProof that the approximation algorithm for the optimization problem is a fully polynomial-time approximation scheme

Please note: the information and ideas in this presentation were gathered from various sources . For complete references, please see the last slide.

NP-Complete Problems

Those problems that are both in NP (nondeterministic polynomial time)

the answer can be described in polynomial space

the answer can be verified correct or not in polynomial time

and are NP-hard if problem being solved in polynomial time

implies that any other NP problem can be solved in polynomial time

But the problem cannot currently be solved in polynomial time

NP-Complete Problems

Because NP-complete problems appear in many everyday problems (problems like the travelling salesman for flight scheduling) that need to be solved, they cannot be ignored. We therefore need an acceptable way to solve these problems: we want a polynomial algorithm that can do the job that the exponential algorithm is doing. It is unlikely that we will find a polynomial time algorithm for the NP-complete problem. If you did, you would be solving the P = NP problem and you of course would then be rich and famous. The solution: approximation algorithms

Approximation Algorithms

Say you are splitting a piece of cake with someone. Dividing it so that you and the person you were splitting it with got exactly half the cake, right down to the last atom, would be pretty hard, and would take an awfully long time (this is our exponential time algorithm for the NP-complete problem), although it would do the job exactly right. But do we ever do this? No, we estimate and say that looks like about half.Approximation algorithms do the same sort of estimation in a more defined way, and are proved to do a good estimate in a reasonable (polynomial, for example) time

Approximation Algorithms

When talking about approximation algorithms, there is some terminology we need to know: Ratio bound Relative error Relative error bound Approximation scheme Polynomial-time approximation scheme Fully polynomial-time approximation scheme

Ratio Bound

Relates to how much bigger the correct answer is than the answer for the approximation algorithm (if a max problem) OR how much smaller the correct answer is (if a min problem)max (C/C*, C*/C) <= p(n) where C is the optimization answer given by the approximation algorithm and C* is the correct optimization answer given by the NP-complete algorithm note: 0<C*<= C if min and 0<C<=C* if max p(n) is never less than one because one of C/C*

or C*/C will be greater than or equal to one always

p(n) is the function bounding how big the C and C* ratio will be. It may depend on input n, hence p(n), but it also may not, and is then just p

Relative Error

Is how far off the correct answer the answer from the approximation algorithm is|C-C*| / C*For example, if the the optimization answer given by the approximation algorithm, C, equals 10 and the optimization answer given by the NP-complete algorithm is 8, (we’re doing a minimization), then (10-8)/8 = 2/8 = 0.25 relative error

Relative error is always either a positive number or zero (because of the absolute value in the equation). This makes sense because what would a negative relative error mean?

Relative Error Bound

Is a bound on how far off the correct answer the answer from the approximation algorithm isThis bound can either be a function of input n as in (n) with the relative error changing according to the size of n, or it can be a constant, as in just |C-C*| / C* <= (n)

Approximation Schemes

Approximation scheme approximation algorithm that also requires a

relative error bound, > 0 as well as the input data

Polynomial-time approximation scheme approximation scheme that runs in polynomial

time for input n

Fully Polynomial-time approximation scheme approximation scheme that runs in polynomial

time for input n and 1/ Why 1/? To capture how the decreasing the

relative error, , increases the running time

The Subset-sum Decision Problem

The subset-sum problem is a decision problem that asks, given a set of number and a number x, determine whether a subset of numbers of the set can be added together to equal x.The subset-sum problem is based on the knapsack problem, but is simpler, although both are NP-complete. In the knapsack problem you’re looking at both the size and the profit of the objects, while in the subset-sum problem you’re just looking at the size of the objectsNaïve solution: come up with all possible combinations of the numbers in the set, sum them together and see if any of the resulting sums equals xIs O(2^n)

The Subset-sum Optimization Problem

The optimization problem associated with the subset-sum problem asks, given a set of numbers and a number x, determine the subset that sums to the largest number less than or equal to x. Since the decision problem associated with it is NP-complete, the optimization problem is also NP complete.

Uses of subset-sum algorithm: for example, packing a truck maximallyThe approximation algorithm is for both the subset-sum optimization problem and decision problem

Solving Subset-sum Optimization Problem in Exponential Time

Start with an x, a set E ={0} and the set to find the subset of, S= {s1,s2,…,sn}

Define the set operation S+i to equal {s1+i,s2+i,…,sn+i}

Then do, E1 = (E + s1) U E E2 = (E1 + s2) U E1 … En = (En-1 + Sn) U En-1 Where each S and Sn are sorted lists

At each step, if any element in Ei is greater than x, then remove the number from the setAt the end, the largest number in En is the answerNotice that the set En is growing exponentially

Solving Subset-sum Optimization Problem in Exponential Time

x = 14 E ={0} and S= {1,4,7}Then,

E1 = {0+1} U {0} = {0,1} Set size = 2 E2 = {0+4,1+4} U {0,1} = {0,1,4,5} Set size = 4 E3 = {0+7,1+7,4+7,5+7} U {0,1,4,5} Set size = 8 = {0,1,4,5,7,8,11,12}

2 + 2^2 + 2^3 ….2^n = (2^n+1)/(2-1) = O(2^n)We did obtain the correct answer (12) but we had to use an exponential amount of space in order to do soNote, in this example the space use doubles; in other examples, this is not necessarily the case

Solving Subset-sum Optimization Problem in Polynomial Time

How do we avoid exponential space use? Trim the set Ei at each step. Get rid of some larger numbers in the set by having smaller numbers represent them Trimming:

Our trimming parameter, >= (y-z)/ y, with 1> > 0 To see if # should stay or go: if the previous number is less

than or equal to (1- ) times the following number, starting from the first number in the set, then the following number can be removed. The first element of the set always stays.

Trimming the set {3,5,6,8} with a = 0.2, that is, 20% error, then we get the set

(5-3)/5 = 0.4 keep 3 (6-5)/6 = 0.2 don’t keep 6 (let 5 represent it) (8-6)/8 = 0.25 keep 8 Final set {3,5,8}

Solving Subset-sum Optimization Problem in Polynomial Time

How do we choose ? Remember for the relative error bound? We choose to be /n where n is the number of elements in the set and 0 < < 1Looking at our example from before,

x = 14, T ={0}, S= {1,4,7}, n = 3, = 0.3 so /n = 0.1 Then, now before

M1 = {0+1}U{0} = {0,1} Set size = 2 2 T1: {0,1}

M2 = {0+4,1+4}U{0,1} = {0,1,4,5} Set size = 3 4

T2: {0,1,4} where 4 rep 5 M3 = {0+7,1+7,4+7}U{0,1,4} Set size = 5 8

= {0,1,4,7,8,11} T3: {0,1,4,7,11} where 7 rep 8

We get 11 now, instead of 12 as the answer. But 11 is within 1- times 12, so it is acceptable.

Proof that Approximation Algorithm is a Fully Polynomial-time Approximation Scheme

If we can prove that the approximation algorithm is a fully polynomial-time approximation scheme, we will be showing that

the algorithm runs in polynomial time for input n and 1/

We want to show this because it would mean that the approximation algorithm is using polynomial time/space instead of exponential and would therefore be a practical algorithm

Proof cont’d

Our trimmed set from the approximation algorithm is a subset of the untrimmed set from the NP-complete algorithm. That is Ti is subset of EiThis means that the answer we find using the approximation algorithm, some z, is the sum of a subset of the set we were given.If this is a good approximation algorithm than if we were to multiply the answer we would get using the NP-complete algorithm by 1- (because of our max relative error equation for C*(1- ) <= C), we should find that z is at least as big as (1- ) times the result we would get, which we’ll call m (we are looking for at least as big because this is a maximization problem)

Proof cont’d

We must therefore prove that z>= (1- )m (If you remember from before from relative error bounds)

|C*-C| / C* <= (n), (C*-C instead of C-C* because max) |C*-C| / C* = (C*-C)/C* = C*/C*-C/C* <= (n) 1-C/C* <= (n) 1- (n) <= C/C* which can be derived to C>=(1- )C* since in our example is

not a function of n (the size of the input). This is what we are working with and it corresponds with z>=(1- )m

We want to show that z and m are very close together. is between 0 and 1 so we want to show that z is, for example, 0.89*m

Proof cont’d

Since was chosen to be /n, this means that the relative error between a number in T and the number in M it represents is no more than /n. Therefore the relative error between the correct answer and the approximated answer will be no more than .So, from (y-z)/y <= (see slide on Solving Subset-sum in Polynomial Time)

>= (y-z)/y = y/y -z/y = 1-z/y + z/y >= 1 z/y >= 1- z >= (1- )y y(1- ) <= z

Proof cont’d

And since we know y>z, we get y(1- ) <= z <= y y(1- /n)^i <= z <= y It has been shown through induction on i, that for all the

y’s that were removed, there is a z that fits this equation

Let y* be the best answer. Then we get y*(1- /n)^n <= z <= y* and the approximation algorithm gives the largest z that

fits this

Proof cont’d

By taking the derivative of the function in the above equation, (1- /n)^n, with respect to n, we find that it is greater than zero. This means that when n increases (1- /n)^n increases. Then when n>0, we get,

1- < (1- /n)^n

remains increases when

the same n increases

From this it follows that (1-)y* <= z because from the previous equation y*(1- /n)^n <= z <= y* we can now get,

(1-) y* < y*(1- /n)^n <= z <= y* and from this we get (1- )y* <= z

Proof cont’d

So we have just proved that z>= (1- )m, meaning that the solution the approximation algorithm gives us is pretty close (1- ) to the solution from the exponential time algorithm We will now show that the approximation algorithm is a fully polynomial-time approximation scheme, thereby proving that it runs in polynomial time in respect to n and 1/ instead of exponential time, while also yielding an answer reasonably close to the correct answer.

Proof cont’d

We can show that a function is polylogarithmically bounded if f(n) = (log n)^O(1)We can use this to show a function is polynomially bounded because instead of getting a polylogartihmic answer we’ll get a polynomial answerThe trimmed list M is what is growing, so we hope to prove that its length is polynomially bounded. Well, we know that the difference between mi and mi+1 in M is given by 1/(1- ) where = /n and was the trimming factor.So our function isf (1/(1-/n)) = log1/(1-/n) t where 1/(1-E/n) is the base of the logarithm and t is what we’re taking the log of

Proof cont’d

Changing the base of the log using, logb a = logc a/ logc b we get

1 loge t ln t ln t log----- t = --------------- = ------------- = -------------------

1-/n loge(1/(1-/n)) ln(1/(1-/n)) ln(1- /n)^-1

ln t Since we know ln(1+x)<= x, ln t n* ln t = ----------------- = then if (1-/n) then if (1-/n) <= --------- = --------- -1 * ln(1- /n) is our x, we get -(-/n)

This is a polynomial in respect to n and 1/ , so our approximation algorithm is a fully polynomial-time approximation scheme

Note: the proof presented here is from Cormen et al

Proof cont’d

Theorem: There is no fully polynomial approximation scheme for a strongly NP-complete problem, unless NP = P (Theorem from Approximation Algorithms for NP-hard

Problems)

The reason we could prove that the approximation algorithm for the subset-sum problem was a fully polynomial approximation scheme was because subset-sum is a weakly NP-complete problem

Speedup: Some Times

Subset-sum problem - Comparison of Algorithms Algorithm Subset sum Time

GS 25554 0.05

DPS 25557 240.24 APPROX_SUBSET_SUM 25436 12.31 DIOPHANT 25557 0.82GS = Greedy, DPS = exponential time algorithm, APPROX_SUBSET_SUM = the algorithm I presented, and DIOPHANT = algorithm by the author of the web page

Source for this table: http://www.geocities.com/zabrodskyvlada/aat/a_suba.html#approx_subset_sum

Conclusion

Approximation algorithms are one way to come up with an answer in a reasonable (polynomial) amount of time for a NP-complete problem

References: Basically All the Info in this Presentation came from the below Sources

Sources: Main Source: Introduction to Algorithms,Cormen, T.H.,

Leiserson, C.E., and Rivest, R.L. (1999), Chapter 37. Approximation Algorithms for NP-hard Problems,

HochBaum, D.S. (1997), Introduction, pp.9-10 and pp.359-365

Ileana lecture notes from class http://www.geocities.com/zabrodskyvlada/aat/a_suba

.html#approx_subset_sum http://cse.hanyang.ac.kr/~jmchoi

/class/1996-2/algorithm/classnote/node7.htmlWhat I did:

Web and library research Asked Ileana :-)

Documents

Approximation Algorithms: The Subset-sum Problem