1
Chapter 7 : NP-completeness
Reference : Computer and Intractability by Garey and Johnson,
1979 Freeman & Company
7.1 General Problems, Input Size and Time Complexity
Time complexity of algorithms :
polynomial time algorithm ("efficient algorithm") v.s.
exponential time algorithm ("inefficient algorithm")
f(n) \ n 10 30 50
n 0.00001 sec 0.00003 sec 0.00005 sec
n5
0.1 sec 24.3 sec 5.2 mins
2n
0.001 sec 17.9 mins 35.7 yrs
Revisited input size and running time of an algorithm
The running time of an algorithm is expressed in term of the size of a problem
instance.
Example : for sorting algorithms, we use “n” as input size; for graph
algorithms, we use combination of “E” and “V”.
In a computer, we use binary bits to encode numbers.
Precisely, the running time should be expressed in term of total number of
input characters (total number of binary bits)
2
Example : for sorting algorithm, input is a set of n numbers {x1, x2, …, xn }
Let L be the largest number in {x1, x2, …, xn }. The size of L in binary number
is lg L bits.
The input size is not more than (n * lg L) bits
For mergesort, the total running time for input size n as given chapter 1.2 is :
T(n) = n * lg n where n is n * lg L bit comparisons
So, the running time in term of the “real” input size ( n * lg L ) is
T( n * lg L) = (n * lg L) * lg n
<= (n * lg L) * lg (n * lg L)
T(m) <= m lg m, substituting “n lg L” for “m”
We obtain same result.
In fact, most algorithms give same running time results.
Let consider the following prime number checking algorithm :
prime(int n) { // assume n > 2
i = 2;
while ( i < n) {
if (n % i == 0) return (“n is not a prime”)
i++;
}
return (“n is a prime”)
}
3
The input size is lg n bits
The magnitude of the input is n
The total steps is n
The running time complexity (in term of input size) is exponential, i.e. 2lg n
,
since n steps is exponential of input size lg n bits
The running time which is bounded by a polynomial function of its size and
magnitudes is called pseudopolynomial time
The running time of above algorithm is exponential time and also
pseudopolynomial time.
Another pseudopolynomial time algorithm : The running time O(nM) of
0/1 Knapsack algorithm in Chapter 5.4
Intractable problem: It is a “hard” problem for a computer to solve. Most likely,
there are no polynomial time algorithms to solve the problem.
Decision problem: The solution to the problem is "yes" or "no". Most
optimization problems can be phrased as decision problems (still have the same
time complexity).
Example : Assume we have a decision algorithm X for 0/1 Knapsack problem,
i.e. Algorithm X return “Yes” or “No” to the question “is there a solution with
profit >= P subject to knapsack capacity <= M?”
We can repeatedly run algorithm X for various profits to find an optimal
solution. Example : Use binary search to get the optimal profit,
maximum of lg n runs.
Min Bound Optimal Profit Max Bound
|_____________________|______________________|
4
7.2 The Classes of P and NP
The class P and Deterministic Turing Machine
Given a decision problem X, if there is a polynomial time Deterministic
Turing Machine program that solves X, then X is belong to P
Informally, there is a polynomial time algorithm to solve the problem
The class NP and Non-deterministic Turing Machine
Given a decision problem X, if there is a polynomial time Non-deterministic
Turing machine program that solves X, then X is belong to NP
Given a decision problem X. For every instance I of X, (a) guess solution S
for I, and (b) check “is S a solution to I?”. If (a) and (b) can be done in
polynomial time, then X is belong to NP.
Obvious : P NP, i.e. A problem in P does not need “guess solution”. The
correct solution can be computed in polynomial time.
5
Some problems which are in NP, but may not in P :
0/1 Knapsack Problem
PARTITION Problem : Given a finite set of positive integers Z.
Question : Is there a subset Z' of Z such that
Sum of all numbers in Z' = Sum of all numbers in Z-Z'
i.e. Z' = (Z-Z')
Two-Processor Non-Preemptive Schedule Length Problem : Given a set of n
tasks T with processing time {p1, p2, …, pn}, two processors and a positive
number L.
Question : Is there a non preemptive schedule for T on two processors such
that the schedule length <= L.
How to make $1,000,000!!
One of the most important open problem in theoretical compute science :
Is P=NP ?
See : http://www.claymath.org/millennium/
Most likely “No”. Currently, there are many known problems in NP, and there is
no solution to show anyone of them in P.
6
7.3 NP-Complete Problems
Stephen Cook introduced the notion of NP-Complete Problems. This makes the
problem “P = NP ?” much more interesting to study.
The following are several important things presented by Cook :
1. Polynomial Transformation (" ")
L1 L2 : There is a polynomial time transformation that transforms
arbitrary instance of L1 to some instance of L2.
If L1 L2 then L2 is in P implies L1 is in P (or L1 is not in P L2 is not
in P)
If L1 L2 and L2 L3 then L1 L3
2. Focus on the class of NP – decision problems only. Many intractable
problems, when phrased as decision problems, belong to this class.
3. L is NP-Complete if L NP and for all other L' NP, L' L
If a problem in NP-complete can be solved in polynomial time then all
problems in NP can be solved in polynomial time.
If a problem in NP cannot be solved in polynomial time then all problems
in NP-complete cannot solve in polynomial time.
So, if an NP-complete problem is in P then P=NP
if P != NP then all NP-complete problems are in NP-P
Question : how can we obtain the first NP-complete problem L?
7
4. Cook Theorem : SATISFIABILITY is NP-Complete.
Instance : Given a set of variable U and a collection of clauses C over U.
Question : Is there a truth assignment for U that satisfies all clauses in C?
Example :
U = {x1, x2}
C1 = {(x1, x2), (x1, x2)}
= (x1 OR x2) AND (x1 OR x2)
if x1 = x2 = True C1 = True
C2 = (x1, x2) (x1, x2) (x1) not satisfiable
With the Cook Theorem, we have the following property :
Lemma : If L1 and L2 belong to NP, L1 is NP-complete, and L1 L2 then L2 is
NP-complete.
i.e. L1, L2 NP and for all other L' NP, L' L1 and L1 L2 L' L2
So now, to prove a problem L to be NP-complete problem , we need to
show L is in NP
select a known NP-complete problem L'
construct a polynomial time transformation f from L' to L
prove the correctness of f and that f is a polynomial transformation
Some NP-complete problems :
SATISFIABILITY
0/1 Knapsack
PARTITION
Two-Processor Non-Preemptive Schedule Length
CLIQUE : An undirected graph G=(V, E) and a positive integer J <= |V|
Question : Does G contain a clique (complete subgraph) of size J or more?
8
7.4 Proving NP-Completeness Results
Example 1 : Show that the PARTITION problem is NP-complete.
Given a known NPC problem - Sum of Subset Problem (SS), show that
PARTITION problem is NPC.
SS Problem
Instance : Let A = {a1, a2, …, an} be a set of n positive numbers.
Question : Given M, is there a subset A' A such that A' = M
PARTITION Problem
Instance : Given a finite set of m positive integers Z.
Question : Is there a subset Z' Z such that Z' = (Z-Z')
PARTITION is in NP
guess a subset Z' O(m) // or use choice(1,n)
verify Z' = (Z-Z')? O(m)
Total O(m)
SS PARTITION
Given an arbitrary instance of SS, i.e. A = {a1, a2, …, an} and M
Construct an instance of PARTITION as follows :
Z = {b1, b2, …, bn, bn+1, bn+2} of m = n+2 positive numbers
where
bi = ai for 1 <= i <= n
bn+1 = M + 1
bn+2 = A + (1 – M)
Note : bi = 2 A + 2. Also, the transformation can be done in
polynomial time (based on input size A & M)
9
To show the transformation is correct : The SS problem has a solution if and
only if the PARTITION problem has a solution.
If SS problem has a solution, then the PARTITION problem has a solution
assume A' is the solution for SS problem then
Z' = A' {bn+2} and Z-Z' =A-A' {bn+1}
Z' = M + A + (1 – M) = A + 1 = (Z-Z')
If the PARTITION problem has a solution then the SS problem has a
solution
if Z' is the solution then Z' = A + 1
exactly one of bn+2 or bn+1 Z'
if bn+2 Z' then Z' – { bn+2 } = A' and A' = M
if bn+1 Z', then use Z - Z' to obtain A'
Example 2 : Show that the Traveling Salesman (TS) Problem is NP-complete.
Given a known NPC problem - Hamiltonian Circuit (HC), show that TS problem
is NPC.
Hamiltonian Circuit (HC) problem
Instance : Give an undirected graph G=(V, E)
Question : Does G contain a Hamiltonian circuit, i.e. a sequence < v1, v2, …, vn >
of all vertices in V which is a simple cycle.
Traveling Salesman (TS) Problem
Instance : Give an undirected complete graph G=(V, E) with distance d(i,j) >= 0
for each edge (i,j) for i j and a positive integer B.
Question : Is there a tour of all cities (a simple cycle with all vertices) having
total distance no more than B.
10
TS is in NP
guess a tour, i.e. sequence of all vertices O(V)
verify that it is a cycle covering all vertices and total distance <= B O(E)
HC TS
Given arbitrary instance of HC, i.e. G=(V, E).
Construct an instance of TS as follows :
G’ = (V , E’ ), where (u,v) E’ for all u, v V and u v
d(u,v) = 0 if (u,v) E
d(u,v) = 1 if (u,v) E
and B = 0
Note : The transformation can be done in polynomial time (based on
input size V and E)
To show the transformation is correct : The HC problem has a solution if and
only if the TS problem has a solution.
If HC problem has a solution, then TS problem has a solution
Assume a < v1, v2, …, vn > is the solution for HC
It is a simple cycle which contains all vertices
Each edge (u,v) in this cycle has d(u,v) = 0
Total distance is 0
Solution for TS
If TS problem has a solution then HC problem has a solution
Obvious to see.
11
Example 3 : Show that the Vertex Cover (VC) Problem is NP-complete
(Optional)
Given 3SAT problem is NPC, show that VC problem is NPC.
3SAT Problem
Instance : Given a set of variables U = {u1, u2, …, un} and a collection of clauses
C = {c1, c2, …, cm} over U such that | ci | = 3 for 1 <= i <= m.
Question : Is there a truth assignment for U that satisfies all clauses in C?
Note : 3SAT problem is a restricted problem of SATISFIABILITY problem.
Vertex Cover (VC) Problem
Instance : Given an undirected graph G=(V, E) and a positive integer K <= |V|
Question : Is there a vertex cover of size K or less for G, i.e. a subset V’ V
such that |V’| <= K and, for each (u,v) E, at least one of u or v V’.
VC is in NP
guess a set of vertices V’ V O(V)
verify that |V’| <= K and, for each (u,v) E, u V’ or v V’ O(V*E)
3SAT VC
Given arbitrary instance of 3SAT, i.e. U = {u1, u2, …, un} and C = {c1, c2, …,
cm}. Construct an instance of VC as follows :
G = (V , E ) and K = n+2m
V = Vu Vc
Vu = {u1t, u1f, u2t, u2f, …, unt, unf} and
Vc = {a11, a12, a13} {a21, a22, a23} … {am1, cm2, am3}
12
E = Eu Ec Euc
Eu = {(u1t, u1f) , (u2t, u2f) , …, (unt, unf)}
Ec = {(a11, a12) , (a12, a13), (a13, a11)} … {(am1, am2) , (am2, am3), (am3,
am1)}
Assume ci = (xi, yi, zi) for 1 <= i <= m,
find the corresponding vertices, xi, yi, zi, in Vu
Euc = {(x1, a11), (y1, a12), (z1, a13)} … {(xm, am1), (ym, am2), (zm, am3)}
|V| = 2n+3m and |E| = n+3m+3m
The transformation can be done in polynomial time (based on input size n
and m)
Example : U = {u1, u2, u3 , u4} and C = {{u1, u3, u4}, {u1, u2, u4}}
u1_t u1_f u2_t u2_f u3_t u3_f u4_t u4_f
a12
a11 a13 a23 a21
a22
13
Major Property : if there is a vertex cover set V’ <= K, then
(a) |V’| = n+ 2m, and
(b) V’ must include exactly 1 vertex in {uit, uif} for 1 <= i <= n from Vu and
at exactly 2 vertices in {ai1, ai2, ai3} for 1 <= i <= m from Vc,
i.e. n vertices from Vu and 2m vertices from Vc
Look at edges in Eu and Ec, a vertex cover set V’ must include
at least 1 vertex {uit, uif} for 1 <= i <= n, and
at least 2 vertices from {ai1, ai2, ai3} for 1 <= i <= m.
Since |V’| <= K |V’| = K
To show the transformation is correct : The 3SAT problem has a solution if
and only if the VC problem has a solution.
If VC problem has a solution then 3SAT problem has a solution
From the above property, V’ contains n vertices from Vu and 2m vertices
from Vc
From Vu , the truth assignment for {u1, u2, …, un} in 3SAT is
ui = T if uit V’
ui = F if uif V’ for 1 <= i <= n
To see that this is a solution for 3SAT :
we must show that for each ci = (xi, yi, zi), there is atleast one variable
i {xi, yi, zi}which set ci to TRUE, 1 <= i <= n
14
From the above property, exactly 2 vertices from {ai1, ai2, ai3} in V’, for 1
<= i <= m
only cover 2 edges from {(xi, ai1), (yi, ai2), (zi, ai3)} from Euc
assume the edge (xi, ai1) is not covered by 2 vertices from {ai1, ai2, ci3}
then xi V’ since V’ is a vertex cover set
xi set the clause ci to True for 1 <= i <= n
If the 3SAT problem has a solution, then the VC problem has a solution
The vertex cover set V’ with exactly n+2m vertices can be obtained as
follows :
From the truth assignment for {u1, u2, …, un} in 3SAT, we get n
vertices from Vu,
i.e. uit V’ if ui = T; otherwise uif V’ for 1 <= i <= n
This covers all edges in Eu and at least one edge in {(xi, ai1), (yi, ai2), (zi,
ai3)} for 1 <= i <= m
From Vc, include 2 vertices from each {ai1, ai2, ai3} into V’, for 1 <= i
<= m. These 2 vertices cover all edges {(ai1, ai2) , (ai2, ai3), (ai3, ai1)} and
also cover edges in Euc that are not covered previously.
15
Example 4 : Show that the Square Packing (SP) Problem is NP-complete
(Optional)
Motivation : truck loading, the design of VLSI chips and etc.
Square Packing Problem
Given a packing square S and a set of packed squares L = {s1, s2, ..., sn}.
Question : is there and orthogonal packing of L into S?
Note : orthogonal packing the sides of squares are parallel to the vertical and
horizontal axes
3-Partition Problem
Given a list A = {a1, a2, ..., a3z} of 3z positive integers such that sum of all
numbers is zB and B/4 < ai < B/2 for each 1 <= i <= 3z.
Question : Can A be partitioned into z groups such that the sum of all
numbers in each group is B. Note : each group must have exactly 3 numbers
Proof : Refer to the research paper
16
Exercises
1. Use Vertex Cover Problem to show that the CLIQUE Problem is NP-complete.
2. Use PARTITION Problem to show that Two-Processor Nonpreemptive
Schedule Length Problem is NP-complete
3. Use 3SAT to show that the SET SPLITTING Problem is NP-complete
SET SPLITTING Problem
Given a finite set S={a1, a2, …, am} and a collection C = {s1, s2, …, sk}
where si S, 1<= i <= k.
Question : Is there a partition of S into two subsets S1 and S2, i.e. S1 S2 =
and S1 S2 = S , such that no si is entirely contained in S1 or S2.
4. Show the following packing problems are NP-complete
a set of squares into a larger rectangle.
a set of rectangles into a larger square.
Note : You should use different reductions.
17
Solution to problem # 1: Show that CLIQUE problem is NP-complete
Vertex cover problem : Given an undirected graph G=(V,E) and K <= |V|.
Question : Is there a vertex cover set V' V such that |V'| <= K?
CLIQUE problem : Given an undirected graph G=(V,E) and J <= |V|.
Question : Does G contain a set V* V such that |V*| >= J and vertices
in V* forms a complete subgraph (clique).
a) Show that CLIQUE is in NP
1. Guess a set of V* vertices O(V)
2. Check V* >= J O(V)
3. Check for each pair of u,v V*, O(E)
u v and (u,v) E
If both (2) and (3) are OK, return "Yes"; Otherwise, return "No"
b) Show that Vertex Cover CLIQUE
Given arbitrary instance of vertex cover problem, i.e. G=(V,E) and K
Construct an instance of CLIQUE problem :
G'=(V,E') and J = |V| - K
for each pair of u,v V and u v
if (u,v) E then (u,v) E'
if (u,v) E, then (u,v) E'
Note : G' is called a complement graph of G
The transformation can be done in O(V*V)
18
To show the transformation is correct, we need to prove that Vertex Cover problem
has a solution if and only if the constructed CLIQUE problem has a solution.
Assume there is a vertex cover set V' V such that |V'| <= K.
Let V" = V - V', clearly
|V"| >= |V| - K
for any pair of vertices u,v V" and u v (u,v) E; otherwise,
u V’ or v V’.
Now, consider constructed G' and V* = V", clearly
|V"| >= |V| - K |V*| >= J
for any pair of vertices u,v V* and u v (u,v) E'
V* forms a clique in G' and |V*| = J
Assume there is a clique V* in G' such that |V*| >= J.
|V*| >= J |V*| >= |V| - K
for any pair of vertices u,v V* and u v (u,v) E' since V*’s
vertices forms a complete sub-graph
Now, consider G and V' = V – V*, clearly
|V*| >= |V| - K |V'| <= K
for any pair of vertices u,v V* and u v (u,v) E
for any edge (a,b) E, either a V* or b V* a V' or b V'
V' is a vertex cover set in G and |V'| <= K
19
Solution to problem # 2 : Show that Two-Processor Nonpreemprive Schedule
Length (TNSL) problem is NP-complete
PARTITION problem : Given a set Z = {a1, a2, …, an} of n positive integers.
Question : Is there a subset Z' Z such that Z' = (Z-Z').
TNSL Problem : Given 2 processors, a set of m jobs, J, with processing time {p1,
p2, …, pm} and a positive integer L.
Question : Is there a nonpreemptive schedule for m jobs on 2 processors such that
the schedule length <= L?
a) Show that TNSL problem is in NP
1. Guess a set of job T J to be scheduled on 1st processor O(m)
2. Check T <= L O(m)
3. Check (J-T) <= L O(m)
Note : T = total processing time of all jobs in T
If both (2) and (3) are OK, return "Yes"; otherwise, return "No"
b) Show that PARITION TNSL
Given arbitrary instance of PARTITION problem, i.e. Z = {a1, a2, …, an}
Construct an instance of TNSL problem :
m = n jobs
pi = ai for 1 <= i <= n
L = (ai)/2
The transformation can be done in O(n)
20
To show the transformation is correct, we need to prove that PARTITION problem
has a solution if and only if the constructed TNSL problem has a solution.
Assume there is a set Z' such that Z' = (Z - Z')
Let T J and T' = J – T.
For each ai Z' , let job i T . Clearly
Z' = (Z - Z') = (ai)/2 T = T' = (pi)/2 = L
jobs in T and jobs in T' can be scheduled on 1st and 2nd processors
respectively with schedule length = L
Assume there is a schedule with schedule length = L.
Let T and T' be a set of jobs that are scheduled in 1st and 2nd
processor respectively.
Clearly both processors have no idle time from 0 to L since the
total processing time pi = 2L.
i.e. T = T' = (pi)/2 = L
Let ai Z' if job i in T Z' = (Z - Z')
a solution to the partition problem
21
7.5 Coping with NP-Completeness Problem
NP-hard Problem
Note : Refer to Chapter 5 of Garey and Johnson
If L' L and L' is an NP-complete problem then L is called NP-hard problem.
All NPC problems are NP-hard.
There are some NP-hard decision problems that are not in NP.
Example : Kth
Largest Subset Problem is not in NP
Instance : Given a set of positive integers A = {a1, a2, …, an}, and two non-
negative numbers, B <= A and K <= 2|A|
.
Question : Are there at least K distinct subsets A’ A such that each subset has
total sum <= B.
Note : PARTITION problem Kth
Largest Subset Problem
22
Pseudo-polynomial time algorithm
Note : Refer to Chapter 4 of Garey and Johnson
Some NP-complete problem may be solved in "polynomial" time (based on input
size and magnitude).
Example : PARTITION problem
Dynamic Programming Algorithm :
assume B = (sum of n integers)/2
construct a table of size (approx.) n x B
fill in the table row by row
for each row, add a new element
mark sum of all possible subsets
if there is a subset with sum = B, stop.
Time Complexity : O(nB)
23
NP-completeness in the strong sense
Note : Refer to Chapter 4 of Garey and Johnson
If L is NP-complete problem in the strong sense, then L cannot be solved by a
pseudo-polynomial time algorithm unless P=NP
L is NP-complete in the strong sense if L' L , L' is NP-complete in the
strong sense and L is in NP
Example : 3-partition problem is NPC in the strong sense.
Solving more restricted problems
If we restricted the problem L to problem L'
i.e. L' is a special/restricted case of L,
then we may solve L' in polynomial time.
For example : PARTITION problem
if we assume that each input integer ai <= n, where n is the number of input
integers, then the pseudo polynomial time algorithm becomes polynomial time
algorithm, i.e. O(n3)
24
Chapter 8 : Approximation Algorithms
8.1 Introduction
In general, computer cannot solve NPC problem efficiently
But, many NPC problems are too important to abandon
If a problem is an NPC problem, you may try to
find a pseudo polynomial time algorithm if it is not NPC in the strong sense
solve restricted problems
find approximation algorithms (a.k.a. heuristics; usually a simple & fast
algorithm)
Let consider optimization problems only
An algorithm A is an approximation for a problem L : if given any valid instance
I, it finds a solution A(I) for L and A(I) is “close” to optimal solution OPT(I).
[sometime, it will be nice to also include - If I is an invalid instance (with no
solution) then it should return “no solution”].
Approximation ration (or bound) of approximation algorithm A for problem L
A(I)/OPT(I) <= ; if L is a minimizing problem and A(I) >= OPT(I) > 0
OPT(I)/A(I) <= ; if L is a maximizing problem and OPT(I) >= A(I) > 0
When you provide an algorithm (pseudo-polynomial/polynomial/heuristic), you
need to prove that it works correctly (of course, sometime the proof is obvious)
For heuristic, we also need to prove the performance of the algorithm.
You don’t want to give a bad approximation algorithm that sometime gives poor
performance. Also, you don’t want to give a good approximation algorithm but
show that a loose bound (i.e. not a tight bound)
25
8.2 Vertex Cover (VC) problem
Optimization VC Problem: Given an undirected graph G=(V, E). Find a
minimum vertex cover set V’ for G, i.e. V’ V such that for each (u,v) E, at
least one of u or v V’.
Approximation VC Algorithm
// Input a graph G using adjacency lists
// Output a vertex cover set C
C = ;
SE = E // initially, SE = E, i.e. adjacency lists
while ( SE ) {
delete arbitrary edge (u,v) from SE // **
C = C {u, v}
delete all edges incident to either u, v from SE
}
return C
Running time : O(V+E)
Example :
d e f g
a b c
d e f g
a b c (1)
d e f g
a b c (2)
d e f g
a b c (3)
26
Obviously, the result set C is a vertex cover set
Theorem : The above approximation VC algorithm returns a vertex cover set C
such that |C| |C*| <= 2 where C
* is an optimal (or minimum) vertex cover set.
We need to show :
(a) It is easy to show that G(V,E) such that |C| |C*| = 2.
(b) G(V,E), |C| |C*| <= 2
Refer to ** in algorithm, consider only a set of edges E
No two edges in E have the same endpoint
Assume | E | = K
C contains all vertices in E and |C| = 2K
a minimum vertex cover set for E is K vertices
a minimum vertex cover set C* for G is >= K vertices
So, |C| |C*| <= 2
b
a c
d
b
a
c
d
b
a c
d
(1) (2)
27
8.3 Maximum Programs Stored (PS) Problem
Optimization PS Problem: Given a set of n program and two storage devices. Let
si be the amount of storage needed to store the ith
program. Let L be the storage
capacity of each disk. Determine the maximum number of these n programs that
can be stores on the two disks (without splitting a program over the disks).
The decision PS problem is NPC : PARTITION PS (you should try this!)
Approximation PS Algorithm
// assume programs are sorted in nondecreasing order of program size
// i.e. s1 <= s2 <= … <= sn
i = 0; c=0; // c count the number of stored program
for j = 1 to 2 {
sum = 0
while ( sum + si <= L ) {
store ith
program into jth
device
sum += si
i++; c++
if i > n return
}
}
Example :
L=10 Si = (2, 4, 5, 6)
S1 S2 Disk 1
S3 Disk 2
0 6 10
28
Let C* be the optimal (maximum) number of programs that can be stores on the
two disks.
The above approximation PS algorithm gives very good performance ratio.
C* <= (C + 1) OR C
* C
<= 1 + 1/C
i.e. the given program stores at most 1 program less than the optimal solution
Theorem : The above approximation PS algorithm returns a number C such that
C* <= (C + 1) where C
* is the optimal value.
(a) It is easy to show that {s1, s2, …, sn} and L such that C* = (C + 1)
The above example gave C* = (C + 1)
(b) {s1, s2, …, sn} and L, C* <= (C + 1)
Let consider only one disk with capacity 2L.
It is obvious that we can store maximum number of programs into the disk by
considering programs in the order of s1 <= s2 <= … <= sn
Let be the maximum number of programs that are stored in the disk
Clearly >= C* and s1 + s2 + … + s <= 2L (i)
Let j be an index such that
(s1 + s2 + … + sj) <= L and (s1 + s2 + … + sj+1) > L (ii)
Obviously j <= and j programs are stored in the 1st disk by the above
approximation algorithm
By (i) & (ii), (sj+2 + sj+3 + … + s) <= L
(sj+1 + sj+2 + … + s-1) <= L
at least (j+1)th
program, (j+2)th
program, …, (-1)th
program
are stored in 2nd
disk by the above approximation alg. Done!
29
8.4. N-Processor Nonpreemptive Schedule Length Problem (Optional)
Give a set of n tasks and m processors. Produce a non-preemptive schedule with
minimum schedule length.
The decision problem can be proved to be NP-complete problem easily.
Let consider the following approximation LPT (largest processing time first)
algorithm :
Whenever a processor becomes free for assignment, assign the
task with the larger execution time among those available tasks.
Example :
T1 T2 T3 T4 T5 T6
10 8 7 5 3 1
Given m = 2 processors
LPT rule gives :
0 8 10 15 16 18
--------------------------------------------------------------------------
P1 | T1 | T4 | T5 |
--------------------------------------------------------------------------
P2 | T2 | T3 | T6 |
--------------------------------------------------------------------------
P1 = {T1,T4,T5} and P2 = {T2,T3,T6} , SL = 18
Optimal schedule :
P1 = {T1,T3} and P2 = {T2,T4,T5,T6} , SL = 17
30
Theorem : SL(LPT)/SL(OPT) <= 4/3 - 1/3m, i.e. The performance of LPT is at
most 33% worst than the optimal solution
Need to show :
There exists a task set TS such that SL(LPT)/SL(OPT) = 4/3 - 1/3m
Consider the following 2m+1 tasks (for m = even number), the execution
times of tasks are :
e(T2i-1) = e(T2i) = 2m-i for 1 <= i <= m
and e(T2m+1) = m.
Total execution time of 2m+1 tasks :
e(T1) = e(T2) = 2m-1
e(T3) = e(T4) = 2m-2
: :
: :
e(T2m-1) = e(T2m) = 2m – m = m
e(T2m+1) = m
2(2m2 -1-2-3+...-m) + m
2(2m2 – (m)(m+1)/2) + m
…
3m2
31
LPT rule produces the following schedule : SL(LPT) = 4m-1
0 3m-1 4m-1
| | |
______________________________________________________
p1 | T1 | T2m | T2m+1 |
p2 | T2 | T2m-1 |/////////////
p3 | T3 | T2m-2 |/////////////
p4 | T4 | T2m-3 |/////////////
: :
pi | Ti | T2m-(i-1) |/////////////
: :
pm-1 | Tm-1 | Tm+2 |/////////////
pm | Tm | Tm+1 |/////////////
Optimal schedule : SL(OPT) = 3m
0 3m
| |
______________________________________________________
p1 | T1 | T2m-2 |
p2 | T2 | T2m-3 |
p3 | T3 | T2m-4 |
p4 | T4 | T2m-5 |
: :
pi | Ti | T2m-(i+1) |
: :
pm-1 | Tm-1 | Tm |
pm | T2m-1 | T2m | T2m+1 |
We have :
SL(LPT)
------------ = (4m-1)/3m = 4/3 - 1/3m
SL(OPT)
32
For any set of tasks TS, SL(LPT)/SL(OPT) <= 4/3-1/3m
For m = 1, LPT is optimal
Let m >= 2, the proof is by contradiction
Assume that SL(LPT)/SL(OPT) <= 4/3 - 1/3m is not true.
Let TS = {T1,T2, ... , Tn} be the smallest set of tasks (least number of tasks) that
violates the bound. i.e. SL(LPT, TS) /SL(OPT, TS) > 4/3 - 1/3m.
WLOG, assume e(T1) >= e(T2) >= ... >= e(Tn)
Goal : To show that this TS has SL(LPT, TS) / SL(OPT, TS) <= 4/3 - 1/3m
contradiction!
Let f(Tx) and s(Tx) be the finishing time and starting time of Tx TS in LPT
schedule.
Claim 1 : f(Tn) = SL(LPT, TS) and f(Ti) < SL(LPT, TS) for 1 <= i <= n-1.
Proof :
if this is not true, then there will be a task Tk , k < n, such that
f(Tk) = SL(LPT, TS).
Consider a set of first k tasks of TS, TS’={T1,T2,...,Tk}, it is clear that
SL(LPT,TS’) >= SL(LPT,TS) > SL(OPT,TS) >= SL(OPT,TS’)
SL(LPT.TS’)/SL(OPT,TS’) > SL(LPT,TS)/SL(OPT,TS) > 4/3 - 1/3m
TS’ is the smaller set of tasks contradict to our assumption that TS is
the smallest set of tasks.
Therefore, claim 1 must be true.
33
Claim 2 : In an optimal schedule of TS, no processor can execute more than 2
tasks, i.e. SL(OPT,TS) < 3 * e(Tn)
Proof : // Let = e(T1) + e(T2) + … + e(Tn)
SL(OPT, TS) >= /m (i)
s(Tn) <= [e(T1) + e(T2) + … + e(Tn-1) ] / m (ii)
SL(LPT, TS)
= s(Tn) + e(Tn) from claim 1
<= [e(T1) + e(T2) + … + e(Tn-1) ] / m + e(Tn) by (ii)
= /m + (1-1/m) e(Tn) (iii)
SL(LPT, TS) / SL(OPT, TS)
<= [ /m + (1-1/m) e(Tn) ] / SL(OPT, TS) by (iii)
= (/m) / SL(OPT, TS) + [(1-1/m) e(Tn) ] / SL(OPT, TS)
<= 1 + [(1-1/m) e(Tn)] / SL(OPT, TS) by (i)
Since SL(LPT, TS) /SL(OPT, TS) > 4/3 - 1/3m, we have
1 + [(1-1/m) e(Tn)] / SL(OPT, TS) > 4/3 - 1/3m
[(m-1) e(Tn)] / m SL(OPT, TS) > (m-1)/(3m)
3 e(Tn) > SL(OPT, TS)
So, the optimal schedule length is less than 3 * the task with the smallest
execution time at most 2 tasks per processor
34
Now, we have claim 1 and claim 2, we want to transform the optimal schedule
S to a schedule S’ such that the schedule length will never increase,
i.e. SL(S’) <= SL(S).
Finally, we show that S’ is a LPT schedule, i.e. SL(S’) = SL(LPT).
This is the contraction we sought since we assumed
SL(LPT) /SL(OPT) > 4/3 - 1/3m but we have shown that
SL(LPT) /SL(OPT) <= 1.
Let consider the following transformation operations :
Type I : Swap positions of Tj and Ti
P’ | Ti | Tj | where e(Tj) > e(Ti)
P’ | Tj | Ti | note : no change in SL
Type II : Move Tu to the early starting time
P’ | Ti | Tu |
P” | Tj | where e(Ti) > e(Tj)
P’ | Ti |
P” | Tj | Tu | note : no increase in SL
Type III : Swap positions of Tu and Tv
P’ | Ti | Tu |
P” | Tj | Tv | where e(Ti) > e(Tj) and e(Tu) > e(Tv)
P’ | Ti | Tv |
P” | Tj | Tu | note : no increase in SL
35
Start with optimal schedule S (note : at most 2 tasks in each processor)
Apply these three transformations exhaustively until none can be applied.
Note : all intermediate schedules are also optimal schedules since schedule
length does not increase
Rearrange all the processors according to the 1st task as follow :
All processors with one tasks are arranged as in P1 to Ps
with e(Tk1) >= e(Tk2) >= ... >= e(Tks) and
All processors with two tasks are arranged as in Ps+1 to Ps+t
with e(Ti1) >= e(Ti2) >= ... >= e(Tit)
P1 | Tk1 |/////////////////
P2 | Tk2 |////////////////////
: :
: :
Ps | Tks |////////////////////////
Ps+1 | Ti1 | Tj1 |///////////////
Ps+2 | Ti2 | Tj2 |/////////////////
: :
: :
Ps+t | Tit | Tjt |////////////////////
Since no type I operation e(Tiz) >= e(Tjz) 1<= z <= t
Since no type II operation e(Tks) >= e(Ti1)
Since no type III operation e(Tjt) >= e(Tjt-1) >= ... >= e(Tj1)
e(Tk1) >= e(Tk2) >= ... >= e(Tks) >=
e(Ti1) >= e(Ti2) >= ... >= e(Tit) >=
e(Tjt) >= e(Tjt-1) >= ... >= e(Tj1)
Is this LPT schedule? Yes, except the following case :
36
P’ | Ti’ | Tj’ |
P” | Ti” | Tj” |
where Ti’ >= Ti” >= Tj” >= Tj’ and Ti’ > Ti” + Tj”
then LPT will look like
P’ | Ti’ |
P” | Ti” | Tj” | Tj’ |
This cannot happen in the new final schedule by claim 2.
this new final schedule S’ has SL(S’) <= SL(S), and S’ is a LPT schedule,
therefore
SL(LPT, TS) <= SL(OPT, TS) contradiction to out assumption.
Note : It can be shown that at most a finite number of type I, II and II
operations are needed to transform S S’.
37
8.5 Traveling Salesperson problem
Not all NPC problems have polynomial time approximation algorithms!
Recall : Traveling Salesman (TS) Problem
Give an undirected complete graph G=(V, E) with distance d(i,j) >= 0 for each
edge (i,j) for i j. Find a tour of all cities (a simple cycle with all vertices) with
minimum total distance.
If P NP, then there is no polynomial time approximation algorithm with bound
, where >= 1 is any constant, for TS problem.
Assume there is a polynomial time approximation algorithm A for TS
problem with integer bound .
We would like to show that there is a polynomial time algorithm to solve
hamiltonian-cycle problem P = NP contradiction!
We prove this by using algorithm A to solve HC problem
Let G=(V, E) be an instant of HC problem
Construct G’ = (V, E’) for TS in polynomial time as follows:
(u,v) E’ for all u, v V and u v
d(u,v) = *V +1 if (u,v) E
d(u,v) = 1 if (u,v) E
38
If algorithm A returns a tour <= *V
it must be cost V, since weight of each edge is either 1 or *V +1
it is HC in G
If algorithm A returns a tour > *V
since the bound is , optimal solution must be > V
contain at least one edge with cost *V +1
no solution to HC problem.
Therefore, Algorithm A can solve the HC problem.
Note : With some restrictions, you may still find approximation algorithm for TS
problem.
If d(u,w) <= d(u,v) + d(v,w) for any three vertices u, v, w V, then there is
polynomial time approximation algorithm ( = 2) for TS problem.
Skip algorithm, just show an example :
a d
b e
c h f g
a d
b e
c h f g
(i) MST (ii) Preorder walk
ordering from (i)