Approximation algorithms for NP-hard optimization problems

Approximation algorithms for NP-hard optimization problems

Philip N. Klein

Department of Computer Science

Brown University

Neal E. Young

Department of Computer Science

Dartmouth College

Chapter 34, Algorithms and Theory of Computation Handbookc©1999 CRC Press

1 Introduction

In this chapter, we discuss approximation algorithms for optimization problems. An optimization problemconsists in finding the best (cheapest, heaviest, etc.) element in a large set P, called the feasible regionand usually specified implicitly, where the quality of elements of the set are evaluated using a func-tion f(x), the objective function, usually something fairly simple. The element that minimizes (ormaximizes) this function is said to be an optimal solution of the objective function at this elementis the optimal value.

optimal value = min{f(x) | x ∈ P} (1)

A example of an optimization problem familiar to computer scientists is that of finding aminimum-cost spanning tree of a graph with edge costs. For this problem, the feasible region P,the set over which we optimize, consists of spanning trees; recall that a spanning tree is a set ofedges that connect all the vertices but forms no cycles. The value f(T ) of the objective functionapplied to a spanning tree T is the sum of the costs of the edges in the spanning tree.

The minimum-cost spanning tree problem is familiar to computer scientists because there areseveral good algorithms for solving it — procedures that, for a given graph, quickly determine theminimum-cost spanning tree. No matter what graph is provided as input, the time required foreach of these algorithms is guaranteed to be no more than a slowly growing function of the numberof vertices n and edges m (e.g. O(m log n)).

For most optimization problems, in contrast to the minimum-cost spanning tree problem, thereis no known algorithm that solves all instances quickly in this sense. Furthermore, there is notlikely to be such an algorithm ever discovered, for many of these problems are NP-hard, and suchan algorithm would imply that every problem in NP could be solved quickly (i.e. P=NP), whichis considered unlikely.1 One option in such a case is to seek an approximation algorithm — analgorithm that is guaranteed to run quickly (in time polynomial in the input size) and to producea solution for which the value of the objective function is quantifiably close to the optimal value.

Considerable progress has been made towards understanding which combinatorial-optimizationproblems can be approximately solved, and to what accuracy. The theory of NP-completeness

1For those unfamiliar with the theory of NP-completeness, see Chapters 33 and 34 or (Garey and Johnson, 1979).

1

can provide evidence not only that a problem is hard to solve precisely but also that it is hard toapproximate to within a particular accuracy. Furthermore, for many natural NP-hard optimizationproblems, approximation algorithms have been developed whose accuracy nearly matches the bestachievable according to the theory of NP-completeness. Thus optimization problems can be cate-gorized according to the best accuracy achievable by a polynomial-time approximation algorithmfor each problem.

This chapter, which focuses on discrete (rather than continuous) NP-hard optimization prob-lems, is organized according to these categories; for each category, we describe a representativeproblem, an algorithm for the problem, and the analysis of the algorithm. Along the way wedemonstrate some of the ideas and methods common to many approximation algorithms. Also, toillustrate the diversity of the problems that have been studied, we briefly mention a few additionalproblems as we go. We provide a sampling, rather than a compendium, of the field — many im-portant results, and even areas, are not presented. In Section 12, we mention some of the areasthat we do not cover, and we direct the interested reader to more comprehensive and technicallydetailed sources, such as the excellent recent book (Hochbaum, 1995). Because of limits on spacefor references, we do not cite the original sources for algorithms covered in (Hochbaum, 1995).

2 Underlying principles

Our focus is on combinatorial optimization problems, problems where the feasible region P isfinite (though typically huge). Furthermore, we focus primarily on optimization problems thatare NP-hard. As our main organizing principle, we restrict our attention to algorithms that areprovably good in the following sense: for any input, the algorithm runs in time polynomial in thelength of the input and returns a solution (i.e., a member of the feasible region) whose value (i.e.,objective function value) is guaranteed to be near-optimal in some well-defined sense.2 Such aguarantee is called the performance guarantee. Performance guarantees may be absolute, meaningthat the additive difference between the optimal value and the value found by the algorithm isbounded. More commonly, performance guarantees are relative, meaning that the value found bythe algorithm is within a multiplicative factor of the optimal value.

When an algorithm with a performance guarantee returns a solution, it has implicitly discovereda bound on the exact optimal value for the problem. Obtaining such bounds is perhaps the mostbasic challenge in designing approximation algorithms. If one can’t compute the optimal value,how can one expect to prove that the output of an algorithm is near it? Three common techniquesare what which we shall call witnesses, relaxation, and coarsening.

Intuitively, a witness encodes a short, easily verified proof that the optimal value is at least,or at most, a certain value. Witnesses provide a dual role to feasible solutions to a problem.For example, for a maximization problem, where any feasible solution provides a lower boundto the optimal value, a witness would provide an upper bound on the optimal value. Typically,an approximation algorithm will produce not only a feasible solution, but also a witness. Theperformance guarantee is typically proven with respect to the two bounds — the upper boundprovided by the witness and the lower bound provided by the feasible solution. Since the optimal

2An alternative to this worst-case analysis is average-case analysis. See Chapter 2.

2

value is between the two bounds, the performance guarantee also holds with respect to the optimalvalue.

Relaxation is another way to obtain a lower bound on the minimum value (or an upper boundin the case of a maximization problem). One formulates a new optimization problem, called arelaxation of the original problem, using the same objective function but a larger feasible region P ′

that includes P as a subset. Because P ′ contains P, any x ∈ P (including the optimal element x)belongs to P ′ as well. Hence the optimal value of the relaxation, min{f(x) | x ∈ P ′}, is less thanor equal to the optimal value of the original optimization problem. The intent is that the optimalvalue of the relaxation should be easy to calculate and should be reasonably close to the optimalvalue of the original problem.

Linear programming can provide both witnesses and relaxations, and is therefore an importanttechnique in the design and analysis of approximation algorithms. Randomized rounding is ageneral approach, based on the probabilistic method, for converting a solution to a relaxed probleminto an approximate solution to the original problem.

To coarsen a problem instance is to alter it, typically restricting to a less complex feasible regionor objective function, so that the result problem can be efficiently solved, typically by dynamicprogramming. For coarsening to be useful, the coarsened problem must approximate the originalproblem, in that there is a rough correspondence between feasible solutions of the two problems,a correspondence that approximately preserves cost. We use the term coarsening rather loosely todescribe a wide variety of algorithms that work in this spirit.

3 Approximation algorithms with small additive error

3.1 Minimum-degree spanning tree

For our first example, consider a slight variant on the minimum-cost spanning tree problem, theminimum-degree spanning tree problem. As before, the feasible region P consists of spanning treesof the input graph, but this time the objective is to find a spanning tree whose degree is minimum.The degree of a vertex of a spanning tree (or, indeed, of any graph), is the number of edges incidentto that vertex, and the degree of the spanning tree is the maximum of the degrees of its vertices.Thus minimizing the degree of a spanning tree amounts to finding a smallest integer k for whichthere exists a spanning tree in which each vertex has at most k incident edges.

Any procedure for finding a minimum-degree spanning tree in a graph could be used to finda Hamiltonian path in any graph that has one, for a Hamiltonian path is a degree-two spanningtree. (A Hamiltonian path of a graph is a path through that graph that visits each vertex of thegraph exactly once.) Since it is NP-hard even to determine whether a graph has a Hamiltonianpath, even determining whether the minimum-degree spanning tree has degree two is presumed tobe computationally difficult.

3.2 An approximation algorithm for minimum-degree spanning tree

Nonetheless, the minimum-degree spanning-tree problem has a remarkably good approximationalgorithm (Hochbaum, 1995, Ch. 7). For an input graph with m edges and n vertices, the algorithmrequires time slightly more than the product of m and n. The output is a spanning tree whose

3

Figure 1: On the left is an example input graph G. On the right is a spanning tree T that mightbe found by the approximation algorithm. The shaded circle indicates the nodes in the witness setS.

Figure 2: The figure on the left shows the r trees T1, . . . , Tr obtained from T by deleting the nodesof S. Each tree is indicated by a shaded region. The figure on the right shows that no edges of theinput graph G connect different trees Ti.

degree is guaranteed to be at most one more than the minimum degree. For example, if the graphhas a Hamiltonian path, the output is either such a path or a spanning tree of degree three.

Given a graph G, the algorithm naturally finds the desired spanning tree T of G. The algorithmalso finds a witness — in this case, a set S of vertices proving that T ’s degree is nearly optimal.Namely, let k denote the degree of T , and let T1, T2, . . . , Tr be the subtrees that would result fromT if the vertices of S were deleted. The following two properties are enough to show that T ’s degreeis nearly optimal.

1. There are no edges of the graph G between distinct trees Ti, and

2. the number r of trees Ti is at least |S|(k − 1)− 2(|S| − 1).

To show that T ’s degree is nearly optimal, let Vi denote the set of vertices comprising subtree Ti

(i = 1, . . . , r). Any spanning tree T ∗ at all must connect up the sets V1, V2, . . . , Vr and the verticesy1, y2, . . . , y|S| ∈ S, and must use at least r + |S| − 1 edges to do so. Furthermore, since no edgesgo between distinct sets Vi, all these edges must be incident to the vertices of S.

4

Figure 3: The figure on the left shows an arbitrary spanning tree T ∗ for the same input graph G.The figure on the right has r shaded regions, one for each subset Vi of nodes corresponding to a treeTi in Figure 3.2. The proof of the algorithm’s performance guarantee is based on the observationthat at least r + |S| − 1 edges are needed to connect up the Vi’s and the nodes in S.

Hence we obtain∑{degT ∗(y) | y ∈ S} ≥ r + |S| − 1

≥ |S|(k − 1)− 2(|S| − 1) + |S| − 1

= |S|(k − 1)− (|S| − 1) (2)

where degT ∗(y) denotes the degree of y in the tree T ∗. Thus the average of the degrees of verticesin S is at least |S|(k−1)−(|S|−1)

|S| , which is strictly greater than k−2. Since the average of the degreesof vertices in S is greater than k − 2, it follows that at least one vertex has degree at least k − 1.

We have shown that for every spanning tree T ∗, there is at least one vertex with degree at leastk − 1. Hence the minimum degree is at least k − 1.

We have not explained how the algorithm obtains both the spanning tree T and the set S ofvertices, only how the set S shows that the spanning tree is nearly optimal. The basic idea is asfollows. Start with any spanning tree T , and let d denote its degree. Let S be the set of verticeshaving degree d or d − 1 in the current spanning tree. Let T1, . . . , Tr be the subtrees comprisingT−S. If there are no edges between these subtrees, the set S satisfies property 1 and one can show italso satisfies property 2; in this case the algorithm terminates. If on the other hand there is an edgebetween two distinct subtrees Ti and Tj , inserting this edge in T and removing another edge from T

results in a spanning tree with fewer vertices having degree at least d−1. Repeat this process on thenew spanning tree; in subsequent iterations the improvement steps are somewhat more complicatedbut follow the same lines. One can prove that the number of iterations is O(n log n).

We summarize our brief sketch of the algorithm as follows: either the current set S is a witnessto the near-optimality of the current spanning tree T , or there is a slight modification to the set andthe spanning tree that improve them. The algorithm terminates after a relatively small number ofimprovements.

This algorithm is remarkable not only for its simplicity and elegance but also for the quality ofthe approximation achieved. As we shall see, for most NP-hard optimization problems, we mustsettle for approximation algorithms that have much weaker guarantees.

5

3.3 Other problems having small-additive-error algorithms

There are a few other natural combinatorial-optimization problems for which approximation algo-rithms with similar performance guarantees are known. Here are two examples:

Edge Coloring: Given a graph, color its edges with a minimum number of colors so that, foreach vertex, the edges incident to that vertex are all different colors. For this problem, it is easy tofind a witness. For any graph G, let v be the vertex of highest degree in G. Clearly one needs toassign at least degG(v) colors to the edges of G, for otherwise there would be two edges with thesame color incident to v. For any graph G, there is an edge coloring using a number of colors equalto one plus the degree of G. The proof of this fact translates into a polynomial-time algorithm thatapproximates the minimum edge-coloring to within an additive error of 1.

Bin Packing: The input consists of a set of positive numbers less than 1. A solution is a partitionof the numbers into sets summing to no more than 1. The goal is to minimize the number of blocks ofthe partition. There are approximation algorithms for bin packing that have very good performanceguarantees. For example, the performance guarantee for one such algorithm is as follows: for anyinput set I of item weights, it finds a packing that uses at most OPT(I) + O(log2 OPT(I)) bins,where OPT(I) is the number of bins used by the best packing, i.e. the optimal value.

4 Randomized rounding and linear programming

A linear programming problem is any optimization problem in which the feasible region correspondsto assignments of values to variables meeting a set of linear inequalities and in which the objectivefunction is a linear function. An instance is determined by specifying the set of variables, theobjective function, and the set of inequalities. Linear programs are capable of representing a largevariety of problems and have been studied for decades in combinatorial optimization and have atremendous literature (see e.g., Chapters 24 and 25 of this book). Any linear program can be solved— that is, a point in the feasible region maximizing or minimizing the objective function can befound — in time bounded by a polynomial in the size of the input.

A (mixed) integer linear programming problem is a linear programming problem augmentedwith additional constraints specifying that (some of) the variables must take on integer values.Such constraints make integer linear programming even more general than linear programming —in general, solving integer linear programs is NP-hard.

For example, consider the following balanced matching problem: The input is a bipartite graphG = (V,W,E). The goal is to choose an edge incident to each vertex in V (|V | edges in total),while minimizing the maximum load of (number of chosen edges adjacent to) any vertex in W . Thevertices in V might represent tasks, the vertices in W might represent people, while the presence ofedge {v, w} indicates that person w is competent to perform task v. The problem is then to assigneach task to a person competent to perform it, while minimizing the maximum number of tasksassigned to any person.3

3Typically, randomized rounding is applied to NP-hard problems, whereas the balanced matching problem hereis actually solvable in polynomial time. We use it as an example for simplicity — the analysis captures the essential

6

This balanced matching problem can be formulated as the following integer linear program:minimize ∆

subject to

∑

u∈N(v) x(u, v) = 1 ∀v ∈ V∑v∈N(u) x(u, v) ≤ ∆ ∀w ∈W

x(u, v) ∈ {0, 1} ∀(u, v) ∈ E.

Here N(x) denotes the set of neighbors of vertex x in the graph. For each edge (u, v) the variablex(u, v) determines whether the edge (u, v) is chosen. The variable ∆ measures the maximum load.

Relaxing the integrality constraints (i.e., replacing them as well as we can by linear inequalities)yields the linear program:

minimize ∆

subject to

∑

u∈N(v) x(u, v) = 1 ∀v ∈ V∑v∈N(u) x(u, v) ≤ ∆ ∀w ∈W

x(u, v) ≥ 0 ∀(u, v) ∈ E.

Rounding a fractional solution to a true solution. This relaxed problem can be solved inpolynomial time simply because it is a linear program. Suppose we have an optimal solution x∗,where each x∗(e) is a fraction between 0 and 1. How can we convert such an optimal fractionalsolution into an approximately optimal integer solution? Randomized rounding is a general approachfor doing just this (Motwani and Raghavan, 1995, Ch. 5).

Consider the following polynomial-time randomized algorithm to find an integer solution x fromthe optimal solution x∗ to the linear program:

1. Solve the linear program to obtain a fractional solution x∗ of load ∆∗.

2. For each vertex v ∈ V :

(a) Choose a single edge incident to v at random, so that the probability that a given edge(u, v) is chosen is x∗(u, v). (Note that

∑u∈N(v) x∗(u, v) = 1.)

(b) Let x(u, v)← 1.

(c) For all other edges (u′, v) incident to v, let x(u′, v)← 0.

The algorithm will always choose one edge adjacent to each vertex in V . Thus, x is a feasiblesolution to the original integer program. What can we say about the load? For any particular vertexw ∈ W , the load on w is

∑u∈N(v) x(u, v). For any particular edge (u, v) ∈ E, the probability that

x(u, v) = 1 is x∗(u, v). Thus the expected value of the load on a vertex u ∈ U is∑

vinN(u) x∗(u, v),which is at most ∆∗. This is a good start. Of course, the maximum load over all u ∈ U is likely tobe larger. How much larger?

spirit of a similar analysis for the well-studied integer multicommodity flow problem. (A simple version of thatproblem is: “Given a network and a set of commodities (each a pair of vertices), choose a path for each commodityminimizing the maximum congestion on any edge.”)

7

To answer this, we need to know more about the distribution of the load on v than just theexpected value. The key fact that we need to observe is that the load on any v ∈ V is a sum ofindependent {0, 1}-random variables. This means it is not likely to deviate much from its expectedvalue. Precise estimates come from standard bounds, called “Chernoff”- or “Hoeffding” bounds,such as the following:Theorem Let X be the sum of independent {0, 1} random variables. Let µ > 0 be the expectedvalue of X. Then for any ε > 0,

Pr[X ≥ (1 + ε)µ] < exp(−µmin{ε, ε2}/3).

(See e.g. (Motwani and Raghavan, 1995, Ch. 4.1).) This is enough to analyze the performanceguarantee of the algorithm. It is slightly complicated, but not too bad:Claim With probability at least 1/2, the maximum load induced by x exceeds the optimal by at mostan additive error of

max{

3 ln(2m),√

3 ln(2m)∆∗}

,

where m = |W |.proof sketch: As observed previously, for any particular v, the load on v is a sum (of independentrandom {0, 1}-variables) with expectation bounded by ∆∗. Let ε be just large enough so thatexp(−∆∗ min{ε, ε2}/3) = 1/(2m). By the Chernoff-type bound above, the probability that theload exceeds (1 + ε)∆∗ is then less than 1/(2m). Thus, by the naive union bound4, the probabilitythat the maximum load on any v ∈ V is more than ∆∗(1+ ε) = ∆∗+ ε∆∗ is less then 1/2. We leaveit to the reader to verify that the choice of ε makes ε∆∗ equal the expression in the statement ofthe claim. �

Summary. This is the general randomized-rounding recipe:

1. Formulate the original NP-hard problem as an integer linear programming problem (IP).

2. Relax the program IP to obtain a linear program (LP).

3. Solve the linear program, obtaining a fractional solution.

4. Randomly round the fractional solution to obtain an approximately optimal integer solution.

5 Performance ratios and ρ-approximation

Relative (multiplicative) performance guarantees are more common than absolute (additive) perfor-mance guarantees. One reason is that many NP-hard optimization problems are rescalable: givenan instance of the problem, one can construct a new, equivalent instance by scaling the objectivefunction. For instance, the traveling salesman problem is rescalable — given an instance, multi-plying the edge weights by any λ > 0 yields an equivalent problem with the objective function

4The probability that any of several events happens is at most the sum of the probabilities of the individual events.

8

scaled by λ. For rescalable problems, the best one can hope for is a relative performance guarantee(Shmoys, ).

A ρ-approximation algorithm is an algorithm that returns a feasible solution whose objectivefunction value is at most ρ times the minimum (or, in the case of a maximization problem, theobjective function value is at least ρ times the maximum). We say that the performance ratio ofthe algorithm is ρ.5

6 Polynomial approximation schemes

The knapsack problem is an example of a rescalable NP-hard problem. An instance consists of aset of pairs of numbers (weighti,profiti), and the goal is to select a subset of pairs for which thesum of weights is at most 1 so as to maximize the sum of profits. (Which items should one put ina knapsack of capacity 1 so as to maximize profit?)

Since the knapsack problem is rescalable and NP-hard, we assume that there is no approximationalgorithm achieving, say, a fixed absolute error. One is therefore led to ask: what is the bestperformance ratio achievable by a polynomial-time approximation algorithm? In fact (assumingP6=NP), there is no such best performance ratio: for any given ε > 0, there is a polynomialapproximation algorithm whose performance ratio is 1 + ε. The smaller the value of ε, however,the greater the running time of the corresponding approximation algorithm. Such a collection ofapproximation algorithms, one for each ε > 0, is called a (polynomial) approximation scheme.

Think of an approximation scheme as an algorithm that takes an additional parameter, the valueof ε, in addition to the input specifying the instance of some optimization problem. The runningtime of this algorithm is bounded in terms of the size of the input and in terms of ε. For example,there is an approximation scheme for the knapsack problem that requires time O(n log(1/ε)+1/ε4)for instances with n items. Below we sketch a much simplified version of this algorithm that requirestime O(n3/ε). The algorithm works by coarsening.

The algorithm is given the pairs (weight1,profit1), . . . , (weightn,profitn), and the parameter ε.We assume without loss of generality that each weight is less than or equal to 1. Let profitmax =maxi profiti. Let OPT denote the (unknown) optimal value. Since the item of greatest profit itselfconstitutes a solution, albeit not usually a very good one, we have profitmax ≤ OPT. In order toachieve a relative error of at most ε, therefore, it suffices to achieve an absolute error of at mostε profitmax.

We transform the given instance into a coarsened instance by rounding each profit down toa multiple of K = ε profitmax /n. In so doing, we reduce each profit by less than ε profitmax /n.Consequently, since the optimal solution consists of no more than n items, the profit of this optimalsolution is reduced by less than ε profitmax in total. Thus, the optimal value for the coarsenedinstance is at least OPT− ε profitmax, which is in turn at least (1 − ε) OPT. The correspondingsolution, when measured according to the original profits, has value at least this much. Thus weneed only solve the coarsened instance optimally in order to get a performance guarantee of 1− ε.

5This terminology is the most frequently used, but one also finds alternative terminology in the literature. Con-fusingly, some authors have used the term 1/ρ-approximation algorithm or (1− ρ)-approximation algorithm to referto what we call a ρ-approximation algorithm.

9

Before addressing the solution of the coarsened instance, note that the optimal value is thesum of at most n profits, each at most profitmax. Thus OPT ≤ n2K/ε. The optimal value for thecoarsened instance is therefore also at most n2K/ε.

To solve the coarsened instance optimally, we use dynamic programming. Note that for thecoarsened instance, each achievable total profit can be written as i ·K for some integer i ≤ n2/ε.The dynamic-programming algorithm constructs an dn2/εe × (n + 1) table T [i, j] whose i, j entryis the minimum weight required to achieve profit i ·K using a subset of the items 1 through j. Theentry is infinity if there is no such way to achieve that profit.

To fill in the table, the algorithm initializes the entries T [i, 0] to infinity, then executes thefollowing step for j = 1, 2, . . . , n:

For each i, set T [i, j] := min{T [i, j − 1],weightj +T [i− ( profitj/K), j − 1]}

where profitj is the profit of item j in the rounded-down instance. A simple induction on j showsthat the calculated values are correct. The optimal value for the coarsened instance is

OPT = max{iK | T [i, n] ≤ 1}.

The above calculates the optimal value for the coarsened instance; as usual in dynamic program-ming, a corresponding feasible solution can easily be computed if desired.

6.1 Other problems having polynomial approximation schemes

The running time of the knapsack approximation scheme depends polynomially on 1/ε. Such ascheme is called a fully polynomial approximation scheme. Most natural NP-complete optimizationproblems are strongly NP-hard, meaning essentially that the problems are NP-hard even when thenumbers appearing in the input are restricted to be no larger in magnitude than the size of theinput. For such a problem, we cannot expect a fully polynomial approximation scheme to exist(Garey and Johnson, 1979, §4.2). On the other hand, a variety of NP-hard problems in fixed-dimensional Euclidean space have approximation schemes. For instance, given a set of points inthe plane:

Covering with Disks: Find a minimum set of area-1 disks (or squares, etc.) covering all thepoints (Hochbaum, 1995, §9.3.3).

Euclidean Traveling Salesman: Find a closed loop passing through each of the points andhaving minimum total arc length (Arora, 1996).

Euclidean Steiner Tree: Find a minimum-length set of segments connecting up all the points(Arora, 1996).

Similarly, many problems in planar graphs or graphs of fixed genus can be have polynomialapproximation schemes (Hochbaum, 1995, §9.3.3), For instance, given a planar graph withweights assigned to its vertices:

10

Maximum-Weight Independent Set: Find a maximum-weight set of vertices, no two of whichare adjacent.

Minimum-Weight Vertex Cover: Find a minimum-weight set of vertices such that every edgeis incident to at least one of the vertices in the set.

The above algorithms use relatively more sophisticated and varied coarsening techniques.

7 Constant-factor performance guarantees

We have seen that, assuming P6=NP, rescalable NP-hard problems do not have polynomial-timeapproximation algorithms with small absolute errors but may have fully polynomial approximationschemes, while strongly NP-hard problems do not have fully polynomial approximation schemes butmay have polynomial approximation schemes. Further, there is a class of problems that do not haveapproximation schemes: for each such problem there is a constant c such that any polynomial-timeapproximation algorithm for the problem has relative error at least c (assuming P6= NP). For sucha problem, the best one can hope for is an approximation algorithm with constant performanceratio.

Our example of such a problem is the vertex cover problem: given a graph G, find a minimum-size set C (a vertex cover) of vertices such that every edge in the graph is incident to some vertexin C. Here the feasible region P consists of the vertex covers in G, while the objective function isthe size of the cover. Here is a simple approximation algorithm (Hochbaum, 1995):

1. Find a maximal independent set S of edges in G.

2. Let C be the vertices incident to edges in S.

(A set S of edges is independent if no two edges in S share an endpoint. The set S is maximal ifno larger independent set contains S.) The reader may wish to verify that the set S can be foundin linear time, and that because S is maximal, C is necessarily a cover.

What performance guarantee can we show? Since the edges in S are independent, any covermust have at least one vertex for each edge in S. Thus S is a witness proving that any cover hasat least |S| vertices. On the other hand, the cover C has 2|S| vertices. Thus the cover returned bythe algorithm is at most twice the size of the optimal vertex cover.

The weighted vertex cover problem. The weighted vertex cover problem is a generalization ofthe vertex cover problem. An instance is specified by giving a graph G = (V,E) and, for each vertexv in the graph, a number wt(v) called its weight. The goal is to find a vertex cover minimizing thetotal weight of the vertices in the cover. Here is one way to represent the problem as an integerlinear program:

minimize∑v∈V

wt(v)x(v)

subject to

x(u) + x(v) ≥ 1 ∀{u, v} ∈ E

x(v) ∈ {0, 1} ∀v ∈ V.

11

There is one {0, 1}-variable x(v) for each vertex v representing whether v is in the cover or not,and there are constraints for the edges that model the covering requirement. The feasible regionof this program corresponds to the set of vertex covers. The objective function corresponds to thetotal weight of the vertices in the cover. Relaxing the integrality constraints yields

minimize∑v∈V

wt(v)x(v)

subject to

x(u) + x(v) ≥ 1 ∀{u, v} ∈ E

x(v) ≥ 0 ∀v ∈ V.

This relaxed problem is called the fractional weighted vertex cover problem; feasible solutions toit are called fractional vertex covers.6

Rounding a fractional solution to a true solution. By solving this linear program, anoptimal fractional cover can be found in polynomial time. For this problem, it is possible toconvert a fractional cover into an approximately optimal true cover by rounding the fractionalcover in a simple way:

1. Solve the linear program to obtain an optimal fractional cover x∗.

2. Let C ={v ∈ V : x∗(v) ≥ 1

2

}.

The set C is a cover because for any edge, at least one of the endpoints must have fractional weightat least 1/2. The reader can verify that the total weight of vertices in C is at most twice thetotal weight of the fractional cover x∗. Since the fractional solution was an optimal solution to arelaxation of the original problem, this is a 2-approximation algorithm (Hochbaum, 1995).

For most problems, this simple kind of rounding is not sufficient. The previously discussedtechnique called randomized rounding is more generally useful.

Primal-dual algorithms — witnesses via duality. For the purposes of approximation, solvinga linear program exactly is often unnecessary. One can often design a faster algorithm based on thewitness technique, using the fact that every linear program has a well-defined notion of “witness”.The witnesses for a linear program P are the feasible solutions to another related linear programcalled the dual of P.

Suppose our original problem is a minimization problem. Then for each point y in the feasibleregion of the dual problem, the value of the objective function at y is a lower bound on the value ofthe optimal value of the original linear program. That is, any feasible solution to the dual problemis a possible witness — both for the original integer linear program and its relaxation. For theweighted vertex cover problem, the dual is the following:

6The reader may wonder whether additional constraints of the form x(v) ≤ 1 are necessary. In fact, assuming thevertex weights are non-negative, there is no incentive to make any x(v) larger than 1, so such constraints would beredundant.

12

maximize∑e∈E

y(e)

subject to

∑

e3v y(e) ≤ wt(v) ∀v ∈ V

y(e) ≥ 0 ∀e ∈ E.

A feasible solution to this linear program is called an edge packing. The constraints for the verticesare called packing constraints.

Recall the original approximation algorithm for the unweighted vertex cover problem: find amaximal independent set of edges S; let C be the vertices incident to edges in S. In the analysis,the set S was the witness.

Edge packings generalize independent sets of edges. This observation allows us to generalizethe algorithm for the unweighted problem. Say an edge packing is maximal if, for every edge, oneof the edge’s vertices has its packing constraint met. Here is the algorithm:

1. Find a maximal edge packing y.

2. Let C be the vertices whose packing constraints are tight for y.

The reader may wish to verify that a maximal edge packing can easily be found in linear time andthat the set C is a cover because y is maximal.

What about the performance guarantee? Since only vertices whose packing constraints are tightare in C, and each edge has only two vertices, we have∑

v∈C

wt(v) =∑v∈C

∑e3v

y(e) ≤ 2∑e∈E

y(e).

Since y is a solution to the dual,∑

e y(e) is a lower bound on the weight of any vertex cover,fractional or otherwise. Thus, the algorithm is a 2-approximation algorithm.

Summary. This is the general primal-dual recipe:

1. Formulate the original NP-hard problem as an integer linear programming problem (IP).

2. Relax the program IP to obtain a linear program (LP).

3. Use the dual (DLP) of LP as a source of witnesses.

Beyond these general guidelines, the algorithm designer is still left with the task of figuring outhow to find a good solution and witness. See (Hochbaum, 1995, Ch. 4) for an approach that worksfor a wide class of problems.

7.1 Other optimization problems with constant-factor approximations

Constant-factor approximation algorithms are known for problems from many areas. In this section,we describe a sampling of these problems. For each of the problems described here, there is nopolynomial approximation scheme (unless P=NP); thus constant-factor approximation algorithmsare the best we can hope for. For a typical problem, there will be a simple algorithm achieving a

13

small constant factor while there may be more involved algorithms achieving better factors. Thefactors known to be achievable typically come close to, but do not meet, the best lower boundsknown (assuming P6=NP).

For the problems below, we omit discussion of the techniques used; many of the problems aresolved using a relaxation of some form, and (possibly implicitly) the primal-dual recipe. Many ofthese problems have polynomial approximation schemes if restricted to graphs induced by pointsin the plane or constant-dimensional Euclidean space (see Section 6.1).

MAX-SAT: Given a propositional formula in conjunctive normal form (an “and” of “or”’s ofpossibly negated Boolean variables), find a truth assignment to the variables that maximizes thenumber of clauses (groups of “or”’ed variables in the formula) that are true under the assignment.A variant called MAX-3SAT restricts to the formula to have three variables per clause. MAX-3SATis a canonical example of a problem in the complexity class MAX-SNP (Hochbaum, 1995, §10.3).

MAX-CUT: Given a graph, partition the vertices of the input graph into two sets so as tomaximize the number of edges with endpoints in distinct sets. For MAX-CUT and MAX-SATproblems, the best approximation algorithms currently known rely on randomized rounding and ageneralization of linear programming called semidefinite programming (Hochbaum, 1995, §11.3).

Shortest Superstring: Given a set of strings σ1, . . . , σk, find a minimum-length string containingall σi’s. This problem has applications in computational biology (Li, 1990; Blum et al., 1994).

K-Cluster: Given a graph with weighted edges and given a parameter k, partition the verticesinto k clusters so as to minimize the maximum distance between any two vertices in the samecluster. For this and related problems see (Hochbaum, 1995, §9.4).

Traveling Salesman: Given a complete graph with edge weights satisfying the triangle inequality,find a minimum-length path that visits every vertex of the graph (Hochbaum, 1995, Ch. 8).

Edge and Vertex Connectivity: Given a weighted graph G = (V,E) and an integer k, find aminimum-weight edge set E′ ⊆ E such that between any pair of vertices, there are k edge-disjointpaths in the graph G′ = (V,E′). Similar algorithms handle the goal of k vertex-disjoint paths andthe goal of augmenting a given graph to achieve a given connectivity (Hochbaum, 1995, Ch. 6)

Steiner Tree: Given an undirected graph with positive edge-weights and a subset of the verticescalled terminals, find a minimum-weight set of edges through which all the terminals (and possiblyother vertices) are connected (Hochbaum, 1995, Ch. 8). The Euclidean version of the problem is“Given a set of points in Rn, find a minimum-total-length union of line segments (with arbitraryendpoints) that is connected and contains all the given points.”

Steiner Forest: Given a weighted graph and a collection of groups of terminals, find a minimum-weight set of edges through which every pair of terminals within each group are connected (Hochbaum,1995, Ch. 4). The algorithm for this problem is based on a primal-dual framework that has beenadapted to a wide variety of network design problems. See Section 8.1.

14

8 Logarithmic performance guarantees

When a constant-ratio performance guarantee is not possible, a slowly-growing ratio is the nextbest thing. The canonical example of this is the set cover problem: Given a family of sets F overa universe U , find a minimum-cardinality set cover C — a collection of the sets that collectivelycontain all elements in U . In the weighted version of the problem, each set also has a weight and thegoal is to find a set cover of minimum total weight. This problem is important due to its generality.For instance, it generalizes the vertex cover problem.

Here is a simple greedy algorithm:

1. Let C ← ∅.

2. Repeat until all elements are covered: add a set S to C maximizing

the number of elements in S not in any set in C

wt(S).

3. Return C.

The algorithm has the following performance guarantee (Hochbaum, 1995, §3.2):Theorem The greedy algorithm for the weighted set cover problem is an Hs-approximation algo-rithm, where s is the maximum size of any set in F .By definition Hs = 1 + 1

2 + 13 + ·+ 1

s ; also, Hs ≤ 1 + ln s.We will give a direct argument for the performance guarantee and then relate it to the general

primal-dual recipe. Imagine that as the algorithm proceeds, it assigns charges to the elements asthey are covered. Specifically, when a set S is added to the cover C, if there are k elements in S

not previously covered, assign each such elements a charge of wt(S)/k. Note that the total chargeassigned over the course of the algorithm equals the weight of the final cover C.

Next we argue that the total charge assigned over the course of the algorithm is a lower boundon Hs times the weight of the optimal vertex cover. These two facts together prove the theorem.

Suppose we could prove that for any set T in the optimal cover C∗, the elements in T areassigned a total charge of at most wt(T )Hs. Then we would be done, because every element is inat least one set in the optimal cover:∑

i∈U

charge(i) ≤∑

T∈C∗

∑i∈T

charge(i) ≤∑

T∈C∗wt(T )Hs.

So, consider, for example, a set T = {a, b, c, d, e, f} with wt(T ) = 3. For convenience, assumethat the greedy algorithm covers elements in T in alphabetical order. What can we say about thecharge assigned to a? Consider the iteration when a was first covered and assigned a charge. At thebeginning of that iteration, T was not yet chosen and none of the 6 elements in T were yet covered.Since the greedy algorithm had the option of choosing T , whatever set it did choose resulted in acharge to a of at most wt(T )/|T | = 3/6.

What about the element b? When b was first covered, T was not yet chosen, and at least5 elements in T remained uncovered. Consequently, the charge assigned to b was at most 3/5.Reasoning similarly, the elements c, d, e, and f were assigned charges of at most 3/4, 3/3, 3/2, and

15

3/1, respectively. The total charge to elements in T is at most

3× (1/6 + 1/5 + 1/4 + 1/3 + 1/2 + 1/1) = wt(T )H|T | ≤ wt(T )Hs.

This line of reasoning easily generalizes to show that for any set T , the elements in T are assigneda total charge of at most wt(T )Hs. �

Underlying duality. What role does duality and the primal-dual recipe play in the above anal-ysis? A natural integer linear program for the weighted set cover problem is

minimize∑S∈F

wt(S)x(S)

subject to

∑

S3i x(S) ≥ 1 ∀i ∈ Ux(S) ∈ {0, 1} ∀S ∈ F .

Relaxing this integer linear program yields the linear programminimize

∑S∈F

wt(S)x(S)

subject to

∑

S3i x(S) ≥ 1 ∀i ∈ Ux(S) ≥ 0 ∀S ∈ F .

A solution to this linear program is called a fractional set cover. The dual ismaximize

∑i∈U

y(i)

subject to

∑

i∈S y(i) ≤ wt(S) ∀S ∈ Fy(i) ≥ 0 ∀i ∈ U .

The inequalities for the sets are called packing constraints. A solution to this dual linear programis called an element packing. In fact, the “charging” scheme in the analysis is just an elementpacking y, where y(i) is the charge assigned to i divided by Hs. In this light, the previous analysisis simply constructing a dual solution and using it as a witness to show the performance guarantee.

8.1 Other problems having poly-logarithmic performance guarantees

Minimizing a Linear Function subject to a Submodular Constraint: This is a naturalgeneralization of the weighted set cover problem. Rather than state the general problem, we givethe following special case as an example: Given a family F of sets of n-vectors, with each setin F having a cost, find a subfamily of sets of minimum total cost whose union has rank n. Anatural generalization of the greedy set cover algorithm gives a logarithmic performance guarantee(Nemhauser and Wolsey, 1988).

Vertex-Weighted Network Steiner Tree: Like the network Steiner tree problem describedin Section 7.1, an instance consists of a graph and a set of terminals; in this case, however, thegraph can have vertex weights in addition to edge weights. An adaptation of the greedy algorithmachieves a logarithmic performance ratio.

16

Network Design Problems: This is a large class of problems generalizing the Steiner forestproblem (see Section 7.1). An example of a problem in this class is survivable network design:given a weighted graph G = (V,E) and a non-negative integer ruv for each pair of vertices, find aminimum-cost set of edges E′ ⊆ E such that for every pair of vertices u and v, there are at leastruv edge-disjoint paths connecting u and v in the graph G = (V,E′). A primal-dual approach,generalized from an algorithm for the Steiner forest problem, yields good performance guaranteesfor problems in this class. The performance guarantee depends on the particular problem; in somecases it is known to be bounded only logarithmically (Hochbaum, 1995, Ch. 4). For a commercialapplication of this work see (Mihail et al., 1996).

Graph Bisection: Given a graph, partition the nodes into two sets of equal size so as to minimizethe number of edges with endpoints in different sets. An algorithm to find an approximatelyminimum-weight bisector would be remarkably useful, since it would provide the basis for a divide-and-conquer approach to many other graph optimization problems. In fact, a solution to a relatedbut easier problem suffices.

Define a 13 -balanced cut to be a partition of the vertices of a graph into two sets each containing

at least one-third of the vertices; its weight is the total weight of edges connecting the two sets.There is an algorithm to find a 1

3 -balanced cut whose weight is O(log n) times the minimum weightof a bisector. Note that this algorithm is not, strictly speaking, an approximation algorithm forany one optimization problem: the output of the algorithm is a solution to one problem while thequality of the output is measured against the optimal value for another. (We call this kind ofperformance guarantee a “bait-and-switch” guarantee.) Nevertheless, the algorithm is nearly asuseful as a true approximation algorithm would be because in many divide-and-conquer algorithmsthe precise balance is not critical. One can make use of the balanced-cut algorithm to obtainapproximation algorithms for many problems, including the following.

Optimal Linear Arrangement: Assign vertices of a graph to distinct integral points on thereal number line so as to minimize the total length of edges.

Minimizing Time and Space for Sparse Gaussian Elimination: Given a sparse, positive-semidefinite linear system, the order in which variables are eliminated affects the time and storagespace required for solving the system; choose an ordering to simultaneously minimize both timeand storage space required.

Crossing Number: embed a graph in the plane so as to minimize the number of edge-crossings.The approximation algorithms for the above three problems have performance guarantees that

depend on the performance guarantee of the balanced-separator algorithm. It is not known whetherthe latter performance guarantee can be improved: there might be an algorithm for balancedseparators that has a constant performance ratio.

There are several other graph-separation problems for which approximation algorithms areknown, e.g. problems involving directed graphs. All these approximation algorithms for cut prob-lems make use of linear-programming relaxation. See (Hochbaum, 1995, Ch. 5).

17

9 Multi-criteria problems

In many applications, there are two or more objective functions to be considered. There have beensome approximation algorithms developed for such multi-criteria optimization problems (thoughmuch work remains to be done). Several problems in previous sections, such as the k-clusterproblem described in Section 7.1, can be viewed as a bi-criteria problem: there is a budget imposedon one resource (the number of clusters), and the algorithm is required to approximately optimizeuse of another resource (cluster diameter) subject to that budget constraint. Another example isscheduling unrelated parallel machines with costs: for a given budget on cost, jobs are assigned tomachines in such a way that the cost of the assignment is under budget and the makespan of theschedule is nearly minimum.

Other approximation algorithms for bi-criteria problems use the bait-and-switch idea men-tioned in Section 8.1. For example, there is a polynomial approximation scheme for variant of theminimum-spanning-tree problem in which there are two unrelated costs per edge, say weight andlength: given a budget L on length, the algorithm finds a spanning tree whose length is at most(1 + ε)L and whose weight is no more than the minimum weight of a spanning tree having lengthat most L (Ravi and Goemans, 1996).

10 Hard-to-approximate problems

For some optimization problems, worst-case performance guarantees are unlikely to be possible: itis NP-hard to approximate these problems even if one is willing to accept very poor performanceguarantees. Following are some examples (Hochbaum, 1995, §10.5,10.6).

Maximum Clique: Given a graph, find a largest set of vertices that are pairwise adjacent (seealso (Hastad, 1996)).

Minimum Vertex Coloring: Given a graph, color the vertices with a minimum number ofcolors so that adjacent vertices receive distinct colors.

Longest Path: Given a graph, find a longest simple path.

Max Linear Satisfy: Given a set of linear equations, find a largest possible subset that aresimultaneously satisfiable.

Nearest Codeword: Given a linear error-correcting code specified by a matrix, and given avector, find the codeword closest in Hamming distance to the vector.

Nearest Lattice Vector: Given a set of vectors v1, . . . , vn and a vector v, find an integer linearcombination of the vi that is nearest in Euclidean distance to v.

18

11 Research Issues and Summary

We have given examples for the techniques most frequently used to obtain approximation algorithmswith provable performance guarantees, the use of witnesses, relaxation, and coarsening. We havecategorized NP-hard optimization problems according to the performance guarantees achievable inpolynomial time:

1. a small additive error,

2. a relative error of ε for any fixed positive ε,

3. a constant-factor performance guarantee,

4. a logarithmic- or polylogarithmic-factor performance guarantee,

5. no significant performance guarantee.

The ability to categorize problems in this way has been greatly aided by recent research devel-opments in complexity theory. Novel techniques have been developed for proving the hardness ofapproximation of optimization problems. For many fundamental problems, we can state with con-siderable precision how good a performance guarantee can be achieved in polynomial time: knownlower and upper bounds match or nearly match. Research towards proving matching boundscontinues. In particular, for several problems for which there are logarithmic-factor performanceguarantees (e.g. balanced cuts in graphs), researchers have so far not ruled out the existence ofconstant-factor performance guarantees.

Another challenge in research is methodological in nature. This chapter has presented meth-ods of worst-case analysis: ways of universally bounding the error (relative or absolute) of anapproximation algorithm. This theory has led to the development of many interesting and usefulalgorithms, and has proved useful in making distinctions between algorithms and between opti-mization problems. However, worst-case bounds are clearly not the whole story. Another approachis to develop algorithms tuned for a particular probability distribution of inputs, e.g. the uniformdistribution. This approach is of limited usefulness because the distribution of inputs arising in aparticular application rarely matches that for which the algorithm was tuned. Perhaps the mostpromising approach would address a hybrid of the worst-case and probabilistic models. the per-formance of an approximation algorithm would be defined as the probabilistic performance on aprobability distribution selected by an adversary from among a large class of distributions. Blum(?) has has presented an analysis of this kind in the context of graph coloring, and others (see(Hochbaum, 1995, 13.7) and (?)) have addressed similar issues in the context of on-line algorithms.

12 Defining Terms

ρ-approximation algorithm — An approximation algorithm that is guaranteed to find a solutionwhose value is at most (or at least, as appropriate) ρ times the optimum. The ratio ρ is theperformance ratio of the algorithm.

absolute performance guarantee — An approximation algorithm with an absolute performanceguarantee is guaranteed to return a feasible solution whose value differs additively from theoptimal value by a bounded amount.

19

approximation algorithm — For solving an optimization problem. An algorithm that runs intime polynomial in the length of the input and outputs a feasible solution that is guaranteedto be nearly optimal in some well-defined sense called the performance guarantee.

coarsening — To coarsen a problem instance is to alter it, typically restricting to a less complexfeasible region or objective function, so that the resulting problem can be efficiently solved,typically by dynamic programming. This is not standard terminology.

dual linear program — Every linear program has a corresponding linear program called the dual.For the linear program under linear program, the dual is maxy

{b · y : AT y ≤ c and y ≥ 0

}.

For any solution x to the original linear program and any solution y to the dual, we havec · x ≥ (AT y)T x = yT (Ax) ≥ y · b. For optimal x and y, equality holds. For a problemformulated as an integer linear program, feasible solutions to the dual of a relaxation of theprogram can serve as witnesses.

feasible region — See optimization problem.

feasible solution — Any element of the feasible region of an optimization problem.

fractional solution — Typically, a solution to a relaxation of a problem.

fully polynomial approximation scheme — An approximation scheme in which the runningtime of Aε is bounded by a polynomial in the length of the input and 1/ε.

integer linear program — A linear program augmented with additional constraints specifyingthat the variables must take on integer values. Solving such problems is NP-hard.

linear program — A problem expressible in the following form. Given an n × m real matrixA, m-vector b and n-vector c, determine minx

{c · x : Ax ≥ b and x ≥ 0

}where x ranges over

all n-vectors and the inequalities are interpreted component-wise (i.e. x ≥ 0 means that theentries of x are non-negative).

MAX-SNP — A complexity class consisting of problems that have constant-factor approximationalgorithms, but no approximation schemes unless P=NP.

mixed integer linear program — A linear program augmented with additional constraints spec-ifying that some of the variables must take on integer values. Solving such problems is NP-hard.

objective function — See optimization problem.

optimal solution — To an optimization problem. A feasible solution minimizing (or possiblymaximizing) the value of the objective function.

optimal value — The minimum (or possibly maximum) value taken on by the objective functionover the feasible region of an optimization problem.

optimization problem — An optimization problem consists of a set P, called the feasible regionand usually specified implicitly, and a function f : P → R, the objective function.

20

performance guarantee — See approximation algorithm.

performance ratio — See ρ-approximation algorithm.

polynomial approximation scheme — A collection of algorithms {Aε : ε > 0}, where each Aε

is a (1 + ε)-approximation algorithm running in time polynomial in the length of the input.There is no restriction on the dependence of the running time on ε.

randomized rounding — A technique that uses the probabilistic method to convert a solutionto a relaxed problem into an approximate solution to the original problem.

relative performance guarantee — An approximation algorithm with a relative performanceguarantee is guaranteed to return a feasible solution whose value is bounded by a multiplica-tive factor times the optimal value.

relaxation — A relaxation of an optimization problem with feasible region P is another optimiza-tion problem with feasible region P ′ ⊃ P and whose objective function is an extension of theoriginal problem’s objective function. The relaxed problem is typically easier to solve. Itsvalue provides a bound on the value of the original problem.

rescalable — An optimization problem is rescalable if, given any instance of the problem andinteger λ > 0, there is an easily computed second instance that is the same except that theobjective function for the second instance is (element-wise) λ times the objective function ofthe first instance. For such problems, the best one can hope for is a multiplicative performanceguarantee, not an absolute one.

semidefinite programming — A generalization of linear programming in which any subset ofthe variables may be constrained to form a semi-definite matrix. Used in recent resultsobtaining better approximation algorithms for cut, satisfiability, and coloring problems.

strongly NP-hard — A problem is strongly NP-hard if it is NP-hard even when any numbersappearing in the input are bounded by some polynomial in the length of the input.

triangle inequality — A complete weighted graph satisfies the triangle inequality if wt(u, v) ≤wt(u,w)+wt(w, v) for all vertices u, v, and w. This will hold for any graph representing pointsin a metric space. Many problems involving edge-weighted graphs have better approximationalgorithms if the problem is restricted to weights satisfying the triangle inequality.

witness — A structure providing an easily verified bound on the optimal value of an optimiza-tion problem. Typically used in the analysis of an approximation algorithm to prove theperformance guarantee.

References

Arora, S. (1996). Polynomial time approximation scheme for Euclidean TSP and other geometricproblems. In (IEEE, 1996), pages 2–11.

21

Blum, A., Jiang, T., Li, M., Tromp, J., and Yannakakis, M. (1994). Linear approximation ofshortest superstrings. Journal of the ACM, 41(4):630–647.

Crescenzi, P. and Kann, V. (1995). A compendium of NP optimization problems.http://www.nada.kth.se/nada/theory/problemlist.html.

Garey, M. R. and Johnson, D. S. (1979). Computers and Intractibility: A Guide to the Theory ofNP-Completeness. W. H. Freeman and Company, New York.

Hastad, J. (1996). Clique is hard to approximate within n1−ε. In (IEEE, 1996), pages 627–636.

Hochbaum, D. S., editor (1995). Approximation Algorithms for NP-hard Problems. PWS PublishingCo.

IEEE (1996). 37th Annual Symposium on Foundations of Computer Science, Burlington, Vermont.

Johnson, D. S. (1974). Approximation algorithms for combinatorial problems. Journal of Computerand System Sciences, 9:256–278.

Li, M. (1990). Towards a DNA sequencing theory (learning a string) (preliminary version). In 31stAnnual Symposium on Foundations of Computer Science, volume I, pages 125–134, St. Louis,Missouri. IEEE.

Mihail, M., Shallcross, D., Dean, N., and Mostrel, M. (1996). A commercial application of survivablenetwork design: ITP/INPLANS CCS network topology analyzer. In Proceedings of the SeventhAnnual ACM-SIAM Symposium on Discrete Algorithms, pages 279–287, Atlanta, Georgia.

Motwani, R. and Raghavan, P. (1995). Randomized Algorithms. Cambridge University Press.

Nemhauser, G. L. and Wolsey, L. A. (1988). Integer and Combinatorial Optimization. John Wileyand Sons, New York.

Ravi, R. and Goemans, M. X. (1996). The constrained minimum spanning tree problem. In Proc.5th Scand. Worksh. Algorithm Theory, number 1097 in Lecture Notes in Computer Science,pages 66–75. Springer-Verlag.

Shmoys, D. B. Computing near-optimal solutions to combinatorial optimization problems. Seriesin Discrete Mathematics and Computer Science. AMS.

Further Information

For an excellent survey of the field of approximation algorithms, focusing on recent results andresearch issues, see the survey by David Shmoys (Shmoys, ). Further details on almost all of thetopics in this chapter, including algorithms and hardness results, can be found in the definitivebook edited by Dorit Hochbaum (Hochbaum, 1995). NP-completeness is the subject of the classicbook by Michael Garey and David Johnson (Garey and Johnson, 1979). An article by Johnsonanticipated many of the issues and methods subsequently developed (Johnson, 1974). Randomizedrounding and other probabilistic techniques used in algorithms are the subject of an excellent

22

text by Motwani and Raghavan (Motwani and Raghavan, 1995). As of this writing, a searchablecompendium of approximation algorithms and hardness results, by Crescenzi and Kann, is availableon-line (Crescenzi and Kann, 1995).

23

Documents

Approximation algorithms for NP-hard optimization problems