15
Journal of Combinatorial Optimization, 5, 233–247, 2001 c 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Approximation Algorithms for Bounded Facility Location Problems PIOTR KRYSTA [email protected] ROBERTO SOLIS-OBA * [email protected] Max-Planck-Institut f¨ ur Informatik, Saarbr ¨ ucken, Germany Received December 1, 1999; Revised February 24, 2000; Accepted February 25, 2000 Abstract. The bounded k-median problem is to select in an undirected graph G = (V , E ) a set S of k vertices such that the distance from any vertex v V to S is at most a given bound d and the average distance from vertices V\ S to S is minimized. We present randomized algorithms for several versions of this problem and we prove some inapproximability results. We also study the bounded version of the uncapacitated facility location problem and present extensions of known deterministic algorithms for the unbounded version. Keywords: facility location, approximation algorithms, randomized algorithms, clustering 1. Introduction The bounded k-median problem is to select in an undirected graph G = (V , E ) a set S of k vertices (called centers) such that the distance from any vertex v V\ S to S is at most a given bound d and the average distance from vertices in V\ S to S is minimized. This is a natural generalization of the well known k -median problem (Arora et al., 1998) in which it is desired to choose k centers so as to minimize the average distance from vertices to centers. Consider for example the following situation. In some city it is desired to locate a set of k fire departments so that the average distance that a fire crew must travel to get to any building in the city is minimized. Moreover, for safety reasons, it is also required that the maximum time needed to move a fire crew to any point in the city does not exceed a certain bound. Otherwise, when the fire crew finally reaches a fire site, the fire might have already destroyed everything there. Let G = (V , E ) be a graph with minimum dominating set of size k . It is not difficult to see that when all edge lengths are equal to 1 and d = 1, the bounded k -median problem is equivalent to the minimum dominating set problem. The k -center problem consists in choosing k centers so that the maximum distance from a vertex to its nearest center is minimized. The bounded k -median problem is also a generalization of the k -center problem. Another related problem is the bounded uncapacitated facility location problem. Here, given a graph G = (V , E ) there is a set F V of possible locations for service facilities. * Present address: Department of Computer Science, The University of Western Ontario, London, ON N6A 5B7, Canada.

Approximation Algorithms for Bounded Facility Location Problems

Embed Size (px)

Citation preview

Page 1: Approximation Algorithms for Bounded Facility Location Problems

Journal of Combinatorial Optimization, 5, 233–247, 2001c© 2001 Kluwer Academic Publishers. Manufactured in The Netherlands.

Approximation Algorithms for Bounded FacilityLocation Problems

PIOTR KRYSTA [email protected] SOLIS-OBA∗ [email protected] f̈ur Informatik, Saarbr̈ucken, Germany

Received December 1, 1999; Revised February 24, 2000; Accepted February 25, 2000

Abstract. Thebounded k-median problemis to select in an undirected graphG = (V, E) a setS of k verticessuch that the distance from any vertexv ∈ V to S is at most a given boundd and the average distance from verticesV\S to S is minimized. We present randomized algorithms for several versions of this problem and we prove someinapproximability results. We also study the bounded version of the uncapacitated facility location problem andpresent extensions of known deterministic algorithms for the unbounded version.

Keywords: facility location, approximation algorithms, randomized algorithms, clustering

1. Introduction

Thebounded k-median problemis to select in an undirected graphG = (V, E) a setS ofk vertices (calledcenters) such that the distance from any vertexv ∈ V\S to S is at mosta given boundd and the average distance from vertices inV\S to S is minimized. This isa natural generalization of the well knownk-median problem (Arora et al., 1998) in whichit is desired to choosek centers so as to minimize the average distance from vertices tocenters. Consider for example the following situation. In some city it is desired to locate aset ofk fire departments so that the average distance that a fire crew must travel to get toany building in the city is minimized. Moreover, for safety reasons, it is also required thatthe maximum time needed to move a fire crew to any point in the city does not exceed acertain bound. Otherwise, when the fire crew finally reaches a fire site, the fire might havealready destroyed everything there.

Let G = (V, E) be a graph with minimum dominating set of sizek. It is not difficult tosee that when all edge lengths are equal to 1 andd = 1, the boundedk-median problemis equivalent to the minimum dominating set problem. Thek-center problem consists inchoosingk centers so that the maximum distance from a vertex to its nearest center isminimized. The boundedk-median problem is also a generalization of thek-center problem.

Another related problem is the bounded uncapacitated facility location problem. Here,given a graphG = (V, E) there is a setF ⊆ V of possible locations for service facilities.

∗Present address: Department of Computer Science, The University of Western Ontario, London, ON N6A 5B7,Canada.

Page 2: Approximation Algorithms for Bounded Facility Location Problems

234 KRYSTA AND SOLIS-OBA

Each sitei ∈ F has a costfi for opening a service facility there. Thecost of servic-ing a request from a vertexv ∈ V\F is the distance fromv to its nearest facility. Theplacement of facilities must be such that the maximum cost of servicing a request is atmost a given boundd. The goal is to find locations for the facilities so that the total costof servicing the verticesv ∈ V\F plus the total cost for opening the facilities is mini-mized.

We can think of the above two problems as multi-criteria optimization problems (Linand Xue, 1999; Marathe et al., 1998) in which it is desired to minimize the total cost ofservicing the vertices, the maximum service cost, and (in the case of the boundedk-medianproblem) the number of centers. This is the view that we adopt in this paper since wepresent algorithms that might use more than the specified number of centers, or that violatethe bound on the maximum service cost.

Although the boundedk-median problem has been studied before, all the previous workcenters on exact algorithms for solving the problem in exponential time (Berman andYang, 1991; Choi and Chaudhry, 1993; Chaudhry et al., 1995; Khumawala, 1973; Toregaset al., 1971). Thek-median problem is a classical clustering problem with a large numberof applications (see e.g., Jain and Dubes, 1981). Lin and Vitter (1992a) showed that theproblem does not admit a polynomial time approximation scheme unlessP =NP. Theyalso designed an algorithm that for any valueε > 0 finds a solution of value(1+ ε) timesthe optimum, but using(1+ 1/ε)(ln n + 1)k centers, Lin and Vitter (1992a). For metricspaces Lin and Vitter (1992b) gave an algorithm that finds a solution of value 2(1+ ε)times the optimum using(1+ 1/ε)k centers. Arora et al. (1998) designed a randomizedalgorithm for the problem when the set of vertices lay on the plane. For any valueε > 0 thisalgorithm finds with high probability a solution of value at most 1+ ε times the optimumin O(nO(1+ 1

ε)) time. Very recently, Charikar et al. (1999) designed the first constant-factor

approximation algorithm for the metrick-median problem.The uncapacitated facility location problem is a classical problem of Operations Research

(Cornuejols et al., 1990). Guha and Khuller (1998) have shown that the metric version ofthe problem isMAX SNP-hard, and that it cannot be approximated within a factor smallerthan 1.463 of the optimum unlessNP⊆DTIME(npoly logn). Shmoys et al. (1997) presentedan algorithm for the problem in metric spaces with performance ratio 3.16. Algorithmswith better ratios of 2.41 and 1.74 were later given in Guha and Khuller (1998) and Chudak(1998). When the vertices of the graph lay on the plane Arora et al. (1998) designed arandomized algorithm with performance ratio 1+ ε and running timeO(nO(1+ 1

ε)), for any

ε > 0. The bounded version of the uncapacitated facility location problem has also beenstudied by the Operations Research community and several exponential time algorithms forsolving exactly the problem are known (Berman and Yang, 1991; Drezner, 1995).

Theservice costof a vertex is defined as the distance from the vertex to its closest center.We prove that the boundedk-median problem isMAX SNP-hard even when all edge lengthsare 1 and the bound on the service cost isd = 2. Moreover, by extending ideas of Guha andKhuller (1998) we prove that the optimum solution of the problem cannot be approximated inpolynomial time within a factor smaller than 1.367 unlessNP⊆DTIME(nO(log logn)). Evenif we allow the use of 1.2784k centers, we show that the problem is still inapproximablewithin 1.2784, unlessNP⊆DTIME(nO(log logn)).

Page 3: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 235

We present a technique that allows us to design randomized approximation algorithmsfor several versions of the boundedk-median problem on graphs with unit edge lengths,minimum dominating set of sizek, andd= 1: (i) an approximation algorithm with ex-pected performance ratio(4e+ 6)/(4e+ 1) ≈ 1.4211 that uses at most 2k centers andhas maximum service cost 2d; (ii) an approximation algorithm with expected performanceratio (e+ 6)/(e+ 2) ≈ 1.8478 that uses at mostk centers, but has maximum servicecost 3d; (iii) an algorithm with expected performance ratio(e+ 5)/(e+ 1) ≈ 2.076with maximum service cost 3d when the vertices have weights{1,+∞}. For the boundedk-median problem we also give a deterministic algorithm that produces a solution of valueat most 1.5 times the optimum, that uses at most 2k centers, and with maximum servicecost 2d.

We also give a 1.4489-approximation algorithm for a fault tolerant version of the boundedk-median problem: the bounded 2-neighbork-median problem. This algorithm improves onaverage the approximation ratio of the algorithm in Khuller et al. (1999) for the unbounded2-neighbork-center problem in the case of unit edge lengths.

For the case of arbitrary edge lengths, we extend algorithms of Chudak (1998), Shmoyset al. (1997), and of Lin and Vitter (1992b) for thek-median, and the capacitated anduncapacitated facility location problems. These algorithms have the same performanceguarantees as the original ones, but they bound the maximum service cost of every vertex.

The rest of the paper is organized in the following way. In Section 2 we give precisedefinitions of the problems that we study. In Section 3 we present inapproximability resultsfor the boundedk-median problem. In Section 4 we present some approximation algorithmsfor the boundedk-median problem and for the fault tolerant version of the problem when alledge lengths are equal to one. Finally, in Section 5 we describe approximation algorithmsfor the bounded uncapacitated facility location and for the boundedk-median problemswith arbitrary edge lengths.

2. Preliminaries

Let G = (V, E) be an undirected graph withn vertices. Every edge(i, j ) ∈ E has a non-negative lengthci j . We assume that the edge lengths satisfy the triangle inequality. Withoutloss of generality we might assume thatG is a complete graph and that for every pair ofverticesi and j , ci j is equal to the length of a shortest path fromi to j . Given an integervaluek ≤ n, and a valued > 0, the boundedk-median problem is to find a setS of kvertices (calledcenters) such that

1. for every vertexv ∈ V the distance to its nearest center,c(v, S), is at mostd, and2. the sum of distances from vertices inV\S to their nearest center,

∑v∈V c(v, S), is

minimized.

In the bounded uncapacitated facility location problem there is a setF of vertices thatcan be selected as centers. Each vertexi ∈ F has a costfi for selecting it as a center. Theproblem consists in choosing a setSof centers such that

Page 4: Approximation Algorithms for Bounded Facility Location Problems

236 KRYSTA AND SOLIS-OBA

1. for every vertexv ∈ V , c(v, S) ≤ d, and2. the total cost of the solution,

∑v∈V c(v, S)+∑u∈S fu, is minimized.

We say that a vertexv ∈ V is servicedat distanced′ if the distance fromv to itsnearest center is at mostd′. An (α, β, γ )-approximation algorithm for the boundedk-median problem is a polynomial time algorithm that finds a solutionS of cost at mostαtimes the optimum, with maximum service costβd, and that uses at mostγ k centers. An(α, β)-approximation algorithm for the bounded uncapacitated facility location problem isdefined in a similar way.

3. Inapproximability results

In this section we prove that the optimum solution of the boundedk-median problemcannot be approximated within a factor smaller than 1.367 in polynomial time unlessNP⊆DTIME(nO(log logn)). This results holds even when all edge lengths are equal to 1.

Let X = {x1, . . . , xp} be a finite set of elements andS ⊆ X. We say that setS coversall elementsxi ∈ S. Given a familyS = {S1, . . . , S̀ } of sets such that∪`i=1Si = X, theminimum set covering problem asks for the smallest number of sets inS that coverX. Sucha collection of sets is called aset cover. The following lemma follows from results of Guhaand Khuller (1998).

Lemma 3.1. Let (X,S) be an instance of the set covering problem with minimum coverof size k. If there is a polynomial time algorithm ASC that for some positive constantβpicksβk sets ofS covering more than(1 − 1

eβ )|X| elements of X, then NP⊆DTIME(|X|O(log log |X|)).

Let G = (V, E) be an undirected graph with unit length edges. Given two verticesu, v ∈ V we say thatu dominatesv if either u = v or u andv are adjacent. A dominatingset ofG is a set that dominates every vertex inV . Let the minimum size of a dominatingset ofG bek.

Lemma 3.2. If there is a polynomial time algorithm ADom that for any constantsβ, ε > 0,selects a set ofβk vertices that dominates c′n vertices of G, with c′ > 1+ ε− 1

eβ , then NP⊆DTIME(nO(log logn)).

Proof: We show that if algorithmADom exists then we can build algorithmASC asdescribed in Lemma 3.1. Let(X,S) be an instance of the set cover problem withX ={x1, . . . , xp} andS = {S1, . . . , S̀ }. We build a graphG = (V, E) that contains one vertexS′i for each setSi ∈ S andt = d( 1

eβ − ε)`/(εp)e verticesx′i for each elementxi ∈ X. Thereis an edge inE from S′i to every copy ofx′i if xi ∈ Si and there is also an edge fromS′i toevery setS′j , i 6= j . It is not hard to see thatG has a dominating set of sizek if and only ifX has a set cover of sizek.

Assume that algorithmADom exists and that it chooses a setU of βk vertices dominatingat leastc′|V | vertices ofG. If U contains a vertexx′i , we can replace it for any vertexS′i such

Page 5: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 237

thatxi ∈ Si . This change does not decrease the number of vertices dominated byU . Hence,we might assume thatU contains only verticesS′i and it dominates at leastc′(pt + `)− `verticesx′i . This means that the corresponding subsetsSi cover at least(c′(pt+`)−`)/t =c′p− (1− c′)`/t elements fromX. But by Lemma 3.1,c′p− (1− c′)`/t ≤ (1− 1

eβ )p, andthusc′ ≤ 1+ ε − 1

eβ . 2

Theorem 3.1. The bounded k-median problem cannot be approximated within a factorα < 1+ 1

e of the optimum, unless NP⊆DTIME(nO(log logn)). This is true even when alledge lengths are equal to1 and d= 2.

Proof: Let G = (V, E) be a graph with unit length edges and minimum dominating setof sizek. Assume that there is anα-approximation algorithmAα for the boundedk-medianproblem that keeps all service costs no larger thand = 2. Run the algorithm onG and letS ⊆ V be the set of centers that it selects. LetV1 ⊆ V be the set of vertices at distance1 from S and let the number of vertices dominated byS becn, i.e. |S∪ V1| = cn, wherec < 1. SinceG has a dominating set of sizek, there exists a solution to the boundedk-median problem of costn− k. The aboveα-approximation algorithm finds a solution of costat mostα(n− k). In this solution there arecn− k vertices at distance 1 fromS, andn− cnvertices at distance 2 fromS. Thus the cost of this solution iscn−k+2(n−cn) ≤ α(n− k),and henceα ≥ (2−c)n−k

n−k . Since by Lemma 3.2,c ≤ 1− 1e, thenα ≥ 1+ 1

e. 2

Even if we allow an algorithm to use more thank centers, it cannot approximate the valueof the optimum solution within an arbitrary precision.

Corollary 3.1. If an algorithm for the bounded k-median problem is allowed to use up to(1+ε)k centers, where0≤ ε ≤ 0.2784,then the solution obtained by that algorithm cannotbe always smaller than1+ e−1−ε times the optimum unless NP⊆DTIME(nO(log logn)).

Proof: The proof follows the proof of Theorem 3.1. 2

4. The uniform cost boundedk-median problem

Given any instance of the boundedk-median problem, we might assume without loss ofgenerality that the maximum service costd is equal to the length of some edge in the graph.This is because for any setS of k centers, the vertexv farthest fromS is at distancecvu

for some vertexu ∈ S. Given a graphG = (V, E), the k-center problemis to find thesmallest distanced′ for which there is a setS of at mostk vertices such thatc(v, S) ≤ d′

for all verticesv ∈ V . Let d∗k be the value of the optimum solution to thek-center problem.The following result by Hochbaum and Shmoys (1986) shows that we should consider onlythose instances of the boundedk-median problem in which the boundd is at least 2d∗k .

Lemma 4.1. No algorithm can approximate in polynomial time the value of the optimumsolution for the k-center problem within a factor smaller than2 unless P=NP.

Page 6: Approximation Algorithms for Bounded Facility Location Problems

238 KRYSTA AND SOLIS-OBA

In this section we restrict our attention to the boundedk-median problem in which themaximum service cost isd = 1, the input graph is connected, it has unit edge lengths andminimum dominating set of sizek. We call this problem theuniform cost bounded k-medianproblem.

4.1. A simple algorithm

Hochbaum and Shmoys (1986) designed a simple 2-approximation algorithm for thek-center problem. The algorithm, which we callAH S, finds the smallest edge lengthci j

for which a maximal setS of centers at distance larger than 2ci j from each other has size|S| ≤ k. Note that every vertexv is at distance at most 2ci j from its nearest center. For aproof thatci j is no larger than the value of the optimum solution the reader is referred toHochbaum and Shmoys (1986).

Lemma 4.2. Algorithm AH S is a ((2 − kn−k ), 1, 1)-approximation algorithm for the

uniform cost bounded k-median problem.

Proof: The centers selected by the algorithm are at least at distance 3 from each other, andso every center has at least one unique non-center vertex at distance 1 from it. Hence, thevalue of the solution found by the algorithm is at mostk+ 2(n− 2k) = 2n− 3k, while theoptimum solution has value at leastn−k. Therefore, the performance ratio of the algorithmis 2n−3k

n−k = 2− kn−k . 2

4.2. A randomized algorithm

In this section we describe a randomized algorithm for the uniform cost boundedk-medianproblem. LetNj be the set of vertices (includingj ) at distance at most one from vertexj .We can describe the problem as the following integer linear program,IP.

Min∑

i, j∈V

ci j xi j (1)

s.t.∑i∈Nj

xi j = 1 ∀ j∈V (2)

xi j ≤ yi ∀ j∈V , ∀i∈Nj (3)∑i∈V

yi ≤ k (4)

xi j , yi ∈ {0, 1} ∀i, j∈V (5)

The meaning of the variables is as follows:yi = 1 if and only if vertexi is chosen as acenter, andxi j = 1 if and only if vertexi is a center,i ∈ Nj , andi is the closest center toj .Let LP be the linear program obtained by relaxing constraint (5) to 0≤ xi j , yi ≤ 1.

Let λ ∈ [0, 1) be a fixed constant whose value will be specified later. Our algorithm is asfollows. If k > λ

1+λn we use algorithmAH S to select a set of centers. Otherwise, we solveLP and then round the solution using the following rounding procedure based on ideas from

Page 7: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 239

Shmoys et al. (1997) and Chudak (1998). Let(x, y) be an optimal solution ofLP. Withoutloss of generality (see Chudak, 1998) we may assume that(x, y) iscomplete, i.e., if xi j > 0,thenxi j = yi , for everyi, j ∈ V . Our algorithm chooses independently each vertexi as acenter with probabilityyi . Let S1 = {i | yi > 0 andi is selected as a center}. This set ofcenters might not induce a feasible solution for the problem since some vertex might be faraway from its closest center. To ensure that the maximum service cost is bounded we runalgorithmAH S on the input graph. LetS2 be the set of centers that it chooses. Note that inthis solution all vertices are serviced at distance at most 2d.

For every vertexj , let Cj =∑

i∈V ci j xi j denote the fractional cost of servicing it. LetE(cost( j )) be the expected cost of servicingj in the rounded solution.

Lemma 4.3. If k ≤ λ1+λn, then for each j∈ V, Cj + xj j = 1, and E(cost( j )) ≤

Cj + q(Cj + xj j ), where q≤ 1e.

Proof: We show first that thatCj +xj j = 1. Note that for any vertexi 6= j , if xi j > 0 thenci j = d = 1. This follows from constraint (2) ofLP and from the objective function ofLP.Hence,Cj + xj j =

∑i∈V ci j xi j + xj j =

∑i∈V : xi j>0 xi j , sinceci j = 1 for all xi j > 0, and

cj j = 0. This last expression is equal to 1 because of constraint (2) ofLP.Next, we show thatE(cost( j )) ≤ Cj + q for some valueq ≤ 1

e. For simplicity let theneighbors of vertexj beN( j ) = {1, 2, . . . , g, j }. Then,c1 j = · · · = cgj = 1, andcj j = 0.The probability that no vertex fromN( j ) is selected as a center isq = (1−yj )

∏gi=1(1−yi );

this is the probability that vertexj is serviced at distance at most 2 by one of the centersselected in the second phase of the algorithm. The probability thatj is chosen as center isyj , and the probability that at least one neighbor ofj is a center is 1− q

1−yj. Hence, the

expected service cost of vertexj is E(cost( j )) ≤ 0 · yj + 1 · (1− yj )(1− q1−yj

) + 2q =1− yj −q+2q = Cj + xj j − yj +q = Cj +q, because the solution of the linear programis complete, and soxj j = yj .

Since 1− x ≤ e−x for all x > 0, andyi = xi j for all i ∈ N( j ), thenq = (1− yj ) ·∏gi=1(1− xi j ) ≤

∏i∈{1,...,g, j } e

−xi j = e∑(−xi j ) = 1

e, where the last equality follows fromconstraint (2) ofLP. 2

Theorem 4.1. There is a randomized algorithm for the uniform cost bounded k-medianproblem, that produces a solution of expected value no larger than(4e+ 6)/(4e+ 1) timesthe value of the optimum solution. This solution uses, with high probability, at most2kcenters, and services each vertex at distance at most2.

Proof: We first argue for the bound on the number of centers. In first phase the algorithmchooses independently eachi ∈ V as a center with probabilityyi , and so by constraint (4)of LP, E(|S1|) =

∑i∈V yi ≤ k. Note that this selection process constitutes independent

Poisson trials. Using Chernoff bounds we can prove that the probability Pr[|S1| > k] is atmost [e/((1+ 1

k+1)k+2)]

kk+1 (see e.g., Theorem 4.1 in Motwani and Raghavan, 1995). Since

(1+ 1k+1)

k+2 > e(1+ 1k+1), then Pr[|S1| > k] < (1− 1

k+2)k

k+1 . If we repeat our algorithm[(k + 1)(k + 2) ln n]/k times then this probability is at most 1/n. In the second phase ofour algorithmAH S selects at mostk centers.

Page 8: Approximation Algorithms for Bounded Facility Location Problems

240 KRYSTA AND SOLIS-OBA

We now prove the bound on the approximation ratio. We consider two cases, ifk ≤ λ1+λn

then by Lemma 4.3, the expected cost of the solution isE(cost) ≤∑ j∈V [Cj+q(Cj+xj j )].Let L =∑ j∈V Cj be the value of an optimal solution ofLP, thenE(cost) ≤ (1+ q)L +q∑

j∈V xj j = (1+ q + λq)L + q∑

j∈V (xj j − λCj ).Let r =∑ j∈V (xj j −λCj ). By Lemma 4.3,Cj = 1− xj j , sor =∑ j∈V ((1+λ)xj j −λ).

From constraint (3),xj j ≤ yj for each vertexj , thusr ≤ ∑ j∈V ((1+ λ)yj − λ). Now byconstraint (4),r ≤ (1+ λ)k− λn. Sincek ≤ λ

1+λn, thenr ≤ 0. Thus,E(cost) ≤ (1+ q+λq)L, and the expected performance ratio of the algorithm isr2 = 1+q+λq ≤ 1+(1+λ) 1

e.On the other hand, ifk > λ

1+λn we use algorithmAH S. The value of the solution that thealgorithm finds is at most 2n − 6k, and so the performance ratio of the algorithm in thiscase isr1 = (2n− 6k)/(n− k) < 2− 4λ. By choosingλ so thatr1 = r2, the performanceratio of the overall algorithm is(4e+ 6)/(4e+ 1) ≈ 1.421. 2

4.3. A greedy algorithm

Now we describe a simple greedy 1.5-approximation algorithm for the uniform cost boundedk-median problem that uses at most 2k centers. LetS be a set of vertices andN(S) be theneighbors ofS. The algorithm has two phases, and in each one of them it selectsk centers.The first phase is as follows.

S← ∅while |S| < k do

Add to S the vertex with largest number of neighbors not inN(S).

Lemma 4.4. Let G = (V, E) be a graph with a dominating set S∗ of size k. The abovealgorithm finds a set S such that|N(S)| ≥ (n− k)/2.

Proof: Partition the verticesV\S∗ into k disjoint groupsNj , j ∈ S∗, such that everyvertex inNj is adjacent to vertexj . Index the groups in non-increasing order of size. LetSi , i = 0, . . . , k be the set formed by thei first vertices chosen by the algorithm. We showby induction oni that|N(Si )| ≥ 1

2|⋃

j≤i Nj | for all i .The claim trivially holds fori = 1. Assume that it also holds for all` < i . If |N(Si−1)

⋂Nj | ≥ 1

2|Nj | for all j = 1, . . . , i , then clearly the claim follows. Otherwise, there mustbe a vertexv ∈ S∗, v ≤ i , such that|N(Si−1)

⋂Nv| ≤ 1

2|Nv|. Hence, by induction hy-pothesis, in thei -th iteration the algorithm selects a vertexu with |N({u})⋃ N(Si−1)|≥ 1

2|⋃`<i Nj | + 1

2|Nv|. 2

In the second phase of the algorithm we useAH S to select another setS′ of k centers.The algorithm outputs the centers inS∪ S′.

Theorem 4.2. The above algorithm is a(1.5, 2, 2)-approximation algorithm for the uni-form cost bounded k-median problem.

Page 9: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 241

Proof: Observe that every vertexv ∈ V\S is at distance at most 2 from its nearest center.Moreover at least(n − k)/2 vertices are neighbors of centers, and so the value of thesolution is at most3(n−k)

2 . Since the optimum solution has value at leastn − k the claimfollows. 2

4.4. A randomized algorithm with maximum service cost d= 3

In this section we present a( e+6e+2, 3, 1)-approximation algorithm for the uniform cost

boundedk-median problem. This algorithm uses the clustering technique of Shmoys et al.(1997). The idea is to ensure for each vertex that there is always some center withindistance 3 from it. The algorithm is as follows. Ifk > λ

λ+1n then use algorithmAH S.Otherwise, find an optimal complete solution(x, y) of LP. For each vertexj ∈ V , letN( j ) = {i ∈ V | xi j > 0}. Partition the set of vertices in the following manner. Chooseany vertexi ∈ V and create a clusterQi that includes all verticesj ∈ V such thatN(i ) ∩ N( j ) 6= ∅. Remove the vertices inQi from V , and repeat this process untilV isempty. LetC be the set of vertices that serve as indices for the clustersQi . Note that forevery pair of verticesi, j ∈ C, i 6= j , N(i ) ∩ N( j ) = ∅.

Now we divide the vertices into two groups:A = ∪ j∈C N( j ) andB = V\A. By constraint(2) of LP,

∑i∈N( j ) xi j = 1. For every clusterQj the algorithm randomly chooses a vertex

in N( j ) as a center, selecting vertexi ∈ N( j )with probabilityxi j . Since the solution(x, y)is complete thenxi j = yi for all i ∈ N( j ), and thus every vertexi ∈ N( j ) is chosen as acenter with probabilityyi . Next, the algorithm chooses independently every vertexi ∈ Bas a center with probabilityyi . By linearity of expectation and by constraint (4) ofLP theexpected number of centers that this algorithm selects isk. Note that the probability that thisalgorithm chooses more thank centers is no larger than the probability that the algorithmof Section 4.2 chooses more thank centers during the first phase. Hence, using Chernoffbounds one can show that the probability that the algorithm chooses more thank centersis at most [e(1+ 1/(k + 1))−k−2]k/(k+1). Hence, by performing [(k + 1)(k + 2) ln n]/kindependent runs of the algorithm, the probability of choosing more thank centers is atmost 1/n.

Consider a vertexj ∈ Qi . If some vertexu ∈ N( j ) is selected by the algorithm as acenter, then vertexj is serviced at distance 1. But if none of the verticesN( j ) is a centerthen j is serviced at distance at most 3. To see why, leti ′ ∈ Qi be the center that thealgorithm chooses in clusterQi . Sincei ′ is a neighbor ofi , and j is at distance at most 2from i , the claim follows. LetBj be a random variable denoting the distance fromj to thecenter chosen in clusterQi . Let Dj denote the event that none of the vertices inN( j ) arecenters.

Lemma 4.5. For each vertexv, E(cost(v)) ≤ Cv + 2q, where q≤ 1e.

Proof: For every vertexj ∈ C let Nv j = N( j ) ∩ N(v), and letEj be the event that onevertex inNv j is selected by the algorithm as center. Note that for any two verticesi, j ∈ C,i 6= j , Nv j ∩ Nvi = ∅, and thus eventsEj andEi are independent. The probability of eventEj , j ∈ C is

∑i∈Nv j

yi =∑

i∈Nv jxi v.

Page 10: Approximation Algorithms for Bounded Facility Location Problems

242 KRYSTA AND SOLIS-OBA

For notational simplicity, for every vertexi ∈ N(v) such thati 6∈ N( j ) for all j ∈ C, letEi be the event that vertexi is selected as center. The probability ofEi is yi = xi v. Becauseof the way in which the centers are selected, all above eventsEi are independent, and thusby linearity of expectation

∑i Pr[Ei ] =

∑i∈N(v) xi v = 1.

The probability that none of the vertices inN(v) is a center isq = ∏i (1− Pr [Ei ]) ≤∏i e−Pr [Ei ] = e−

∑i Pr [Ei ] = 1

e. Therefore, the expected cost of servicing vertexv is at most0 · Pr [Ev]+ 1 · (1− Pr [Ev])(1− q

1−Pr [Ev ])+ 3q because if no vertex fromN(v) is center

then there is some center at distance at most 3 fromv. Simplifying we getE(cost(v)) ≤1− yv − q + 3q = Cv + 2q by Lemma 4.3. 2

Theorem 4.3. The above algorithm for the uniform cost bounded k-median problem findsa solution of expected value no larger than(e+ 6)/(e+ 2) times the optimum in whichevery vertex is serviced within distance3. The number of centers in this solution is withhigh probability at most k.

Proof: If k ≤ λ1+λn then by Lemmas 4.5 and 4.3 the expected cost of the solution is

E(cost) ≤∑ j∈V (Cj + 2q) =∑ j∈V (Cj + 2q(Cj + xj j )). Let L =∑ j∈V Cj be the valueof an optimum solution of linear programLP. Then,E(cost) ≤ (1+2q)L+2q

∑j∈V xj j =

(1+ 2q + 2λq)L + 2q∑

j∈V (xj j − λCj ) ≤ (1+ 2q + 2λq)L, because by the proof ofLemma 4.3,

∑j∈V (xj j −λCj ) ≤ 0. Hence, the expected performance ratio of the algorithm

in this case isr1 ≤ 1+ (1+ λ) 2e.

If k > λ1+λn then algorithmAH S is used. This algorithm has performance ratior2 ≤

2− kn−k < 2− λ. Choosingλ = e−2

e+2 the expected performance ratio of the algorithm ise+6e+2 ≈ 1.848. 2

4.5. Weights on the vertices

Consider a graphG = (V, E) with unit length edges and weightswi on the vertices. Theweights can be either 1 or∞. The weighted boundedk-median problem is to find a setS of vertices such that

∑i∈Swi ≤ k, d(v, S) ≤ d for eachv ∈ V , and

∑v∈V d(v, S) is

minimized. This problem is interesting because it allows us to exclude some vertices aspossible centers. Here we study the case whend = 3. Our results improve on average the3-approximation algorithm of Hochbaum and Shmoys (1986) for the weightedk-centerproblem in the special case of weights{1,+∞} on the vertices.

Let k be the minimum weight∑

i∈Swi of a dominating setS of G. In this section, weextend the techniques of Section 4.4 to design a randomized(2.076, 3, 1)-approximationalgorithm for the weighted boundedk-median problem. LetAW

H S be the 3-approximationalgorithm of Hochbaum and Shmoys (1986) for the weightedk-center problem. This al-gorithm first chooses a set of centers just like in algorithmAH S, and then it modifies thesolution by replacing every center with its smallest weight neighbor.

Lemma 4.6. AWH S is a(3− k

n−k , 3, 1)-approximation algorithm for the weighted boundedk-median problem.

Page 11: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 243

Proof: It is not difficult to see that every vertex is at distance at most 3 from its nearestcenter and that every center can be assigned at least one unique vertex at distance one fromit. Then, the solution obtained by the algorithm has value at most 3(n−2k)+ k = 3n−5k,while the optimum solution has value at leastn − k. Therefore, the performance ratio ofthe algorithm is at most 3− k

n−k . 2

We can describe the weighted boundedk-median problem as an integer linear program,that we callIP2. This is the same as integer programIP, but with constraint (4) replaced by∑

i∈V wi yi ≤ k. For convenience we writeIP2 here.

Min∑

i, j∈V

ci j xi j

s.t.∑i∈Nj

xi j = 1 ∀ j∈V

xi j ≤ yi ∀ j∈V , ∀i∈Nj (6)∑i∈V

wi yi ≤ k (7)

xi j , yi ∈ {0, 1} ∀i, j∈V (8)

Let LP2 be the linear program relaxation ofIP2 obtained by relaxing constraint (8). Thealgorithm for the weighted boundedk-median problem is like the algorithm of Section 4.4.If k > λ

λ+1n, for some valueλ to be specified later, then use algorithmAWH S. Otherwise,

solveL P2 and find a complete solution(x, y). Then, we define clusters like before, butthis time we create clusters only around verticesi with yi > 0. Notice thatyi = 0 ifwi = +∞.

Using similar arguments as those in Section 4.4 we can show that every vertex isserviced at distance at most 3 and that the expected total weight of the solution is atmost

∑i∈V wi yi ≤ k. Lemma 4.5 also holds for the weightedk-center problem. If

k ≤ λλ+1n, then the expected cost of the solution isE(cost) ≤∑ j∈V (Cj +2q(Cj +xj j )) =

(1+ 2q+ 2λq)L2+ 2q∑

j∈V (xj j − λCj ), whereL2 is the value of an optimal solution ofL P2, andλ ∈ [0, 1] is some constant to be defined later. By Lemma 4.3Cj = 1− xj j , soE(cost) ≤ (1+2(1+λ)q)L2+2q

∑j∈V ((1+λ)xj j −λ). Since the weight of every vertex

is at least one, then∑

j∈V xj j ≤∑

j∈V yj ≤∑

j∈V w j yj ≤ k, by constraints (6) and (7).Thus,E(cost) ≤ (1+ 2(1+ λ)q)L2+ 2q((1+ λ)k− λn) ≤ (1+ 2(1+ λ)q)L2, becausek ≤ λ

λ+1n. By Lemma 4.3,q ≤ 1e, so the performance ratio of the algorithm in this case is

r1 ≤ 1+ 2(1+ λ) 1e.

If k > λλ+1n, then by Lemma 4.6 the performance ratio of the algorithm isr1 ≤ 3− 2λ.

By choosingλ = e−1e+1 the performance ratio of the algorithm ise+5

e+1 ≈ 2.076.

Theorem 4.4. There is a randomized algorithm for the weighted bounded k-median prob-lem, when the vertex weights are either1 or∞, that finds a solution of expected value atmost(e+ 5)/(e+ 1) times the optimum. The total cost of the centers in this solution is

Page 12: Approximation Algorithms for Bounded Facility Location Problems

244 KRYSTA AND SOLIS-OBA

with high probability at most k. In this solution each vertex is serviced at distance at most3. Moreover, for each vertexv, the probability thatv is serviced at distance3 is at most1e.

4.6. Fault tolerant problem

The boundedp-neighbork-median problem is given a valued, find a setS of k centerssuch that for each vertexv ∈ V\S there are at leastp centers inS within distanced of v,and

∑v∈S d(v, S) is minimized, whered(v, S) is the distance fromv to its closest center.

We can also handle the case ofd(v, S) being a sum of the distances fromv to its p closestcenters. In this section, we are interested in the case when all edge lengths are 1 andd = 1.

The p-neighbork center problem consists in finding the smallest distanced for whichthere is a setSof k vertices such that every vertexv ∈ V\S is at distance at mostd from pvertices inS. Khuller et al. (1999) designed a 2-approximation algorithm for the problem,and whenp = 2 it is not difficult to modify the solution that this algorithm finds so that thenumber of vertices within distanced from their closest centers is at leastk/2. The idea isthat if the setSd of vertices within distanced from S has size smaller thank/2 then someof these vertices can be exchanged from vertices inS to get a new solution in which thesetSd is larger. Since the algorithm involves analyzing several cases, and its description isslightly complicated we omit it. We call this algorithmAN .

The boundedp-neighbork-median problem can be modeled as the integer programIP ofSection 4.2, with the constraint (2) replaced by

∑i∈Nj

xi j = p, ∀ j∈V . If k ≤ 1−q3.5 n, whereq is

as defined in Lemma 4.3., then we solve the linear program relaxationLPof this integer pro-gram to obtain a complete solution, and then we round the solution as in Section 4.2 but usingalgorithm AN instead of algorithmAH S. If k > 1−q

3.5 n we use algorithmAN to choose thecenters.

Using arguments similar to those of Section 4.2 we can show that whenk ≤ 1−q3.5 n,

every vertexj has expected service costE(cost( j )) ≤ 1− yj + q, whereq ≤ 1e. The

service cost of a vertex is the distance to its closest center. We note that the objectivefunction of the linear program does not reflect this definition of service cost, which is verydifficult to write as a linear function. However, this linear program can be used to get a goodapproximation for the solution of the problem, as we show now. The expected number ofcenters chosen by the algorithm is with high probability at most 2k. Proceeding as in the proofof Theorem 4.1, we can show thatE(cost) = ∑ j∈V E(cost(i )) ≤∑ j∈V (1− yj + q) =(1 + q)n − k. Therefore, the performance ratio of the algorithm in this case is at most1+ qn

n−k .

If k > 1−q3.5 n then we use algorithmAN to find a solution for the problem, but we allow

AN to choose 2k centers. The cost of this solution is 0· 2k+ k2 + 2(n− 2.5k) = 2n− 4.5k.

Therefore the performance ratio of the algorithm in this case is at most2n−4.5kn−k = 2− 2.5k

n−k .The overall performance ratio of the algorithm is approximately 1.4489.

Theorem 4.5. There is a randomized algorithm for the uniform cost bounded2-neighbork-median problem that finds a solution of expected value at most1.4489times the optimum.This solutions has, with high probability, at most2k centers, and every vertex is at distanceat most2 from 2 of the centers.

Page 13: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 245

5. Facility location problems with arbitrary edge lengths

In this section, we consider the bounded uncapacitated facility location problem and theboundedk-median problem with arbitrary edge lengths. Given a graphG = (V, E), let fibe the cost of selecting vertexi as a center. The bounded uncapacitated facility locationproblem can be stated as the following integer program, that we callIP3.

Min∑

i, j∈V

ci j xi j +∑i∈V

fi yi

s.t.∑

i∈Nd( j )

xi j = 1 ∀ j∈V

xi j ≤ yi ∀i, j∈V

xi j , yi ∈ {0, 1} ∀i, j∈V

(9)

whereNd( j ) = {i ∈ V | ci j ≤ d} andci j is the distance fromi to j . We solve the linearprogramming relaxation ofIP3, and whenfi = 1 for all verticesi , we can use ideas fromChudak (1998) and Guha and Khuller (1998) to round this solution to get an algorithm withthe following performance ratio.

Theorem 5.1. If f i = 1 for every vertex i∈ V, then there is a deterministic(3, 2)-approximation algorithm for the bounded uncapacitated facility location problem.

Proof: We first solveL P3 optimally. We can assume that the solution is complete, i.e.xi j > 0⇒ xi j = yi . Let α ∈ [0, 1] be a fixed constant to be specified later. LetN( j ) ={i ∈ V | xi j > 0} and letNα( j ) be the smallest set of verticesi ∈ N( j ) that are closest tojand such that theirxi j values add up to at leastα. For a given vertexj ∈ V , let i0 ∈ Nα( j )be the vertex farthest fromj , and letcj (α) = ci0 j

After solving the linear program we round the solution as follows. Find a vertexj with thesmallestcj (α) value and select it as center. All vertices in the setJj = {i | Nα( j )∩Nα(i ) 6=∅} are serviced by vertexj . Removej and all the vertices inJj from the graph and repeatthe process until all vertices have been deleted.

Notice that by the triangle inequality the service cost of each vertexi is at most 2ci (α).Since

∑i∈Nα( j ) yi ≥

∑i∈Nα( j ) xi j ≥ α and the centersj have disjoint setsNα( j ), then the

selection of these centers increases the total cost of the centers by at most a factor of1α. So

the overall cost of the solution is at most:1α

∑i∈V yi + 2

∑j∈V cj (α).

Observe that for every vertexj ,∑

i∈V ci j xi j ≥∑

i∈V : ci j≤cj (α)ci j xi j+

∑i∈V : ci j>cj (α)

cj (α)xi j ≥ (1−α)cj (α). Hence, the value of the solution is at most max{ 1α, 2

1−α }(∑

i∈V yi+∑i, j∈V ci j xi j ). By choosing1

α= 2

1−α we obtain a 3-approximation algorithm.Sincecj (α) ≤ d for every vertexj , then the cost of servicing any vertexj is at most

2cj (α) ≤ 2d. 2

If we relax the constraint on the maximum distance from a vertex to its closest center,we can use ideas of Section 4.4 to get an algorithm for the bounded uncapacitated facilitylocation problem with arbitrary costs on the centers. The idea is to first build the dualfor the linear program relaxation ofIP3 and solve it optimally. Letv j be a dual variablecorresponding to the primal constraint (9). We solve the linear program relaxation ofIP3

Page 14: Approximation Algorithms for Bounded Facility Location Problems

246 KRYSTA AND SOLIS-OBA

and perform the clustering step of Section 4.4 but we create clusters around verticesi withsmallest valuevi + Ci . The rest of the algorithm is as in Section 4.4.

Theorem 5.2. (Chudak) There is a deterministic(1.736, 3)-approximation algorithm forthe bounded uncapacitated facility location problem.

We can formulate the boundedk-median problem with arbitrary edge lengths as thelinear programL P1 with Nj replaced withNd( j ). To design an approximation algorithmfor this general problem we first solve the linear program, and then round the solution likewe did in the proof of Theorem 5.1. The analysis of this algorithm is very similar to thatof Theorem 5.1, hence we omit it. This algorithm finds a solution of value no more than2/(1− α) times the optimum and it uses at most(1/α)k centers. Choosingα = 1

1+ε forsome valueε > 0 we obtain the following result.

Theorem 5.3. There is a deterministic(2(1+ 1ε), 2, 1+ ε)-approximation algorithm for

the metric bounded k-median problem for any valueε > 0.

6. Conclusions

We have presented randomized and deterministic algorithms for the boundedk-medianand the bounded facility location problems. These problems are natural generalizationsof the classicalk-median and the facility location problems. We view these problems asmuti-criteria optimization problems where the objective functions are: (1) minimize thetotal service cost, (2) minimize the maximum service cost, and (3) minimize the number ofcenters. Our results are summarized in the following table.

Problem Randomized algorithm Deterministic algorithm

Uniform boundedk-median (1.4211, 2, 2), (1.8478, 3, 1) (2− kn−k , 1, 1), (1.5, 2, 2)

Boundedk-median with{0,∞} weights (2.076, 3, 1) (3− k

n−k , 3, 1)Fault tolerant boundedk-median (1.4489, 2, 2)Metric boundedk-median (2(1+ 1

ε), 2, 1+ ε)

Bounded facility locationwith fi = 0 (3, 2)

Metric bounded facility location (1.736, 3)

Acknowledgment

First author was supported by Deutsche Forschungsgemeinschaft (DFG) as a member ofthe Graduiertenkolleg Informatik, University of Saarland, Germany.

Page 15: Approximation Algorithms for Bounded Facility Location Problems

BOUNDED FACILITY LOCATION PROBLEMS 247

References

S. Arora, P. Raghavan, and S. Rao, “Approximation schemes for Euclideank-medians and related problems,”Proceedings of the 30th Annual ACM Symposium on Theory of Computing, 1998, pp. 106–113.

O. Berman and E.K. Yang, “Medi-center location problems,”Journal of the Operational Research Society, vol.42, pp. 313–322, 1991.

M. Charikar, S. Guha,́E. Tardos, and D.B. Shmoys, “A constant-factor approximation algorithm for thek-medianproblem,” inProceedings of the 31st ACM Symposium on Theory of Computing, 1999.

S. Chaudhuri, N. Garg, and R. Ravi, “Thep-neighbork-center problem,” Information Processing Letters, vol. 65,pp. 131–134, 1998.

S.S. Chaudhry, I.C. Choi, and D.K. Smith, “Facility location with and without maximum distance constraintsthrough thep-median problem,”International Journal of Operations and Production Management, vol. 15, pp.75–81, 1995.

I.C. Choi and S.S. Chaudhry, “Thep-median problem with maximum distance constraints: A direct approach,”Location Science, vol. 1, pp. 235–243, 1993.

F. Chudak, “Improved approximation algorithms for uncapacitated facility location,” inInteger Programming andCombinatorial Optimization, R.E. Bixby, E.A. Boyd, and R.Z. Rios-Mercado (Eds.), Lecture Notes in ComputerScience, vol. 1412, 1998, pp. 180–194.

G. Cornuejols, G. L. Nemhauser, and L.A. Wolsey, “The uncapacitated facility location problem,” inDiscreteLocation Theory, P.B. Mirchandani and R.L. Francis (Eds.), Wiley: New York, 1990, pp. 119–171.

Z. Drezner (Ed.),Facility Location. A Survey of Applications and Methods, Springer-Verlag: New York, 1995.S. Guha and S. Khuller, “Greedy strikes back: Improved facility location algorithms,” inProceedings of the 9th

Annual ACM-SIAM Symposium on Discrete Algorithms, 1998, pp. 649–657.D.S. Hochbaum and D.B. Shmoys, “A unified approach to approximation algorithms for bottleneck problems,”

Journal of the ACM, vol. 33, pp. 533–550, 1986.A.K. Jain and R.C. Dubes,Algorithms for Clustering Data, Prentice Hall: Englewood, NJ 1981.S. Khuller, R. Pless, and Y. J. Sussmann, “Fault Tolerantk-center problems,”Theoretical Computer Science,

vol. 242, pp. 237–245, 2000.B.M. Khumawala, “An efficient algorithm for thep-median problem with maximum distance constraint,”Geo-

graphical Analysis, vol. 5, pp. 309–321, 1973.G. Lin and G. Xue, “Balancing shortest-path trees and Steiner minimum trees in the rectilinear plane,” inPro-

ceedings of the 1999 IEEE International Symposium on Circuits and Systems(ISCAS’99), pp. 117–120, vol.VI, 1999.

J.H. Lin and J.S. Vitter, “ε-Approximations with minimum packing constraint violation,” inProceedings 24thACM Symposium on Theory of Computing, 1992a, pp. 771–782.

J.H. Lin and J.S. Vitter, “Approximation algorithms for geometric median problems,”Information ProcessingLetters, vol. 44, pp. 245–249, 1992b.

M.V. Marathe, R. Ravi, R. Sundaram, S.S. Ravi, D.J. Rosenkrantz, and H.B. Hunt III, “Bicriteria network designproblems,”Journal of Algorithms, vol. 28, pp. 142–171, 1998.

R. Motwani and P. Raghavan,Randomized Algorithms, Cambridge University Press, New York, 1995.D.B. Shmoys,́E. Tardos, and K. Aardal, “Approximation algorithms for facility location problems,” inProceedings

of the 29th ACM Symposium on Theory of Computing, 1997, pp. 265–274.C. Toregas, R. Swain, C. ReVelle, and L. Bergman, “The location of emergency service facilities,”Operations

Research, vol. 19, pp. 1363–1373, 1971.