12
Discrete Applied Mathematics 143 (2004) 31 – 42 www.elsevier.com/locate/dam Approximation algorithms for the optimal p-source communication spanning tree Bang Ye Wu Department of Computer Science and Information Engineering, Shu-Te University, YanChau, Kaohsiung, Taiwan 824, ROC Received 14 August 2002; received in revised form 29 September 2003; accepted 8 October 2003 Abstract The computational complexity and the approximation algorithms of the optimal p-source communication spanning tree (p-OCT) problem were investigated. Let G be an undirected graph with nonnegative edge lengths. Given p vertices as sources and all vertices as destinations, and also given arbitrary requirements between sources and destinations, we investigated the problem how to construct a spanning tree of G such that the total communication cost from sources to destinations is minimum, where the communication cost from a source to a destination is the path length multiplied by their requirement. For any xed integer p ¿ 2, we showed that the problem is NP-hard even for metric graphs. For metric graphs of n vertices, we show a 2-approximation algorithm with time complexity O(n p1 ). For general graphs, we present a 3-approximation algorithm for the case of two sources. c 2003 Elsevier B.V. All rights reserved. Keywords: Approximation algorithms; Network design; Spanning trees 1. Introduction Consider the following optimum communication spanning tree (OCT) problem formulated by Hu [6]. Let G =(V; E; w) be an undirected graph with nonnegative edge length function w. We are also given requirement (u; v) for each pair u, v, of vertices. For any spanning tree T of G, the communication cost between two vertices is dened to be the requirement multiplied by the path length of the two vertices on T , and the communication cost of T is the total communication cost summed over all pairs of vertices. Our goal is to construct a spanning tree with minimum communication cost. That is, we want to nd a spanning tree T such that u;vV (u; v)dT (u; v) is minimized, where dT (u; v) is the distance between u and v on T . The requirements in the OCT problem are arbitrary nonnegative numbers. By restricting the requirements, several special cases of the problem have been studied. (u; v) = 1 for each u; v V . The problem is called the minimum routing cost spanning tree (MRCT) problem (also called the shortest total path length spanning tree problem), and is NP-hard [5,7]. The rst constant ratio approximation algorithm for the MRCT problem consists in constructing a shortest path tree and showing that it is a 2-approximation [9]. The approximation ratio was improved to ( 4 3 +j) for any xed j ¿ 0[13], and then further improved to a polynomial time approximation scheme (PTAS) [10]. An exact algorithm for the problem was also studied [4]. Another related result concerns how to construct a subgraph with small routing cost and small size [14]. (u; v)= r (u)r (v) for each u; v V , where r (v) is a given nonnegative vertex weight for each vertex v. This ver- sion of the problem is called the optimal product-requirement communication spanning tree (PROCT) problem. A 1.577-approximation algorithm for the PROCT problem was developed [11], and then improved to a PTAS [12]. E-mail address: [email protected] (B.Y. Wu). 0166-218X/$ - see front matter c 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.dam.2003.10.002

Approximation algorithms for the optimal p-source communication spanning tree

Embed Size (px)

Citation preview

Page 1: Approximation algorithms for the optimal p-source communication spanning tree

Discrete Applied Mathematics 143 (2004) 31–42www.elsevier.com/locate/dam

Approximation algorithms for the optimal p-sourcecommunication spanning tree

Bang Ye WuDepartment of Computer Science and Information Engineering, Shu-Te University, YanChau, Kaohsiung, Taiwan 824, ROC

Received 14 August 2002; received in revised form 29 September 2003; accepted 8 October 2003

Abstract

The computational complexity and the approximation algorithms of the optimal p-source communication spanning tree(p-OCT) problem were investigated. Let G be an undirected graph with nonnegative edge lengths. Given p verticesas sources and all vertices as destinations, and also given arbitrary requirements between sources and destinations, weinvestigated the problem how to construct a spanning tree of G such that the total communication cost from sources todestinations is minimum, where the communication cost from a source to a destination is the path length multiplied bytheir requirement. For any 5xed integer p¿ 2, we showed that the problem is NP-hard even for metric graphs. For metricgraphs of n vertices, we show a 2-approximation algorithm with time complexity O(np−1). For general graphs, we presenta 3-approximation algorithm for the case of two sources.c© 2003 Elsevier B.V. All rights reserved.

Keywords: Approximation algorithms; Network design; Spanning trees

1. Introduction

Consider the following optimum communication spanning tree (OCT) problem formulated by Hu [6]. Let G=(V; E; w)be an undirected graph with nonnegative edge length function w. We are also given requirement �(u; v) for each pair u, v,of vertices. For any spanning tree T of G, the communication cost between two vertices is de5ned to be the requirementmultiplied by the path length of the two vertices on T , and the communication cost of T is the total communication costsummed over all pairs of vertices. Our goal is to construct a spanning tree with minimum communication cost. That is,we want to 5nd a spanning tree T such that

∑u; v∈V �(u; v)dT (u; v) is minimized, where dT (u; v) is the distance between

u and v on T .The requirements in the OCT problem are arbitrary nonnegative numbers. By restricting the requirements, several special

cases of the problem have been studied.

• �(u; v) = 1 for each u; v∈V . The problem is called the minimum routing cost spanning tree (MRCT) problem (alsocalled the shortest total path length spanning tree problem), and is NP-hard [5,7]. The 5rst constant ratio approximationalgorithm for the MRCT problem consists in constructing a shortest path tree and showing that it is a 2-approximation[9]. The approximation ratio was improved to ( 4

3 +j) for any 5xed j¿ 0 [13], and then further improved to a polynomialtime approximation scheme (PTAS) [10]. An exact algorithm for the problem was also studied [4]. Another relatedresult concerns how to construct a subgraph with small routing cost and small size [14].

• �(u; v) = r(u)r(v) for each u; v∈V , where r(v) is a given nonnegative vertex weight for each vertex v. This ver-sion of the problem is called the optimal product-requirement communication spanning tree (PROCT) problem. A1.577-approximation algorithm for the PROCT problem was developed [11], and then improved to a PTAS [12].

E-mail address: [email protected] (B.Y. Wu).

0166-218X/$ - see front matter c© 2003 Elsevier B.V. All rights reserved.doi:10.1016/j.dam.2003.10.002

Page 2: Approximation algorithms for the optimal p-source communication spanning tree

32 B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42

mor

e ge

nera

l

p-source MRCT, arbitrary p

2-source MRCT

Optimum Communication spanning Tree

SROCTPROCT

MRCT p-source MRCT, fixed p

2-source OCT

p-source OCT, fixed p

Fig. 1. The relationships between OCT problems.

• �(u; v)=r(u)+r(v) for each u; v∈V , where r(v) is a given nonnegative vertex weight for each vertex v. This version ofthe problem is called the optimal sum-requirement communication spanning tree (SROCT) problem. A 2-approximationalgorithm for the SROCT problem was shown [11].

In the p-source MRCT (p-MRCT) problem, we are given p vertices as sources and all vertices (including the sources)as destinations. While the all-to-all distance is considered in the MRCT problem, the goal of the p-MRCT problem is tominimize the total distance from sources to destinations. The p-MRCT problem is a special case of the SROCT problem,in which the vertex weight of each source is one and the weights of all the other vertices are zeros. If there is only onesource, it is always possible to 5nd a spanning tree, called shortest-path tree, such that the path between the source andeach vertex is a shortest path on the given graph. Therefore both the 1-MRCT and 1-OCT problems are polynomial-timesolvable. However, the 2-MRCT problem was shown to be NP-hard even for metric graphs, and a PTAS for the problemwas also proposed [15]. A metric graph is a complete graph with edge weights satisfying the triangle inequality.

In this paper, we investigate the optimal p-source communication spanning tree (p-OCT) problem. Let G = (V; E; w)be the input graph. Given a set S ⊂ V of p vertices as sources and all vertices including the sources as destinations, thep-OCT of G is a spanning tree T of G such that the total communication cost, de5ned by

∑s∈S

∑v �(s; v)dT (s; v), is

minimum. The p-OCT problem is a special case of the OCT problem, and in the meantime it is also a generalizationof the p-MRCT problem. The previous result of the NP-hardness of the 2-MRCT problem only implies the NP-hardnessof the 2-OCT problem, but it is not suJcient to show the computational complexity of the p-MRCT problem for other5xed p. It was pointed out that there is a simple reduction to show the NP-hardness of the p-MRCT problem for anyeven integer p, but the complexity for any odd integer p is left open. In this paper, not surprisingly, we show theNP-hardness of the p-MRCT problem even for metric graphs and for any 5xed p¿ 2. Then, the p-OCT problem is alsoshown to be NP-hard by a straightforward reduction. The proof generalizes the previous result and the reduction is moresimple.

To approximate the p-OCT, we 5rst focused on the case of metric graphs. For metric graphs, we begin with a simple2-approximation algorithm for the 2-OCT and then generalize the algorithm to the case of p sources. For any 5xed integerp¿ 2, the algorithm 5nds a 2-approximation of the p-OCT of a metric graph in O(np−1) time. For general graphs, wepresent a 3-approximation algorithm for the 2-OCT.

The relationship of the diKerent versions of the OCT problems is illustrated in Fig. 1, and the currently best approxi-mation ratios are summarized in Table 1.

The remaining sections are organized as follows: In Section 2, some de5nitions and notations are given. The compu-tational complexity is shown in Section 3. The algorithm for the p-OCT of metric graphs is presented in Section 4, andthe approximation algorithms for the 2-OCT of general graphs is shown in Section 5. Finally concluding remarks are inSection 6.

2. Preliminaries

By G = (V; E; w), we denote a graph G with vertex set V , edge set E, and edge length (or edge weight) function w.In this paper, we consider only connected undirected graphs with nonnegative edge lengths, and an edge between verticesu and v is denoted by (u; v). A metric graph is a complete undirected graph and the edge lengths satisfy the triangleinequality. For any graph G, V (G) denotes its vertex set and E(G) denotes its edge set. Let w be the edge length function

Page 3: Approximation algorithms for the optimal p-source communication spanning tree

B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42 33

Table 1The restrictions and currently best ratios of the OCT problems

Problem Restriction Ratio Reference

OCT Nonnegative O(log n log log n) [10]PROCT �(u; v) = r(u)r(v) PTAS [12]SROCT �(u; v) = r(u) + r(v) 2 [11]MRCT �(u; v) = 1 PTAS [10]2-MRCT �(u; v) = r(u) + r(v) PTAS [15]

r(s1) = r(s2) = 1r(v) = 0 for v �∈ {s1; s2}

p-OCT �(u; v) = 0 for u; v �∈ S 2 (metric graphs) This paper|S| = p is a constant

2-OCT �(u; v) = 0 for u; v �∈ {s1; s2} 3 (general graphs) This paper

of a graph G. For a subgraph H of G, we de5ne w(H)=w(E(H))=∑

e∈E(H) w(e). We shall also use n to denote |V (G)|when there is no ambiguity.

De�nition 1. Let G = (V; E; w) be a graph. For u; v∈V , SPG(u; v) denotes a shortest path between u and v on G. Theshortest path length is denoted by dG(u; v) = w(SPG(u; v)).

De�nition 2. Let H be a subgraph of G. For a vertex v∈V (G), we use dG(v; H) to denote the shortest distance from vto H , i.e., dG(v; H) = minu∈V (H) dG(v; u). The de5nition also includes the case in which H contains no edge.

Let G = (V; E; w) be a graph and m∈V . The shortest path tree rooted at m is a spanning tree T such that dT (m; v) =dG(m; v) for any v∈V . The shortest path tree problem has been well studied and eJcient algorithms for various familiesof graphs have been developed [2]. For example, Dijkstra’s algorithm [3] 5nds a shortest path tree for a graph withnonnegative weights in O(|V |2) time, and the time complexity can be improved to O(|E|+ |V |log|V |) by using Fibonacciheaps. A recent result for undirected graphs with positive integer weights is an O(|E|) algorithm [8]. Let M ⊂ V . Aforest F is a shortest path forest with roots M if dF (v;M) = dG(v;M) for any v∈V , i.e., each vertex is connected tothe closest root by a shortest path. A shortest path forest can be constructed by an algorithm similar to the shortest pathtree algorithm. First we create a dummy node and the multiple roots are connected to the dummy node by edges of zeroweight. Then a shortest path tree rooted at the dummy node is constructed, and the shortest path forest is obtained byremoving the dummy node and dummy edges. The time complexity is the same as the shortest path tree algorithm.

We now de5ne the communication cost.

De�nition 3. Let T be a spanning tree of a graph G and S = {s1; s2; : : : ; sp} ⊂ V (T ) be the set of given sources. For anyvertex v∈V (T ), the communication cost of v on T is de5ned by cT (v) =

∑pi=1 ri(v)dT (v; si), where ri(v) is the given

nonnegative requirement between si and v. The communication cost of T is de5ned by c(T ) =∑

v∈V (T ) cT (v).

De�nition 4. Given a graph, a set of p sources, and the requirements, the optimal p-source communication spanningtree (p-OCT) is a spanning tree with minimum communication cost. The problem of 5nding the p-OCT is called as thep-OCT problem.

3. The NP-hardness

In this section, we shall discuss the computational complexity of the p-OCT problem. First we de5ne a weightedversion of the 2-MRCT problem. By a transformation from the well-known satis5ability problem, we show the weighted2-MRCT problem is NP-hard. Then, we show that the p-MRCT problem for any 5xed p can be transformed from theweighted 2-MRCT problem, and the NP-hardness of the p-OCT problem is shown by a straightforward reduction.

We 5rst introduce the satis5ability problem. Let U = {u1; u2; : : : ; un} be a set of Boolean variables. A truth assignmentfor U is a function mapping each variable to TRUE or FALSE. If u is a variable in U , then u and Nu (the negation of u)are literals over U . A clause over U is a set of literals over U , which represents the disjunction of those literals and issatis5ed by a truth assignment if and only if at least one of its members is assigned TRUE. For a set X of clauses over

Page 4: Approximation algorithms for the optimal p-source communication spanning tree

34 B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42

b1, b2, b3,.., bm

s1 s2

ai

... ...

ai+1a1 a2 a3

a1 a2 a3 ai ai+1

an

an

Fig. 2. The transformation from the SAT problem to the 2-MRCT(�) problem.

U , a truth assignment t is called a satisfying truth assignment for X if all the clauses in X are simultaneously satis5edby t. A set of clauses is satis5able if and only if there exists some satisfying truth assignment.

De�nition 5. Given a set U of variables and a set X of clauses over U , the SATISFIABILITY (SAT) problem consistsin determining if there is a satisfying truth assignment for X .

De�nition 6. Let G = (V; E; w) be a graph and s1; s2 ∈V be two given sources. For any integer �¿ 1, the 2-MRCT(�)problem consists in 5nding a spanning tree T of G such that the weighted routing cost c(T; �)=

∑v∈V (�dT (v; s1)+dT (v; s2))

is minimum.

We shall transform the SAT problem to the 2-MRCT(�) problem (Fig. 2). Given a set U = {u1; u2; : : : ; un} of Booleanvariables and a set X = {x1; x2; : : : xm} of clauses as an instance of the SAT problem, we construct a graph G = (V; E; w)as follows:

• Let A={ai; Nai|16 i6 n} and B={bi|16 i6m}. The vertex set V ={s1; s2}∪A∪B, in which s1 and s2 are the twosources. The vertices ai and Nai correspond to variable ui and negated variable Nu i, respectively, and bi correspondsto clause xi.

• The edge set E contains the following subsets:(1) E1 = {(s1; a1); (s1; Na1); (s2; an); (s2; Nan)}.(2) E2 =

⋃16i¡n{(ai; ai+1); (ai; Nai+1); ( Nai; ai+1); ( Nai; Nai+1)}.

(3) E3 = {(ai; bj)|ui ∈ xj;∀i; j} ∪ {( Nai; bj)| Nu i ∈ xj;∀i; j}.• For any e∈E1 ∪ E2, w(e) = 1. For any (ai; bj) or ( Nai; bj) in E3, the edge weight is L − (� − 1)=(� + 1)i, in which

L = (� + 1)mn.

We shall show that the SAT problem has a satisfying truth assignment if and only if there is a spanning tree T of Gsuch that the weighted routing cost c(T; �)6 ', where

' = (n + 1)2� + (n2 + 4n + 1) + ((� + 1)L + n + 1)m:

Proposition 1. If there is a truth assignment satisfying X , there exists a spanning tree Y of G such that c(Y; �) = '.

Proof. We may construct a corresponding spanning tree Y of G as follows.

• The path PY between the two sources has the form

(s1 = v0; v1; v2; : : : ; vn; s2);

in which, for 16 i6 n, vi = ai if ui is assigned true and vi = Nai otherwise.• For all 16 i6 n, if vi = ai, connect Nai to vi−1. Otherwise, connect ai to vi−1.• For all 16 i6m, connect bi to one of the vertices on PY . Such an edge always exists since at least one of the literals

in xi is assigned true.

Page 5: Approximation algorithms for the optimal p-source communication spanning tree

B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42 35

For any v∈ {ai; Nai}, if v is on the path PY , dY (v; s1) = i and dY (v; s2) = n + 1 − i, and otherwise dY (v; s1) = i anddY (v; s2) = n + 3 − i. For any bi, since it is connected to some aj (a similar argument holds for Naj) on PY ,

�dY (bi; s1) + dY (bi; s2) = (� + 1)w(bi; aj) + �j + (n + 1 − j)

= (� + 1)L − (� − 1)j + (� − 1)j + (n + 1)

= (� + 1)L + n + 1:

Note that the cost depends only on whether bi is directly connected to path PY or not, but not on which vertex of thepath it is connected to.

The routing cost of Y is given by

c(Y; �)

=(� + 1)dY (s1; s2) +∑

v∈A

(�dY (v; s1) + dY (v; s2))

+∑

i

(�dY (bi; s1) + dY (bi; s2))

=(� + 1)(n + 1) + �∑

i6n

(2i) +∑

i6n

(2n + 4 − 2i) + m((� + 1)L + n + 1)

=(n + 1)2� + (n2 + 4n + 1) + ((� + 1)L + n + 1)m

=':

Proposition 2. Let T be an optimal solution of 2-MRCT(�) on G. If c(T; �)6 ', the path PT from s1 to s2 on T hasthe form (s1; v1; v2; : : : ; vn; s2), in which vi ∈ {ai; Nai} for 16 i6 n.

Proof. If the proposition was false, we would have that PT contains some vertex in B or that it contains more than nvertices in A. In the following, we show that both cases lead to contradictions.

• Suppose that PT contains some bi. This implies that dT (s1; s2)¿ 2L − n since dG(bi; s1)¿L and dG(bi; s2)¿L − n.For any vertex v∈A,

�dT (v; s1) + dT (v; s2)¿dT (s1; s2)¿ 2L − n:

For any vertex v∈B,

�dT (v; s1) + dT (v; s2)¿ �dG(v; s1) + dG(v; s2)¿ (� + 1)L − n:

Therefore, since L = (� + 1)mn,

c(T; �) ¿ 2n(2L − n) + m((� + 1)L − n)

= 4nL − 2n2 − mn + (� + 1)mL

¿ 4n2m� + n2m + (� + 1)mL:

Comparing with

' = (n + 1)2� + (n2 + 4n + 1) + ((� + 1)L + n + 1)m;

we have c(T; �)¿' when m; n¿ 2, and it is a contradiction.• Suppose that the path PT contains more than n vertices in A. It implies that dT (s1; s2)¿n+ 1 and that there exists the

smallest i such that both ai and Nai are on the path. By the de5nition of G, without loss of generality, we may assumethat the path PT = (: : : ; ai; ai+1; Nai; Nai+1; : : :). However, if we replace edge (ai+1; Nai) with (ai; Nai+1), we may obtainedanother spanning tree and its cost is smaller than T since the costs for ai+1 and Nai are decreased and the costs for anyother vertex is not increased.

Proposition 3. Let T be an optimal solution of 2-MRCT(�) on G. If c(T; �)6 ', there is a truth assignmentsatisfying X .

Page 6: Approximation algorithms for the optimal p-source communication spanning tree

36 B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42

Proof. By Proposition 2, the path PT from s1 to s2 on T has the form (s1; v1; v2; : : : ; vn; s2), in which vi ∈ {ai; Nai} for16 i6 n. For any ai (a similar argument holds for Nai) on PT ,

�dT (ai; s1) + dT (ai; s2) = i� + n + 1 − i:

For any ai (a similar argument holds for Nai) not on PT ,

�dT (ai; s1) + dT (ai; s2)

= minv∈V (PT )

{(� + 1)dT (ai; v) + �dT (v; s1) + dT (v; s2)}

¿ (� + 1) + �(i − 1) + (n + 1 − (i − 1))

=i� + n + 3 − i:

The lower bound is obtained when ai is connected to vi−1 on PT . For any bi, similar to the proof of Proposition 1,

�dT (bi; s1) + dT (bi; s2)¿ (� + 1)L + n + 1:

The lower bound is obtained when bi is connected to some vertex on PT . Consequently,

c(T; �)

¿ (� + 1)(n + 1) +n∑

i=1

((i� + n + 1 − i) + (i� + n + 3 − i))

+((� + 1)L + n + 1)m

=(n + 1)2� + (n2 + 4n + 1) + ((� + 1)L + n + 1)m = ':

The lower bound is obtained when each vertex in B is connected to some vi on PT . It implies that, for each clause in X ,there is a literal assigned TRUE, and hence X is satis5able.

Theorem 1. For any 5xed integer �¿ 1, the 2-MRCT(�) problem is NP-hard.

Proof. By Propositions 1 and 3, we have transformed the SAT problem to the 2-MRCT(�) problem. Given an instance ofthe SAT problem, in polynomial time, we can construct an instance of the 2-MRCT(�) problem. If there is a polynomialtime algorithm 5nding the optimal solution of the 2-MRCT(�) problem, it can also be used to solve the SAT problem.Therefore the 2-MRCT(�) problem is NP-hard since the SAT problem is NP-complete [1,5].

The NP-hardness result can be easily extended to metric graphs.

Corollary 2. For any 5xed integer �¿ 1, the 2-MRCT(�) problem is NP-hard even for metric graphs.

Proof. Let G be the constructed graph in Theorem 1, and NG = (V; V × V; Nw) be the metric closure of G, that is,Nw(u; v) = dG(u; v) for all u; v∈V . Clearly NG is a metric graph. The transformation is similar to Propositions 1–3.

Let p¿ 1 be any 5xed integer. We can easily transform the 2-MRCT(p) problem to the p-MRCT by duplicating pcopies of the source s1. The next corollary is obvious and the proof is omitted.

Corollary 3. For any 5xed integer p¿ 1, the p-MRCT is NP-hard even for metric inputs.

Since the OCT problem includes the MRCT problem as a special case, we have the following result.

Corollary 4. For any 5xed integer p¿ 1, the p-OCT is NP-hard even for metric inputs.

4. Approximating p-OCT on metric graphs

In this section, we show a 2-approximation algorithm for the p-OCT problem with metric inputs. A metric graph is acomplete graph for which the edge between any pair of vertices is a shortest path. We start with a simple algorithm forthe case of two sources, and then generalize to p-OCT for any 5xed integer p.

Page 7: Approximation algorithms for the optimal p-source communication spanning tree

B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42 37

4.1. Approximating the 2-OCT

To approximate the 2-OCT, our algorithm starts at the edge (s1; s2), and then inserts other vertices one by one inarbitrary order. In each iteration, we greedily connect a vertex v to either s1 or s2 depending on the communication cost.We shall show the approximation ratio is two. The algorithm is given in the following.

Algorithm A1Input: A metric graph G, two vertices s1; s2, and requirements r1(v); r2(v),for each vertex v∈V (G).Output: A spanning tree T of G.Initially T contains only one edge (s1; s2).For each vertex v∈V do/* Connect each v to either s1 or s2 */

If (r1(v) + r2(v))w(v; s1) + r2(v)w(s1; s2)6 (r1(v) + r2(v))w(v; s2) + r1(v)w(s1; s2)

insert edge (v; s1) into T ;else

insert edge (v; s2) into T ;endif

Output T .For convenience, we de5ne some notations.

De�nition 7. Let Y be the 2-OCT and P be the path between s1 and s2 on Y . We de5ne f1(v)=dY (v; s1)−dY (v; P) andf2(v) = dY (v; s2) − dY (v; P) for each vertex v.

The next lemma gives a formula of the optimal cost. The formula directly follows the above notations and the de5nitionof the communication cost. We omit the proof.

Lemma 5. Let Y be the 2-OCT and P be the path between s1 and s2 on Y . c(Y ) =∑

v∈V ((r1(v) + r2(v))dY (v; P) +r1(v)f1(v) + r2(v)f2(v)).

The next theorem shows that Algorithm A1 is an approximation algorithm. Here we assume that the input graph isalready in the memory.

Theorem 6. Algorithm A1 computes a 2-approximation of the 2-OCT of a metric graph in O(n) time.

Proof. Using adjacency lists to store the tree, the insertion of an edge takes only constant time. Therefore, the timecomplexity of the algorithm is obviously O(n). We now prove the performance ratio. Let Y be the 2-OCT and P be thepath between s1 and s2 on Y . By the triangle inequality, w(v; s1)6dY (v; s1) = dY (v; P) + f1(v). We have

(r1(v) + r2(v))w(v; s1) + r2(v)w(s1; s2)

6 (r1(v) + r2(v))(dY (v; P) + f1(v)) + r2(v)(f1(v) + f2(v))

=(r1(v) + r2(v))dY (v; P) + (f1(v)r1(v) + f2(v)r2(v)) + 2f1(v)r2(v)

=cY (v) + 2f1(v)r2(v):

That is, if v is connected to s1, the cost of v is increased by at most 2f1(v)r2(v). Similarly w(v; s2)6dY (v; s2)=dY (v; P)+f2(v) and we have

(r1(v) + r2(v))w(v; s2) + r1(v)w(s1; s2)

6 (r1(v) + r2(v))(dY (v; P) + f2(v)) + r1(v)(f1(v) + f2(v))

=(r1(v) + r2(v))dY (v; P) + (f1(v)r1(v) + f2(v)r2(v)) + 2f2(v)r1(v)

=cY (v) + 2f2(v)r1(v):

Page 8: Approximation algorithms for the optimal p-source communication spanning tree

38 B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42

(a)

(b) (c)

: source : non-source

Fig. 3. (a) A tree with four sources, (b) the skeleton and (c) the reduced skeleton.

That is, if v is connected to s2, the cost of v is increased by at most 2f2(v)r1(v). Since the vertex v is connected toeither s1 or s2 by choosing the minimum of the two costs,

cT (v)6 cY (v) + min{2f1(v)r2(v); 2f2(v)r1(v)}:

Since the minimum of two number is no more than their weighted mean, we have

cY (v) + min{2f1(v)r2(v); 2f2(v)r1(v)}

6 cY (v) +r1(v)2

r1(v)2 + r2(v)22f1(v)r2(v) +

r2(v)2

r1(v)2 + r2(v)22f2(v)r1(v)

=cY (v) +2r1(v)r2(v)

r1(v)2 + r2(v)2× (f1(v)r1(v) + f2(v)r2(v)): (1)

Since r1(v)2 + r2(v)2 − 2r1(v)r2(v) = (r1(v) − r2(v))2¿ 0, we have

cT (v)6 cY (v) + f1(v)r1(v) + f2(v)r2(v)6 2cY (v): (2)

We have shown that cT (v)6 2cY (v) for any vertex v. Therefore c(T ) =∑

v∈V cT (v)6 2∑

v∈V cY (v) = 2c(Y ), and Tis a 2-approximation of the optimal.

4.2. The reduced skeleton of a tree

To analyze the approximation algorithm, we de5ne the reduced skeleton of a tree as follows.

De�nition 8. Let T be a spanning tree of a metric graph G and S ⊂ V (T ) be the set of sources. The S-skeleton of T is atree Y de5ned by

⋃u; v∈S SPT (u; v). The reduced S-skeleton of T is a tree X =(V (X ); E(X ); w) de5ned by the following:

• V (X ) is the union of the source set and the set of vertices whose degrees on the skeleton Y are not two.• For u; v∈V (X ), (u; v) ∈E(X ) if and only if the path SPY (u; v) contains no other vertex in V (X ).• For each e∈E(X ), w(e) = dG(u; v).

An example of the S-skeleton and reduced S-skeleton of a tree is illustrated in Fig. 3. The S-skeleton may be easilyobtained by repeatedly removing the non-source leaves from the tree, i.e., the leaves of the skeleton must be sources. Thereduced skeleton is obtained from the skeleton by eliminating the non-source vertices with degree two. Our approximationalgorithm tries to guess the reduced S-skeleton X of the OCT, and the other vertices are connected to one of the vertexof X by making the cost as small as possible. Similar to the case of two sources, it can be shown that the approximationratio is two.

When there are only two sources, the reduced skeleton is the edge between the two sources and can be easily determined.But the structure of the reduced skeleton becomes more complicated as the number of sources increases. The reducedskeleton is a tree spanning S and possibly some other vertices. In the next lemma, we show that the number of verticesof X is bounded by 2|S| − 2. In other words, there are at most |S| − 2 non-source vertices in X . For a constant p = |S|,in polynomial time one can check all trees spanning the p given sources and at most (p−2) non-source vertices in orderto guess X .

Page 9: Approximation algorithms for the optimal p-source communication spanning tree

B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42 39

Lemma 7. Let X be the reduced S-skeleton of T . For any u; v∈V (X ), dX (u; v)6dT (u; v). Furthermore |V (X )|62|S| − 2.

Proof. By de5nition, the S-skeleton of T is the union of the path between any pair of vertices in S. For any two verticesof the skeleton, the distance on the skeleton remains the same as the one on the tree T . Since the graph G is metric, theedge weight is no more than the length of any path between the two endpoints. Consequently dX (u; v)6dT (u; v) for anyu; v∈V (X ).

Let n1 and n2 be the number of leaves and internal vertices, respectively. By de5nition, the leaf set of X is a subsetof S, and therefore n16 |S| and there are (|S| − n1) source vertices which are internal vertices. Since X is a tree, thenumber of edges is (n1 + n2 − 1) and the sum of degrees of all vertices is 2(n1 + n2 − 1). Since the reduced skeletoncontains no source vertex of degree two, we have

n1 + 3(n2 − (|S| − n1)) + 2(|S| − n1)6 2(n1 + n2 − 1)

and therefore,

n1 + n26 n1 + |S| − 26 2|S| − 2:

4.3. Approximating the p-OCT

In this subsection, we generalize the 2-approximation algorithm for the 2-OCT to the case of p sources, where p¿ 2is a constant. Our approximation algorithm is shown below.

Algorithm A2Input: A metric graph G = (V; E; w), a set S ⊂ V of p sources,and requirement ri(v) for each si ∈ S and v∈V .Output: A spanning tree T of G.For each set V1 of (p − 2) vertices in V \ S do

For each tree X with vertex set S ∪ V1 doInitially T = X ;For each vertex v ∈ V (X ) do

/* Connect each v to a vertex in V (X ) */For each u∈V (X ), compute

∑i ri(v)(w(v; u) + dX (u; si))

and choose the vertex u∗ minimizing the cost;Insert edge (v; u∗) into T ;

Compute the cost of T and keep the best tree found so far;Output T .

Theorem 8. For a metric graph, Algorithm A2 5nds a 2-approximation of the p-source OCT in O(np−1) time, wherep¿ 2 is a constant.

Proof. The algorithm tries each tree X spanning the p sources and (p− 2) other vertices. The total number of such treesis ( n−p

p−2 )(2p− 2)2p−4. For each X and each v∈V \V (X ), it takes O(p) time to determine a vertex u∗ ∈V (X ) and insert

edge (v; u∗). The total time complexity is therefore O(np−1) since p is a constant.Let Y be the optimal solution and NX be the reduced S-skeleton of Y . By Lemma 7, |V ( NX )|6 2p − 2. First we show

the approximation ratio for the case that |V ( NX )| = 2p − 2, and the other case, i.e. |V ( NX )|¡ 2p − 2, will be explainedlater. Since the algorithm tries all possible trees spanning the sources and (p − 2) other vertices, it must happen thatX = NX in some iteration. It is suJcient to show the approximation ratio of the tree constructed in this iteration becausethe algorithm outputs the best of the trees constructed in all iterations.

Let (u1; u2) ∈E(X ) and P= SPY (u1; u2). Removing the edges of P from Y , the tree is cut into several components. LetY1 and Y2 be the components containing u1 and u2 respectively, and B(u1; u2) =V (Y ) \ (V (Y1) ∪V (Y2)) be the set of thevertices not in Y1 or Y2. Also let S1 and S2 denote the set of the sources in T1 and T2, respectively. Since X is the reducedS-skeleton, S = S1 ∪ S2 and B(u1; u2) contains no source. The subsets V (X ) and B(u1; u2) for all (u1; u2) ∈E(X ) form apartition of the vertex set V (Y ). Since dT (u; v) = dY (u; v) for any u; v∈V (X ) by Lemma 7, we have cT (v) = cY (v) foreach v∈V (X ). We shall prove that, for any (u1; u2) ∈E(X ) and any vertex v∈B(u1; u2), cT (v)6 2cY (v), and thereforeT is a 2-approximation of Y .

Page 10: Approximation algorithms for the optimal p-source communication spanning tree

40 B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42

To simplify the proof, we use the following notations. For v∈V , let R1 =∑

si∈S1ri(v) and R2 =

∑si∈S2

ri(v) bethe total requirements between v and all sources in S1 and S2, respectively. As in the previous subsection, we de5nef1(v) = dY (v; u1) − dY (v; P) and f2(v) = dY (v; u2) − dY (v; P). Since the vertex v is connected to one of the vertices inV (X ) by making the cost as small as possible, the cost cT (v) is no more than what it would be in the case that v isconnected to u1.

cT (v)6 (R1 + R2)w(v; u1) +∑

si∈S1

ri(v)dT (u1; si)

+R2(v)w(u1; u2) +∑

si∈S2

ri(v)dT (u2; si): (3)

Since w(v; u1)6dY (v; P) + f1(v) and w(u1; u2)6f1(v) + f2(v),

(R1 + R2)w(v; u1) + R2w(u1; u2)

6 (R1 + R2)dY (v; P) + (f1(v)R1 + f2(v)R2) + 2f1(v)R2: (4)

By de5nition, the cost cY (v) can be computed as follows:

cY (v) = (R1 + R2)dY (v; P) + f1(v)R1 + f2(v)R2

+∑

si∈S1

ri(v)dT (u1; si) +∑

si∈S2

ri(v)dT (u2; si): (5)

By Eqs. (3)–(5):

cT (v)6 cY (v) + 2f1(v)R2: (6)

Similarly, the cost cT (v) is no more than what it would be in the case that v is connected to u2, and we have

cT (v)6 cY (v) + 2f2(v)R1: (7)

By taking the minimum of the bounds in Eqs. (6) and (7), and using the similar technique of Eqs. (1) and (2) in theproof of Theorem 6, we obtain

cT (v)6 cY (v) + min{2f1(v)R2; 2f2(v)R1}6 2cY (v):

Therefore T is a 2-approximation of Y .Finally, we show the approximation ratio for the case |V ( NX )|¡ 2p−2, i.e. the reduced skeleton has less than (2p−2)

vertices. There exists X constructed in some iteration such that E( NX ) ⊂ E(X ) and, for each v∈V (X ) \ V ( NX ), v isconnected to the vertex of NX to minimize the cost of v. In other words, the case is included in the case |V ( NX )| = 2p− 2and the approximation ratio is two.

5. Approximating 2-OCT on general graphs

In this section, we present the approximation algorithm of the 2-OCT problem in the case that the input is a generalgraph. Our approximation algorithm consists in 5nding a shortest path between the two sources and then constructing ashortest path forest with all the vertices of the path as the multiple roots. The output tree is the union of the forest andthe path. The algorithm is exactly the same as the one which 5nds a 2-approximation of the 2-MRCT [15]. However, forthe 2-OCT problem, we shall show that the approximation ratio is three. The algorithm is listed below.

Algorithm A3Input: A graph G = (V; E; w), two sources s1; s2, and requirements r1(v); r2(v).Output: A spanning tree T of G.Find a shortest path X between s1 and s2 on G.Find the shortest path forest with multiple roots in V (X ).Let T be the union of the forest and X . Output T .We now show the approximation in the next lemma, in which T is the spanning tree obtained by the approximation

algorithm.

Page 11: Approximation algorithms for the optimal p-source communication spanning tree

B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42 41

Lemma 9. Let Y be the 2-OCT. For any vertex v, cT (v)6 3cY (v).

Proof. Suppose that v is connected to the path X at vertex x of X . In other words, among the trees of the shortest pathforest, x is the root of the tree containing v. By the property of shortest path forests, dT (v; x)=dG(v; x)6dG(v; s1). SinceX is a shortest path between s1 and s2, dT (s1; x) = dG(s1; x)6dG(s1; v) + dG(v; x). Therefore,

dT (v; s1) = dT (v; x) + dT (x; s1)6 2dG(v; x) + dG(s1; v)6 3dG(v; s1):

Similarly, dT (v; s2)6 3dG(v; s2). By the de5nition of the communication cost:

cT (v) = r1(v)dT (v; s1) + r2(v)dT (v; s2)

6 3r1(v)dG(v; s1) + 3r2(v)dG(v; s2)

6 3(r1(v)dY (v; s1) + r2(v)dY (v; s2))

= 3cY (v):

The following theorem summarize our result for the 2-OCT problem on general graphs.

Theorem 10. The algorithm A3 computes a 3-approximation of the 2-OCT of a general graph G = (V; E; w) in O(|E| +|V |log |V |) time. The time complexity can be reduced to O(|E|) if the edge weights are integers.

Proof. By Lemma 9, cT (v)6 3cY (v) for any vertex v. Since c(T ) =∑

v cT (v), c(T )6 3c(Y ). The time complexityis dominated by the time of 5nding the shortest path forest. As mentioned in Section 2, the shortest path forest canbe found in O(|E| + |V |log |V |) time, and the time complexity can be reduced to O(|E|) if the edge weights are allintegers.

6. Concluding remarks

The main open question left in the paper is how to approximate the p-OCT of general graphs. Algorithm A2 onlyworks for metric graphs, and we did not 5nd a similar result for general graphs. The time complexity of Algorithm A2is O(np−1) which is not eJcient at all for large p. It would be interesting to 5nd eJcient algorithms to approximate thep-OCT with good ratio. Another interesting approach concerns the development of an approximation scheme, by whichone can control the trade-oK between the time complexity and the approximation ratio.

References

[1] S.A. Cook, The complexity of theorem-proving procedures, Proceedings of the Third Annual ACM Symposium on Theory ofComputing, 1971, pp. 151–158.

[2] T.H. Cormen, C.E. Leiserson, R.L. Rivest, Introduction to Algorithms, MIT Press, Cambridge, MA, 1994.[3] E.W. Dijkstra, A note on two problems in connexion with graphs, Numer. Math. 1 (1959) 269–271.[4] M. Fischetti, G. Lancia, P. Sera5ni, Exact algorithms for minimum routing cost trees, Networks 39 (2002) 161–173, DOI

10.1002/net.10022.[5] M.R. Garey, D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman and Company,

San Francisco, 1979.[6] T.C. Hu, Optimum communication spanning trees, SIAM J. Comput. 3 (1974) 188–195.[7] D.S. Johnson, J.K. Lenstra, A.H.G. Rinnooy Kan, The complexity of the network design problem, Networks 8 (1978) 279–285.[8] M. Thorup, Undirected single source shortest paths with positive integer weights in linear time, J. ACM 46 (1999) 362–394.[9] R. Wong, Worst-case analysis of network design problem heuristics, SIAM J. Algebraic Discrete Methods 1 (1980) 51–63.

[10] B.Y. Wu, A polynomial time approximation scheme for the two-source minimum routing cost spanning trees, J. Algorithms 44(2002) 359–378 doi:10.1016/S0196-6774(02)00205-5.

[11] B.Y. Wu, K.M. Chao, C.Y. Tang, Approximation algorithms for some optimum communication spanning tree problems, DiscreteAppl. Math. 102 (2000) 245–266.

Page 12: Approximation algorithms for the optimal p-source communication spanning tree

42 B.Y. Wu /Discrete Applied Mathematics 143 (2004) 31–42

[12] B.Y. Wu, K.M. Chao, C.Y. Tang, Approximation algorithms for the shortest total path length spanning tree problem, Discrete Appl.Math. 105 (2000) 273–289.

[13] B.Y. Wu, K.M. Chao, C.Y. Tang, A polynomial time approximation scheme for optimal product-requirement communication spanningtrees, J. Algorithms 36 (2000) 182–204, doi:10.1006/jagm.2000.1088.

[14] B.Y. Wu, K.M. Chao, C.Y. Tang, Light graphs with small routing cost, Networks 39 (2002) 130–138, DOI 10.1002/net.10019.[15] B.Y. Wu, G. Lancia, V. Bafna, K.M. Chao, R. Ravi, C.Y. Tang, A polynomial time approximation scheme for minimum routing

cost spanning trees, SIAM J. Comput. 29 (2000) 761–778.