A linear-time algorithm to compute a MAD tree of an interval graph

l

f

Information Processing Letters 89 (2004) 255–259

www.elsevier.com/locate/ip

A linear-time algorithm to compute a MAD treeof an interval graph

Elias Dahlhausa, Peter Dankelmannb,1, R. Ravic,∗,2

a Department of Computer Science, University of Bonn, Bonn, Germanyb School of Mathematical and Statistical Sciences, University of Natal, Durban, South Africa

c Graduate School of Industrial Administration, Carnegie Mellon University, Pittsburgh, PA, USA

Received 9 June 2003; received in revised form 12 November 2003

Communicated by A.A. Bertossi

Abstract

The average distance of a connected graphG is the average of the distances between all pairs of vertices ofG. We present alinear time algorithm that determines, for a given interval graphG, a spanning tree ofG with minimum average distance (MADtree). Such a tree is sometimes referred to as a minimum routing cost spanning tree. 2003 Elsevier B.V. All rights reserved.

Keywords: Graph algorithms; Interval graphs; Spanning tree

1. Introduction tree of G (MAD tree in short) is a spanning tree o

d

er-

rch

nal2581

G of minimum average distance. MAD trees, also re-are

rksk ofach

ible.P-is

e-in

atarye

no-

al

erved

The average distance µ(G) of a finite, connectedgraphG = (V ,E) is the average over all unorderepairs of vertices of the distances,

µ(G) =(|V |

2

)−1 ∑{u,v}⊂V (G)

dG(u, v),

wheredG(u, v) denotes the distance between the vticesu andv. A minimum average distance spanning

* Corresponding author.E-mail address: [email protected] (R. Ravi).

1 Financial support by the South African National ReseaFoundation is gratefully acknowledged.

2 This material is based upon work supported by the NatioScience Foundation under grants CCR 0105548 and ITR 012(ALADDIN Center).

0020-0190/$ – see front matter 2003 Elsevier B.V. All rights resdoi:10.1016/j.ipl.2003.11.009

ferred to as minimum routing cost spanning trees,of interest in the design of communication netwo[6]. One is interested in designing a tree subnetwora given network, such that on average, one can reevery node from every other node as fast as possIn general, the problem of finding a MAD tree is Nhard [6]. A polynomial-time approximation schemedue to [1]. Hence it is natural to ask for which rstricted graph classes a MAD tree can be foundpolynomial time. In [3], an algorithm is exhibited thcomputes a MAD tree of a given distance-hereditgraph in linear time. In [5] it is shown that a MAD treof a given outerplanar graph can be found in polymial time.

In this paper, we show that for a given intervgraphG a MAD tree can be computed in time O(|E|)..

256 E. Dahlhaus et al. / Information Processing Letters 89 (2004) 255–259

If the interval representation ofG is known and theleft and right boundaries of the intervals are ordered,

ADin

sid-

alves

a

o

Consider the tree

′

the

, in

ceht

then a MAD tree ofG can be found in time O(|V |). InSection 2, we present some structural results on Mtrees. In Section 3, we sketch the algorithm, andSection 4, we give some concluding remarks.

2. Structure of MAD trees in interval graphs

We always assume that the graph under coneration is connected. Thedistance of a vertex v,dG(v), is defined as

∑x∈V dG(v, x). The total dis-

tance of a graphH denotedd(H) is the sum of allpairwise distances between nodes inH , i.e.,d(H) =∑

{u,v}⊂V (H) dH (u, v).A median vertex of G is a vertexc for whichdG(c)

is minimum. Theeccentricity of a vertexv is definedasexG(v) = maxw∈V dG(v,w).

The neighbourhood of a vertexv in G is defined intwo ways: theopen versionNG(v) = {u: uv ∈ E} andtheclosed versionNG[v] = {v} ∪NG(v).

The following lemma applies not only to intervgraphs but to all connected graphs. Part (i) improon a result in [2], which states that, ifT is a MADtree of a given connected graphG, then there existsvertexc in T such that every path inT starting atc isinduced inG. We now prove thatc can be chosen tbe a median vertex ofT .

Lemma 1. (i) If T is a MAD tree of G and c is amedian vertex of T then every T -path starting at c isan induced path in G (i.e., has no diagonals in G).

(ii) T and c can be chosen such that there is novertex c′ = c such that NG[c] is strictly contained inNG[c′].

Proof. (i) Let P = v1, . . . , vk be a path inT withc = v1. SupposeP is not induced inG. Then there arei andj , such thati < j − 1 andvivj ∈ E(G). Let T1andT2 be the connected components ofT − vj−1vjcontaining vj−1 and vj , respectively. It is known(see [2]) thatdT (c) � dT (vi) < dT (vj−1). Since thevertices inT2 are further away (inT ) from vi , thanfrom vj−1, we obtain

dT1(vi) < dT1(vj−1).

T = T − vj−1vj + vivj .

Since the distances between any two vertices aresame inT andT ′, unless one vertex is inT1 and theother vertex is inT2, we have

d(T ′) − d(T ) = ∣∣V (T2)∣∣(dT1(vi) − dT1(vj−1)

)< 0,

contradicting the minimality ofd(T ). HenceP is aninduced path inG.

(ii) If there is a vertexc′ with NG[c] ⊂ NG[c′]and NG[c] = NG[c′], then joining allT -neighboursof c to c′ instead ofc yields a spanning treeT ′ withd(T ′) � d(T ), in which c is an end vertex andc′ is amedian vertex. ✷

From now on we assume thatG is an intervalgraph. Each vertexv corresponds to an intervalI (v) =[l(v), r(v)], such that two verticesv and w areadjacent if and only ifI (v) ∩ I (w) = ∅.

Proposition 1. Let v0, . . . , vk be an induced path ofthe interval graph G. Then

(1) l(v1) < l(v2) < · · · < l(vk) and r(v0) < · · · <r(vk−1) or

(2) r(v1) > · · · > r(vk) and l(v0) > · · · > l(vk−1).

In the first case, we call such a path an R-paththe second case, we call it an L-path.

Consequently, each path of the MAD treeT of Gstarting at a median vertexc of T is an L-path or anR-path.

For a vertexv of G let h(v) be a neighbourx of vsuch thatr(x) is maximum; If r(v) = maxw∈V r(w),we sayh(v) is undefined. Similarly, letk(v) be aneighboury of v such thatl(y) is minimum. Again,if l(v) = minw∈V l(w), we say thatk(v) is undefined.We also defineh0(v) = k0(v) = v for all v. For i � 2,hi(v) is defined ash(hi−1(v)). Analogously, we defineki(v).

A component of a graph istrivial if it contains onlyone vertex, otherwise it is called nontrivial.

For a given vertexv of G and an integeri � 2 letV Ri (v) (V L

i (v)) be the set of all vertices at distani from v, whose intervals lie completely to the rig(left) of the interval of v. We also letV R

1 (v) =V L

1 (v) = NG(v).

E. Dahlhaus et al. / Information Processing Letters 89 (2004) 255–259 257

Theorem 1. If G is an interval graph then there is aMAD tree T of G with a median vertex c, such that for

o

of,

es

n

n

t

e

of

al

val

e

s

r

each i, 1 � i � exG(c),

(∗){

each v ∈ V Ri (c) is adjacent in T to hi−1(c),

each v ∈ V Li (c) is adjacent in T to ki−1(c).

Proof. Let T be a MAD tree ofG and let c be amedian vertex ofT , wherec is chosen according tLemma 1(ii).

By Lemma 1(i), eachG-neighbourv of c is also inT adjacent toc, since otherwise thec − v path inT

would have a diagonal. Hence(∗) holds fori = 1.Suppose that(∗) does not hold for somei. Let i be

the smallest such number. For the rest of this prowe omit the reference to the median vertexc in thenotation forV R’s.

We first show that only one vertex inV Ri−1 hasT -

neighbours inV Ri . Suppose that there exist vertic

vi, v′i ∈ V R

i , vi−1 = v′i−1 ∈ V R

i−1 such thatvivi−1,

v′iv

′i−1 ∈ E(T ). Since, by the minimality ofi, vi−1

andv′i−1 are adjacent inT to hi−2(c), we havevi = v′

i

(else there would be a cycle inT ). Without loss of gen-erality, we may assume thatr(vi−1) � r(v′

i−1). Let Bbe the component ofT − vi−1vi not containingc andlet B ′ be the component ofT − hi−2(c)v′

i−1 not con-tainingc. Moreover let

child(v′i−1) = {w ∈ B ′ | v′

i−1w ∈ ET }.Note that child(v′

i−1) is exactly the set of childreof v′

i−1 when T is rooted atc. Now child(v′i−1) ⊆

NG(vi−1) since every child ofv′i−1 is on an R-path

from c andr(vi−1) � r(v′i−1). Therefore,

T ′ = T − {v′i−1w | w ∈ child(v′

i−1)}

+ {vi−1w | w ∈ child(v′

i−1)}

is a spanning tree ofG. Since the distance betweeany two vertices are the same inT andT ′, unless oneof the vertices is inB and the other one is inB ′, thedifference between the total distances is

d(T ′) − d(T ) = −2∣∣V (B)

∣∣(∣∣V (B ′)∣∣ − 1

)< 0

contradicting the minimality ofd(T ). Hence at mosone vertexv in V R

i−1 hasT -neighbours inV Ri . Hence

only one vertex inV Ri−1 hasT -neighbours inV R

i . Sinceeach vertex inV R

i that is adjacent inG to a vertexv in Vi−1 is also adjacent inG to hi−1(c), we can

Fig. 1.

join each vertex inV Ri not to v, but tohi−1(c) with-

out increasing the total distance ofT . Analogously, wecan achieve that each vertex inV L

i is adjacent inT toki−1(c). HenceT satisfies condition(∗). ✷

Theorem 1 suggests the following polynomial timalgorithm. Fix a vertexc of G, determinehi(c) andki(c) for i = 1,2, . . . and construct a spanning treeG in which each vertex inV R

i (c) is adjacent tohi−1(c)

and each vertex inV Li (c) is adjacent toki−1(c) for

i � 1. Construct such a tree for eachc ∈ V (G).Among thosen trees select a tree with minimum totdistance. By Theorem 1, this tree is a MAD tree ofG.

Theorem 2. A MAD tree of an interval graph can bedetermined in polynomial time.

A set of intervals and the corresponding intergraphG is shown in Fig. 1. The verticesc, hi(c) andki(c), i = 1,2, are labelled. Thick lines indicate thedges of a MAD treeT satisfying the condition(∗) ofTheorem 1.

3. Linear-time computation of a MAD tree

We assume that an interval representation ofG =(V ,E) is known and that the left bordersl(v) andright bordersr(v), v ∈ V are sorted. As in the previousection, we assume thath(v) is a neighbourx ofv, such thatr(x) is maximum and thatk(v) is aneighboury of v, such thatl(y) is minimum. In [7] itis shown thath andk can be determined in time linea

258 E. Dahlhaus et al. / Information Processing Letters 89 (2004) 255–259

in the number of vertices (even in logarithmic time inparallel with a linear workload).

e

1

y

h

g

The numbers numL(v) and numR(v) can be deter-mined overall in O(n) time (see, e.g., [7]). For any

for

llof

he

ages

We consider any vertexv and assume that thmedian vertexc is left (right) ofv, i.e., thatr(c) � r(v)

(l(c) � l(v)).Let T be a MAD tree according to Theorem

rooted atc and letv = c be a vertex,v ∈ V Ri (c), say.

ConsiderTv , the subtree ofT rooted atv. Then ei-ther v is a leaf ofT and Tv is trivial, or v = hi(c)

for somei. In that case,Tv contains someG-neigh-bours ofv, in particularh(v) = hi+1(c), and all ver-tices inV R

i+2(c) = V R2 (v) (which are inT adjacent to

hi+1(c) = h(v)),V Ri+3(c) = V R

3 (v) (which are inT ad-jacent tohi+2(c) = h2(v)), and so on, as long as theare defined. The fact that the main part ofTv , namelyTh(v), only depends on whetherv is to the left or rightof c, but not on the actual choice ofc, is the key to ouralgorithm.

Definition 1. Let G be a connected interval grapwith h and k as defined above, andv ∈ V (G). Ifh(v) (k(v)) is not defined then letT R

v (T Lv ) be the

empty tree. Ifh(v) is defined then letT Rv be the tree

with vertex set{h(v)} ∪ ⋃j�2V

Rj (v), where a vertex

x ∈ V Rj (v) is adjacent tohj−1(v). Analogously, if

k(v) is defined then letT Lv be the tree with vertex

set {k(v)} ∪ ⋃j�2V

Lj (v), where a vertexx ∈ V L

j (v)

is adjacent tokj−1(v). Note that bothT Rv andT L

v donot containv!

Hence the treeT = Tc consists ofc as root, theneighbours ofc in the original graphG as neighboursin Tc, and T L

c and T Rc appended onk(c) and h(c)

respectively. We do not determine these treesT Rv

andT Lv explicitly. Instead, we compute the followin

quantities.

(1) the number of neighbours numR(v) (numL(v)) ofh(v) (k(v)) that are not neighbours ofv, i.e., thenumber of children ofh(v) (k(v)) in T R

v (T L(v)),(2) |T R

v | (|T Lv |), the number of vertices of the treeT R

v

(T Lv ),

(3) the total distancetR(v) (tL(v)) of T Rv (T L

v ), and(4) the distancedR(v) (dL(v)) of h(v) (k(v)) in T R

v

(T Lv ).

particularv, dR(v) can be determined in O(1) timeif dR(h(v)) and numR(v) are known. AlsotR(v) canbe determined in O(1) time if dR(v), numR(v) andtR(h(v)) are known. Analogous statements holdthe left counterparts.

In more detail, we proceed as follows.

Determine numR(v): If h(v) is not defined thennumR(v) = 0. Otherwise, numR(v) is the number ofverticesw, such thatrv < lw � rh(v). This can bedetermined in overall linear time.

Determine|T Rv |: If h(v) is not defined,|T R

v | = 0.Otherwise,∣∣T R

v

∣∣ = ∣∣T Rh(v)

∣∣ + numR(v).

Determine dR(v): If h(v) is not defined thendR(v) = 0. Otherwise

dR(v) = dR(h(v)

) + ∣∣T Rh(v)

∣∣ + numR(v) − 1.

Determine the distancetR(v): If h(v) is not definedthentR(v) = 0. Otherwise

tR(v) = tR(h(v)

) + dR(v)

+ (numR(v) − 1

)((numR(v) − 2

)+ 2

∣∣T R(h(v)

)∣∣ + dR(h(v)

)).

Determine the total distance ofTc: First we deter-mine the degreeδ(c) of c. This can be done in overatime O(n), for all c (one has to count the numberright borders between the left borderlc and the rightborderrc of c and the number of intervals passing tright border ofc). Then we definen(c) to be the totalnumber of neighbours ofc excluding itself as well ash(c) andk(c) if they are defined.

The total distance ofTc is(n(c)

)2 + tL(c)+ tR(c)

+ (1+ 2n(c)+ dL(c)

)∣∣T R(c)∣∣

+ (1+ 2n(c)+ dR(c)

)∣∣T L(c)∣∣ + 2

∣∣T Lc

∣∣∣∣T Rc

∣∣+ (

dL(c) + dR(c))(

1+ n(c)).

Thus we compute the numberstR(v) (tL(v)), dR(v)

(dL(v)), |T Lv | (|T R

v |), and numR(v) (numL(v)) inoverall linear time.

We obtain the total distances and thus the averdistances of the treesTc with assumed median vertice

E. Dahlhaus et al. / Information Processing Letters 89 (2004) 255–259 259

c in overall linear time, and we only have to select oneTc with minimum average distance. We determine this

n

ereD

thmighte

t

to v1, . . . , v4. An interval representation of this graphis as follows:v1 : [0,1], v2 : [0,3], v3 : [2,5], v4 :

e,

of

dge-orows of

g,ng8.of

AD131

rval

nar

lex-85.forns

particularTc explicitly using Theorem 1. Also this cabe done in linear time.

Theorem 3. If an interval representation of an intervalgraph G with n vertices is given and the left and theright boundaries l(v) and r(v) are sorted then a MADtree of G can be determined in O(n) time.

4. Conclusions

It remains an open problem to decide whether thexists a polynomial time algorithm to find a MAtree of avertex weighted interval graph. Ifw is a realvalued weight function on the vertex set ofG, then theaverage distance ofG with respect tow is defined (see[4]) as(w(V )

2

)−1 ∑{u,v}⊂V (G)

w(u)w(v)dG(x, y),

wherew(V ) is the total weight of the vertices inG.Theorem 1 does not hold (and hence the algori

presented above does not work) if there is a wefunction on the vertices ofG. To see this, consider thpath with four verticesv1, . . . , v4 of weight 1, 2, 1, 5together with a vertexv5 of weight 0, that is adjacen

[4,5], v5 : [0,5]. This graph has a unique MAD trethe pathv1, v2, v3, v4, v5. It is easy to check thatTdoes not contain a vertex satisfying the conclusionTheorem 1.

New techniques seem necessary to solve the eweighted counterpart of the MAD tree problem fInterval graphs. It would also be interesting to knwhether our algorithm can be extended to the classtrongly chordal graphs.

References

[1] B.Y. Wu, G. Lancia, V. Bafna, K.-M. Chao, R. Ravi, C.Y. TanA polynomial time approximation scheme for minimum routicost spanning trees, SIAM J. Comput. 29 (3) (1999) 761–77

[2] C.A. Barefoot, R.C. Entringer, L.A. Szekély, Extremal valuesdistances in trees, Discrete Appl. Math. 80 (1997) 37–56.

[3] E. Dahlhaus, P. Dankelmann, W. Goddard, H.C. Swart, Mtrees and distance-hereditary graphs, Discrete Appl. Math.(2003) 151–167.

[4] P. Dankelmann, Computing the average distance of an integraph, Inform. Process. Lett. 48 (1993) 311–314.

[5] P. Dankelmann, P. Slater, Average distance in outerplagraphs, Preprint.

[6] D.S. Johnson, J.K. Lenstra, A.H.G. Rinnooy-Kan, The compity of the network design problem, Networks 8 (1978) 279–2

[7] S. Olariu, J. Schwing, J. Zhang, Optimal parallel algorithmsproblems modeled by a family of intervals, IEEE Transactioon Parallel and Distributed Systems 3 (1992) 364–374.

Documents

A linear-time algorithm to compute a MAD tree of an interval graph