Algorithms for finding paths with multiple constraints

Algorithms for Finding Paths with Multiple Constraints Jeffrey M. Jaffe IBM Thomas J. Watson Research Center, Yorktown Heights, New York 10598

Let G = (V, E ) be a graph with weight function w :E + I?.+ and length function 1:E + Z’. The problem of determining for u1, u2 E V whether there is a path from u1 to u2 with weight at most W and length at most L is NP-complete. This paper gives two approaches to meeting or approximating the length and weight constraints. The first approach is to use a pseudopolynomial-time algorithm which determines whether a path meets the constraints. Its running time is O ( n 5 b log nb) where n = (VI and b is the largest length or weight. If tables with O ( n 3 b ) entries are kept then all instances of multiple constraints may be decided. Table size may be substantially decreased if one is willing to tolerate incorrect answers t o rare instances. The algorithm is suitable for distributed execution. In the second approach, an objective function is defined which evaluates a path’s distance from meeting the constraints. Polynomial-time algorithms attempt t o find good paths in terms of the objective function. One algorithm is at most 1.62 times worst than optimal. A notion of “average worst-case behavior” is defined. The algorithm’s “average” behavior is 1.51 times worse than optimal.

I. INTRODUCTION

Distributed algorithms for the shortest-path problem have attracted considerable interest recently [ 6 ] . Part of the reason for this interest is due to applications in message routing in computer networks [5]. If a message or a sequence of messages needs to be sent from one node t o another, it is desirable to use the shortest route, least cost route, o r shortest delay route-all variants of the shortest-path problem.

To provide flexible service t o network users, one might like to minimize some combination of several criteria, e.g., achieve relatively low delay at relatively low (dollar) cost. One possible criterion is to minimize some linear combination of delay and cost. This may be reformulated as another shortest-path problem and may be easily solved in polynomial time. Another formulation is to insist that the chosen path be within a given cost and given delay. Determining whether such a “feasible” path exists is NP-complete [3], and it is this problem that we address here.

In Section I1 a pseudopolynomial-time algorithm is given for the problem. The running times is O(n5b log nb) where n is the number of vertices or nodes and b is the largest cost or delay (henceforth called length and weight). When executed in a distributed manner, O(n4b log nb) machine steps are required per network node.

Once this algorithm is executed, one may find paths that meet any set of feasible

NETWORKS, Vol. 14 (1984),95-116 0 1984 John Wiley & Sons, Inc. CCC QQ28-3Q45I84lQ1 QQ95-22$04.QQ

96 JAFFE

constraints. To find these paths, one requires tables with O(nb) entries per node pair. For any functionf such that limx+m f ( x ) = = and f ( x ) < x, one may substitute tables of O(f(nb)) entries per node pair. This will result in incorrect answers to some instances of the problem, but the probability of error approaches 0 as nb goes to infinity (Section 111). If one desires correct answers with fured probability p , then one may use tables of fured finite size, irrespective of nb.

Time O(n5b log nb) may be too expensive for certain applications. The rest of the paper describes polynomial-time algorithms that approximate the constraints on the path. First one must define the notion of approximating the constraints. Section IV defmes an objective function used to evaluate a path’s distance from the constraints.

In Section V we use shortest-path algorithms to approximate constraints when a feasible path exists. An algorithm that uses very limited information is at most twice worse than optimal. A more complicated algorithm is at most 1.5 times worse than optimal. A measure is defined on the problem space which enables us to discuss the average performance of the algorithm. The average for the first algorithm is 1 + log 2 = 1.69 and for the second is 3 - 4 2 = 1.43.

These algorithms, although guaranteed to be close to optimal when a feasible solution exists, may be unboundedly worse than optimal when none exists. This defi- ciency is overcome in Section VII with another algorithm. Its worst-case performance is 4 (fi t 1) = 1.62 times worse than optimal, even when no feasible solution exists. The average behavior is approximately 1.51 times worse than optimal.

II. A PSEUDOPOLYNOMIAL TIME ALGORITHM

A graph G = (V , E) consists of a set V of vertices and a set E of edges. Associated with each edge e E E is a length l(e) E Z+ and a weight w(e) E Z+. A path p is a sequence of distinct vertices ul, . . . , uk such that (ui, u ~ + ~ ) E E for i = 1, . . . , k - 1. The set {(ui, u ~ + ~ ) ) is called the edges o f p . The length (weight) o f p , l ( p ) [ W ( p ) ] is the sum of the lengths (weights) of the edges of p . There is a polynomial-time algorithm that, given any graph, length function, and pair of vertices, finds the minimum length path between the vertices [I] .

Consider the following decision problem:

Multiple Constraints on a Path (MCP). Given a graph G = (V, E); length and weight functions 1, w on G ; integers L , W , and ul, u2 E V; is there a path p from u1 to u2 such that L ( p ) < L and W ( p ) < W?

This problem is NP-complete [3]. As explained in the Introduction, this problem has a number of important applications, among them that of finding a route in a computer network with cost at most L and delay at most W. Our fust approach to MCP solves it in pseudopolynomial time. If b is the largest length or weight in the network, the algorithm runs in time bounded by 0(n5 b log nb).

We informally review some definitions [3]. An algorithm runs in polynomial time if its run time is polynomial in the length of the input. Typically, a number x is repre- sented in log x bits, and an instance of MCP is of length between O(n) and O(n log b). An algorithm runs in pseudopolynomial time if its run time is polynomial in the length

PATHS WITH MULTIPLE CONSTRAINTS 97

of the input and Zargest number in the input. The largest number is b. O(nsb log nb) is not a polynomial in O(n log b), but is in n and b. Thus the O(n5b log nb) algorithm is a pseudopolynomial-time algorithm.

A special case of this algorithm is noted in [3]. If all weights or all lengths are equal, then a polynomial-time algorithm exists.

The algorithm is presented here as a “distributed” algorithm (where the vertices are thought of as nodes in a network) with O(n4b log nb) computation done at each node. Since there are n nodes, O(nsb log nb) time is required for a centralized computation.

We first outline the algorithm. Let b be the largest weight in the graph. All loop free paths between any two nodes in the graph have weight at most nb. Also, any path has integral weight. The strategy is to simultaneously find the minimal length path between all pairs of nodes with weight at most 1, 2, . . . , nb. Each node maintains a table of O(nb) entries for every other node representing the shortest lengths for weights 1, . . . , nb. This requires a total of O(n2b log nb) storage at each node, and O(n3b log nb) in the entire network, since each entry is of size at most log nb. (Each entry is of size log nb since it represents the length of a loop-free path.)

Using the tables, it is easy to solve any instance of MCP. Given ul, u2, L , and W, one checks the table at u1 for paths to u2 with weight W to find the minimal length path. If it exceeds L , then there is no feasible solution (i.e., no paths meet the constraints); otherwise there is a feasible path.

Initially the tables have little information. Let L(ul, u2, k) be the entry in the table of u1 which will ultimately contain the length of the shortest path to u2 with weight at most k (ul f u2). Then L ( u l , u2, k) is initialized according to information known locally to ul. In particular, L ( u l , u2, k ) = 00 if (ul, u2) $ E or if e = (u1, u 2 ) E E but w ( e ) > k . If e =(ul,u2)EEandw(e)Qk,thenL(ul,u2,k)=I(e). ThusL(ul ,u~ ,k) is the length of the “best known” path, of weight at most k. For convenience, we say that the initial values are the entries after the first iteration. With the above prelim- inaries, we give the algorithm to solve MCP:

Algorithm for Node v1 at i th Iteration (i> 2)

1. For each entry L(ul , u 2 , k) that was changed at the (i - I)th iteration: For each ug such that e = (ul, u g ) E E with w(e) = w send a message to u3 “L(u3 ,

u 2 , k t w) < L ( u l , u 2 , k ) t Z(e).” 2. For each message received of the form “L(ul, u2, k) Q x”:

a. If L(u1, u2, k) is already less than x ignore the message. b. If L(u1, u 2 , k ) is currently greater than x, then L(ul, u2, k) :=x. c. I fL(ul ,u2 ,k’ )>xfork’>k,L(ul ,u l ,k) :=x .

Basically, in Step 1 node u1 informs all neighbors about improved paths learned at the last iteration. In Step 2, node ul learns about other improved paths from its neighbors.

There are certain assumptions made about the above algorithm. It is assumed that synchronization prevents a node from starting the (i t 1)th iteration before all have completed the ith. The above algorithm has various inefficiencies which are removed below. For now the above simple form is used. The correctness of this algorithm is encapsulated in Theorem 1.

98 JAFFE

Theorem 1. After at most n - 1 iterations of the above algorithm, each entry L(u1, u2, k) for u l , u2 E V, 1 < k < nb, contains the length of the minimal-length path from u1 to u2 with weight at most k.

It is left to the reader to verify that if at any point in the algorithm L(ul, u2, k) = L ,

The proof of Theorem 1 is facilitated with the following: then there is a path of length L and weight at most k from u1 to u2.

Lemma 1. Assume that a minimal-length path from u1 to u2 with weight at most k traverses r edges. Then L(ul , u2 , k) contains the length of this path after the r th iteration.

Note that all loop-free paths require at most n - 1 edges. Since table entries are never increased, Theorem 1 is immediate from Lemma 1, using the remark that pre- ceded Lemma 1.

Proof of Lemma 1. By induction on r: r = 1. Assume that the minimal-length path of weight at most k consists of a single

edge e = ( u l , uz ) . Then when initializing L , u1 sets L(ul , u2 , k) := l(e). Thus the entry is correctly entered at the first iteration.

Inductive Step. Assume that the smallest (in terms of number of edges) minimal- length path from u1 to u2 with weight at most k traverses r > l edges. Assume that the first edge on the path is e = ( u l , u3) , and w(e) = w. Note that a minimal-length path from u3 to u2 with weight at most k - w is identical to the minimal-length path from u1 to u2 with weight at most k, with the edge e removed. For if there were a path of smaller length, it would produce a path of smaller length from u1 to u2 with weight at most k by attaching e.

Since the minimal-length path with weight at most k - w from u3 to u2 traverses r - 1 edges, L(u3, u2, k - w ) contains the proper value after r - 1 iterations (induction). By the r th iteration, u3 sends a message to u1 “L(ul, u2, k) <L(u3, u2, k - w) t l(e).” Using this information, u1 updates L(ul , u2 , k) in the r th interation and the lemma (and theorem) is proved.

Running Time of the Algorithm

The number of machine steps is at most 0 ( n 2 ) times the running time per iteration per node (Theorem 1). In Step 1 of the algorithm each node sends out at most n messages (at most n neighbors) for each entry in its tables of size n2b. Since each message is of size O(1og nb), Step 1 takes at most time U(n3b log nb). Similarly, each node receives at most 0 ( n 3 b ) messages. Unfortunately, in Step 2c each node may do U(nb) work per message for a total of 0(n4b2) machine steps per iteration.

The amount of time for 2c is reduced as follows. If a message “T(ul, u2, k) < x” arrives, only T(ul , u2 , k) is updated but not T(ul , u2 , k’) fork’> k. After all U(n3b) messages have been handled, u1 makes one pass through its table to ensure T(ul , u2 , k‘) < T(u l , u2, k ) if k‘ > k. Thus 0 ( n 2 b ) additional time is required, but each iteration requires o d y 0 ( n 3 b log nb).


The tables actually solve more than MCP. For a given W , one may find the minimal- length path with weight at most W. One may also find the minimal-weight path with length at most L . This is accomplished by searching L ( u l , u2, *) for the smallest value of k with the property L(u1, u2, k ) < L . One may also minimize linear combinations of the form cL(p) t dW(p) . If p minimizes cL(p ) t dW(p), then p is also the mini- mumlength path with weight W( p ) .

In computer network situations it is not enough to know whether some path satisfies the constraints. The path itself must be explicitly determined so that communication between nodes may take place on this path. This information does not exist in the tables as defined, but may be conveniently introduced as follows.

When u1 discovers that L(ul, u2, k) needs to be updated owing to a short path (in terms of length) that goes to u2 via u3, u1 associates with L ( u l , u2, k): “NEXT-NODE(uI, u2, k) = u3.” The interpretation is that the minimal-length path from u1 to u2 with weight at most k has as its fust edge ( u l , ug) . Using this, it is easy to determine the entire path from u1 to u2. The path with weight at most k and length L(ul, u2, k ) has as its first node NEXT-NODE(ul, u2, k) = u3. The second nodeisNEXT-NODE(u3, u 2 , k - w(ul, u 3 ) ) = u4. Third is NEXT-NODE(u4, u 2 , k - W ( U I , u3) - w(u3, u4) ) , and so on.

111. TABLE SIZE CONSlDERATlONS

Using the algorithm of the last section, one may solve an instance of MCP in time O(n5b log nb). This is too costly for most applications. However, the algorithm does more than solve one instance-it makes it easy (table lookup) to solve all instances. For that reason if many instances of MCP need to be solved, the time bound may be tolerable. Unfortunately, there is another problem with the efficiency of this algorithm. Every node keeps a large table with O(nb) entries which lists possible paths to every other node.

In this section we discuss methods of economizing on table size. Table entries “which do not provide too much information” are dropped. This makes it impossible to correctly answer all instances of MCP; if there is a path of length L and weight W and all record of that path is destroyed, then that instance of MCP will be answered incorrectly. The main result of this section is that MCP will almost always be answered correctly even if only a small fraction of the tables are maintained. In particular let r = nb and let f ( r ) be any nondecreasing function of such that 1imr+- f ( r ) = 00. If one uses a particularly well chosen set of f ( r ) entries for each pair u l , u2 E V , then with probability that approaches 1 (as r + m), MCP will be answered correctly (assuming all “possible” values of L and W are equally likely). In addition, if one is satisfied with a fixed probability of correctness (say 0.9999, for example), then only finitely many entries are needed, irrespective of r . In the sequel we focus on the table for one node pair, u l , u2 E V .

For simplicity assume that b is both the largest length and largest weight. The region of possible instances of MCP may be pictorially described as in Figure 1. The broken line near the diagonal of the square delineates those instances that have a solution and those that do not.

100 JAFFE

r*nb

0

t L

2 11111111111111

I 2 ..* r=nb W-

FIG. 1.

Instances with W < 1 or L < 1 can never be solved, and with W 2 r or L 2 r can be solved as simple shortest path problems. Each separate horizontal line represents many entries in the table L(ul, uz, *). For example, line A represents the fact that L(ul, u2, 1) = nb - 1 and L(ul, uz, 2) = nb - 1. One table optimization is to include only one listing in the table for each horizontal line and infer the rest. Thus, we would keep L(ul, u2, 1) = nb - 1, and since L(ul, u z , 2) is not kept in the table, we would inferL(ul,uz,2)=L(ul, u 2 , l ) .

We use the pictorial representation to help explain how we further economize on table space. Consider Figure 2:

The horizontal lines in the interior of the two squares are identical except that the two lines A,B in the figure on the left are replaced by the single line C on the right. Assume that the table on the left represents the result of the algorithm of Section 11. Then the table on the right correctly answers all instances of MCP except for L = y and W = x . It thus saves one table entry, but is punished by getting the incorrect answer for one of the r z possible instances. Using such “table approximation” techniques permits us to prove the aforementioned results. For each node pair we replace a table of size nb with one much smaller. First a few definitions are given.

Fix a function f ( r ) < r, a nondecreasing function of r such that 1imr+- f ( r ) = 00.

For a function g: (1, . . . , r} +. (1, . . . , r} such that g(i) 2 g ( j ) if i < j , let h,: (1, . . . , r} X (1, . . . , r} +. (0, 1) be defined by hg(w, 1 ) = 1 iff I >g(w). h f ( r ) approximation to g is any function g‘: (1, . . . , r } + (1, . . . , r} with at most f ( r ) elements in the range of g’. The number of domain elements (w, I) such that hg(w,l) f h,~(w, I) is denoted lhg - hgll.

FIG. 2.


L t Y y+b

W-

FIG. 3.

The function g represents the tat,.: obtained from the iyiorithm of Section 11, and h,(w, I ) = 1 iff some path has weight less than or equal to w and length less than or equal to 1. An f ( r ) approximation g’ is a table with at most f ( r ) entries. The quality of the approximation is measured by Prob(g,g’) = (r2 - Ih, - hglI)lrz, the probability that an instance is answered correctly assuming all values for L and W in (1, . . . , r} are equally likely.

Theorem 2. Fix a function f ( r ) as above. One may devise an algorithm [which takes O(r log r ) time] that given any function g: (1, . . . , r} + (1, . . . , r} produces an f ( r ) approximation tog ,g* such that Prob(g,g*) - 1 - l/f(r).

Theorem 2 has the following informal content. Any function g can be approximated quite closely by some f ( r ) approximation to g. In particular, as r + 00, the percentage of domain elements such that h, # h,* approaches 0 [since f ( r ) + -1. To prove Theorem 2, we explicitly find the desired function g’ for a given function g.

Proof of Theorem 2. Given a function g with m range elements, a functiong‘ with m - 1 range elements is defined. The new function differs from g in only a few entries (in terms of lhg - hgl I).

Consider Figure 3, where x represents a range element ofg. The semiperimeter of x is defined to be a t b, the sum of the lengths of the horizontal line with L = x and verticallinewith W = y [no te tha tg (y )=g(y t l ) = - * * = g ( y +b)=x] .

Lemma 2. If g has m range elements, then some range element has semiperimeter that does not exceed L2rlmJ.

Proof of Lemma 2. The sum of semiperimeters of all range elements is at most 2r. If 2r/m is not an integer and all semiperimeters are at least 2r/m, then the sum of semiperimeters exceeds 2r. Hence at least one range element has semiperimeter of at most L2rlmJ.

Lemma3. Assumeg(y- l ) > x , g ( y ) = x . Defineg’: (1 ,... , r } + { l , . . . , r}by

g’(z) = g(z) if g(z) # x;

g‘(z) = g(y - 1) if g(z) = x

102 JAFFE

(i.e., eliminate x from the range of g’) . Assume that the semiperimeter of x is s. Then

S L Ih,- h g l l < - , ifsiseven

4

s2 1 l h g - h g t l < - - - , ifsisodd.

4 4

Proof. Refer to Figure 3. It is easy to see that lh, - hgt I = ab where s = a + b. The largest possible value for ab given that s = a t b is i s 2 , if a = b = is. Since a, b are integers, if s is odd, then the largest possible value of ab is $(s t l)(s - 1) if a is i(s + 1 )

Returning to the proof of Theorem 2, given a function g = g o , we find functions gl, . . . , gk based on g such that gk is anf(r) approximation to g . We will show that (r2 - lhg - hgkI)/r2 2 q(r) where q(r) - 1 - l / f ( r ) . This will prove Theorem 2 if we do this in time O(r log r).

The procedure to find gi+l from gi is straightforward. We attempt to minimize Ihgi+l - hgil. For each value y such that gi(y - 1 ) > g i ( y ) , we investigate changing g j by setting g j + i ( z ) = g i ( y - 1) if gi (z ) = g i ( y ) . Of these at most r choices for gi+l , one chooses the function which minimizes Ihgj+l - hgil.

If Irange(gi)l = m , then by Lemma 2 , some range element has semiperimeter at most L2r/mJ. By Lemma 3 the difference Ihgi+l - hgil is at most i (L2rlmJ)’.

Since g = go has at most r range elements, the value of lhgo - h, I is at most

and b is i(s - 1).

This follows from the fact that when going from r - i to r - i - 1 range elements, one causes at most 4 (L2r/(r - i)J)2 incorrect instances of MCP. If the original function g has fewer than r range elements, then the summation is of fewer terms, but in this upper bound proof, we may always add in extra terms.

In [5] it is shown that Eq. (1) is at most ( r 2 / [ f ( r j + 11) + ( r 2 / [ f ( r ) + 11 2). Thus

To review, one may find for any function g a “compact” good approximation. This may be done by successively removing “table entries” with small semiperimeter, i.e., by removing entries that “provide little information.”

The proof of Theorem 2 is completed with a discussion of running time. Clearly the minimal semiperimeter entry may be found in O(r ) time, but to do this O(r) times requires time O(r2). Since r = nb, r2 = n2b2 , which may be too large. In particular, the algorithm of Section I1 took time O(n5b log nb), but if b is large, then

To do the entire procedure in time O(r log rj, the r initial entries are sorted by semi- O(b2) > O(n5b log nb).


perimeter [time O(r log r ) ] . We then must perform the operations of min (to determine which element to remove), delete, and insert r times each (actually at each iteration two elements are deleted and one inserted). A data structure (such as a sorted list with pointers to the successive table entries) may be used which requires

The proof of Theorem 2 does more than guarantee the existence of good f ( r ) approximations. It explicitly demonstrates how to build the small tables that lose little information.

only O(1og r ) steps per operation. w

Corollary. Let g: { 1 , . . . , r} +. (1, . . . , I ) as above, and assume that a function g‘ was desired such that hg and hgi agreed on p percent of their range values 0 < p < 100. Such a function g’ can be found in time O(r log I ) for all r and all g: (1, . . . , r} +.

{ 1, . . . , r } , which has q( p ) entries, for some function of p , 4 ( p ) .

ProoJ Using the proof of Theorem 2, a function

(1) has q ( p ) entries, ( 2 ) agrees withg on l O O ( 1 - {l/[q(p) t 11 t l/[q(p) t l]’})percent of the entries.

It suffices to remark that some function q( p ) has the property

can be found which

for all p. w

It can be seen from (3) that to achieve a probability p of correctness one needs roughly lOO/(lOO - p ) entries. Thus to guarantee “almost certainty” one may need quite a large number of table entries. This should not be surprising, since every table entry that is eliminated causes the loss of some information.

One last issue is to find the paths that are generated by the shortened tables. Recall that to find a path through the graph, one merely had to collect a small amount of NEXT-NODE information from each node. Unfortunately, when the tables are shortened, the fact that most of the entries are removed may mean that there is not enough information to generate any paths.

Two methods of rectifying this are discussed. First of all, each node may store the entire path associated with each table entry that is not removed from the table. The second method is to keep additional table entries to aid path definition. In particular, if L(ul, u2, k ) is to be kept in the table of u1 (as decided by the table shortening algorithm) and u3 is on the minimal-length path from u1 to u2 with weight at most k , then an appropriate entry L(u3, u2, w) is kept at u3 . This has the advantage of not only providing u1 with the needed path to u2, but it also provides an additional path from u3 to u2.

Both methods require extra storage. In the worst case each path requires n vertices, and the total space per node pair is then n times the number of table entries. If the number of table entries is f ( r ) and b is large, then nf(r ) may be far smaller than nb. Also, on the average, “good“ paths in networks are usually short, and less than n storage would be needed per path.

104 JAFFE

IV. APPROXIMATING THE MCP PROBLEM

The approaches of the last two sections may at times be inappropriate. If b is large they are quite expensive in time and space. Even for small b, n5 is quite large for moderate values of n. An algorithm that approximates the constraints and runs quickly is thus appropriate.

To quickly find paths in a network, a natural algorithm is to minimize d ( p ) t f lW(p) where a, p E Z’. To solve this efficiently one merely defines a length function Z’: E + Z+ by l‘(e) = d(e) t pw(e). The path that minimizes L’(p ) also minimizes d ( p ) t f l ~ ( p ) , sinceL’(p) = d ( p ) +pW(p) for everyp.

An important question to be asked is what does it mean for an algorithm to approximate the constraints? For example, consider the minimization of d( p) t PW( p). It is possible that an optimal path [according to d ( p ) t PW(p) ] violates the constraints while a suboptimal path satisfies them.

The following objective function f measures the amount by which a particular path violates the constraints. It has the additional property that if any feasible solutions to an instance of MCP exist, then all feasible solutions minimize f:

The best possible value of f is L t W, which is achieved only at feasible solutions. A path p is not considered any better if L ( p ) is much smaller than L, rather than slightly smaller. If a constraint is violated, the path’s value depends linearly on the “extent of constraint violation.” Any algorithm that minimizes f(p) solves MCP. There is a feasible solution if some path p has f ( p) = L t W.

Note that solving MCP does not facilitate the minimization off. Certainly if a feasible solution exists then the minimum value o f f is L t W. However, if no such solutions exists, we know of no way to use MCP as a subroutine to minimize f i n polynomial time. Despite the inequivalence between MCP and the functionf, the function still seems to approximate the intention of MCP.

One might desire to generalize the function f to

where MCP has a solution iff f ’ is minimized at crl, t PW. This is actually not a gen- eralization but a special case off. If all lengths are multiplied by a and all weights by f l , then f(p) in the modified system equals f ’ ( p ) in the original system. If one desired to emphasize lengths and weights differently, this could be done either by using f ‘ or by changing the scale of the lengths and weights in the network.

V. A LIMITED-INFORMATION ALGORITHM

As a means of introducing our techniques we give a simple way to approximate f. This way does not use the values of L and W in the problem instance, but finds the same path for all L and W. A possible application is to provide a single path between two nodes in a network, which performs reasonably well for different sets of constraints. The algorithm is to minimizeL(p) t W ( p ) . Assume W > L .


Theorem 3. Fix a graph G = (V, E), length function I, weight function w, and bounds W, L as above. Let u1, u2 E V, and let p* be the minimum-fpath between u1 and u2. Let p’ be the path from u1 to u2 that minimizes L ( p ) t W ( p ) . Thenf(p’)/f(p*) Q 1 t W/(L t W) Q 2.

Proof. Define g( p) = L( p ) t W( p). Clearly f( p) > g( p) for all p. Assume p is not feasible, i.e., L ( p ) > L or W ( p ) > W. Theng(p) t max {L, W} >f(p). This follows from g ( p ) = L ( p ) t W ( p ) and the fact that f(p) is either L ( p ) t W, W ( p ) t L, or

If p ‘ is feasible then f ( p ’ ) / f ( p * ) = 1 and the theorem is proved. Assume p’ is not W(P) t L(P).

feasible. Then

t- < 2 . (6) L t W

The first inequality follows from f ( p ‘ ) < g ( p ‘ ) t max {L W}. The second follows from f ( p * ) 2 g(p*), f(p*) 2 L t W, and max {L, W} = W. The third inequality

Note that the bound of 1 t W/(L t W) is tight, in that for every pair (L, W), the algorithm does at times perform that poorly. To see this, consider the graph of Fig- ure 4 with lengths and widths as labelled. The path that minimizes L ( p ) t W ( p ) is p = ( A , B , C ) w i t h g ( p ) = L + W - ~ a n d f ( p ) = L t 2 W - e . Howeverf(A,C)= L t W. Thus,

follows from g( p‘) Q g( p*), since p’ minimizes g.

As e + 0, the ratio approaches 1 t W/(L t W). In evaluating a heuristic, it is useful to know its average behavior. For example, if

a single path is used for many values of L and W, the average performance is of interest. Unfortunately, the average depends on “the average graph to be found in prac- tice” or, “average length and weight functions.” Generally, such analyses must be made with knowledge of the particular application; eg., if a graph represents a computer network, then one might assume that the average degree of a node is small. The following describes one general definition of average behavior.

Assume that L and W are random variables with joint probability density function

FIG. 4.

106 JAFFE

h(L, W ) . Assume that the worst-case ratio of an algorithm A is perf@, W ) (in terms of minimizingf). Then the average worst-case performance o f A is defined by

Since we do not know how L and W will be distributed in general, we use simple distributions just to get some feeling for average behavior. To do this, we first redefine the MCP problem. Assume that all lengths and weights are arbitrary real numbers, and that W and L are independent and uniformly distributed in [0, 11. Thus (8 ) may be rewritten as

It is our contention that the above provides a reasonable measure of “average worst- case performance.” In the sequel we use a different version of (9). Recall that perf@, W ) = perf(W, L ) for the algorithm discussed above. It turns out that this is true for all algorithms in this paper (since it is essentially a renaming of L and W ) . Thus

Using perf(l, W ) = perf(W, L) , changing the order of integration, and renaming the variables,

Thus perf = 2Jo’lYperf(L, W ) dL dW. (In the sequel In refers to log,.)

Theorem 4. Let F f be as above {average performance of minimizing L( p ) t W( p ) where L and Ware uniform in [0, 11 }. Then perf = 1 t In 2 = 1.6931.

Prooi In suffices to calculate


The value of perf is twice the above quantity. Having introduced the basic technique of approximating f by doing a shortest-path

calculation and the basic method of evaluating average behavior, we turn to improved algorithms. The algorithms will use Land W to decide which shortest-path calculation to make.

VI. AN IMPROVED ALGORITHM

The approach used here is to minimize L ( p ) t dW( p ) . The value d is chosen on the basis of L and W. First, an algorithm is described for choosing d , if some feasible solution exists. (If a feasible solution does exist, one might be especially concerned about finding a reasonable approximation.) A different algorithm that has better overall behavior if no feasible solution exists but is not as good when feasible solutions exist is given in Section VII. The worst-case and average worst-case behavior of both algorithms is given.

Lemma 4. Let G, 1, w, L , W, ul, u2, and p* be as in Theorem 4. Let p’ be the path from u1 to u2 that minimizes L ( p ) t dW(p). If p* is feasible, then

Proof. The proof is divided into four cases.

Case 1. L ( ~ ’ ) < L , ~ ( p ’ ) < W. In this casef(p’)=f(p*).

Case 2. L ( p ’ ) > L, W ( p ’ ) > W. This case cannot happen, since it implies L ( p ‘ ) -+ dW(p’)>L(p*)+ dW(p*).

Case3. L ( p ‘ ) < L , W ( p ’ ) > W. Defineg(p)=L(p)+dW(p). Theng(p’)=L(p’)+ dW(p’) and (lld) g(p’) = (l /d)L(p’)+ W ( p ’ ) . A l s o f ( p ’ ) = L + W ( p ’ ) . Thusf(p’) < (1 /d) g( p ’ ) t L. Also g( p ’ ) Q g( p * ) Q L t d W. Combining these inequalities yields

108 JAFFE

Case4 L ( p ' ) > L , W ( p ' ) < W. Then g ( p ' ) = L ( p ' ) + d W ( p ' ) , f ( p ' ) = L ( p ' ) + W, and f( p') < g( p') t W. Thus

From (13) and (14), Lemma 4 follows immediately. m For a given pair (L, W), the value of d that provides the best performance is the one

that minimizes

Since the first term is an increasing function of d and the second a decreasing function of d, the max is minimized when they are equal, i.e., when dW = L/d or d = (L/W)'Iz. Thus the algorithm of minimizing L ( p ) + d W ( p ) gets best performance when using d = (L/W)'12. In that case

Theorem 5. Let G, I , w,.L, W, vl, u2, and p* be as in Theorem 4 , and p f defined based on d = (L/W)'I2. Thenf(p')/f(p*) < 1 t (WL)'I2/(W t L ) Q 1.5.

Proof Substitute d = (L/W)'12 in

l+max(--,-} Wd L/d W + L W + L

for the first inequality. To prove that 4 2 (Wt) '12/ (W t L) , note (W'" - L'12)2 2 0, thus W - 2W'IzL"2 +L>O,thus W+L22(WL) '12 , thus: 2(WL)'12/(W+L).

Theorem 6. With notation as above, the average worst-case behavior of minimizing L( p) + (L/W)'12 W ( p ) when some feasible solution exists is 3 - n/2 = 1.4292.

Proof. It suffices to calculate (see [4])

[lW (1 +%) d L d W = y - 3 A 7.

The integration with respect to x can be done using a table of integrals [ 6 ] . As above, - perf is just twice (1 5) .

Note that the bound of

Wd L/d W + L W + L

1 + max { -, -}


FIG. 5 .

is best possible, and thus Theorems 5 and 6 are best possible as far as algorithms that minimize L( p) t dW( p) are concerned. Consider the example of Figure 5 .

k t p* = (A, B), p f = (A, C, 13). Then ~ ( p * ) t d ~ ( p * ) = L t d~ while ~ ( p ' ) t dW( p ' ) = L - E t dW, but

f(p') L(l t l/i> t W - e/d f(P*)- L t W L t W

(L - E)/d -- =1t-.

Thus, ifL/d > Wd the bound is achieved. Similarly, if Wd > L/d, the bound is achieved if l(AC) = L t dW- e and w(AC) = 0.

The above algorithm provides good approximations when a feasible solutions exists. Unfortunately, the paths may be unboundedly worse than optimal if no feasible solution exists. This is illustrated with the following example.

Example. Let W = 1, L = e2, d = e. The algorithm above attempts to minimize L ( p ) t e W ( p ) , and when a feasible solution exists is at most 1 t e/(1 t e') times worse than optimal. Consider, however, Figure 6:

Path (A, B) has value, 1, for L ( p ) t e W ( p ) , while (A, C, B) has value 1 t E. Thus A , B would be chosen by the algorithm. Now f ( A , B) = e' t l/e, and f ( A , C, B) = 2 te,ase+O,f(A,B)/f(A,C,B)-, $(l/e)+-.

To rectify examples such as the above, we give an algorithm which has better performance when no feasible solution exists.

VII. A GENERAL ALGORITHM

This section provides a new algorithm to approximate MCP. It is at most 3 (6 t 1) = 1.618 times worse than optimal. The average worst case behavior of this algorithm is

FIG. 6.

110 JAFFE

1.5103 times worse than optimal. The outline of the algorithm is identical to that of Section VII, one minimizes L ( p ) t dW(p) . The parameter d is chosen to ensure good behavior even if no feasible solution exists. In the sequel, p * denotes any path that minimizes f between a given pair of vertices, and p ’ , any path that minimizes g ( p ) = L ( P ) + dW(P).

Lemma 5. (Notation as above.) If p’ is not feasible and 0 < d < 1, f(p’) Q max { d P * ) W , L +dP*)/dI.

Proof. Recall that g ( p ’ ) Q g ( p * ) . That is, L ( p ’ ) + dW(p’ ) Q g ( p * ) . The proof of Lemma 5 is divided into three cases.

Case I . ~ ( p ‘ ) > L , W ( p ’ ) Q W. In this case f ( p ‘ ) = L ( p ’ ) + W Q g ( p * ) + W.

~ a s e 2 . L ( ~ ‘ ) < L , W ( p ’ ) > W. In this case f( p ‘ ) = L t ~ ( p ‘ ) Q L + g( p*)/d.

Case3. L ( ~ ’ ) > L , w ( p ’ ) > W. In this casef(p’)=L(p’) t W ( p ’ ) Q L ( p ‘ ) / d t W ( p ‘ ) < g ( p * ) / d .

Lemma 6. (Notation as above.)

Proof: The proof is divided into four cases.

Case I. There is a feasible solution, thus L ( p * ) d L . In that case Lemma 6 is immediate from Lemma 4.

Case 2. L ( p * ) 2 L , W(p*) Q w. Using Lemma 5 and f( p * ) = L( p *) t W, we have

L ( p * ) + dW t w L + L(p*) /d + w ’ L ( p * ) t w Q max

dW L t L ( p * ) / d t ”} dW L +L(p*) /d t ”}

PATHS WITH MULTIPLE CONSTRAINTS 1 1 1

Gzse3. L ( p * ) < L , W(p*)> W. Using Lemma 5 andf(p*) = L t W ( p * ) , we have

It suffices to show

L t dW(p*) t W dW Q l t -

(a) L t W ( p * ) L t W'

We prove (a); (b) is trivial. Note that W(W t L - Ld) Q W ( p * ) (W t L - Ld) since W ( p * ) 2 W and W t L -

Ld 2 0. From this it follows that (L t W) [L t W tdW(p*)] < ( L t W tdW) [L t W ( p * ) ] , w h c h implies (a).

a s e 4 . L ( ~ * ) > L , W ( p * ) > W. Using Lemma 5 and f( p * ) = L( p * ) t W( p*) , we have

It suffices to show

dW < I t -

L ( p * ) + dW(p*) + W (a) L ( p * ) t W ( p * ) L t W '

L +L(p*)/d + W ( p * ) L(p*)/d t W t L Q (b) L ( p * ) t W ( p * ) U P * ) t W *

Note that W[L t W - dL(p*)] < W ( p * ) (L t W - dL) since W Q W ( p * ) and L < L(p*), and L t W - dL > 0. From this it follows that [L(p*) t dW(p*) t W] (L t W) Q (L t W t dW) [L(p*) t W ( p * ) ] , which implies (a).

Also note that

From this follows

( L t y t w ( p * ) [L(p*)tW]< ( L t d L ( p * ) t w) [ L ( p * ) t W ( p * ) ] ) and (b). m

112 JAFFE

Lemma 6 does not give a clear picture how to choose d as a function of L and W , since the performance bound depends on L(p*). However, one may distill reasonable values of d to be used irrespective of L(p*). Inasmuch as the bounds of Theorems 5 and 6 were quite good, we first investigate using d = (L/W)'12.

This choice of d cannot always be used. As L/W+ 0, d wouldgo too. IfL(p*)>>L, W, the ratio of Lemma 6 grows like l / d ; thus, the ratio approaches infinity. However, for a large class of L , W pairs, d may be chosen to equal (L/W)'" with the same performance as given in Theorem 5 . The following lemma provides the first step.

Lemma 7. (Notation as above.) If d Z W/(L t W ) , then

dW max {L(p*), L}/d t W t L dW L/d }< 1 t max { -, -} W t L W t L -

Proof: Note that Lemma 7 will permit us to use d = (L/W)'12 whenever (L/W)'12 Z L/(L t W ) since the bound equals that which was obtained in Lemma 4. Also note that if L 2 L(p*), the two sides of Eq. (16) are equal. It suffices to show that if L(p*) > L , then

L(p*) /d t W t L W t L t L / d U P * ) W W t L -

Since d Z W/(L t W ) , W/d < L t Wand (W/d) [L(p*) - L ] Q (L t W ) [L(p*) - L ] [since L(p*) Z L ] . From this it follows that

t W t L < [ L ( p * ) t W ] W t L t - , (L t W ) (9 1 ( 3

which results in ( 1 7). It follows from Lemma 7 that whenever (LIW)''' > W/(L t W), the worst-case per-

formance of optimizing ~ ( p ) t ( L / w ) ' I ~ ~ ( p ) is at most 1 t (wL) ' /~ / (w + L ) < 1.5 even if no feasible solution exists. W and L satisfy (L/W)'12 Z W/(L t W ) , when L312 t WL'I' - W312 > 0 iff 2.1482L Z W. Thus whenever W/L < 2.1482, the worst- case performance is at most 1 t (wL) ' /~ /(w t L).

Next, a method for choosing d is given when W/L > 2.1482.

Lemma 8. (Notation as above.)

Pro05 Let J? = max {L(p*) , L}. It suffices to show that

J ? / d t W t L h Q max

L t W


It shall be shown that if l/d does not exceed the left-hand side of (18), then (L t W ) / W does. Assume

1 Z / d t W t L d Z t W . -<

Then W/d < W t L, or W/(W t L ) < d. In that case

- Z t W - L t W Z t W Z t W W(Z t W)/(L t W) W *

-- Z / d + WtL$[W/(WtL)] t W t L -

To optimize f ( p ' ) / f ( p * ) , we first choose d to minimize max (1 t dW/(L t W), l/d}. Then it is shown that for the chosen value of d , if W/L > 2.1482, then l/d = 1 + dW/(L t W) > (1; t W/W. Max (1 t dW/(L t W), l/d} is minimized at 1 t dW/(L t w) = l /d, i.e., at d 2 t d ( ~ t W)/W - (L t w ) / w = C, or

d = L {-?t L t W [r+/ +4--] L t W } . 2

Lemma 9. (Notation as above.) Let W/L > 2.1482. Then

f(p') <lt--{-Tt 1 W ( L t W ) [(Tr L t W t4--] L t W }=+--~' W 1

f(P*) 2 L + W

if d is chosen by

.='{--p[ L t W (yr t 4 T ] L t W ll2 }. 2

Pr0o.f ByLemma8

The right-hand side of the inequality of Lemma 9 is just 1 t dW/(L t W) with the rele- vant value of d substituted. Also, at this choice of d, l /d = 1 t dW/(L t W). It suffices to show that 4 t [i t W/(L t W)]

Recall that W/L > 2.1482 implies L3l2 t WL'l2 < W3/' (that is how 2.1482 was determined). Thus L3 t 2L2 W t W 2 L < W3, which implies W 3 > (L t W)' - W(L t W)', which implies 4 t W/(L t W) > [(L t W)/W] - (L t W ) / W t i, which implies [$ t W/(L t W)] '/' > (L t W ) / W - 4. The first implication follows from (L t W) - W = L , the second by dividing through by W' (L t W) and adding $, and the third by

2 (L t W ) / W .

taking square roots. w

Theorem 7. (Notation as above.) Suppose dis chosen to be (L/W)"' if W / L < 2.1482 and {-(L t W) t [(L t W)' t 4W(L t W)] '/'}/2Wif W/L > 2.1482. Then

114 JAFFE

f ( p ' ) / f ( p * ) < 3 t ;fi% 1.618.

Proof: If W/L < 2.1482, then for d = (L/W)'I2, it has already been shown that f ( p ' ) / f ( p * ) Q 1.5. It suffices to show that for W/L > 2.1482, t [$ + W/(L t W)] 'I2 d 1.618. Note that the maximum value of 3 t [ $ + W/(L t W)] 1/2 occurs as

In summation, choosing d as indicated in Theorem 7 guarantees that the performance is at most 1.618 times worse than optimal. As W/L + m, d is chosen close to 0.618 and not 0 as indicated by the choice of d in Theorem 5. In particular, even if there is a feasible solution the ratio may be as large as 1.618 using the new method of choosing d.

Using a standard trick, one may design an algorithm that simultaneously is at most 1.5 times worse than optimal when a feasible solution exists, and at most 1.618 times worse than optimal otherwise, even if W/L > 2.1482. Namely, one optimizes both L ( p ) t ( I ~ / W > ' / ~ W ( p ) and L ( p ) t d W ( p ) (d as in Theorem 8), and chooses the path with smaller value off.

The choice of d described in Theorem 7 is best possible. In Section VI it was already shown that d = (L/W)'/2 cannot be improved upon (if W/L < 2.1482). It suffices to show that for any value of d, max (1 t dW/(L t W), l/d) is achievable (if W/L > 2.1482). Since it has already been shown that 1 t dW/(L + W) is achievable is suffices to show that l/d is achievable. We prove achievability only if d < 1. If d > 1, then

Consider Figure 7. f ( A , B) = x t W and f ( A , C, B) = L t (x - e)/d. Since d < 1, for sufficiently small E and sufficiently large x, (A , B) is optimal. Nevertheless L(A, B) + dW(A, B) = X 2 X - E = L(A, B , C ) t dW(A, B , C). Thus the ratio

W/L + 00 and W/(L t W) + 1. This value is 3 t m= (1 t 6).

1 t dW/(L t W) > l/d.

Next, the average worst-case performance of the above algorithm is calculated.

Theorem 8. (Notation as above.) The average worst-case performance of minimizing L ( p ) t d W ( p ) where d = (L/W)'/' for W/L < 2.1482 and

W L

, for - 2 2.1482,is - (L t W) + [(L W)2 + 4W(L t W)] ' P d =

2W

FIG. I .


is

w/2.1482

4 L t W

Proof: The fact that perf equals the sum of the two indicated integrals follows from the values of perf@, W ) obtained in Lemma 9 and the discussion following Lemma 7. It suffices to evaluate the integrals. In [4] it is shown that the above expression is correct.

VIII. RELATED PROBLEMS

There are a number of calculations that may be done without any new technical ideas which are left t o the reader. For example one may look at the case of more than two constraints, or different distributions for L and W .

One might attempt to improve the running time of the algorithm of Section 11. All points shortest-path can be done in time 0(n3) in a centralized manner [ l ] . Since we are finding nb paths, a time of 0(n4b) may be expected, rather than U(n5b log nb). This might not be hard in a centralized manner, but is probably quite difficult in a distributed manner.

The most interesting technical problem is to find more sophisticated approximation algorithms or table shortening techniques. Included may be methods to shorten all tables at once so that after an entry is removed it will never have t o be placed back for the sake of other entries.

IX. SUMMARY

We have taken an NP-complete problem with important practical applications and shown how t o solve it for all practical purposes. An algorithm solves it in polynomial time if the lengths and weights have a small range of values. We have indicated how to approximate the problem if one only has a small amount of time available.

The author acknowledges useful discussions with F. H. Moss.

References

[ 1 1 A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The Design and Analysis of Com- puter Algorithms. Addison-Wesley, Reading, MA (1974).

[2] H. B. Dwight, Tables of Integrals and Other Mathematical Data. MacMillan, NY ( 196 1).

[ 3 ] M. R. Carey and D. S. Johnson, Computers and Intractability, A Guide to the Theory of NP-Completeness. Freeman, San Francisco (1979).

[4] J. M. Jaffe, “Algorithms for finding paths with multiple constraints. IBM RC 8205, April, 1980.

116 JAFFE

[ 51 P. M. Merlin and A. Segall, “A failsafe distributed routing protocol. IEEE Trans. Comm. COM-27 (1979) 1230-1237.

[ 6 ] W. D. Tajibnapis, A correctness proof of a topology information maintenance protocol for a distributed computer network. Comm. ACM 20 (1977) 477-485.

Received May 22, 1980 Accepted April 21,1983

Documents

Algorithms for finding paths with multiple constraints