A theorem on the expected complexity of dijkstra's shortest path algorithm

JOURNAL OF ALGORITHMS 6, d@b@8 (1985)

A Theorem on the Expected Complexity of Dijkstra’s Shortest Path Algorithm

KOHEI NOSHITA

Department of Computer Science, Denkitusin University, Choju, Tokyo 182, Japan

Received November 10,1983; accepted June 26,1984

The expected number of times of updating the values of tentative distances assigned to nodes in Dijkstra’s algorithm for the single-source shortest path problem is proved to be at most n log,(2m/n), where m and n are the number of edges and nodes respectively of a given graph. The assumption on probability distributions is as general as Spira’s for the all-pair shortest path problem. As a corollary of this result, an efficient method for implementing the algorithm by means of a llog,(2Wn)j-w heap is presented, which has the upper bound of m + 2n log,n log,(2m/n)/log, log,(2m/n) binary comparisons in average on the decision tree model. 0 1985 Academic Press, Inc.

1. INTR~Duc-~I~N

In this paper we investigate the expected behavior of Dijkstra’s algorithm [3] for the single-source shortest path problem for nonnegatively weighted undirected graphs. Throughout the paper, m and n will stand for the number of edges and nodes respectively of a given graph. We prove that the expected number of times of updating the values of tentative distances assigned to nodes is bounded by

where p is defined to be 2m/n, i.e., the average degree per node. This fact has been reported by the author [lo] empirically with some analysis for complete graphs and certain types of probability distributions. The theorem of the present paper holds for any given co~e!&.ed graph on arbitrary probability distributions. This assumption is identical to Spira’s [12] for the all-pair shortest path problem.

As a corollary of our theorem, we present an expectedly efficient method for implementing Dijkstra’s algorithm, by employing the well-known &heap

400 Ol%-6774/85 $3.00 Copyri&t 0 1985 by Academic Press, Inc. AlI tigbts of reprcduction in any form reserved.

DIJKSTRA'S SHORTEST PATH ALGORITHM 401

method [l]. Wepropose to use a Ilog.pj-heap for representing the set of nodes with tentative distances. The expected complexity of this method is asymptotically

m + 2n log2 n log,p/log, log,c

in terms of the number of binary comparisons performed in the algorithm. This result implies, for example, that the total number of comparisons is close to the number of edges for any relatively dense graph. In the later section of this paper, our expected-case upper bound will be compared in detail with the worst-case upper bound obtained by Johnson [6].

As a remark, we note that the technique used in the proof of our theorem is based on a new idea, which can be contrasted with several analyses recently published ([12, 2, 51). They are all based on almost the same strategy for searching for shortest paths. See also [4] for another different approach. However, all those papers deal with the all-pair problem.

2. BASIC DEFINITIONS AND DIJKSTRA'S ALGORITHM

Let G = (V, E) be an undirected graph. We may assume that G is simple and connected. Clearly, n - 1 < m < n(n - 1)/2, where m = lE1 and n = lV1. The single-source shortest path problem is to determine all the shortest distances from a specified source node s in V to other nodes, where the edge weight w(e) of a nonnegative real number for each edge e in E is given as input together with G and s. For example, w([u, u]) denotes the weight of edge [u, u] in E.

Let S*(u) denote the shortest distance from s to node u in V. For the sake of our discussion, we describe the complete Dijkstra’s algorithm [33:

1 2 3 4 5 6 7 8 9

10 11

wnp:= 0; 6(s) := 0; T := {s}; whileT#Odo

begin FINDMIN: find the minimum value 6(u) in T; Let u* be the node giving this minimum; S*( u*) has now been determined to be 6( u*);

P:= PU {u*};T:= T- {u*}; UPDATE: for u such that [u*, u] in E and u not in P do

if uinTthen 6(u) := min{6(u),6*(u*) + w([u*,u])

elsebegin T:= TU {u}; 6(u) := a+( u*> + w([ u*, u])

402 KOHEI NOSHITA

12 end 13 end 14 end

In the algorithm P stands for the set of nodes whose shortest distances have been determined, T for the set of nodes with tentative distances, and S(u) for the current tentative distance assigned to u in T. For the basic results on the complexity of this algorithm, see, for example, [l, 81.

Now we describe the probability distributions. For brevity, R will denote the set of nonnegative real numbers. Consider a random variable X over R, and choose an arbitrary probability density function f(x).

Let m “mutually independent” random variables IV,, W,, . . . , W, corre- spond to each of m edges e,, e2,. . . , e, in E in this order, and let every random variable K have the identical density function f(x) (1 I i 5 m). Our assumption is that each weight w(ei) of edge ei has a value wi of Wi(l 5 i S m). Let

w = W,,w,,...,w,)

denote a random variable over R”. We call input (G, S, w) for our shortest path problem a “random input,”

where G and s are arbitrarily fixed and the edge weights function w is given as assumed above. Clearly, our definition of random input is identical to Spira’s [12]. Note that we need not impose any restriction on our density function f(x), as will be seen from the proofs in the next section. (The notion of random input may be general&d to the case of directed graphs by following the definition by Bloniatz [2].)

3. EXPECTED NUMESER OF UPDATING TIMES

In this section we prove the main theorem of this paper on the expected behavior of Dijkstra’s algorithm. Throughout the discussion, G, s and f are assumed to be fixed arbitrarily.

DEFINITION 3.1. When the algorithm is executed for edge weights w = (w1, wz, *. *, w,,,), wi in R and w(ei) = wi for 1 I i s m, let V,(w) denote the total number of times of updating 6(u) in UPDATE (line 9), for node u in V.

In other words, V,(w) denotes how many times the value of S(u) has been replaced by a new smaller value in the assignment of line 9, when the algorithm terminates. Clearly, V,(w) = 0.

We start to study the upper bound of the expectation of V,(W) for random input.

DIJKSTRA’S SHORTEST PATH ALGORITHM 403

DEFINITION 3.2. Let E[U,,] denote the expected value of U,.

DEFINITION 3.3. E, = {ele = [u, u] in E, u in V}, pLv = IE,I (i.e., the degree of u).

Now we are ready to state the key lemma.

LEMMA 3.1. E[U,] I lo&p,.

Before proving this, we present the main theorem of this paper. We define the total number of updating times U(w) by summing up U,(w) over all nodes in V.

MAIN THEOREM. E[U] I n log,p, where

WV = c v,(w) and p = 2m/n. oin Y

Proof.

0 in V

5 log,(2m/n)” = n log,p.

Here the second inequality is the arithmetic-geometric inequality. 0 In the rest of this section we prove Lemma 3.1. First we need the

following basic lemma. (See [9, pp. 94-991 for a related subject.)

LEMMA 3.2. Let z = (zl, z2,. . . , ZJ be the values of p independent identically distributed random variables, and let M denote the number of indices j (> 1) which satisfv

zi > zj foralli (1 I i <j). (1)

Then the expected value E(M) of M satisfies

E(M) zz i l/i < log,p. i-2

Proof. For 2 5 j s cc, let Mj be the random variable whose value is 1 iff j satisfies (1) and 0 otherwise. Clearly M = &Mj. To complete the proof we show that E(Mj) s l/j. For 1 5 i < j, let Aij be the event that zi is strictly smaller than all the Z~S, 1 s k 5 j, k f i. { Aij}ii,, are pairwise disjoint, and by symmetry they all have the same probability. Thus E(Mj) = Prob(A,) 5 l/j. 0

404 KOHEI NOSHITA

We assume that a node u (Z S) is arbitrarily chosen and fixed. Now we will observe how the tree representing shortest paths grows during the execution of the algorithm for input weights ( wl, w,, . . . , w,,,). Here we may assume that ( wl, w2,. . . , wA) is the set of weights of edges which are not incident to node u, and that (wh+t,. . . , w,,,) is for edges incident to u. Clearly, X = m - pL,. For convenience, we shall use p instead of cl0 until the end of this section. Define w = (x, y), where

and

x = (x1 )..., x*) = (WI,.. .,WA),

Y = (Yl,..., Y,) = (wx+1~...>w,).

Also similarly define

X = (X,,...,X,) and Y = (Y,,...,y,) where W = (X,Y).

We now show that the expected value E[U,(X, Y)],, ,J of U,(X, Y) for any fixed given value x of X is less than or equal to log, cc. This will clearly complete the proof of Lemma 3.1.

First we assume that (x1, . . . , xJ is arbitrarily fixed (i.e., w(ei) = xi for e, not in E, (1 5 i s X)) and that

yi = w( eh+i) = + 00 for e,,, in E, (1 I i I p).

Then execute the algorithm for this input (G, s, w). After the execution, we have obtained a sequence of nodes, denoted by K,,

such that ui is the ith node having been moved from T to P; namely, ui is the ith node of the growing tree rooted with s which represents the shortest paths. Clearly u1 = s and 1 I h s n - 1. We denote this tree by H. Obviously we have the following inequalities on the shortest distances:

6*( 241) = 0 I i3*( u2) s . . . I a*( Uh).

We examine how each edge ei (X + 1 s i ts m) in E, is connected to this tree H. Some of the edges are incident with some z+ in K,. Let k be the number of those edges [r+, u]s in E,. There may be other edges in E, which are not incident to any z+.

Now we set the values of edge weights in E, to be y, i.e.,

wh+J = Yi in R (1 I i s p),

as xis remain fixed as before (1 I i I h). Let (zr, z2,. . . , zk) be the

DIJKSTRA’S SHORTEST PATH ALGORHT-IM 405

sequence such that zP = w([q, u]) for some edge [q, u] in E, (1 5 p ~2 k), satisfying p < q if zP = w([ui, 01) and zq = w([ Us, u]) and i < j. Obviously, there exists some permutation ?I on (1,2,. . . , p), such that (zl, z2,. . . , zk) is the initial subsequence of n(y). Assume that, for a given x, IT is chosen and fixed. Now we execute the algorithm for this partly new input (G, s, w). The algorithm determines

which represents the sequence of nodes whose shortest distances have been determined in this order. Here ug is the last node that has been added to K, before the shortest distance of u is determined (g I; II). The most important point here is that the sequence (aI, u2,. . . , u8) is always an initial subsequence of K, for any y. This fact is obvious by noting the way nodes are moved from T to P. Note that g depends on y.

We will see how the tentative distance 6(u) has been updated, until u is put into K, as the next node of u!, namely, until u is moved from T to P. When the shortest distance 6*( uj) IS determined (line 5 in the algorithm) for any j (1 < j 5 g), the value 6(u) is updated (actually changed in line 9), if and only if the current 6(u) satisfies

S(0) > S’(Uj) + zq,

where [ uj, u] in E, and zq = w([ uj, u]). The latter condition can be rewritten as follows:

(Cl) For all i (1 < i < j) such that [ui, u] is in E,,

6*( Ui) + Zp, > s*( uj) + 'qv

where zP. = wt[“i7 ul)* Hence’ we have the following necessary condition (C2) for (Cl), since

6*( ui) I 6*( uj) for any i < j by the definition of K,.

(C2) For all i (1 I i < j) such that [I(~, u] is in E,,

where zP, = w([“i, ul)m Of course, (C2) does not always mean that actual updating happens to 6(u), because it is just a necessary condition. Still it is useful to obtain an “upper” bound. Clearly, the sequence of zP,s, with zq as the last element, defined in (C2) constitutes an initial subsequence of r(y). By applying Lemma 3.2 to s(Y), we have

E[U,(X,Y)l,,.] = E[~,(X~(Y))I,-,] s low‘,

406 KOHEI NOSHITA

since both a(Y) and Y are p independent identical random variables, i.e., r(Y) = Y. This is the desired inequality.

4. IMPLEMENTATION 0F THE ALGORITHM

As an application of our main theorem, we present an expectedly efficient method for implementing Dijkstra’s algorithm. The basic idea is to use a priority queue, in particular a p-heap, for representing the set T [l], which was devised in [7, 61, and some others around 1972. Johnson [6] has derived the following worst-case upper bound, denoted by J, in terms of the number of binary comparisons on the decision tree model:

J = n/310gSn + mlog,+.

The first term of J comes from FINDMIN, while the second is due to UPDATE. As is well known now, if we choose /3 = m/n, we can obtain the following upper bound

2 m log n/log( m/n ) .

The base 2 of log is assumed to be omitted. Also we may choose /3 = (m/n)/log( m/n), leading to an asymptotically better bound

(1 + o(l))(mlogn/log(m/n)) if m/n + 00.

For the comparison below, we use m log n/log( m/n) for the worst-case upper bound, denoted again by J. Based on the same idea as above, we have an upper bound in the expected case.

THEOREM 4.1. The upper bound of the number of comparisons by the [log, p 1 -heap method is

m + 2nlognlog,~/loglog~

in the expected case for random input, where p = 2m/n 2 8.

Proof. When we derive the worst-case upper bound above, m logsn comparisons have been considered for UPDATE throughout the algorithm. In the expected case, however, we may replace this term by

m + nlog,plogBn

from our main theorem. Note that the first term m is due to one obligatory comparison in line 9. Choosing /3 = log,p, we obtain the upper bound:

n/310gsn + m + nlog,ploggn = m + 2nlognlog,/,4/logl6g,p. Cl

DIJKSTRA’S SHORTEST PATH ALGORITHM 407

Note that we may choose #I = log, p/log log, p in a similar way as above, reducing the constant factor 2 of the second term to (1 + o(1)). Also note that in case ~1 < 8, a simple binary heap can be used.

Now we compare the two upper bounds in terms of the asymptotic number of comparisons. Let N denote our expected-case upper bound in the theorem.

Case 1. [m = nlogn],

J = mlogn/loglogn,

N = (1 + o(l))(2/loge)mloglogn/logloglogn.

Case 2. [m = n(log PI)“, where k is constant (> l)].

J = (l/k)mlogn/loglogn,

N = (1 + o(l))m.

Case 3. [m = n’+A, where X is constant (0 < X < l)].

J = (l/A)m,

N = (1 + o(l))m.

Note that, in the extreme cases, if m = cn, where c is constant, both J and N are of order n log n, and for complete graphs J = N = (1 + o(l))m.

5. CONCLUDING REMARKS

The technique devised for proving the main theorem may be applied to other network problems. prim’s algorithm [ll] for the minimum spanning tree problem is an example (see [lo]). It may be worthwhile for practitioners to note that our [log,c(]-heap method can be efficiently implemented on conventional computers in terms of the total running time, as well as in terms of the number of comparisons as we have studied in this paper.

ACKNOWLEDGMENTS.

The author expresses his thanks to his colleague Hajime Machida and Professor Akihiro Nozaki of ICU for their discussion on this subject, and to the referee for suggesting really lots of valuable improvements, by showing, among others, Lemma 3.2, which fully generslised the main result.

408 KOHEI NOSHITA

REFERBNCE~

1. A. V. AHO, J. E. HOPCROFT, AND J. D. ULLMAN, “The Design and Analysis of Computer Algorithms,” Addison-Wesley, Reading, Mass., 1974.

2. P. BLONIARZ, A shortest path algorithm with expected time Q( n2 log n log*n), SIAM J. Comput. 12, No. 3 (1983), 588-600.

3. E. W. DIJKSTRA, A note on two problems in connexion with graphs, Numer. Math. 1

(1959), 269-271. 4. M. L. FREDMAN, New bounds on the complexity of the shortest path problem, SIAM J.

Comput. 5, No. 1 (1976), 83-89. 5. A. M. FRIEZE, “On Random Shortest Path Problems,” Technical Report, Department of

Computer and Statistics, Queen Mary College, University of London, March 1982. 6. D. B. JOHNSON, Efficient algorithms for shortest paths in sparse networks, J. Assoc.

Comput. Mach.. 24, No. 1 (1977), 1-13. 7. E. L. JOHNSON, On shortest paths and sorting, in “Proc. ACM 25th Annual Conference,”

Boston, Vol. 1, August 1972, pp. 510-517. 8. V. KLEE, Combinatorial optimization: What is the state of the art, Math. Oper. Res. 5, No.

1 (1980), l-26. 9. D. E. KNUTH, “The Art of Computer Programming, Vol. 1, Fundamental Algorithms,”

Addison-Wesley, Reading, Mass., 1968. 10. K. NOSHITA, E. MASUDA, AND H. MCHIDA, On the expected behavior of the Dijkstra’s

shortest path algorithm for complete graphs, Inform. Process. Z..ett. 7, No. 5 (1978), 237-243.

11. R. C. PRIM, Shortest connection networks and some generalizations, ESTJ 36 (1957). 1389-1401.

12. P. M. SPIRA, A new algorithm for finding all shortest paths in a graph of positive arcs in average time O(n2 log* n), SIAM J. Comput. 2, No. 1 (1973). 28-32.

Documents

A theorem on the expected complexity of dijkstra's shortest path algorithm