Modifying Dijkstra's Algorithm to Solve Many Instances of ... · Dijkstra’s algorithm works by incrementally building a \shortest-path-tree" S outward from s, one node at a time

$Page 1: Modifying Dijkstra's Algorithm to Solve Many Instances of ... · Dijkstra’s algorithm works by incrementally building a \shortest-path-tree" S outward from s, one node at a time$
University of Colorado, BoulderCU ScholarAerospace Engineering Sciences FacultyContributions Aerospace Engineering Sciences

Spring 4-12-2015

Modifying Dijkstra's Algorithm to Solve ManyInstances of SSSP in Linear TimeMichael Wilson OtteUniversity of Colorado at Boulder, [email protected]

Follow this and additional works at: http://scholar.colorado.edu/asen_facpapers

Part of the Artificial Intelligence and Robotics Commons, and the Theory and AlgorithmsCommons

This Technical Report is brought to you for free and open access by Aerospace Engineering Sciences at CU Scholar. It has been accepted for inclusionin Aerospace Engineering Sciences Faculty Contributions by an authorized administrator of CU Scholar. For more information, please [email protected].

Recommended CitationOtte, Michael Wilson, "Modifying Dijkstra's Algorithm to Solve Many Instances of SSSP in Linear Time" (2015). Aerospace EngineeringSciences Faculty Contributions. 1.http://scholar.colorado.edu/asen_facpapers/1

http://scholar.colorado.edu?utm_source=scholar.colorado.edu%2Fasen_facpapers%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages

http://scholar.colorado.edu/asen_facpapers?utm_source=scholar.colorado.edu%2Fasen_facpapers%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages


http://scholar.colorado.edu/asen?utm_source=scholar.colorado.edu%2Fasen_facpapers%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages


http://network.bepress.com/hgg/discipline/143?utm_source=scholar.colorado.edu%2Fasen_facpapers%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages



http://scholar.colorado.edu/asen_facpapers/1?utm_source=scholar.colorado.edu%2Fasen_facpapers%2F1&utm_medium=PDF&utm_campaign=PDFCoverPages

mailto:[email protected]

Modifying Dijkstra’s Algorithm to Solve ManyInstances of SSSP in Linear Time

Michael Otte

University of Colorado at BoulderDept. Aerospace Engineering Sciences

Technical Report12 April 2015

Abstract. We show that for graphs with positive edge weights thesingle-source shortest path planning problem (SSSP) can be solved us-ing a novel partial ordering over nodes, instead of a full ordering, with-out sacrificing algorithmic correctness. The partial ordering we investi-gate is defined with respect to carefully chosen (but easy to calculate)“approximate” level-sets of shortest-path length. We present a family ofeasy-to-implement “approximate” priority heaps, based on an array oflinked-lists, that create this partial ordering automatically when useddirectly by Dijkstra’s SSSP algorithm. For graphs G = (E, V ) with pos-itive edge lengths, and depending on which version of the heap is used,the resulting Dijkstra variant runs in either time O(|E|+ |V |+K) withspace O(|E| + |V | + `max

`min) or time O((|E| + |V |) logw( `max

`min+ 1)) with

space O(|E|+ `max`min

logwd `max`mine), where `min and `max are the minimum

(non-zero) and maximum (finite) edge lengths, respectively, and w is theword length of the computer being used (e.g., 32 or 64 in most cases), andK is a function of G such that K = O(|E|) for many common types ofgraphs (e.g., K = 1 for graphs with unit edge lengths). We also describea linear time pre-/post-processing procedure that extends these resultsto undirected graphs with non-negative edge weights. Thus, it possibleto solve many instances of SSSP in O(|E| + |V |); for these instancesour method ties the fastest known runtime for SSSP, while having sig-nificantly smaller constant factor overhead than previous methods. Thiswork can be viewed as an extension of Dial’s SSSP algorithm that is ableto handle floating point edge weights, yields faster runtime, and is basedon new theoretical results.

1 Introduction

Finding the shortest path through a graph G = (E, V ) of nodes V and edgesE is a classic problem. The variation of the problem known as the “singlesource shortest-path planning problem” (or SSSP) is concerned with findingall of the shortest-paths from a particular node s ∈ V to all nodes v ∈ V , whereeach edge ε ∈ E is associated with a length ‖ε‖. The first algorithm that solvesSSSP was presented by Dijkstra in the 1950s using an algorithm that runs inO(|E|+ |V |2) time for the case of non-negative edge lengths [8]. Over the years,

more sophisticated priority heap data structures have reduced the runtime toO(|E|+ |V | log |V |) for an algorithm presented by [10], and for which the au-thors remark is the fastest time bound we can ever hope to achieve for generalSSSP with non-negative edge weights (‖ε‖ ≥ 0). Undaunted, more recent workhas yielded algorithms that boast even faster theoretical performance for subsetsof SSSP.

[5] uses an approximate heap data structure to achieve runtimeO(|E|+ C|V |)in space O(|E|+C|V |) for the case of non-zero C-bounded integer edge weights(0 < ‖ε‖ < C <∞). [22] presents an extension of [5] that runs in time O(|V |+|E| + L), where L is the length of the longest path. [3] extend [5] to requireless space, O(|E| + |V | + C), as well as time O(|E| + |V |(B + C/B)) or timeO(|E + |V |(∆+ C/∆)), where B < C + 1 and ∆ are both user defined parame-ters .

[19] presents a method for C-bounded integer edge weights (0 ≤ ‖ε‖ < C <∞)that runs in O(|E| + |V |), and then extends this to C-bounded floating pointedge weights in [20]. While both of the latter methods represent theoreticalmilestones, they use Atomic Heaps [11], which has led some to criticize theO(|E| + |V |) versions of [19, 20] as being impractical [2, 6, 15]. Indeed, Atomicheaps were created mainly as a theoretical tool and their presentation in [11]

involved a constant factor of 21220

for the sake of readability — the authorsof [11] state “An alternative but less readable method circumvents this require-ment. However, as already noted we are foregoing any pretense of practicality.”A second (more practical) variation is also presented in [19] that runs at theslightly increased time of O(log(C) + α(|E|, |V |)|E|+ |V |), where α(|E|, |V |) isthe inverse Ackermann function using |E| and |V |.

In the current paper we present a new and remarkably simple modificationto Dijkstra’s algorithm that, for SSSP with positive edge weights (‖ε‖ > 0),yields a runtime of O(|E|+ |V |+K), where K is a constant that depends onthe instance of the problem being solved, and the |V | term is dropped in thecase of connected graphs. In general, K ≤ mindmax

`min, `max

`min|V |, where dmax

is the length of the longest finite shortest-path in the final solution, `min and`max are the minimum and maximum finite non-zero edge lengths in the graph,respectively. Tighter bounds likely exist for particular classes of graphs. Notethat (assuming a finite number of nodes in the graph) infinite lengthedges and shortest-paths are allowed ; however they do not directlyinfluence the runtime. We also present a slightly less-simple version of ourheap that runs in time O((|E|+ |V |) logw( `max

`min+ 1)), where w is the word size

of the computer being used, e.g., 64. For many classes of graphs and instancesof SSSP one or both of these variations runs in linear time, and in many morecases they are faster and/or requires significantly less overhead than other knownmethods. Finally, we also show how these result can be extended to SSSP withnon-negative edge weights (‖ε‖ ≥ 0) for the special case of undirected graphs.

The contributions of this paper are threefold:

1. Theoretical: The presentation/analysis of a new variant of Dijkstra’s algo-rithm that ties the fastest known linear O(|E|+|V |) and O(|E|) time boundsfor many instances of SSSP and SSSP over connected graphs, respectively.

2. Practical: The description of a relatively simple data structure and corre-sponding Dijkstra’s variant that can be implemented by anybody familiarwith arrays and linked-lists.

3. Conceptual: The dissemination of a new insight about the SSSP that in-spired these heap modifications, and which we hope will enable new and evenbetter algorithms.

The rest of this paper is organized as follows. Section 2 provides an overviewof the insight that lead to the new modifications of Dijkstra’s algorithm, as wellas a high-level overview of the method. In Section 3 we survey related work. InSections 4 and 5 we define our nomenclature and formally introduce the SSSPproblem, respectively. The data structures required for our modifications are pre-sented in Section 6, and the analysis (of completeness, runtime, and runspace)of the resulting variant of Dijkstra’s algorithm in Section 7. The extension toundirected graphs with non-negative edge weights appears in Section 8. We con-clude with a few remarks in Section 9 and a summary in Section 10. Algorithmicdetails of a bit-tree required for the “less-simple” heap modification appear inthe appendix.

2 Intuition

Dijkstra’s algorithm works by incrementally building a “shortest-path-tree” Soutward from s, one node at a time (Dijkstra’s algorithm appears in Algo-rithm 1). Each node that is not yet part of the growing S refines a “best-guess” D(v) of its actual shortest-path-length d(v), with the restriction thatd(v) ≤ D(v). Dijkstra’s algorithm guarantees/requires that d(v) = D(v) for thenode in V \ S with minimum D(v). In modern versions of the algorithm, a min-priority-heap H is used to keep track of D(v) values.

The min-priority-heap H is initialized to empty, best-guesses D(v) are ini-tialized to ∞, parent pointers p(v) with respect to the shortest path tree S areinitialized to NULL, and the start node s is given an actual distance of 0 fromitself, lines 1-5, respectively.

Each iteration involves “processing” the node v ∈ V \ S with minimum D(v)lines 8-13. Such a node v is extracted from the heap on line 14 (in the first itera-tion we know to use s, line 6). Next, each neighbor u of v checks if d(v) + ‖(v, u)‖ < D(u)(i.e., if the distance from s through the shortest-path-tree to v plus the dis-tance from v to u through edge (v, u) ∈ E is less than u’s current best-guess).If so, then u updates its best-guess and parent pointer to reflect the betterpath via v, lines 10-13. In other words, all neighbors u of v perform the updateD(u) = min(D(u), d(v) + ‖(v, u)‖). The heap is adjusted to account for changingD(u) on line 13.

Dijkstra’s original algorithm is provably correct (see [8]), based on guaranteesthat the next node v processed at any step has the following properties:

Algorithm 1: Dijkstra(G, s)

Input: A graph G = (E, V ) of node set V and edge set E, and a start nodes ∈ V .

Output: Shortest path lengths d(v) and parent pointers p(v) with respect tothe shortest path-tree S for all v ∈ V .

1 H = ∅ ;2 for all v ∈ V do3 D(v) =∞ ;4 p(v) = NULL ; /* S = ∅ */ ;

5 D(s) = 0 ;6 v = s ;7 while v 6= NULL do8 d(v) = D(v) ; /* S = S ∪ v */ ;9 for all u s.t. (v, u) ∈ E do

10 if d(v) + ‖(v, u)‖ < D(u) then11 D(u) = d(v) + ‖(v, u)‖ ;12 p(u) = v ;13 updateValue(H,u) ;

14 v = extractMin(H) ;

1. v ∈ V \ S.

2. Either v = s or v has some neighbor u such that u ∈ S.

3. D(v) ≤ D(v′), for all nodes v′ ∈ V \ S.

Thus, as has often been remarked, Dijkstra’s algorithm works by finding anordering on d(v) for all v ∈ V . The priority heap data structure enforces (3) anddetermines this ordering.

The runtime of the heap is a major contributing factor to the overall runtimeof Dijkstra’s algorithm. Indeed, every reduction in Dijkstra’s theoretical runtimebounds has been due to the discovery of better heap implementations or heapimplementations that are more amenable to the SSSP. Our paper is no exceptionto this trend. Before we describe the implementation details of the particularheap we use, we start by sharing the insight into the SSSP that lead us tochoose it.

Consider the 4-grids depicted in Figure 1. The start node s is located at thelarge black node. Nodes are colored in alternating colors based on the level-setof d they belong to. Note that all edges either connect nodes within a particularlevel-set, or connect the nodes of adjacent level-sets. Recall that Dijkstra allowsus to break ties arbitrarily. This means that any node in a particular level setmay be processed before or after any other node in the same level set withoutaffecting the correctness of the algorithm. The only thing required for correctnessis that all nodes in the k-th level-set are processed before any of those in the(k + 1)-th. Thus, for this simple case, we do not need to go through the troubleof calculating a full ordering on the nodes of V — it turns out that a partial

Fig. 1: A 4-connected grid graph (left) and a subgraph created by removing edges(right) are used to show the basic intuition behind our method. The start nodes is located at the large black node. Nodes v of the same color and lightnessare in the same level-set with respect to shortest-path length d(v). The origi-nal Dijkstra’s algorithm processes all nodes from a particular level-set (removesthem from the heap) before any nodes of the next level-set are processed. Also,all nodes from a particular level-set are processed before any nodes of the samecolor but different lightness are added to the heap. We show how this idea canbe generalized such that Dijkstra’s algorithm is still complete when nodes areprocessed according to a partial ordering induced by the carefully chosen “ap-proximate” level-sets (instead of using a full ordering on d(v) as usual).

ordering based on the level-sets of d is sufficient! This is exciting because partialorderings are much faster and easier to calculate then full orderings.

The aforementioned observation was first documented by [5] and used for thecase where non-negative edge weights fall into a finite number of level sets thatare known a priori. However, this idea can be extended to cases where multiplenodes do not naturally fall into the same level-sets (the 4-grid is a special casewhere nodes fall into level-sets along integers because all edge lengths are 1).In fact, we show that we can get away with grouping nodes into approximatelythe same level-sets, as long as we take a few precautions. In particular, we cangroup together nodes u and v such that |d(u) − d(v)| < cu,v, where cu,v is aconstant, as long as cu,v is chosen such that nodes within each group will neverbe descendants of each other in any valid shortest-path-tree of the particularSSSP instance being solved. Taking this precaution allows us to process nodesin the top-most group in any order. Thus, we are always able to process the firstnode in the heap — even if it is not the one with the shortest path estimate!(This contrasts with [3] which must scan past each node in the top-most groupup to ∆ times).

While there are likely countless ways to choose the aforementioned constantcu,v (each representing another Dijkstra variant), we choose to use the length ofthe shortest edge, cu,v ≡ `min = min(v,u)∈E(‖(v, u)‖). This is a method that isstraightforward to implement and analyze, and has fast theoretical runtime onmany graphs and many instances of SSSP. The use of `min is motivated by the ob-

servation that: while processing v it is impossible for any of its neighbors u to ex-perience a best-guess update fromD1(u) toD2(u) such thatD2(u) ≤ d(v) + `min(because the edge between u and v is at least as long as `min). Thus, by defin-ing the 0-th, 1-th, 2-th, 3-th, etc. group based on d(v) that fall, respectively,into the ranges [0, 0], (0, `min], (`min, 2`min], (2`min, 3`min], . . ., we can guaran-tee that the members of a particular group cannot be descendants of each otherin any shortest-path. The members of a particular group may be processed inany order, but all members of group k must be processed before we move ontogroup k + 1. Figure 2 depicts the intuition of this process and a formal proof ispresented in Section 7.

Dijkstra’s algorithm processes nodes in the order of increasing heap-keys, andso we only need to guarantee that we have placed all of the appropriate nodesinto the k-th approximate level-set before we start processing any nodes fromthat k-th approximate level-set. This is exactly what the heap we use does. Theuse of `min is also a convenient because it can be found in a single O(|E|) passover E.

A related useful insight is that there are no (finite length) edges betweennodes that end up in level sets more distant than the maximum (finite) edgelength `max = max(v,u)∈E s.t. ‖(v,u)‖<∞(‖(v, u)‖). This means that the algorithm

will interact with nodes in a contiguous band of only `max

`min+ 1 approximate level-

sets at any given time (in addition to the level set at ∞), and the d(v)-valuesof the particular band of interest are non-decreasing as the algorithm runs. InSection 7 we show how this can be used to design a memory efficient version ofthe heap that requires space O(|V |+ `max

`min), allowing Dijkstra’s algorithm to run

in space O(|E|+ |V |+ `max

`min).

3 Related Work

Dijkstra’s algorithm was originally presented in [8] with a runtime ofO(|E|+ |V |2).Note that for some disconnected graphs it is possible that |E| < |V |; therefore,we choose to report runtimes in terms of both |E| and |V |, even though termsinvolving |V | are often omitted in the literature due to the fact that |V | = O(|E|)for connected graphs. A heap by [27] yields O(|E| log(|V |)+ |V |), another by [10]

gives O(|E|+ |V | log |V |). New heaps by [11] yield O(|E|+ |V | log |V |log log |V | ) but have

the rather large constant factor overhead of 21220

.With regard to expected running times over randomizations, [12] gives ex-

pected time O(|E|√

log |V |+ |V |), and heaps by [21] yield the expected timesof O(|E| log(log |V |) + |V |) and O(|E| + |V |(log |V |)(1+ε)/2). More recent heapsby [16] and [17] give expected times O(|E| + |V |(log |V | log log |V |)1/2) andO(|E|+ |V |(log |V |)(1+ε)/3), respectively.

The special case of SSSP in which there are z distinct positive edge lengths

can be solved in time O(|E| + |V |) if z|V | ≤ 2|E| and O(|E| log z|V ||E| + |V |),

otherwise, using [14]. SSSP on planar graphs can be solved in O(|V |√|V |) with

a method by [9], note that |E| = O(|V |) for planar graphs.

s

s

dmax

`max

`min

B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 B∞

Fig. 2: Color depicts approximate level-sets of d (shortest-path lengths) from allnodes to s. Top: Edges in the shortest path tree are solid, while other edges aredashed. Bottom: A linear depiction of path length level sets, essentially the resultof dangling the shortest-path tree by s, and then setting it horizontally across thepage (vertical distance in bottom sub-figure is for illustrative purposes only, d isproportional to horizontal distance from s). We divide nodes into approximatelevel-sets (or buckets) Bk, depicted via different colors/repeats. The length ofeach bucket is defined by the minimum edge length `min and each run of buckets(i.e., set of non-repeating colors) is no longer than `max+`min, where `max is themaximum edge length. By construction, no edge in the shortest path tree travelsbetween two nodes in the same bucket and no edges travel beyond a single run.

[5] uses an approximate heap data structure to achieve runtimeO(|E|+ C|V |)in space O(|E| + C|V |) for the subset of SSSP with C-bounded integer edgeweights (0 ≤ ‖ε‖ < C <∞). This is extended by [22] to floating point edgeweights with a runtime of O(|V |+ |E|+L), where L is the length of the longest

path. [5] is also extended by [3] to require less space, O(|E|+ |V |+ C), as wellas time O(|E|+ |V |(B+C/B)) or time O(|E|+ |V |(∆+C/∆)), for user definedparameters B < C + 1 and ∆. The algorithms of [3,5,22] are arguably the mostsimilar to our own. Indeed, the idea of using an array of linked-lists appearsin [5], and [3] suggests using a looping structure to save space. The primarycontribution of our work beyond [3, 5, 22] is a principled way to stratify nodeswithin the data structures such that: (1) better the time and space bounds areachieved, (2) user parameters are eliminated, (3) both integer and floating pointedge weights can be handled, and (4) the first node in the top bucket may alwaysbe processed (in contrast to [3] which may scan each node ∆ times). We alsopresent an additional alternative data structure that yields better runtime whenpath-length level sets are sparsely populated.

For the subset of SSSP involving an upper-bound C on non-infinite edgelengths [19] remarks that O(|E| log(logC)+|V |) runtime can be achieved by run-ning Dijkstra’s algorithm with the priority heaps presented by [13,23,24]. Subse-quent heaps by [1], [4], and [17] respectively yield times of O(|E|+|V |

√logC) and

|E|+V (logC log logC)1/3 (expected) and O(|E|+ |V |(logC)1/4+ε). Finally, [19]presents a method for the case of C-bounded integer edge weights that runs inO(|E| + |V |), and then extends this to C bounded floating point edge weightsin [20]; however, as mentioned earlier, these are mainly of theoretical signif-icance (i.e., instead of practical) due to their use of Atomic Heaps. A varia-tion of this algorithm achieves the slightly worse theoretical runtime bound ofO(log(C) + α(|E|, |V |)|E| + |V |), where α(|E|, |V |) is the inverse Ackermannfunction using |E| and |V |.

[25, 26] present a modifications to [20] that achieve better practical perfor-mance by removing the necessity of an “unvisited node structure” but do notchange theoretical runtime bounds (or the constant factor). [18] perform an em-pirical evaluation showing that a simple binary heap outperforms state-of-theart implementations on many practical problems, despite having worse runtimebounds.

It worth mentioning that both [19, 20] and [25, 26] capitalize on the insightthat the SSSP can often be solved using a partial orderings over nodes (insteadof full orderings) as long as the groups into which nodes are divided are guar-anteed not to affect each other during processing. Moreover, they also makethe observation that such a valid partial ordering can be created by requiringedges between each subset to have length at least δ. The main conceptual dif-ference between these previous works and our current paper is in the details ofthe partial orderings that are used. These differences cause both (A) theoreticalramifications regarding the subset of SSSP for which a particular algorithm canachieve linear runtime, and (B) practical differences affecting ease of implemen-tation and performance (including the fact that we do not suffer from the atomicheap overhead of [19, 20]). The particular partial ordering used in our currentpresentation is grounded in the notion of shortest-path-distance level sets, andwe believe this makes our method much easier to understand and analyze thenthe alternative partial orderings used in previous work (e.g., [19, 20] builds a

dependency-tree of groups using a modified spanning-tree algorithm, and thenprocesses nodes according to the relationships encoded in the dependency-tree).

It is also worth mentioning that our method is only applicable to SSSP withpositive edge weights, as well as undirected graphs with non-negative weights ifpre-/post-processing is used. Directed edges of length 0 currently break the algo-rithm, although we are optimistic that our method may eventually be extendedto handle them.

4 Nomenclature

A graph G (either directed or underacted) is defined by its edge set E andvertex set V . We assume that both |E| and |V | are finite. Each edge εij = (vi, vj)between two vertices vi and vj (or from vi to vj if the edge is directed) is assumedto have a predefined length (or edge-length or cost) ‖εij‖ = ‖(vi, vj)‖ such that0 ≤ ‖εij‖ ≤ ∞ iff εij ∈ E. We follow the standard practice of defining ‖εij‖ ≡ ∞if εij 6∈ E, but also discuss an alternative in which such edges are assumed notto exist in Section 9.

A path P (vi, vj) is an ordered sequence of edges ε1, . . . , ε` such that ε1 = (vi, v1)and εk = (vk−1, vk) for all k ∈ 2, . . . , ` − 1 and ε` = (v`−1, vj) and whereεk ∈ P (vi, vj) ⊂ V . The shortest path P ∗(vi, vj) is the shortest possible pathfrom vi to vj . Formally,

P ∗(vi, vj) ≡ arg minP (vi,vj)

∑ε∈P (vi,vj)

‖ε‖

We are primarily interested in paths from nodes v to a particular “start-node”s, and define d(v) to be the length of the shortest possible path from v to s.

d(v) ≡ minP (v,s)

∑ε∈P (v,s)

‖ε‖

Each node maintains a non-increasing “best-guess” D(v) of its shortest pathlength, where d(v) ≤ D(v). We define the maximum-minimum finite path lengthas dmax = maxv∈V \v′ | d(v′)=∞ d(v′)

Dijkstra’s algorithm works by incrementally building a “shortest-path-tree”S outward from s. Our variation of Dijkstra’s algorithm relies on knowledge oftwo quantities that can be obtained using a single O(|E|) pass over E. Namely,`max is the length of longest non-infinite edge in E and `min is the length ofshortest non-zero edge in E.

`max = maxε∈E\ε′ | ‖ε′‖=∞

‖ε‖

`min = minε∈E\ε′ | ‖ε′‖=0

‖ε‖ (1)

The most basic implementations of our method assumes that E contains nozero-length edges, E ∩ ε′ | ‖ε′‖ = 0 = ∅ and so `min = minε∈E ‖ε‖. However,

in Section 8 we present an extension to non-negative edge lengths for the caseof undirected graphs, and for which the more general definition of `min fromEquation 1 must be used.

We define β = d `max

`mine, and use β to pick the circumference of a circular

array that is used in our heap data structures. A particular heap data structureis denoted H.

The heaps we present allow Dijkstra’s algorithm to solve SSSP inO(|E|+ |V |+K)and O((|E|+|V |) logw( `max

`min+1)), respectively. w is the word size of the computer

being used (currently 32 or 64 in most computers). K is a constant depending onthe problem being solved; in particular, it is the number of empty approximatelevel-sets of d between the 0-th level-set and the approximate level-set containingdmax.

We shall often refer to the approximate level-sets, as well as the linked liststhat our heap uses to store them, as “buckets” and denote the k-th bucket Bk.

5 Problem

The shortest path planning problem for positive edge weights is de-fined as follows:

Given G = (V,E) such that ‖ε‖ > 0 for all ε ∈ E, and a particular nodes ∈ V , then for all v ∈ V , find the shortest path P ∗(s, v).

The shortest path planning problem for undirected graphs with non-negative edge weights is defined:

Given G = (V,E) such that ‖ε‖ ≥ 0 for all ε ∈ E and for all (u, v) ∈ E thereexists (v, u) ∈ E such that ‖(u, v)‖ = ‖(v, u)‖, and a particular node s ∈ V , thenfor all v ∈ V , find the shortest path P ∗(s, v).

By convention, either problem is considered solved once we have produced adata structure containing both:

1. The shortest-path lengths d(v) for all v ∈ V from s.

2. The shortest path tree that can be used to extract the shortest path from sto any v (at least for any v such that d(v) <∞).

For example, the latter can be accomplished by storing the parent of each nodewith respect to S, allowing each shortest path to be extracted by following backpointers in the fashion of gradient descent from v to s and then reversing theresult.

The reverse (i.e., sink) search that involves finding all paths to s (instead offrom s) can be solved using basically the same algorithm except that the rollsplayed by in- and out- neighbors are swapped and the extracted path is notreversed.

6 “Approximate-Heap” Data Structure

There are three variants of the heap data structure that we present. The firstis mainly used to provide an introduction of concepts and as an analyticaltool. The second and third yield the time bounds of O(|E|+ |V |+K) andO((|E|+ |V |) logw( `max

`min+ 1)), respectively, and are easy and “less easy” to im-

plement, respectively.All three heap variations are combinations of two or three simple and widely

used data structures. The general idea is to store an array of buckets, whereeach bucket is implemented as a doubly-linked list. The basic forms of Heap 1and Heap 2 have previously been described in [5] and [3], respectively; however,the particular stratification that is used is unique to our work and is the sourceof our method’s benefits.

6.1 Heap 1: an introduction and analytical tool

The first and simplest variant heap H is a standard array of L heads of doublylinked lists. We shall refer to the corresponding lists as buckets and denote themB0 through BL−1. We shall also use the convention that nodes that have neverbeen added to the heap are defined to be an implicit bucket B∞. Heap 1 is usefulas a theoretical tool for analysis, and may have applications outside of SSSP,but is it not quite space efficient enough for direct use with Dijkstra’s algorithm(Heap 2, presented in the next section fixes the latter problem).

Let us assume, for the sake of the current discussion, that we are providedwith dmax a priori, where dmax is the length of the longest finite shortest-lengthpath (i.e., we can ignore any infinite length paths when calculating dmax). Theassumption of a priori knowledge of dmax will be dropped in Heap 2. The arrayonly needs to store L = ddmax

`mine+ 1 linked list heads (any node that is ever

supposed to be moved into Bh≥L can remain in B∞, since (by constructionof the heap) that node must eventually receive information about a path shortenough to cause insertion into one of the buckets B0 through BL−1.

Bk holds nodes that currently believe they belong to the k-th approximatelevel set, i.e., ∀v ∈ V \ S such that (k − 1)`min < D(v) ≤ k`min. This enforcesa partial ordering of nodes v ∈ V based on D(v) with the convenient propertythat it is impossible for edges εji from nodes vj ∈ Bh to decrease D(vi) fornodes vi ∈ Bk such that k < h (since doing so would require εji < `min , whichis impossible) — as we prove formally in Section 7.

Dijkstra’s original algorithm requires an initial insertion of all nodes into theheap. The use of the implicit B∞ means that we can ignore this step (this smallmodification does not affect runtime bounds and has been used by others inthe past)1. Thus, membership in B∞ can be determined in O(1) time and does

1In practice, this can be achieved either by allowing nodes to store their statusinternally or by using an additional length |V | array of pointers that are initialized tonull, point to the list-node that holds each node that is currently in a linked list, andreset to ∞ when that node is extracted from the queue.

Algorithm 2: updateValue(H, v) for basic version (Heap 1)

Input: Heap H (of type Heap 1) and a node v.Output: Updates the position of v within H based on D(v), adding v to H if

necessary.1 k = currentBucketID(v) ;2 if k <∞ and (k − 1)`min < D(v) and D(v) ≤ k`min then3 return;

4 if k <∞ then5 RemoveFromList(Bk, v) ;

6 k = bD(v)/`minc ;7 AddToListFront(Bk, v) ;

8 if k < k then

9 k = k ;

not change our results in any fundamental way. We ignore these non-criticalcomplications for now, but discuss an alternative variation in Section 9.

In general, any priority heap must implement the two functions extractMin(H)and updateValue(H, v). extractMin(H) is traditionally responsible for extract-ing the node with the lowest key value. In our case the key value is D(v) andwe design the heap such that it returns node v such that v in the same approxi-mate level-set as the node u with the lowest key-value, formally extractMin(H)returns v such that v, u ∈ B and u = arg minv′∈H D(v′). updateValue(H, v)updates the location of v within H; for example, after D(v) has been modified.

The updateValue(H, v) and extractMin(H) operations for Heap 1 are de-scribed in Algorithms 2 and 3, respectively. We assume that we are provided witha doubly-linked-list data structure that is able to add and remove node v to/froma list B in O(1) time using AddToListFront(B, v) and RemoveFromList(B, v),and able to remove the front node of a list in time O(1) using PopList(B) (whichreturns null if the list is empty). The subroutine currentBucketID(v) returnsthe index of the bucket that currently contains v and can be implemented usingeither the table described above or by allowing v to store the information inter-nally; in either case the time of calling it is O(1). The subroutine IsEmpty(B)return true if the list is empty and false if it is not; this is clearly O(1). Fi-

nally, the heap also maintains the index k of the first nonempty bucket, and kis initialized to 0.

updateValue(H, v) is presented in Algorithm 2. It starts by checking if vis already in a bucket, and then returns if v is already in the correct bucket(lines 1-3). If v is in an incorrect bucket, then it is removed from the incorrectbucket (line 4). Next, the index of the correct bucket for v is calculated, line 6,

and v is added to the appropriate linked list (line 7). Finally, we update k if vhas been placed into an earlier bucket than Bk, lines 8-9. In Section 7 we prove

that k is non-decreasing, and so if this heap is used with Dijkstra’s algorithm

Algorithm 3: extractMin(H) for basic version (Heap 1).

Input: Heap H (of type Heap 1).Output: Node v such that D(v)−minu∈H D(u) < `min.

1 while k < L and IsEmpty(Bk) do

2 k = k + 1 ;

3 if k < L then4 return NULL ;

5 return PopList(Bk);

then k < k never evaluates to true. The final check is included here only for thepurpose of presenting a correct data structure, in general.

extractMin(H) is presented in Algorithm 3. It starts by finding the earliestnon-empty bucket (lines 1-2). If no data is contained then it returns null, lines 3-4, otherwise it removes the node from the front of the earliest non-empty bucketand then returns that node, line 5.

6.2 Heap 2: a ring array of lists

Let β = d `max

`mine. We observe that it is impossible for the processing of vi ∈ Bk

to cause any neighbor vj of vi to be moved to any bucket between array indexk and k + h, where 0 < h ≤ β + 1 (since doing so would require ‖εji > `max‖,which is impossible) — this is formally proved in Section 7.

This means that only β + 1 contiguous bucket levels are ever needed at anytime during the algorithm’s execution. Thus, we can modify the data structureto be more space efficient by simply defining bucket Bk ≡ Bmod(k,β+1). Anotherway of thinking about this is that the array is a loop that overlaps itself everyβ + 1 positions.

While the runtimes for extract min and update value remain essentially un-changed (now requiring an additional modulo operation) we end up using signif-icantly less space. The fact that we can use Bk ≡ Bmod(k,β+1) is also convenient

given that we do not usually know dmax a priori, but β = d `max

`mine can be calcu-

lated in time O(E) using a single pass over E.Readers that dislike assuming modulo and division require time O(1) are

referred to Section 9, where we describe how the modulo and division operationscan be replaced with bit-shift and multiplication operations (and float to integerconversion in the case that edge lengths are floating point numbers).

Versions of updateValue(H, v) and extractMin(H) that are modified for usewith Heap 2 appear in Algorithms 4 and 5, respectively. The main differences vs.Heap 1 are obviously related to the looping nature of the algorithm, in particular,the calculation of k (lines 4.1-4.3), and the check for the next non-empty bucket(line 5.1-5.4).

Heap 2 is complete when properly used in conjunction with Dijkstra’s algo-rithm solving SSSP for positive edge weights (this is proved in Section 7), butmay not be in other applications.

k = 0 B0

B1

B2

v1

v2 v3 v4

v5 v6

v7

k = 1

B0

B1

B2

v2 v3 v4

v7 v5 v6

Fig. 3: Node v1 is removed from the heap H using extractMin(H) (left vs.right). During the processing of v1 its neighbor v7 is added to the heap usingupdateValue(H, v7) (right). Both Heap 1 and Heap 2 use an array of doubly-linked-lists, i.e., buckets B. Heap 2 defines Bk ≡ Bmod(k,β+1) such that buckets

at “index” k loop back around (loop arrow). The ‘v’ graphic is used to indicate

the current position where this looping occurs. Note that it moves down as kdecreases.

k = 1

B9

B7

B8

v6

v1 v2 v3

v4 v5

v7v8

k = 1

B9

B7

B8

v7

v2 v3

v8 v6 v4 v5

Fig. 4: Node v1 is removed from the heap H using extractMin(H) (left vs. right).yellow represents reduced D(v) for v, while light-blue represents movement intoor within the heap. v1 has neighbors v3, v5, v6, v7, v8. Processing v1 cannotreduce D(v3) when v1, v3 ∈ B and so v3 is not moved (this happens because ofthe particular way we have defined the approximate level-set width of buckets).D(v5) is reduced, however (as we show in Section 7), this reduction will neverbe enough to place v5 into the same bucket as v1 (thus, v5 will remain in thenext-lowest bucket B8≡2). D(v6) is reduced such that v6 is moved to B8≡2. Nodev7 is added to H at B9≡0 and v8 is added to H at at B8≡0.

6.3 Heap 3: skipping empty buckets using a bit-tree

We observe that extractMin(H) in Heaps 1 and 2 may end up checking anexcessive number of empty buckets when the distribution of edge lengths is suchthat many approximate level-sets of d are empty (lines 3.1-3.2 and 5.2-5.3). Forexample, this is likely to happen when `min `max. Heap 3 takes precautionsto avoid unnecessary checks of empty buckets.

The basic idea is to use a bit-tree to track which buckets are empty andwhich are not. A bit-tree is a word-based tree of binary flags. In our case, leafnodes are mapped to the empty vs. non-empty status of a particular bucket(and each bucket has a corresponding leaf node in the bit tree). A ‘1’ located

Algorithm 4: updateValue(H, v) for ring array version (Heap 2).


necessary.1 kold = currentBucketID(v) ;2 k = mod(bD(v)/`minc, β + 1) ;3 if kold == k then4 return;

5 if kold <∞ then6 RemoveFromList(Bkold , v) ;

7 AddToListFront(Bk, v) ;

Algorithm 5: extractMin(H) for ring array version (Heap 2)


1 kf = mod(k − 1, β + 1) ;

2 while k < kf and IsEmpty(Bk) do

3 k = mod(k + 1, β + 1) ;

4 if IsEmpty(Bk) then5 return NULL ;

6 return PopList(Bk) ;

in i-th position of the word stored at bit-tree-node τ indicates that one of thedescendants (buckets in our case) of τ ’s i-th child is non-empty. The specificdetails of bit-tree implementation are presented in the appendix; however, it isimportant to know that the bit-tree provides the following accessing functions:

– markPositionNonEmpty(k) informs the bit-tree that Bk is non-empty in timeO(logw(β + 1)).

– markPositionEmpty(k) informs the bit-tree thatBk is empty in timeO(logw(β + 1)).

– nextNonEmptyPosition(k) returns the index h of the next non-empty bucketafter (and including) Bk. If Bk is nonempty then the runtime is O(1), andotherwise the runtime is O(logw(β + 1)).

See Section 7 for the derivation of the runtimes associated with our applicationof the bit-tree, recall that w is the word length of the computer being used.

The versions of updateValue(H, v) and extractMin(H) that are modified foruse with the new data structure appear in Algorithms 6 and 7, respectively. Notethat nextNonEmptyPosition(k) ignores the ability to loop around the array,which is why it is called a second time on line 7.3.

1111

1

111

0

000

111

101

000

111

110

110

100

Fig. 5: A bit-tree is used to facilitate faster extractMin(H) operations in caseswhere K is large (i.e., when there are many empty buckets). Each node in thebit tree is associated with an integer of w bits, where the i-th significant bitrepresents whether or not any of that node’s i-th child’s descendants (self inclu-sive) are associated with non-empty buckets. In this figure w = 3. This allowsus to find the next non-empty bucket in O(logw( `max

`min+ 1)) time. The bit-tree is

described in more detail in the appendix.

Algorithm 6: updateValue(H, v) for ring array with bit-tree version(Heap 3).


necessary.1 kold = currentBucketID(v) ;2 k = mod(bD(v)/`minc, β + 1) ;3 if kold == k then4 return;

5 if kold <∞ then6 RemoveFromList(Bkold , v) ;

7 AddToListFront(Bk, v) ;

7 Analysis

7.1 Correctness of Heap 1

We now show that Dijkstra’s algorithm is correct (terminates in finite time andproduces a shortest path from all nodes v ∈ V ) when using our heaps. In this

Algorithm 7: extractMin(H) for ring array with bit-tree version (Heap3)


1 k = nextNonEmptyPosition(k) ;

2 if k ==∞ then

3 k = nextNonEmptyPosition(0) ;

4 if k ==∞ then5 return NULL ;

6 v = PopList(Bk);7 if IsEmpty(Bk) then

8 markPositionEmpty(k);

9 return v ;

subsection we consider Heap 1, and then extend the results to Heap 2 in thenext subsection (the extension to Heap 3 is implied by the extension to Heap 2).The only significant difference between our version of Dijkstra’s algorithm andthe original is the heap data structure that is used. Therefore, a sufficient prooffor the correctness of our modified Dijkstra’s is to demonstrate that a node v isremoved from the top of our heap only if d(v) = D(v), i.e., the shortest pathfrom v to s has already been computed.

In order to achieve this, we require a few intermediate results regarding theway that nodes in different buckets interact.

Lemma 1. If vi and vj are currently in the same bucket Bk, then D(vi) cannotbe decreased via any edge εji from node vj to node vi.

Proof. (by contradiction). If using εji decreasesD(vi) then clearlyD(vi) > D(vj) + ‖εji‖,which can be rearranged D(vi)−D(vj) > ‖εji‖. By construction ‖εji‖ ≥ `max,and so it follows that D(vi)−D(vj) > `max. However, vi and vj are in the samebucket, and thus D(vi) − D(vj) < `max by construction, which is a contradic-tion. ut

Lemma 2. If vi is in bucket Bk and vj is in bucket Bh, where k < h, thenD(vi) cannot be decreased via edge εji from node vj to node vi.

Proof. By construction D(v) < D(u) for all u, v such that v ∈ Bk and u ∈ Bhwhere h > k. Also by construction all edge lengths, including ‖εji‖, are non-negative. Therefore, D(vi) < D(vj) + ‖εji‖. ut

Recall that “Processing” vi is the act of removing vi from the heap andupdating its neighbors v ∈ vj | (vi, vj) ∈ E with respect to any path-lengthdecreases that can be achieved via edges from vi. Processing vi reduces the pathlength of its neighbor vj if and only if D(vi) + ‖εij‖ < D(vj). A reduced pathlength at vj may decrease the bucket in which vj resides or moves vj into an

initial bucket if it is not already in a bucket. The reasoning in the followingLemma is very similar to that in Lemma 1.

Lemma 3. Processing vi ∈ Bk cannot move vj into Bk.

Proof. (by contradiction). In order for vj to move into Bk then its path length(after processing vi and moving vj into Bk) is D(vj) = D(vi) + ‖εij‖. Rearrang-ing gives D(vj)−D(vi) = ‖εij‖. By construction ‖εij‖ ≥ `min, and substitutinggives D(vj)−D(vi) ≥ `min. However, vi and vj are in the same bucket Bk, andthus D(vj)−D(vi) < `max by construction, which is a contradiction. ut

It is important to note that vj may already be in Bk before vi is processed,Lemma 3 just guarantees that its existence in Bk is not caused by processing vi.

Lemma 4. Processing vi ∈ Bk cannot move vj into Bh, where h < k.

Proof. By construction edges are non-negative, including εij . Moving vj into abucket Bh such that h < k and vi ∈ Bk (immediately prior to the processing ofvi) would require εij < 0. ut

Note that Lemmas 3 and 4 together guarantee that we are able to processbuckets in the order of increasing bucket index.

We are now ready to prove our main result.

Theorem 1. v ∈ Bk is removed from Heap 1 only if d(v) = D(v).

Proof. (By induction).

Base case: B0 contains s and nothing else (since all other nodes are at least as farfrom s as `min). s has the correct shortest path of d(s) = D(s) = 0 by definition,and is processed first by construction.

Inductive step on index k: Assuming nodes v′ from buckets B0 through Bk−1have correctly calculated d(v′) = D(v′), and Bk−1 is now empty, we must provethat all nodes v in Bk are guaranteed to have d(v) = D(v).

More formally, we must demonstrate that d(v′) = D(v′) for all v′ ∈⋃u | u ∈ Bh ∧ 0 ≤ h < k

guarantees d(v) = D(v) for all v ∈ Bk.Lemmas 1 and 2 guarantee that when any v ∈ Bk is processed there are no

nodes in the heap (including Bk) that are currently able to reduce D(v).Lemmas 3 and 4 guarantee that all future heap additions and reshuffling

involving any node u′ ∈⋃u | u ∈ Bh ∧ h ≥ k cannot move u′ into bucket Bk

or a lower bucket Bh, where h < k. This is necessary for the inductive step towork, but more importantly guarantees that, for all v ∈ Bk, no combination offuture heap operations will ever yield a node u′ that could have reduced D(v).

Finally, because all nodes v′ ∈⋃v ∈ Bh | 0 ≤ h < k were processed cor-

rectly (via inductive assumption), we can conclude that all edges from any v ∈ Bkto v′ ∈

⋃v ∈ Bh | 0 ≤ h < k must have been evaluated and, by construction,

the best of them has been used to calculate D(v). Thus, d(v) = D(v) for allv ∈ Bk even before any member of Bk is processed. ut

The correctness of Dijkstra’s Algorithm using our data structure is a corollarythat follows from Theorem 1 and the original completeness proof of Dijkstra’salgorithm.

Corollary 1. Dijkstra’s algorithm using Heap 1 is correct.

7.2 Correctness of Heap 2

When using Dijkstra’s algorithm with Heap 1, during the processing of nodev ∈ Bk it is impossible that any of v’s neighbors will require reshuffling into Bh,where h > k + β, this is formalized in the following theorem, and implies thatwe are able to use the ring-based array of Heap 2 (and by extension Heap 3).

Theorem 2. At most β + 1 levels are needed simultaneously when using Heap1 with Dijkstra’s Algorithm.

Proof. (by contradiction). If we do need more than β + 1 levels, then at somepoint during the algorithm’s execution it must be the case that we are pro-cessing a node vi with a neighbor vj such that: vi is in bucket Bk (prior toprocessing), and vj is in some Bh, where h ≥ β + 1 (after processing). ThusD(vi) ∈

((k − 1)`min, k`min

], andD(vj) ∈

((k + β)`min,∞

), and so ‖εij‖ = D(vj)−D(vi) > β`min.

However, β`min ≥ `max ≥ ‖εij‖ also by construction, which is a contradiction.ut

Corollary 2. Dijkstra’s algorithm using Heap 2 and Heap 3 is correct.

7.3 Runtime and runspace of Heap 1 and Heap 2

We assume a RAM computational model that is similar to the architecturesused by most modern digital computers. We assume that addition, subtraction,multiplication, division, modulo, comparison, type conversion, and pointer oper-ations are all O(1). See Section 9 for an alternative version that replaces divisionand modulo with integer bit shift (and float to integer casting if edge weightsare floating point numbers).

Finding the values `min and `max can be achieved in time O(|E|) by scanningall edges.

Each call to updateValue(H, v) requires O(1) time. In particular, we access abucket (or skip this step if v is not currently in the heap), calculate a new bucketindex and then add a node to a doubly linked list. The former requires a floating-point division and conversion (with truncating) of a floating-point number to aninteger. In Heap 2 we also require a modulo operation.

There are two cases for extractMin(v). In the first, there is at least oneelement in the current ‘top’ bucket; this requires an array look-up to find thebucket’s list head and also the removing of a node from the head of a doubly-linked list. In the second, we must move down the bucket array until we finda populated bucket. In the worst case, most buckets are empty and over theentire run of the algorithm look-up operations require a cumulative time of

K = O(dmax

`min) to move down the array (since there are that many array positions

in Heap 1, and the index of the bucket used for processing is non-decreasing).Alternatively, we observe that Theorem 2 guarantees that there will never beany more than β empty buckets in a row, at least until the heap is empty. Thuswe also know that K = O(β|V |) = O( `max

`min|V |). Together, these observations

provide the following combined bound on K of:

K = O

(min

dmax`min

,`max`min

|V |)

.

updateValue(H, v) is called at most |E| times, while extractMin(v) is calledat most |V | times. Combining results (and remembering to account for the traver-sal over empty buckets), we find that the running time for Dijkstra’s algorithmusing either Heap 1 or Heap 2 is O(|E|+ |V |+K).

The total space required by Heap 1 is an array of size ddmax

`mine plus a collec-

tion of doubly linked lists that cumulatively require no more than |V | storagecontainers, for a heap space requirement of O(|V | + dmax

`min), and thus a total of

O(|E|+ |V |+ dmax

`min) for the modified version of Dijkstra’s using Heap 1.

Heap 2 reduces this by using a ring array that only requires d `max

`mine + 1

positions. Thus the total space required for Heap 2 is O(|V |+ `max

`min), and so the

modified version of Dijkstra’s using Heap 2 requires space O(|E|+ |V |+ `max

`min).

We note that the number of list containers stored at any particular time isno greater than the number of nodes in β+1 contiguous level-set buckets, whichcould be used to calculated tighter bounds in some special cases.

7.4 Runtime and Runspace of Heap 3

Heap 3 uses a bit-tree which additionally assumes the word operations of ‘shiftby x bits’ and ‘return location of the first non-zero bit’ each run in O(1). It pro-vides the subroutines markPositionNonEmpty(k), markPositionEmpty(k), andnextNonEmptyPosition(k). The first two run in time O(logw(β + 1)) (In theworst case, they each involve moving from a leaf of the bit-tree to its root, flip-ping one bit at each level, and the depth of the tree is logw(β + 1)). There aretwo cases for the third; if Bk is nonempty then the runtime is O(1), and other-wise the runtime is O(logw(β+1)) (the worst case involves moving from a leaf tothe root and then back down to another leaf). The first two are each called oncein updateValue(H, v), while the latter is called at most twice in extractMin(v).All together this means that Dijkstra’s algorithm using Heap 3 runs in timeO((|E|+ |V |) logw( `max

`min+ 1)).

Heap 3 also requires additional space vs. Heaps 1 and 2. In particular,O( `max

`minlogwd `max

`mine) words are used to store the word-length tree. For a total

of O(|V |+ `max

`minlogwd `max

`mine) for the Heap 3 alone and O(|E|+ `max

`minlogwd `max

`mine)

for Dijkstra using Heap 3.

8 Undirected graphs with non-negative edge weights

In the case of undirected graphs, Dijkstra’s algorithm using any of the aforemen-tioned heaps can be extended to handle non-negative edges using straightforwardpre-processing and post-processing routines that run in time O(|E|+ |V |). Thekey insight is that, for undirected graphs, if there exists a zero-length path be-tween two nodes v and u, then v and u have the same shortest-path lengths vs.all nodes in V . Thus, we may treat such v and u as a single node for the purposesof SSSP, allowing us to solve instead a dual of the original problem that containsno zero-length edges.

In pre-processing (Algorithm 11 presented in the Appendix) we essentiallycombine each v ∈ V with all nodes that it can reach in 0 distance, i.e., all u suchthat ‖P ∗(v, u)‖ = 0, into a “meta-node” v. The resulting set of meta-nodes is V ,and the resulting set of edges between such meta-nodes is E. This can be donein both time and space O(|E|+ |V |) by using one level of abstraction that alsoadds O(1) operations, vs. the basic version of the algorithm, each time we toucha meta-node or an edge between meta-nodes (in particular, one or two pointeroperations). Note that |V | ≤ |V | and |E| ≤ |E|.

Instead of solving the original problem (V,E), we run Dijkstra’s Algorithmusing either Heap 2 or Heap 3 on the dual problem involving (V , E) while simul-taneously ignoring any edges of the form (v, v). The latter removes zero-lengthedges from consideration without affecting the dual’s solution.

Post-processing (Algorithm 12 presented in the Appendix) involves “unpack-ing” each v ∈ V by transferring d(v) to d(v) for all v that were combined intov during pre-processing. All sub-paths that moves through a sequence of nodesv1, v2, . . . , vi, such that v1, v2, . . . vi are all part of the same v ∈ V , have length0. This fact allows us to recover shortest-path parent pointers with respect tothe original problem in a single O(|E|+ |V |) pass over the original graph andthe dual’s shortest-path tree. For each of the dual’s meta-nodes v ∈ V we per-form breadth first search restricted to zero-length edges (and thus over only thesub-graph containing the nodes from which v was created). Even though one (re-stricted) breadth-first search is performed per each v ∈ V , we only touch eachoriginal edge ε ∈ E once during post-processing, since each v ∈ V is part of ex-actly one v ∈ V , and so post-processing also takes time and space O(|E|+ |V |).

9 Remarks

9.1 Regarding infinite edge lengths

There are two slightly different notions of “infinite” edge weight that the methodspresented above have conflated for notational convenience and the sake of usingstandard practice. In particular, ‘no edge exists between two nodes’ has beencombined with ‘an edge of infinite length exists between two nodes’ due to thedefinition of ‖ε‖ ≡ ∞ for ε 6∈ E.

In cases where this distinction is important, we can regain the distinctionby storing two separate “external” buckets B∞ and B∅ instead of just one, and

using instead ‖ε‖ ≡ ∅ for ε 6∈ E. We implicitly assume that un-inserted nodesreside in B∅. If an infinite-length edge connects v and u, where v ∈ B∅ and uis processed, then we place v into B∞ (which should now be implemented asa linked list instead of an array). Finally, after all nodes have been processedfrom the normal heap H, we process the nodes in B∞, while moving any newlyconnected nodes from B∅ to B∞ (the latter is allowed because all remainingbest-path lengths are either infinite or undefined).

This modification handles infinite length edges as a separate special casethat is guaranteed to run in O(|E|+ |V |) time. Therefore, much of the analysisis unaffected. The only technical detail worth mentioning (indeed, a conventionwe have been using throughout our discussion) is that we must carefully define`max as the longest finite-length edge, and dmax as the longest finite-lengthshortest-path.

9.2 Regarding division and modulo operations

On computational architectures where division takes Ω(1) time, we can storethe reciprocal of `min instead of `min, and then perform multiplication insteadof division. The reciprocal only needs to be calculated once; and so, assuming wehave some way of computing the reciprocal in time O(|E|+|V |), this modificationdoes not affect our runtime analysis.

On (binary) computational architectures where the modulo operation takesΩ(1) we can replace the modulo operation with bit-shifting operations by replac-ing β with β = 2n − 1, where 2n is the smallest power of 2 such that 2n − 1 ≥ d `max

`mine.

This makes the circular array in Heaps 2 and 3 slightly longer than necessary,so that it can be a power of two, but causes all modulo operations to have theform mod(x, 2n), where n is constant. For integers on standard binary architec-tures mod(x, 2n) can be implemented using the up << and down >> shiftingoperations as follows: mod(x, 2n) ≡ (x− ((x >> n) << n)), and which uses O(1)time, assuming that bit shifts and subtraction are O(1).

In cases where we want to remove the modulo operation and there are floatingpoint edge weights, we need to convert x = bD(v)/`minc to an integer from afloat; thus, this strategy still requires that such a casting operation runs intime O(1). With the above assumptions, these modifications do not affect ourasymptotic runtime and space bounds because we are simply using an array oflength β < 2d `max

`mine instead of β = d `max

`mine.

9.3 Regarding runtime

For the subset of SSSP instances involving strictly positive edge weights, aswell as those involving undirected graphs with non-negative edges Section 8, themethods we present tie the previous best known bounds on runtime wheneverK ≤ |V |+ |E|. This happens whenever dmax

`min≤ |V | + |E| and also whenever

`max

`min|V | ≤ |E| (and other cases may also exist).

The constant factors that affect practical performance are easy to calculate(or bound) and give direct insight into when the algorithm should be expected

to perform well or poorly. For example, the method will perform better as therelative difference between `max and `min decreases.

As previously remarked, K is the total number of approximate level sets of dthat turn out to be empty. For many types of graphs K is guaranteed to be small.The simplest case is when all edge weights are the same, and in which case thealgorithm runs in linear time. This happens for problems in which path lengthis defined by the number of edges in the path (e.g., social degrees-of-separation).It also happens on 4-grids and other uniform lattices, which are frequently usedfor path-planning.

In some practical cases of interest (e.g., trajectory library based robotic mo-tion planning) we also have the power to design the graph that we intend toplan over. In these cases we can manually enforce that edge lengths are “close-enough” to each other that the algorithm runs at a desired speed (for example,within a few orders of magnitude of each other).

On the other hand, there are many types of graphs that are not well suited(at least theoretically) to our method. For example, using Heap 3 with Graphsthat are created by sampling points in space uniformly at random (e.g., randomlysampled disc-graphs) will tend to see long-term (but relatively slow) performancedegradation vs. an increasing number of nodes, since the ratio `max

`minis expected

to increase without bound as |V | approaches infinity in that case. On the otherhand, Heap 1 and 2 will suffer due to decreasing `min but also benefit from thefact that the chances a particular level set is empty will approach 0, in the limit,as |V | approaches infinity. A particularly bad case for any version of our heaphappens when a graph simultaneously has relatively few edges but edge lengthsexist at different orders of magnitude.

9.4 Potential for Parallelization

The idea of parallel processing nodes based on any partial ordering that guar-antees best-path-length independence between subsets of nodes is, in general,amenable to Parallelization on multi-core architectures. in order to ensure sucha parallel implementation is correct, synchronization between processors is (only)necessary after each bucket is emptied. For our data structure this will obviouslywork best in cases where each bucket contains many nodes and/or `max and `minare similar.

10 Summary

We have presented a family of new heap data-structures that enable Dijk-stra’s Algorithm to solve the SSSP with positive edge-weights in either timeO(|E|+ |V |+K) and spaceO(|E|+|V |+ `max

`min) or timeO((|E|+ |V |) logw( `max

`min+ 1))

and space O(|E| + `max

`minlogwd `max

`mine), where `min and `max are problem depen-

dent constants and w is an architecture dependent constant. We derive bounds

K = O(mindmax

`min, `max

`min|V |

). Thus, our method is able to solve many instances

of SSSP in linear time, e.g., whenever dmax

`min≤ |V | + |E| and also whenever

`max

`min|V | ≤ |E|.

The method works because, as we prove in Section 7, it is possible to solvethe positive weight SSSP using only a partial ordering over nodes instead of a fullordering, as long as we choose that partial ordering carefully. In particular, byusing approximate level-sets based on the shortest-path-lengths of nodes. Eventhough shortest-path-lengths are initially unknown for all nodes but the startnode, we can correctly calculate the partial ordering on-the-fly if the widths ofeach approximate level set are bounded by the graph’s shortest edge length.Nodes within an approximate level set can be processed in any order by a stan-dard implementation of Dijkstra’s algorithm and still yield the correct solution.The method can be extended to non-negative edges for the special case of undi-rected graphs using pre-/post-processing to solve a dual of the original problemthat ignores zero-length edges.

The heaps that we present store the approximate level sets in an array ofbuckets, where each bucket is a doubly linked list. We show that only a smalland calculable subset of contiguous buckets are only ever used at one time, andso we can use a looping array that saves considerable space. In particularlychallenging cases many empty buckets exist (which may slow down heap extrac-tion operations), therefore, the “less-simple” version uses a bit-tree to quicklydetermine the next non-empty bucket.

The heaps that we present are easy to implement, and the resulting modifi-cation of Dijkstra’s algorithm efficient vs. time and space, and easy to analyze.We hope that the concept of level-set based partial orderings will increase theoverall understanding of the SSSP and perhaps even be useful in other domains.

Acknowledgments

This work was supported by the Control Science Center of Excellence at the AirForce Research Laboratory (AFRL), the National Science Foundation (NSF)grant IIP-1161029, and the Center for Unmanned Aircraft Systems. This workwas completed while the author was “in residence” at AFRL. We would alsolike to acknowledge Erik Komendera, who independently discovered but neverpublished a data structure similar to Heap 1 while working on an unrelatedproblem, and thank him for his discussions involving it. The author would alsolike to thank Dmitry Yershov and Wheeler Ruml for pointing our relevant relatedwork.

References

1. Ravindra K Ahuja, Kurt Mehlhorn, James Orlin, and Robert E Tarjan. Fasteralgorithms for the shortest path problem. Journal of the ACM (JACM), 37(2):213–223, 1990.

2. Yasuhito Asano and Hiroshi Imai. Practical efficiency of the linear time algorithmfor the single source shortest path problem. Journal of the Operations Research,43:431–447, 2000.

3. Boris V Cherkassky, Andrew V Goldberg, and Tomasz Radzik. Shortest pathsalgorithms: Theory and experimental evaluation. Mathematical programming,73(2):129–174, 1996.

4. Boris V Cherkassky, Andrew V Goldberg, and Craig Silverstein. Buckets, heaps,lists, and monotone priority queues. SIAM Journal on Computing, 28(4):1326–1346, 1999.

5. Robert B Dial. Algorithm 360: Shortest-path forest with topological ordering [h].Communications of the ACM, 12(11):632–633, 1969.

6. Cornelius Diekmann. An unusual way of solving the sssp problem. 2010.7. Henry Gordon Dietz. The Aggregate Magic Algorithms. Technical report, Univer-

sity of Kentucky.8. Edsger W Dijkstra. A note on two problems in connexion with graphs. Numerische

mathematik, 1(1):269–271, 1959.9. Greg N Federickson. Fast algorithms for shortest paths in planar graphs, with

applications. SIAM Journal on Computing, 16(6):1004–1022, 1987.10. Michael L Fredman and Robert Endre Tarjan. Fibonacci heaps and their uses in im-

proved network optimization algorithms. Journal of the ACM (JACM), 34(3):596–615, 1987.

11. Michael L Fredman and Dan E Willard. Trans-dichotomous algorithms for min-imum spanning trees and shortest paths. In Foundations of Computer Science,1990. Proceedings., 31st Annual Symposium on, pages 719–725. IEEE, 1990.

12. Michael L Fredman and Dan E Willard. Surpassing the information theoreticbound with fusion trees. Journal of computer and system sciences, 47(3):424–436,1993.

13. Kurt Mehlhorn and Stefan Naher. Bounded ordered dictionaries in o (log log n)time and o (n) space. Information Processing Letters, 35(4):183–189, 1990.

14. James B Orlin, Kamesh Madduri, K Subramani, and Matthew Williamson. A fasteralgorithm for the single source shortest path problem with few distinct positivelengths. Journal of Discrete Algorithms, 8(2):189–198, 2010.

15. Nick Pruhs. Implementation of thorup’s linear time algorithm for undirected single-source shortest paths with positive integer weights. B. S. thesis, University of Kiel,Kiel, Germany, 2009.

16. Rajeev Raman. Priority queues: Small, monotone and trans-dichotomous. InAlgorithmsESA’96, pages 121–137. Springer, 1996.

17. Rajeev Raman. Recent results on the single-source shortest paths problem. ACMSIGACT News, 28(2):81–87, 1997.

18. K Subramani and Kamesh Madduri. Two-level heaps: a new priority queue struc-ture with applications to the single source shortest path problem. Computing,90(3-4):113–130, 2010.

19. Mikkel Thorup. Undirected single-source shortest paths with positive integerweights in linear time. Journal of the ACM (JACM), 46(3):362–394, 1999.

20. Mikkel Thorup. Floats, integers, and single source shortest paths. Journal ofAlgorithms, 35(2):189–201, 2000.

21. Mikkel Thorupf. On ram priority queues. 81:59, 1996.22. John N Tsitsiklis. Efficient algorithms for globally optimal trajectories. IEEE

TRANSACTIONS ON AUTOMATIC CONTROL, 40(9), 1995.23. Peter van Emde Boas. Preserving order in a forest in less than logarithmic time

and linear space. Information processing letters, 6(3):80–82, 1977.24. Peter van Emde Boas, Robert Kaas, and Erik Zijlstra. Design and implementation

of an efficient priority queue. Mathematical Systems Theory, 10(1):99–127, 1976.

25. Yusi Wei and Shojiro Tanaka. An improved thorup shortest paths algorithm witha modified component tree. In Natural Computation (ICNC), 2013 Ninth Inter-national Conference on, pages 1160–1165. IEEE, 2013.

26. Yusi Wei and Shojiro Tanaka. Improvement of thorup shortest path algorithmby reducing the depth of a component tree. Journal of Advances in ComputerNetworks, 2(2), 2014.

27. John William Joseph Williams. Heapsort. Communications of the ACM, 7:347–348, 1964.

Bit-Tree Implementation

Each tree-node τ in the bit-tree portion of the looping word-length based treedata structure T has the following fields (we shall denote the “field of” relation-ship using a ‘.’ (dot):

– A flag indicating if the node is a leaf node or not L, with value true or falserespectively.

– A word B of bits of word-length w, where w determined by the computerhardware being used. We shall denote the i-th bit in B as B[i]. This is onlyused in leaf nodes.

– An array C of w pointers to the w children of the current tree-node (theseare to other tree nodes if this node is internal, and to a particular bucket ifthe node is a leaf). We shall index into the array using ‘[·]’. This is also onlyused in leaf nodes.

– A pointer to the parent of the current tree-node p.– An integer ip that represents this node’s location in its parent’s C.

– Leaf nodes also store k, the index of the bucket they are associated with.

Additionally, we store two arrays A and P that each have one element for everyleaf node in the bit tree. We enforce that A[k] == 1 if Bk is nonempty, andA[k] == 0 otherwise. P[k] contains a pointer to the leaf node that is associatedwith Bk

The tree is designed such that there is one path from the root τ0 of the treeto each of the β+ 1 buckets B in the bucket array. Let τd,k denote the tree-nodeat depth d along the path from the root to bucket Bk. If a particular bucketBk is nonempty, then we enforce the condition that τd,k.B[i] = 1, where i is theposition of the pointer to τd+1,k that is stored in τd+1,k.C.

This has the advantage of letting us quickly determine when none of thedescendant buckets of τ have any nodes in them (i.e., because τd,k.B = 0 inthat case). In the event that descendant buckets of τ are nonempty, then it alsoallows us to quickly determine the earliest tree-child of τ that also has nonemptydescendant buckets, i.e. (τd,k.B), using the ‘first non-zero bit’ primitive which weshall denote ‘’. Note that we assume bit indexing that starts at 1 (if architectureindexing starts at 0, then we need only add one to the result, which requires O(1)time).

Many common architectures (e.g., X86, ARM, SPARC, etc.) implement a‘count trailing zeros’ primitive, ctz(x), that runs in O(1) time, and which

Algorithm 8: nextNonEmptyPosition(T, k)

Input: Bit-tree T = (A,P,⋃τ) of bit-word-nodes τ and integer k.

Output: The next non-empty position of T after (and including) k.1 if A[k] == 1 then2 return k ;

3 τ = P[k] ;4 while τ 6= τ0 do5 inext = ((τ.p.B >> τ.ip) << τ.ip)6 if inext > 0 then7 break ;

8 τ = τ.p ;

9 if τ == τ0 then10 return ∞ ;

11 τ = τ.p.C[inext] ;12 while τ.L 6= 1 do13 τ = τ.C[(τ.B)] ;

14 return τ.k ;

can be used to implement (x) as follows: (x) = w − ctz(x). For architecturesthat have neither ‘first non-zero bit’ nor ‘count trailing zeros’ primitives thefunction (x) can be implemented in software with runtime O(log2 w), see [7]for details. In the latter case that (x) is implemented in software (and nothardware), the overall runtime of Dijkstra’s algorithm using Heap 3 becomesO((|E|+ |V |) log2(w) logw( `max

`min+ 1)). Although this represents a slightly larger

architecture dependent constant, it does not change the overall validity of ourqualitative claim of ‘linear time performance on many graphs with small over-head’.

We now describe in more detail the subroutines nextNonEmptyPosition(k),markPositionEmpty(k), and markPositionNonEmpty(k) that are used to inter-act with the bit-tree that is used in the bit-tree version of our data structure.The i-th bit of the integer k is denoted k[i]

nextNonEmptyPosition(k) appears in Algorithm 8. A quick check is usedto see if the bucket at index k is empty, and if not then k is returned directly,line 1. If Bk is empty, then we walk up the tree using parent pointers until wefind a parent that has a later child with non-empty descendants or we reach thebit-tree-root, lines 7. Note that ‘>>’ and ‘<<’ are the decreasing and increasingbit-shift operators, respectively. If we have reached the root, then we we knowthat there were no nonempty buckets after Bk, and so we return ∞, lines 9-10.Otherwise, we now move down through the branches of the tree toward the firstnon-empty bucket, lines 12-13, and return it on line 14.

markPositionEmpty(k) is presented in Figure 9. A quick check is used to testif the position k is already marked as empty, and if so then the algorithm returnsimmediately (lines 1-2). The position is marked as empty on line 4 (where ‘⊗’ is

Algorithm 9: markPositionEmpty(T, k)


Output: Marks position k of T as empty.1 if A[k] == 0 then2 return;

3 A[k] = 0 ;4 τ = P[k] ;5 while τ 6= τ0 do6 τ.p.B = τ.p.B ⊗ (1 << τ.ip) ;7 if τ.p.B > 0 then8 return;

9 τ = τ.p ;

Algorithm 10: markPositionNonEmpty(T, k)


Output: Marks position k of T as nonempty.1 if A[k] == 1 then2 return;

3 A[k] = 1 ;4 τ = P[k] ;5 while τ 6= τ0 do6 if τ.p.B > 0 then7 τ.p.B = τ.p.B&(1 << τ.ip) ;8 return;

9 τ.p.B = τ.p.B&(1 << τ.ip) ;10 τ = τ.p ;

the exclusive or operation) and then we walk up the bit tree to the root of thelargest sub-tree that is now empty due to the removal of Bk, while recording thelatter, lines 4-9. Note that we can stop as soon as we find a parent with othernon-empty descendants (lines 7-8).

markPositionNonEmpty(k) is presented in Figure 10 and is similar to markPositionEmpty(k),except that we record information marking Bk as full instead of empty. We canstop walking up the bit-tree as soon as we find a parent with other non-emptydescendants (after marking that the path we have taken up the tree is nownon-empty as well), lines 6-8.

Pre-/post-processing undirected graphs with zero-length edges

This section presents subroutines that can be used to pre-/post-process an undi-rected graphs with zero-length edges such that a dual problem (without zero-length edges) can be solved instead of the original problem (with zero-lengthedges). These subroutines appear in Algorithms 11 and 12, respectively.

Algorithm 11: preProcess(V,E) for the extension of our method to undi-rected graphs with non-negative weights.

Input: Node set V and edge set E of the graph G = (V,E) with non-negativeedge weights.

Output: Node set V of a dual problem in which zero-length edges can beignored. The dual’s edge set E is implicitly defined by V , E, and V .

1 V ; /* array of |V | of linked list heads */ ;2 n = 0 ;3 for v ∈ V do4 n = n+ 1 ;

5 V [n] = 0 ;

6 v.i = 0 ; /* position of V containing v */ ;

7 n = 0 ; /* counts non-empty V elements */ ;8 QFIFO ; /* FIFO queue */ ;9 for v ∈ V do

10 if v.i == 0 then11 n = n+ 1 ;

12 AddToListFront(V [n], v) ;13 v.i = n ;14 u = v ;15 while u 6= NULL do16 for (u,w) ∈ E do17 if ‖(v, w)‖ == 0 and w.i == 0 then

18 AddToListFront(V [n], w) ;19 w.i = n ;20 PushBack(QFIFO, w) ;

21 u = PopFront(QFIFO) ;

22 V = V [1 to n];

23 return V ;

Each v ∈ V of the original problem is combined, along with all nodes u that itcan reach in 0 distance (all u such that ‖P ∗(v, u)‖ = 0) into a “meta-node” v inthe dual. Each meta-node contains a particular subset of nodes from the originalproblem that are zero-distance from each other (and contains all such nodes).Each node from the original problem is associated with exactly one meta-nodein the dual. We assume that each node v maintains an internal integer field v.ithat stores the (index of) the particular “meta-node” v in the dual problem thatv is associated with.

With some abuse of notation v ∈ v denotes that pre-processing has placed vinto the meta-node v and meta-nodes are stored in an array of linked lists, alsodenoted V , such that each linked list V [n] stores all of the (original) nodes in aparticular meta-node v.

Algorithm 12: postProcess(V,E, s, V ) for extension of our method toundirected graphs with non-negative weights.

Input: Node set V and edge set E of the graph G = (V,E) with non-negativeedge weights and start node s of the original problem; the dualproblem’s node set V as calculated in the pre-processing step, andassuming that the SSSP has been solved for the dual problem.

Output: The solution to the original SSSP problem (the shortest path tree Sand values are stored implicitly in V and E).

1 for all v ∈ V do

2 d(v) = d(V [v.i]) ; /* unpack path-lengths */ ;3 p(v) = NULL ;

4 unpackParents(s, E) ;

5 for n = 1, . . . , |V | do6 if s.i == n then7 continue ;

8 v = V [n] ;9 v = memberWithBestParent(v, E) ;

10 unpackParents(v,E) ;

Pre-processing is accomplished using preProcess(V,E) in Algorithm 11. Thesubroutine works by walking the graph (V,E). If a node v is found that is notyet associated with a meta-node in the dual, then a new meta-node v = V [n] iscreated (lines 10-13) and an internal breadth-first search over zero-length edges(lines 14-21) is performed to find all other nodes that belong in v. It is importantto note that each edge in E is touched at most twice during the entire executionof preProcess(V,E), once each in the (u, v) and (v, u) directions. The latterproperty is guaranteed due to the fact that each node v ∈ V can belong to atmost one meta-node v ∈ V , and by construction all nodes associated with aparticular meta-node v are found (using the restricted breadth-first-search) assoon as any node associated with v is discovered, and each direction of each edge(v, u) appears in a single restricted breadth-first search.

Post-processing is accomplished using postProcess(V,E, s, V ) in Algorithm 12.It involves “unpacking” each v ∈ V by transferring d(v) to d(v) for all v ∈ v (line2). All sub-paths that moves through a sequence of nodes v1, v2, . . . , vi, such thatv1, v2, . . . vi ∈ v, have length 0. This fact allows us to recover shortest-path parentpointers with respect to the original problem in a single O(|E|+ |V |) pass overthe original graph and the dual’s shortest-path tree. The details of the latter aredelegated to the subroutines unpackParents(v,E) and memberWithBestParent(v, E),each of which cumulatively touches each edges in E at most twice.

unpackParents(v,E) is called once per meta-node v ∈ V and uses a breadth-first-search restricted to zero-length edges to find parent pointers among nodeswithin the same meta-node v. The restriction to zero-length edges means that

Algorithm 13: unpackParents(v,E)

Input: A node v and edge set E of the original problem, assuming path-lengthshave already been unpacked but not parent pointers.

Output: Unpacks the parent pointers for all original nodes in the dual’smeta-node v such that v was part of v.

1 QFIFO ; /* empty FIFO queue */ ;2 u = v ;3 while u 6= NULL do4 for (u,w) ∈ E s.t ‖(u,w)‖ = 0 do5 if p(w) == NULL then6 p(w) = u ;7 PushBack(QFIFO, w) ;

8 u = PopFront(QFIFO) ;

Algorithm 14: v = memberWithBestParent(v, E)

Input: A node v and edge set E of the original problem, assuming path-lengthshave already been unpacked but not parent pointers.

Output: The original node v in the dual’s meta-node v such that v was part ofv and v has a parent u within the parent meta-node u = p(v). Alsounpacks the parent pointer for v.

1 for v ∈ v do2 for (v, u) ∈ E do3 if u.i == p(v) then4 p(v) = u ;5 return v;

6 return NULL;

each search is is constrained to consider only those nodes associated with aparticular meta-node.

memberWithBestParent(v, E) is used to find a particular node v ∈ v thatmay use a parent outside of v in a valid shortest-path-tree. In other words, itfinds v such that v ∈ v and (u, v) ∈ E and (u, v) ∈ E and u = p(v). This routinealso “unpacks” the appropriate parent pointer for v (line 4).

Documents

Modifying Dijkstra's Algorithm to Solve Many Instances of ... · Dijkstra’s algorithm works by incrementally building a \shortest-path-tree" S outward from s, one node at a time