Single Source Shortest Path Problem Dijkstra’s Algorithm

Preview:

Citation preview

Outline

1 Single Source Shortest Path Problem

2 Dijkstra’s Algorithm

3 Bellman-Ford Algorithm

4 All Pairs Shortest Path (APSP) Problem

5 Floyd-Warshall Algorithm

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 1 / 35

Single Source Shortest Path (SSSP) Problem

Single Source Shortest Path ProblemInput: A directed graph G = (V,E); an edge weight function w : E → R, and astart vertex s ∈ V.Find: for each vertex u ∈ V, δ(s, u) = the length of the shortest path from s tou, and the shortest s→ u path.

There are several different versions:

G can be directed or undirected.

All edge weights are 1.

All edge weights are positive.

Edge weights can be positive or negative, but there are no cycles withnegative total weight (why?).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 2 / 35

Single Source Shortest Path (SSSP) Problem

Single Source Shortest Path ProblemInput: A directed graph G = (V,E); an edge weight function w : E → R, and astart vertex s ∈ V.Find: for each vertex u ∈ V, δ(s, u) = the length of the shortest path from s tou, and the shortest s→ u path.

There are several different versions:

G can be directed or undirected.

All edge weights are 1.

All edge weights are positive.

Edge weights can be positive or negative, but there are no cycles withnegative total weight (why?).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 2 / 35

Single Source Shortest Path (SSSP) Problem

Single Source Shortest Path ProblemInput: A directed graph G = (V,E); an edge weight function w : E → R, and astart vertex s ∈ V.Find: for each vertex u ∈ V, δ(s, u) = the length of the shortest path from s tou, and the shortest s→ u path.

There are several different versions:

G can be directed or undirected.

All edge weights are 1.

All edge weights are positive.

Edge weights can be positive or negative, but there are no cycles withnegative total weight (why?).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 2 / 35

Single Source Shortest Path (SSSP) Problem

Single Source Shortest Path ProblemInput: A directed graph G = (V,E); an edge weight function w : E → R, and astart vertex s ∈ V.Find: for each vertex u ∈ V, δ(s, u) = the length of the shortest path from s tou, and the shortest s→ u path.

There are several different versions:

G can be directed or undirected.

All edge weights are 1.

All edge weights are positive.

Edge weights can be positive or negative, but there are no cycles withnegative total weight (why?).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 2 / 35

Single Source Shortest Path (SSSP) Problem

Single Source Shortest Path ProblemInput: A directed graph G = (V,E); an edge weight function w : E → R, and astart vertex s ∈ V.Find: for each vertex u ∈ V, δ(s, u) = the length of the shortest path from s tou, and the shortest s→ u path.

There are several different versions:

G can be directed or undirected.

All edge weights are 1.

All edge weights are positive.

Edge weights can be positive or negative, but there are no cycles withnegative total weight (why?).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 2 / 35

SSSP Problem

Note 1: There are natural applications where edge weights can benegative.

Note 2: If G has a cycle C with negative total weight, then we can just goaround C to decrease the δ(s, ∗) indefinitely.

Then the problem is not well defined. We will need an algorithm todetect such condition.

For the case when w(e) = 1 for all edges, we have shown that theproblem can be solved by BFS in Θ(n + m) time.

We next discuss algorithms for more general cases.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 3 / 35

SSSP Problem

Note 1: There are natural applications where edge weights can benegative.

Note 2: If G has a cycle C with negative total weight, then we can just goaround C to decrease the δ(s, ∗) indefinitely.

Then the problem is not well defined. We will need an algorithm todetect such condition.

For the case when w(e) = 1 for all edges, we have shown that theproblem can be solved by BFS in Θ(n + m) time.

We next discuss algorithms for more general cases.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 3 / 35

SSSP Problem

Note 1: There are natural applications where edge weights can benegative.

Note 2: If G has a cycle C with negative total weight, then we can just goaround C to decrease the δ(s, ∗) indefinitely.

Then the problem is not well defined. We will need an algorithm todetect such condition.

For the case when w(e) = 1 for all edges, we have shown that theproblem can be solved by BFS in Θ(n + m) time.

We next discuss algorithms for more general cases.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 3 / 35

SSSP Problem

Note 1: There are natural applications where edge weights can benegative.

Note 2: If G has a cycle C with negative total weight, then we can just goaround C to decrease the δ(s, ∗) indefinitely.

Then the problem is not well defined. We will need an algorithm todetect such condition.

For the case when w(e) = 1 for all edges, we have shown that theproblem can be solved by BFS in Θ(n + m) time.

We next discuss algorithms for more general cases.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 3 / 35

SSSP Problem

Note 1: There are natural applications where edge weights can benegative.

Note 2: If G has a cycle C with negative total weight, then we can just goaround C to decrease the δ(s, ∗) indefinitely.

Then the problem is not well defined. We will need an algorithm todetect such condition.

For the case when w(e) = 1 for all edges, we have shown that theproblem can be solved by BFS in Θ(n + m) time.

We next discuss algorithms for more general cases.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 3 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

SSSP Problem: Positive Edge Weight

We consider the case where G is directed and w(e) ≥ 0 for all e ∈ E. If G isundirected, the algorithm is almost identical.

General Description:

Each vertex u ∈ V has a variable d[u], which is an upper bound of δ(s, u).

During the execution, we keep a set S ⊆ V.

For each u ∈ S, d[u] = δ(s, u) has been computed. Initially S contains sonly and d[s] = δ(s, s) = 0.

The vertices in V − S are stored in a priority queue Q. d[u] is the keyvalue for Q.

In each iteration, the vertex in Q with min d[u] value is included into S.

For vertex v ∈ Q where u→ v ∈ E, d[v] is updated.

When Q is empty, the algorithm stops.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 4 / 35

Priority Queue

To implement the algorithm, we need a data structure.

Priority QueueA Priority Queue is a data structure Q. It consists of a set of items. Each itemhas a key. The data structure supports the following operations.

Insert(Q, x): insert an item x into Q.

Extract-Min(Q): remove and return the item with minimum key value.

Min(Q): return the item with minimum key value.

Decrease-Key(Q, x, k): decrease the key value of an item x to k.

By using a Heap data structure, priority queue can be implemented so that:

Min(Q) takes O(1) time.

All other three operations take O(log n) time (n is the number of items inQ.)

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 5 / 35

Priority Queue

To implement the algorithm, we need a data structure.

Priority QueueA Priority Queue is a data structure Q. It consists of a set of items. Each itemhas a key. The data structure supports the following operations.

Insert(Q, x): insert an item x into Q.

Extract-Min(Q): remove and return the item with minimum key value.

Min(Q): return the item with minimum key value.

Decrease-Key(Q, x, k): decrease the key value of an item x to k.

By using a Heap data structure, priority queue can be implemented so that:

Min(Q) takes O(1) time.

All other three operations take O(log n) time (n is the number of items inQ.)

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 5 / 35

Priority Queue

To implement the algorithm, we need a data structure.

Priority QueueA Priority Queue is a data structure Q. It consists of a set of items. Each itemhas a key. The data structure supports the following operations.

Insert(Q, x): insert an item x into Q.

Extract-Min(Q): remove and return the item with minimum key value.

Min(Q): return the item with minimum key value.

Decrease-Key(Q, x, k): decrease the key value of an item x to k.

By using a Heap data structure, priority queue can be implemented so that:

Min(Q) takes O(1) time.

All other three operations take O(log n) time (n is the number of items inQ.)

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 5 / 35

Outline

1 Single Source Shortest Path Problem

2 Dijkstra’s Algorithm

3 Bellman-Ford Algorithm

4 All Pairs Shortest Path (APSP) Problem

5 Floyd-Warshall Algorithm

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 6 / 35

Dijkstra’s Algorithm

Main Data structures:

G: Adjacency List Representation.

For Each vertex u ∈ V:

Adj[u]: the adjacency list for ud[u]: An upper bound for δ(s, u)π[u]: indicates the first vertex after u in the shortest s→ u path

S: A set that holds the finished vertices.

Q: A priority queue that holds the vertices not in S.

Initialize(G, s)

1 for each u ∈ V do

2 d[u] =∞; π[u] = NIL;

3 d[s] = 0

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 7 / 35

Dijkstra’s Algorithm

Main Data structures:

G: Adjacency List Representation.

For Each vertex u ∈ V:

Adj[u]: the adjacency list for ud[u]: An upper bound for δ(s, u)π[u]: indicates the first vertex after u in the shortest s→ u path

S: A set that holds the finished vertices.

Q: A priority queue that holds the vertices not in S.

Initialize(G, s)

1 for each u ∈ V do

2 d[u] =∞; π[u] = NIL;

3 d[s] = 0

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 7 / 35

Dijkstra’s Algorithm

Main Data structures:

G: Adjacency List Representation.

For Each vertex u ∈ V:

Adj[u]: the adjacency list for ud[u]: An upper bound for δ(s, u)π[u]: indicates the first vertex after u in the shortest s→ u path

S: A set that holds the finished vertices.

Q: A priority queue that holds the vertices not in S.

Initialize(G, s)

1 for each u ∈ V do

2 d[u] =∞; π[u] = NIL;

3 d[s] = 0

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 7 / 35

Dijkstra’s Algorithm

Main Data structures:

G: Adjacency List Representation.

For Each vertex u ∈ V:

Adj[u]: the adjacency list for ud[u]: An upper bound for δ(s, u)π[u]: indicates the first vertex after u in the shortest s→ u path

S: A set that holds the finished vertices.

Q: A priority queue that holds the vertices not in S.

Initialize(G, s)

1 for each u ∈ V do

2 d[u] =∞; π[u] = NIL;

3 d[s] = 0

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 7 / 35

Dijkstra’s Algorithm

Relax(u, v,w(∗))1 if d[v] > d[u] + w(u→ v) do

2 d[v] = d[u] + w(u→ v)

3 π[v] = u

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 8 / 35

Dijkstra’s Algorithm

Dijkstra(G, s,w(∗))1 Initialize(G, s)

2 S← ∅3 Q← V

4 while Q 6= ∅ do

5 u← Extract-Min(Q)

6 S← S ∪ {u}7 for each v ∈ Adj[u] do

8 Relax(u, v,w(∗))9 end for

10 end while

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 9 / 35

Dijkstra’s Algorithm: Example

101

2 39

6

25 4

7

101

2 39

6

25 4

7

101

2 39

6

25 4

7

101

2 39

6

25 4

7

101

2 39

6

25 4

7

101

2 39

6

25 4

7

0s

5

8

7

9

vertices in S π value

0s

5

8

7

9

9 d value

0s

88

88

0s

88

10

5

0s

5

8 14

7

0s

5

8

7

13

The number inside each circle indicates the d value.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 10 / 35

Dijkstra’s Algorithm: Analysis

Initialize: Θ(n)

Relax This is actually the decrease-key operation of the priority queue,which takes O(log n) time.

Line 1: Θ(n)

Line 2: Initialize an empty set takes O(1) time.

Line 3: Insert n items into Q, Θ(n) time.

Line 4: While loop (not counting the time for the for loop, lines 7-9):

The loop iterates n times. (Q has n items in it initially. Each iterationremoves one item from Q. Nothing is added into it. The loop stopswhen Q is empty.)In the loop body, Extract-Min takes O(log n) time. The line 6 takesO(1) time.Thus the total run time of the while loop (not counting lines 7-9) isO(n log n).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 11 / 35

Dijkstra’s Algorithm: Analysis

The total runtime of the lines 7-9:Each entry in Adj[u] is processed once.When it is processed, we call Relax once.Thus the processing of each entry takes O(log n) time.There are a total of Θ(m) entries in all Adj[u]’s (m is the number ofedges in G).So the the total run time for lines 7-9 is: O(m log n) time.

Since this term (O(m log n)) dominates all other terms, the wholealgorithm takes O(m log n) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 12 / 35

Dijkstra’s Algorithm: Analysis

The total runtime of the lines 7-9:Each entry in Adj[u] is processed once.When it is processed, we call Relax once.Thus the processing of each entry takes O(log n) time.There are a total of Θ(m) entries in all Adj[u]’s (m is the number ofedges in G).So the the total run time for lines 7-9 is: O(m log n) time.

Since this term (O(m log n)) dominates all other terms, the wholealgorithm takes O(m log n) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 12 / 35

SSSP Problem: Negative Edge Weight

If the edge weight of G can be negative, Dijkstra’s algorithm doesn’t work:

−23

2 2

−2

3

−22

3

2

−2

3

3 3 35 5

0s

88

0s

3

0s

2

3

0s

3

2

vertices in S π value 9 d value

8 8 1

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 13 / 35

Outline

1 Single Source Shortest Path Problem

2 Dijkstra’s Algorithm

3 Bellman-Ford Algorithm

4 All Pairs Shortest Path (APSP) Problem

5 Floyd-Warshall Algorithm

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 14 / 35

Bellman-Ford AlgorithmBellman-Ford(G, s,w(∗))

1 Initialize(G, s)

2 for i = 1 to n do

3 for each e = (u, v) ∈ E do

4 Relax(u, v,w(∗))5 for each e = (u, v) ∈ E do

6 if d[v] > d[u] + w(u→ v) output “G has a negative cycle”

7 d[u] is the length of the shortest s→ u path for each u ∈ V

Analysis:

This time, we don’t need Extract-Min operation. So we don’t needpriority queue anymore. Relax now takes O(1) time.

The loop iterates n · m times. The loop body takes O(1) time. Thus thealgorithm takes Θ(nm) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 15 / 35

Bellman-Ford AlgorithmBellman-Ford(G, s,w(∗))

1 Initialize(G, s)

2 for i = 1 to n do

3 for each e = (u, v) ∈ E do

4 Relax(u, v,w(∗))5 for each e = (u, v) ∈ E do

6 if d[v] > d[u] + w(u→ v) output “G has a negative cycle”

7 d[u] is the length of the shortest s→ u path for each u ∈ V

Analysis:

This time, we don’t need Extract-Min operation. So we don’t needpriority queue anymore. Relax now takes O(1) time.

The loop iterates n · m times. The loop body takes O(1) time. Thus thealgorithm takes Θ(nm) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 15 / 35

Bellman-Ford AlgorithmBellman-Ford(G, s,w(∗))

1 Initialize(G, s)

2 for i = 1 to n do

3 for each e = (u, v) ∈ E do

4 Relax(u, v,w(∗))5 for each e = (u, v) ∈ E do

6 if d[v] > d[u] + w(u→ v) output “G has a negative cycle”

7 d[u] is the length of the shortest s→ u path for each u ∈ V

Analysis:

This time, we don’t need Extract-Min operation. So we don’t needpriority queue anymore. Relax now takes O(1) time.

The loop iterates n · m times. The loop body takes O(1) time. Thus thealgorithm takes Θ(nm) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 15 / 35

Bellman-Ford Algorithm: Correctness Proof

Why Bellman-Ford algorithm works?

Path-Relaxation PropertyLet G = (V,E) be a directed graph with edge weight function w(∗) and thestarting vertex s. Consider any shortest path P = 〈v0, v1, . . . , vk〉 from s = v0 toa vertex vk. If G is initialized by Initialize(G, s) and then a sequence ofrelaxation steps occurs that includes, in order, relaxations of the edgesv0 → v1, v1 → v2, . . . , vk−1 → vk, then d[vk] = δ(s, vk) after these relaxationsand at all times afterward. This property holds no matter what other edgerelaxations occur.

Proof: We show by induction that after the ith edge of path P is relaxed, wehave d[vi] = δ(s, vi).Base case i = 0: Before any edges of P have been relaxed, from theInitialization, we have d[v0] = d[s] = 0 = δ(s, s). Because the relaxation neverincreases the d[∗] value, d[s] = 0 always holds. So the statement is true forthe base case.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 16 / 35

Bellman-Ford Algorithm: Correctness Proof

Why Bellman-Ford algorithm works?

Path-Relaxation PropertyLet G = (V,E) be a directed graph with edge weight function w(∗) and thestarting vertex s. Consider any shortest path P = 〈v0, v1, . . . , vk〉 from s = v0 toa vertex vk. If G is initialized by Initialize(G, s) and then a sequence ofrelaxation steps occurs that includes, in order, relaxations of the edgesv0 → v1, v1 → v2, . . . , vk−1 → vk, then d[vk] = δ(s, vk) after these relaxationsand at all times afterward. This property holds no matter what other edgerelaxations occur.

Proof: We show by induction that after the ith edge of path P is relaxed, wehave d[vi] = δ(s, vi).

Base case i = 0: Before any edges of P have been relaxed, from theInitialization, we have d[v0] = d[s] = 0 = δ(s, s). Because the relaxation neverincreases the d[∗] value, d[s] = 0 always holds. So the statement is true forthe base case.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 16 / 35

Bellman-Ford Algorithm: Correctness Proof

Why Bellman-Ford algorithm works?

Path-Relaxation PropertyLet G = (V,E) be a directed graph with edge weight function w(∗) and thestarting vertex s. Consider any shortest path P = 〈v0, v1, . . . , vk〉 from s = v0 toa vertex vk. If G is initialized by Initialize(G, s) and then a sequence ofrelaxation steps occurs that includes, in order, relaxations of the edgesv0 → v1, v1 → v2, . . . , vk−1 → vk, then d[vk] = δ(s, vk) after these relaxationsand at all times afterward. This property holds no matter what other edgerelaxations occur.

Proof: We show by induction that after the ith edge of path P is relaxed, wehave d[vi] = δ(s, vi).Base case i = 0: Before any edges of P have been relaxed, from theInitialization, we have d[v0] = d[s] = 0 = δ(s, s). Because the relaxation neverincreases the d[∗] value, d[s] = 0 always holds. So the statement is true forthe base case.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 16 / 35

Bellman-Ford Algorithm: Correctness Proof

Proof (continued):Induction step: Assume d[vi−1] = δ(s, vi−1), and we examine the relaxation ofthe edge vi−1 → vi. Because P is the shortest s→ vi path, after the relaxationof the edge vi−1 → vi, we will have d[vi] = δ(s, vi). Again, because relaxationnever increases d[∗] value, d[vi] = δ(s, vi) remains valid afterward.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 17 / 35

Bellman-Ford Algorithm: Correctness Proof

Lemma 24.2Let G = (V,E) be an directed graph with edge weight function w(∗) and thestarting vertex s. Assuming G has no negative-weight cycles. Then after|V| − 1 iterations of the for loop of Bellman-Ford algorithm, we haved[v] = δ(s, v) for all vertices in v that are reachable from s.

Proof: Consider any vertex v that is reachable from s. Let P = 〈v0, v1, . . . , vk〉be the shortest path from s = v0 to v = vk. Because G has no negative-weightcycles, P contains no cycles. Thus P has at most |V| − 1 edges, namelyk ≤ |V| − 1.Each of the |V| − 1 iterations of the for loop relaxes all |E| edges. Among theedges relaxed in the ith iteration is the edge vi−1 → vi. According to thePath-Relaxation Property, d[v] = d[vk] = δ(s, vk) = δ(s, v).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 18 / 35

Bellman-Ford Algorithm: Correctness Proof

Lemma 24.2Let G = (V,E) be an directed graph with edge weight function w(∗) and thestarting vertex s. Assuming G has no negative-weight cycles. Then after|V| − 1 iterations of the for loop of Bellman-Ford algorithm, we haved[v] = δ(s, v) for all vertices in v that are reachable from s.

Proof: Consider any vertex v that is reachable from s. Let P = 〈v0, v1, . . . , vk〉be the shortest path from s = v0 to v = vk. Because G has no negative-weightcycles, P contains no cycles. Thus P has at most |V| − 1 edges, namelyk ≤ |V| − 1.

Each of the |V| − 1 iterations of the for loop relaxes all |E| edges. Among theedges relaxed in the ith iteration is the edge vi−1 → vi. According to thePath-Relaxation Property, d[v] = d[vk] = δ(s, vk) = δ(s, v).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 18 / 35

Bellman-Ford Algorithm: Correctness Proof

Lemma 24.2Let G = (V,E) be an directed graph with edge weight function w(∗) and thestarting vertex s. Assuming G has no negative-weight cycles. Then after|V| − 1 iterations of the for loop of Bellman-Ford algorithm, we haved[v] = δ(s, v) for all vertices in v that are reachable from s.

Proof: Consider any vertex v that is reachable from s. Let P = 〈v0, v1, . . . , vk〉be the shortest path from s = v0 to v = vk. Because G has no negative-weightcycles, P contains no cycles. Thus P has at most |V| − 1 edges, namelyk ≤ |V| − 1.Each of the |V| − 1 iterations of the for loop relaxes all |E| edges. Among theedges relaxed in the ith iteration is the edge vi−1 → vi. According to thePath-Relaxation Property, d[v] = d[vk] = δ(s, vk) = δ(s, v).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 18 / 35

Bellman-Ford Algorithm: Correctness Proof

Correctness of Bellman-Ford Algorithm:

If G contains no negative cycle, then by Lemma 24.2, the algorithmcomputes δ(s, v) for all v reachable from s.

If G has a negative-weight cycle C that is reachable from s, then for anyvertex v on C, δ(s, v) = −∞. So the condition of the if statement at line 6will be true for such vertex v. The algorithm will correctly output “G hasa negative cycle”.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 19 / 35

Bellman-Ford Algorithm: Correctness Proof

Correctness of Bellman-Ford Algorithm:

If G contains no negative cycle, then by Lemma 24.2, the algorithmcomputes δ(s, v) for all v reachable from s.

If G has a negative-weight cycle C that is reachable from s, then for anyvertex v on C, δ(s, v) = −∞. So the condition of the if statement at line 6will be true for such vertex v. The algorithm will correctly output “G hasa negative cycle”.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 19 / 35

Outline

1 Single Source Shortest Path Problem

2 Dijkstra’s Algorithm

3 Bellman-Ford Algorithm

4 All Pairs Shortest Path (APSP) Problem

5 Floyd-Warshall Algorithm

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 20 / 35

All Pairs Shortest Path (APSP) Problem

All Pairs Shortest Path (APSP) ProblemInput: A directed graph G = (V,E) and a weight function w : E → R.Output: for each pair u, v ∈ V, find δ(u, v) = the length of the shortest pathfrom u to v, and the shortest u→ v path.

If w(e) = 1 for all e ∈ E:Call BFS n times, once for each vertex u.Total runtime: Θ(n(n + m)).

If w(e) ≥ 0 for all e ∈ E:Call Dijkstra’s algorithm n times, once for each vertex u.Total runtime: Θ(nm log n).

If w(e) can be negative:Call Bellman-Ford algorithm n times, once for each vertex u.Total runtime: Θ(n2m). Since m = Θ(n2) in the worst case, theruntime can be Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 21 / 35

All Pairs Shortest Path (APSP) Problem

All Pairs Shortest Path (APSP) ProblemInput: A directed graph G = (V,E) and a weight function w : E → R.Output: for each pair u, v ∈ V, find δ(u, v) = the length of the shortest pathfrom u to v, and the shortest u→ v path.

If w(e) = 1 for all e ∈ E:Call BFS n times, once for each vertex u.Total runtime: Θ(n(n + m)).

If w(e) ≥ 0 for all e ∈ E:Call Dijkstra’s algorithm n times, once for each vertex u.Total runtime: Θ(nm log n).

If w(e) can be negative:Call Bellman-Ford algorithm n times, once for each vertex u.Total runtime: Θ(n2m). Since m = Θ(n2) in the worst case, theruntime can be Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 21 / 35

All Pairs Shortest Path (APSP) Problem

All Pairs Shortest Path (APSP) ProblemInput: A directed graph G = (V,E) and a weight function w : E → R.Output: for each pair u, v ∈ V, find δ(u, v) = the length of the shortest pathfrom u to v, and the shortest u→ v path.

If w(e) = 1 for all e ∈ E:Call BFS n times, once for each vertex u.Total runtime: Θ(n(n + m)).

If w(e) ≥ 0 for all e ∈ E:Call Dijkstra’s algorithm n times, once for each vertex u.Total runtime: Θ(nm log n).

If w(e) can be negative:Call Bellman-Ford algorithm n times, once for each vertex u.Total runtime: Θ(n2m). Since m = Θ(n2) in the worst case, theruntime can be Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 21 / 35

All Pairs Shortest Path (APSP) Problem

All Pairs Shortest Path (APSP) ProblemInput: A directed graph G = (V,E) and a weight function w : E → R.Output: for each pair u, v ∈ V, find δ(u, v) = the length of the shortest pathfrom u to v, and the shortest u→ v path.

If w(e) = 1 for all e ∈ E:Call BFS n times, once for each vertex u.Total runtime: Θ(n(n + m)).

If w(e) ≥ 0 for all e ∈ E:Call Dijkstra’s algorithm n times, once for each vertex u.Total runtime: Θ(nm log n).

If w(e) can be negative:Call Bellman-Ford algorithm n times, once for each vertex u.Total runtime: Θ(n2m). Since m = Θ(n2) in the worst case, theruntime can be Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 21 / 35

APSP Problem: Negative Edge weight

We will try to improve the algorithm for the last case.

Since we need to compute δ(u, v) for all u, v ∈ V, we will use adjacencymatrix representation for G.

Let w[1..n, 1..n] be a 2D array:

w[i, j] = wij =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

We want to compute an array D[1..n, 1..n] such that:

D[i, j] = dij = δ(i, j)

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 22 / 35

APSP Problem: Negative Edge weight

We will try to improve the algorithm for the last case.

Since we need to compute δ(u, v) for all u, v ∈ V, we will use adjacencymatrix representation for G.

Let w[1..n, 1..n] be a 2D array:

w[i, j] = wij =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

We want to compute an array D[1..n, 1..n] such that:

D[i, j] = dij = δ(i, j)

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 22 / 35

APSP Problem: Negative Edge weight

We will try to improve the algorithm for the last case.

Since we need to compute δ(u, v) for all u, v ∈ V, we will use adjacencymatrix representation for G.

Let w[1..n, 1..n] be a 2D array:

w[i, j] = wij =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

We want to compute an array D[1..n, 1..n] such that:

D[i, j] = dij = δ(i, j)

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 22 / 35

APSP Problem: Negative Edge weight

We assume G contains no negative cycles for now.

Define: d(t)ij = the length of the shortest i→ j path that contains at most t

edges.

Then:

d(0)ij =

{0 if i = j∞ if i 6= j

d(1)ij = w[i, j] =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 23 / 35

APSP Problem: Negative Edge weight

We assume G contains no negative cycles for now.Define: d(t)

ij = the length of the shortest i→ j path that contains at most tedges.

Then:

d(0)ij =

{0 if i = j∞ if i 6= j

d(1)ij = w[i, j] =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 23 / 35

APSP Problem: Negative Edge weight

We assume G contains no negative cycles for now.Define: d(t)

ij = the length of the shortest i→ j path that contains at most tedges.

Then:

d(0)ij =

{0 if i = j∞ if i 6= j

d(1)ij = w[i, j] =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 23 / 35

APSP Problem: Negative Edge weight

We assume G contains no negative cycles for now.Define: d(t)

ij = the length of the shortest i→ j path that contains at most tedges.

Then:

d(0)ij =

{0 if i = j∞ if i 6= j

d(1)ij = w[i, j] =

0 if i = jw[i, j] if i 6= j and i→ j ∈ E∞ if i 6= j and i→ j 6∈ E

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 23 / 35

APSP Problem: Negative Edge weight

Since G contains no negative cycles, we have:

δ(i, j) = d(n−1)ij = d(n)

ij = d(n+1)ij . . .

This is because:

If the shortest i→ j path P contains ≥ n edges, it must contains a cycleC (since G has only n vertices).

Since G has no negative cycles, we can delete C from P, withoutincreasing the length, to get another i→ j path P′ with fewer edges.

So the shortest path contains at most n− 1 edges.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 24 / 35

APSP Problem: Negative Edge weight

Since G contains no negative cycles, we have:

δ(i, j) = d(n−1)ij = d(n)

ij = d(n+1)ij . . .

This is because:

If the shortest i→ j path P contains ≥ n edges, it must contains a cycleC (since G has only n vertices).

Since G has no negative cycles, we can delete C from P, withoutincreasing the length, to get another i→ j path P′ with fewer edges.

So the shortest path contains at most n− 1 edges.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 24 / 35

APSP Problem: Negative Edge weight

Since G contains no negative cycles, we have:

δ(i, j) = d(n−1)ij = d(n)

ij = d(n+1)ij . . .

This is because:

If the shortest i→ j path P contains ≥ n edges, it must contains a cycleC (since G has only n vertices).

Since G has no negative cycles, we can delete C from P, withoutincreasing the length, to get another i→ j path P′ with fewer edges.

So the shortest path contains at most n− 1 edges.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 24 / 35

APSP Problem: Negative Edge weight

All we have to do is to compute d(n−1)ij .

We need to find a recursive formula for d(n−1)ij :

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

,min{ d(t−1)ik + W[k, j] | 1 ≤ k ≤ n}︸ ︷︷ ︸

(2)

}

Case (1) The shortest i→ j path actually only contains t − 1 edges, so itslength is d(t−1)

ij .

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 25 / 35

APSP Problem: Negative Edge weight

All we have to do is to compute d(n−1)ij .

We need to find a recursive formula for d(n−1)ij :

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

,min{ d(t−1)ik + W[k, j] | 1 ≤ k ≤ n}︸ ︷︷ ︸

(2)

}

Case (1) The shortest i→ j path actually only contains t − 1 edges, so itslength is d(t−1)

ij .

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 25 / 35

APSP Problem: Negative Edge weight

All we have to do is to compute d(n−1)ij .

We need to find a recursive formula for d(n−1)ij :

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

,min{ d(t−1)ik + W[k, j] | 1 ≤ k ≤ n}︸ ︷︷ ︸

(2)

}

Case (1) The shortest i→ j path actually only contains t − 1 edges, so itslength is d(t−1)

ij .

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 25 / 35

APSP Problem: Negative Edge weight

All we have to do is to compute d(n−1)ij .

We need to find a recursive formula for d(n−1)ij :

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

,min{ d(t−1)ik + W[k, j] | 1 ≤ k ≤ n}︸ ︷︷ ︸

(2)

}

Case (1) The shortest i→ j path actually only contains t − 1 edges, so itslength is d(t−1)

ij .

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 25 / 35

APSP Problem: Negative Edge weight

Case (2) The shortest i→ j path P contains t edges, Let k be the vertex on Pright before reaching j.The weight of the last edge is W[k, j]. The portion P′ of P from i to k is theshortest i→ k path containing at most t − 1 edges. The length of P′ is d(t−1)

ik .(Do you realize that this is the Optical Substructure Property for this problem?We are using dynamic programming!)

i jk

W[k,j]

d(t−1)ik

The term (1) can be re-written as d(t−1)ij + 0 = d(t−1)

ij + W[j, j]. It can beincluded into the term (2). Thus:

d(t)ij = min

1≤k≤n{ d(t−1)

ik + W[k, j]}

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 26 / 35

APSP Problem: Negative Edge weight

Case (2) The shortest i→ j path P contains t edges, Let k be the vertex on Pright before reaching j.The weight of the last edge is W[k, j]. The portion P′ of P from i to k is theshortest i→ k path containing at most t − 1 edges. The length of P′ is d(t−1)

ik .(Do you realize that this is the Optical Substructure Property for this problem?We are using dynamic programming!)

i jk

W[k,j]

d(t−1)ik

The term (1) can be re-written as d(t−1)ij + 0 = d(t−1)

ij + W[j, j]. It can beincluded into the term (2). Thus:

d(t)ij = min

1≤k≤n{ d(t−1)

ik + W[k, j]}

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 26 / 35

APSP Problem: Negative Edge weight

For t = 1, 2, . . . define:D(t) = (d(t)

ij )1≤i,j≤n

Then D(1) = (d(1)ij )1≤i,j≤n = W[1..n, 1..n] = the input adjacency matrix.

Matrix Operator ⊗Let A = (aij) and B = (bij) be two n× n matrices. Define:

C = (cij) = A⊗ B

wherecij = min

1≤k≤n{aik + bkj}

It is easy to see:

D(t) = D(t−1) ⊗W

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 27 / 35

APSP Problem: Negative Edge weight

For t = 1, 2, . . . define:D(t) = (d(t)

ij )1≤i,j≤n

Then D(1) = (d(1)ij )1≤i,j≤n = W[1..n, 1..n] = the input adjacency matrix.

Matrix Operator ⊗Let A = (aij) and B = (bij) be two n× n matrices. Define:

C = (cij) = A⊗ B

wherecij = min

1≤k≤n{aik + bkj}

It is easy to see:

D(t) = D(t−1) ⊗W

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 27 / 35

APSP Problem: Negative Edge weight

For t = 1, 2, . . . define:D(t) = (d(t)

ij )1≤i,j≤n

Then D(1) = (d(1)ij )1≤i,j≤n = W[1..n, 1..n] = the input adjacency matrix.

Matrix Operator ⊗Let A = (aij) and B = (bij) be two n× n matrices. Define:

C = (cij) = A⊗ B

wherecij = min

1≤k≤n{aik + bkj}

It is easy to see:

D(t) = D(t−1) ⊗W

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 27 / 35

APSP Problem: Negative Edge weight

Observations

If G has no negative cycles, then D(n−1) = D(n) = D(n+1) . . .

If D(n−1) = D(n), then D(n+1) = D(n) ⊗W = D(n−1) ⊗W = D(n). Thus:D(n−1) = D(n) = D(n+1) . . .

If G has a negative cycle C, let i, j be two vertices in on C. Thenδ(i, j) = −∞. Therefore, D(t)

ij will go to −∞ when t→∞. Thus, in thiscase D(n−1) 6= D(n).

Hence G contains a negative cycle if and only if D(n−1) 6= D(n).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 28 / 35

APSP Problem: Negative Edge weight

Observations

If G has no negative cycles, then D(n−1) = D(n) = D(n+1) . . .

If D(n−1) = D(n), then D(n+1) = D(n) ⊗W = D(n−1) ⊗W = D(n). Thus:D(n−1) = D(n) = D(n+1) . . .

If G has a negative cycle C, let i, j be two vertices in on C. Thenδ(i, j) = −∞. Therefore, D(t)

ij will go to −∞ when t→∞. Thus, in thiscase D(n−1) 6= D(n).

Hence G contains a negative cycle if and only if D(n−1) 6= D(n).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 28 / 35

APSP Problem: Negative Edge weight

Observations

If G has no negative cycles, then D(n−1) = D(n) = D(n+1) . . .

If D(n−1) = D(n), then D(n+1) = D(n) ⊗W = D(n−1) ⊗W = D(n). Thus:D(n−1) = D(n) = D(n+1) . . .

If G has a negative cycle C, let i, j be two vertices in on C. Thenδ(i, j) = −∞. Therefore, D(t)

ij will go to −∞ when t→∞. Thus, in thiscase D(n−1) 6= D(n).

Hence G contains a negative cycle if and only if D(n−1) 6= D(n).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 28 / 35

APSP Problem: Negative Edge weight

Observations

If G has no negative cycles, then D(n−1) = D(n) = D(n+1) . . .

If D(n−1) = D(n), then D(n+1) = D(n) ⊗W = D(n−1) ⊗W = D(n). Thus:D(n−1) = D(n) = D(n+1) . . .

If G has a negative cycle C, let i, j be two vertices in on C. Thenδ(i, j) = −∞. Therefore, D(t)

ij will go to −∞ when t→∞. Thus, in thiscase D(n−1) 6= D(n).

Hence G contains a negative cycle if and only if D(n−1) 6= D(n).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 28 / 35

APSP Problem: Negative Edge weight

SimpleAPSA(W) (W is the input adjacency matrix.)

1 D(1) = W

2 for t = 2 to n do

3 compute D(t) = D(t−1) ⊗W

4 if D(n−1) = D(n) output solution matrix D(n−1)

5 else output “G contains negative cycles”

Analysis:

⊗ takes Θ(n3) time.

The loop iterates n times. So total runtime is Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 29 / 35

APSP Problem: Negative Edge weight

SimpleAPSA(W) (W is the input adjacency matrix.)

1 D(1) = W

2 for t = 2 to n do

3 compute D(t) = D(t−1) ⊗W

4 if D(n−1) = D(n) output solution matrix D(n−1)

5 else output “G contains negative cycles”

Analysis:

⊗ takes Θ(n3) time.

The loop iterates n times. So total runtime is Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 29 / 35

APSP Problem: Negative Edge weight

SimpleAPSA(W) (W is the input adjacency matrix.)

1 D(1) = W

2 for t = 2 to n do

3 compute D(t) = D(t−1) ⊗W

4 if D(n−1) = D(n) output solution matrix D(n−1)

5 else output “G contains negative cycles”

Analysis:

⊗ takes Θ(n3) time.

The loop iterates n times. So total runtime is Θ(n4).

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 29 / 35

APSP Problem

We can do better than this. It can be shown ⊗ is associative. Namely:

A⊗ (B⊗ C) = (A⊗ B)⊗ C

So we can compute D(t) by repeated squaring: D(2) = D(1) ⊗ D(1),D(4) = D(2) ⊗ D(2), D(8) = D(4) ⊗ D(4) . . .

FasterAPSA(W)

1 k = dlog2(n− 1)e (k is the smallest integer such that 2k ≥ (n− 1).)

2 Compute D(2), D(4), D(8) . . .D(2k), D(2k+1) by repeated squaring.

3 if D(2k) = D(2k+1) output solution matrix D(2k)

4 else output “G contains negative cycles”

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 30 / 35

APSP Problem

We can do better than this. It can be shown ⊗ is associative. Namely:

A⊗ (B⊗ C) = (A⊗ B)⊗ C

So we can compute D(t) by repeated squaring: D(2) = D(1) ⊗ D(1),D(4) = D(2) ⊗ D(2), D(8) = D(4) ⊗ D(4) . . .

FasterAPSA(W)

1 k = dlog2(n− 1)e (k is the smallest integer such that 2k ≥ (n− 1).)

2 Compute D(2), D(4), D(8) . . .D(2k), D(2k+1) by repeated squaring.

3 if D(2k) = D(2k+1) output solution matrix D(2k)

4 else output “G contains negative cycles”

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 30 / 35

APSP Problem

We can do better than this. It can be shown ⊗ is associative. Namely:

A⊗ (B⊗ C) = (A⊗ B)⊗ C

So we can compute D(t) by repeated squaring: D(2) = D(1) ⊗ D(1),D(4) = D(2) ⊗ D(2), D(8) = D(4) ⊗ D(4) . . .

FasterAPSA(W)

1 k = dlog2(n− 1)e (k is the smallest integer such that 2k ≥ (n− 1).)

2 Compute D(2), D(4), D(8) . . .D(2k), D(2k+1) by repeated squaring.

3 if D(2k) = D(2k+1) output solution matrix D(2k)

4 else output “G contains negative cycles”

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 30 / 35

APSP Problem

Analysis:

D(2k) = D(2k+1) implies D(n−1) = D(n) = · · · = D(2k) = · · · = D(2k+1). So inthis case D(2k) = D(n−1) is the solution matrix.

D(2k) 6= D(2k+1) implies D(n−1) 6= D(n) and hence G has negative cycles.

We call ⊗ k = log2 n times. So this algorithm takes Θ(n3 log n) time.

Can we do better?

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 31 / 35

APSP Problem

Analysis:

D(2k) = D(2k+1) implies D(n−1) = D(n) = · · · = D(2k) = · · · = D(2k+1). So inthis case D(2k) = D(n−1) is the solution matrix.

D(2k) 6= D(2k+1) implies D(n−1) 6= D(n) and hence G has negative cycles.

We call ⊗ k = log2 n times. So this algorithm takes Θ(n3 log n) time.

Can we do better?

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 31 / 35

APSP Problem

Analysis:

D(2k) = D(2k+1) implies D(n−1) = D(n) = · · · = D(2k) = · · · = D(2k+1). So inthis case D(2k) = D(n−1) is the solution matrix.

D(2k) 6= D(2k+1) implies D(n−1) 6= D(n) and hence G has negative cycles.

We call ⊗ k = log2 n times. So this algorithm takes Θ(n3 log n) time.

Can we do better?

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 31 / 35

APSP Problem

Analysis:

D(2k) = D(2k+1) implies D(n−1) = D(n) = · · · = D(2k) = · · · = D(2k+1). So inthis case D(2k) = D(n−1) is the solution matrix.

D(2k) 6= D(2k+1) implies D(n−1) 6= D(n) and hence G has negative cycles.

We call ⊗ k = log2 n times. So this algorithm takes Θ(n3 log n) time.

Can we do better?

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 31 / 35

Outline

1 Single Source Shortest Path Problem

2 Dijkstra’s Algorithm

3 Bellman-Ford Algorithm

4 All Pairs Shortest Path (APSP) Problem

5 Floyd-Warshall Algorithm

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 32 / 35

APSP Problem: Floyd-Warshall Algorithm

We can improve by other ideas. We redefine:

d(t)ij = the length of the shortest i→ j path with

all intermediate vertices in {1, 2, . . . , t}

Thend(0)

ij = W[i, j]

d(0)ij is the length of the shortest i→ j path with all intermediate vertices

in {1, 2, . . . , 0} = ∅. So such path has no any intermediate vertices, andmust be the edge i→ j (if it exists.)

As before, we need to derive a recursive formula.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 33 / 35

APSP Problem: Floyd-Warshall Algorithm

We can improve by other ideas. We redefine:

d(t)ij = the length of the shortest i→ j path with

all intermediate vertices in {1, 2, . . . , t}

Thend(0)

ij = W[i, j]

d(0)ij is the length of the shortest i→ j path with all intermediate vertices

in {1, 2, . . . , 0} = ∅. So such path has no any intermediate vertices, andmust be the edge i→ j (if it exists.)

As before, we need to derive a recursive formula.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 33 / 35

APSP Problem: Floyd-Warshall Algorithm

We can improve by other ideas. We redefine:

d(t)ij = the length of the shortest i→ j path with

all intermediate vertices in {1, 2, . . . , t}

Thend(0)

ij = W[i, j]

d(0)ij is the length of the shortest i→ j path with all intermediate vertices

in {1, 2, . . . , 0} = ∅. So such path has no any intermediate vertices, andmust be the edge i→ j (if it exists.)

As before, we need to derive a recursive formula.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 33 / 35

APSP Problem: Floyd-Warshall Algorithm

We can improve by other ideas. We redefine:

d(t)ij = the length of the shortest i→ j path with

all intermediate vertices in {1, 2, . . . , t}

Thend(0)

ij = W[i, j]

d(0)ij is the length of the shortest i→ j path with all intermediate vertices

in {1, 2, . . . , 0} = ∅. So such path has no any intermediate vertices, andmust be the edge i→ j (if it exists.)

As before, we need to derive a recursive formula.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 33 / 35

APSP Problem: Floyd-Warshall Algorithm

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

, d(t−1)it + d(t−1)

tj︸ ︷︷ ︸(2)

}

Case (1): The shortest i→ j path P with all intermediate vertices in{1, 2, . . . , t} does not pass the vertex t. So all its intermediate verticesare in {1, 2, . . . (t − 1)}. Thus the length of P is d(t−1)

ij

Case (2): The shortest i→ j path P with all intermediate vertices in{1, 2, . . . , t} does pass the vertex t. The first part of P is the shortesti→ t path with all intermediate vertices in {1, 2, . . . , (t − 1)}. The lengthis d(t−1)

it . Similarly the length of the second part of P is d(t−1)tj .

i jd(t−1)

d(t−1)

t

ittj

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 34 / 35

APSP Problem: Floyd-Warshall Algorithm

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

, d(t−1)it + d(t−1)

tj︸ ︷︷ ︸(2)

}

Case (1): The shortest i→ j path P with all intermediate vertices in{1, 2, . . . , t} does not pass the vertex t. So all its intermediate verticesare in {1, 2, . . . (t − 1)}. Thus the length of P is d(t−1)

ij

Case (2): The shortest i→ j path P with all intermediate vertices in{1, 2, . . . , t} does pass the vertex t. The first part of P is the shortesti→ t path with all intermediate vertices in {1, 2, . . . , (t − 1)}. The lengthis d(t−1)

it . Similarly the length of the second part of P is d(t−1)tj .

i jd(t−1)

d(t−1)

t

ittj

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 34 / 35

APSP Problem: Floyd-Warshall Algorithm

d(t)ij = min{d(t−1)

ij︸ ︷︷ ︸(1)

, d(t−1)it + d(t−1)

tj︸ ︷︷ ︸(2)

}

Case (1): The shortest i→ j path P with all intermediate vertices in{1, 2, . . . , t} does not pass the vertex t. So all its intermediate verticesare in {1, 2, . . . (t − 1)}. Thus the length of P is d(t−1)

ij

Case (2): The shortest i→ j path P with all intermediate vertices in{1, 2, . . . , t} does pass the vertex t. The first part of P is the shortesti→ t path with all intermediate vertices in {1, 2, . . . , (t − 1)}. The lengthis d(t−1)

it . Similarly the length of the second part of P is d(t−1)tj .

i jd(t−1)

d(t−1)

t

ittj

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 34 / 35

APSP Problem: Floyd-Warshall AlgorithmFloyd-Warshall(W)

1 D(0) = W2 for t = 1 to n do3 Compute D(t) from D(t−1) by using the above formula.4 return D(n)

Analysis:

By definition, d(n)ij is the length of the shortest i→ j path with all

intermediate vertices in {1, 2, . . . , n}. This is really not a restriction. Sod(n)

ij is the length of the shortest i→ j path.

Thus D(n) is the solution matrix.

D(t) has n2 entries in it. Each entry is min of two terms. So each entry ofD(t) takes O(1) time.

D(t) can be computed from D(t−1) in Θ(n2) time.

The whole algorithm takes Θ(n3) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 35 / 35

APSP Problem: Floyd-Warshall AlgorithmFloyd-Warshall(W)

1 D(0) = W2 for t = 1 to n do3 Compute D(t) from D(t−1) by using the above formula.4 return D(n)

Analysis:

By definition, d(n)ij is the length of the shortest i→ j path with all

intermediate vertices in {1, 2, . . . , n}. This is really not a restriction. Sod(n)

ij is the length of the shortest i→ j path.

Thus D(n) is the solution matrix.

D(t) has n2 entries in it. Each entry is min of two terms. So each entry ofD(t) takes O(1) time.

D(t) can be computed from D(t−1) in Θ(n2) time.

The whole algorithm takes Θ(n3) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 35 / 35

APSP Problem: Floyd-Warshall AlgorithmFloyd-Warshall(W)

1 D(0) = W2 for t = 1 to n do3 Compute D(t) from D(t−1) by using the above formula.4 return D(n)

Analysis:

By definition, d(n)ij is the length of the shortest i→ j path with all

intermediate vertices in {1, 2, . . . , n}. This is really not a restriction. Sod(n)

ij is the length of the shortest i→ j path.

Thus D(n) is the solution matrix.

D(t) has n2 entries in it. Each entry is min of two terms. So each entry ofD(t) takes O(1) time.

D(t) can be computed from D(t−1) in Θ(n2) time.

The whole algorithm takes Θ(n3) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 35 / 35

APSP Problem: Floyd-Warshall AlgorithmFloyd-Warshall(W)

1 D(0) = W2 for t = 1 to n do3 Compute D(t) from D(t−1) by using the above formula.4 return D(n)

Analysis:

By definition, d(n)ij is the length of the shortest i→ j path with all

intermediate vertices in {1, 2, . . . , n}. This is really not a restriction. Sod(n)

ij is the length of the shortest i→ j path.

Thus D(n) is the solution matrix.

D(t) has n2 entries in it. Each entry is min of two terms. So each entry ofD(t) takes O(1) time.

D(t) can be computed from D(t−1) in Θ(n2) time.

The whole algorithm takes Θ(n3) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 35 / 35

APSP Problem: Floyd-Warshall AlgorithmFloyd-Warshall(W)

1 D(0) = W2 for t = 1 to n do3 Compute D(t) from D(t−1) by using the above formula.4 return D(n)

Analysis:

By definition, d(n)ij is the length of the shortest i→ j path with all

intermediate vertices in {1, 2, . . . , n}. This is really not a restriction. Sod(n)

ij is the length of the shortest i→ j path.

Thus D(n) is the solution matrix.

D(t) has n2 entries in it. Each entry is min of two terms. So each entry ofD(t) takes O(1) time.

D(t) can be computed from D(t−1) in Θ(n2) time.

The whole algorithm takes Θ(n3) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 35 / 35

APSP Problem: Floyd-Warshall AlgorithmFloyd-Warshall(W)

1 D(0) = W2 for t = 1 to n do3 Compute D(t) from D(t−1) by using the above formula.4 return D(n)

Analysis:

By definition, d(n)ij is the length of the shortest i→ j path with all

intermediate vertices in {1, 2, . . . , n}. This is really not a restriction. Sod(n)

ij is the length of the shortest i→ j path.

Thus D(n) is the solution matrix.

D(t) has n2 entries in it. Each entry is min of two terms. So each entry ofD(t) takes O(1) time.

D(t) can be computed from D(t−1) in Θ(n2) time.

The whole algorithm takes Θ(n3) time.

c©Hu Ding (Michigan State University) CSE 331 Algorithm and Data Structures 35 / 35

Recommended