Data Structures Graphs. Graph… ADT? Not quite an ADT… operations not clear A formalism for...

Preview:

Citation preview

Data StructuresGraphs

Graph… ADT?• Not quite an ADT…

operations not clear

• A formalism for representing relationships between objectsGraph G = (V,E)– Set of vertices:V = {v1,v2,…,vn}

– Set of edges:E = {e1,e2,…,em} where each ei connects twovertices (vi1,vi2)

2

Han

Leia

Luke

V = {Han, Leia, Luke}E = {(Luke, Leia), (Han, Leia), (Leia, Han)}

Examples of Graphs

• The web– Vertices are webpages– Each edge is a link from one page to another

• Call graph of a program– Vertices are subroutines– Edges are calls and returns

• Social networks– Vertices are people– Edges connect friends

3

Graph DefinitionsIn directed graphs, edges have a direction:

In undirected graphs, they don’t (are two-way):

v is adjacent to u if (u,v) E

4

Han

Leia

Luke

Han

Leia

Luke

Weighted Graphs

5

20

30

35

60

Mukilteo

Edmonds

Seattle

Bremerton

Bainbridge

Kingston

Clinton

Each edge has an associated weight or cost.

Paths and Cycles• A path is a list of vertices {v1, v2, …, vn} such

that (vi, vi+1) E for all 0 i < n.• A cycle is a path that begins and ends at the

same node.

6

Seattle

San FranciscoDallas

Chicago

Salt Lake City

• p = {Seattle, Salt Lake City, Chicago, Dallas, San Francisco, Seattle}

Path Length and Cost• Path length: the number of edges in the path• Path cost: the sum of the costs of each edge

7

Seattle

San FranciscoDallas

Chicago

Salt Lake City

3.5

2 2

2.5

3

22.5

2.5

length(p) = 5 cost(p) = 11.5

More Definitions:Simple Paths and Cycles

A simple path repeats no vertices (except that the first can also be the last):p = {Seattle, Salt Lake City, San Francisco, Dallas}p = {Seattle, Salt Lake City, Dallas, San Francisco, Seattle}

A cycle is a path that starts and ends at the same node:p = {Seattle, Salt Lake City, Dallas, San Francisco, Seattle}p = {Seattle, Salt Lake City, Seattle, San Francisco, Seattle}

A simple cycle is a cycle that is also a simple path (in undirected graphs, no edge can be repeated)

8

Trees as Graphs

• Every tree is a graph with some restrictions:

–the tree is directed–there are no cycles

(directed or undirected)

–there is a directed path from the root to every node

9

A

B

D E

C

F

HG

Directed Acyclic Graphs (DAGs)

DAGs are directed graphs with no (directed) cycles.

10

main()

add()

access()

mult()

read()

Aside: If program call-graph is a DAG, then all procedure calls can be in-lined

Rep 1: Adjacency Matrix

A |V| x |V| array in which an element (u,v) is true if and only if there is an edge from u to v

11

Han

Leia

Luke

Han Luke Leia

Han

Luke

Leia

Space requirements?

Runtimes:Iterate over vertices?Iterate over edges?Iterate edges adj. to vertex?Existence of edge?

Rep 2: Adjacency List

A |V|-ary list (array) in which each entry stores a list (linked list) of all adjacent vertices

12

Han

Leia

LukeHan

Luke

Leia

Space requirements?

Runtimes:Iterate over vertices?Iterate over edges?Iterate edges adj. to vertex?Existence of edge?

Some Applications:Moving Around Washington

13

What’s the shortest way to get from Seattle to Pullman?Edge labels:

Some Applications:Moving Around Washington

14

What’s the fastest way to get from Seattle to Pullman?Edge labels:

Some Applications:Reliability of Communication

15

If Wenatchee’s phone exchange goes down,can Seattle still talk to Pullman?

Some Applications:Bus Routes in Downtown Seattle

16

If we’re at 3rd and Pine, how can we get to1st and University using Metro?

How about 4th and Seneca?

Application: Topological SortGiven a directed graph, G = (V,E), output all

the vertices in V such that no vertex is output before any other vertex with an edge to it.

17

CSE 142 CSE 143

CSE 321

CSE 341

CSE 378

CSE 326

CSE 370

CSE 403

CSE 421

CSE 467

CSE 451

CSE 322

Is the output unique?

Topological Sort: Take One

1. Label each vertex with its in-degree (# of inbound edges)

2. While there are vertices remaining:a. Choose a vertex v of in-degree zero; output

vb. Reduce the in-degree of all vertices adjacent

to vc. Remove v from the list of vertices

18

Runtime:

void Graph::topsort(){ Vertex v, w; labelEachVertexWithItsIn-degree();

for (int counter=0; counter < NUM_VERTICES; counter++){

v = findNewVertexOfDegreeZero(); v.topologicalNum = counter; for each w adjacent to v w.indegree--; }}

19

Topological Sort: Take Two

1. Label each vertex with its in-degree2. Initialize a queue Q to contain all in-degree

zero vertices3. While Q not empty

a. v = Q.dequeue; output vb. Reduce the in-degree of all vertices adjacent to vc. If new in-degree of any such vertex u is zero

Q.enqueue(u)

20

Runtime:

Note: could use a stack, list, set, box, … instead of a queue

void Graph::topsort(){ Queue q(NUM_VERTICES); int counter = 0; Vertex v, w;

labelEachVertexWithItsIn-degree();

q.makeEmpty(); for each vertex v if (v.indegree == 0) q.enqueue(v);

while (!q.isEmpty()){ v = q.dequeue(); v.topologicalNum = ++counter; for each w adjacent to v if (--w.indegree == 0) q.enqueue(w); }}

21

intialize thequeue

get a vertex withindegree 0

insert neweligiblevertices

Runtime:

Graph ConnectivityUndirected graphs are connected if there is a path between

any two vertices

Directed graphs are strongly connected if there is a path from any one vertex to any other

Directed graphs are weakly connected if there is a path between any two vertices, ignoring direction

A complete graph has an edge between every pair of vertices

22

Graph Traversals• Breadth-first search (and depth-first search) work

for arbitrary (directed or undirected) graphs - not just mazes!– Must mark visited vertices so you do not go into an

infinite loop!

• Either can be used to determine connectivity:– Is there a path between two given vertices?– Is the graph (weakly) connected?

• Which one:– Uses a queue?– Uses a stack?– Always finds the shortest path (for unweighted graphs)?

23

CSE 326: Data StructuresGraph Traversals

James Fogarty

Autumn 2007

Graph Connectivity• Undirected graphs are connected if there is a path between

any two vertices

• Directed graphs are strongly connected if there is a path from any one vertex to any other

• Directed graphs are weakly connected if there is a path between any two vertices, ignoring direction

• A complete graph has an edge between every pair of vertices

25

Graph Traversals• Breadth-first search (and depth-first search) work

for arbitrary (directed or undirected) graphs - not just mazes!– Must mark visited vertices. Why?– So you do not go into an infinite loop! It’s not a

tree.

• Either can be used to determine connectivity:– Is there a path between two given vertices?– Is the graph (weakly/strongly) connected?

• Which one:– Uses a queue?– Uses a stack?– Always finds the shortest path (for unweighted graphs)?26

The Shortest Path Problem• Given a graph G, edge costs ci,j, and vertices

s and t in G, find the shortest path from s to t.

• For a path p = v0 v1 v2 … vk

– unweighted length of path p = k (a.k.a. length)

– weighted length of path p = i=0..k-1 ci,i+1 (a.k.a cost)

– Path length equals path cost when ?

27

Single Source Shortest Paths (SSSP)

• Given a graph G, edge costs ci,j, and vertex s, find the shortest paths from s to all vertices in G.

– Is this harder or easier than the previous problem?

28

All Pairs Shortest Paths (APSP)

• Given a graph G and edge costs ci,j, find the shortest paths between all pairs of vertices in G.

– Is this harder or easier than SSSP?

– Could we use SSSP as a subroutine to solve this?

29

Depth-First Graph Search

• DFS( Start, Goal_test)• push(Start, Open);• repeat• if (empty(Open)) then return fail;• Node := pop(Open);• if (Goal_test(Node)) then return Node;• for each Child of node do• if (Child not already visited) then push(Child, Open);• Mark Node as visited;• end

30

Open – Stack

Criteria – Pop

Breadth-First Graph Search

• BFS( Start, Goal_test)• enqueue(Start, Open);• repeat• if (empty(Open)) then return fail;• Node := dequeue(Open);• if (Goal_test(Node)) then return Node;• for each Child of node do• if (Child not already visited) then enqueue(Child,

Open);• Mark Node as visited;• end

31

Open – Queue

Criteria – Dequeue (FIFO)

Comparison: DFS versus BFS• Depth-first search

–Does not always find shortest paths–Must be careful to mark visited vertices, or you could go into an infinite loop if there is a cycle

• Breadth-first search–Always finds shortest paths – optimal solutions–Marking visited nodes can improve efficiency, but even without doing so search is guaranteed to terminate

–Is BFS always preferable?

32

DFS Space Requirements

• Assume:– Longest path in graph is length d– Highest number of out-edges is k

• DFS stack grows at most to size dk– For k=10, d=15, size is 150

33

BFS Space Requirements

• Assume – Distance from start to a goal is d – Highest number of out edges is k BFS

• Queue could grow to size kd

– For k=10, d=15, size is 1,000,000,000,000,000

34

Conclusion

• For large graphs, DFS is hugely more memory efficient, if we can limit the maximum path length to some fixed d.– If we knew the distance from the start to the

goal in advance, we can just not add any children to stack after level d

– But what if we don’t know d in advance?

35

Iterative-Deepening DFS (I)• Bounded_DFS(Start, Goal_test, Limit)• Start.dist = 0;• push(Start, Open);• repeat• if (empty(Open)) then return fail;• Node := pop(Open);• if (Goal_test(Node)) then return Node;• if (Node.dist Limit) then return fail;• for each Child of node do• if (Child not already i-visited) then • Child.dist := Node.dist + 1;• push(Child, Open);• Mark Node as i-visited;• end

36

Iterative-Deepening DFS (II)• IDFS_Search(Start, Goal_test)

• i := 1;

• repeat

• answer := Bounded_DFS(Start, Goal_test, i);

• if (answer != fail) then return answer;

• i := i+1;

• end

37

Analysis of IDFS

• Work performed with limit < actual distance to G is wasted – but the wasted work is usually small compared to amount of work done during the last iteration

38

1

( )d

i d

i

k O k

Ignore low order terms!

Same time complexity as BFS

Same space complexity as (bounded) DFS

Saving the Path

• Our pseudocode returns the goal node found, but not the path to it

• How can we remember the path?– Add a field to each node, that points to the

previous node along the path– Follow pointers from goal back to start to

recover path

39

Example

40

Seattle

San FranciscoDallas

Salt Lake City

Example (Unweighted Graph)

41

Seattle

San FranciscoDallas

Salt Lake City

Example (Unweighted Graph)

42

Seattle

San FranciscoDallas

Salt Lake City

Graph Search, Saving Path• Search( Start, Goal_test, Criteria)• insert(Start, Open);• repeat• if (empty(Open)) then return fail;• select Node from Open using Criteria;• if (Goal_test(Node)) then return Node;• for each Child of node do• if (Child not already visited) then• Child.previous := Node;• Insert( Child, Open );• Mark Node as visited;• end

43

Weighted SSSP:The Quest For Food

44

Vending Machine in EE1

ALLENHUB

Delfino’s

Ben & Jerry’sNeelam’sCedars

Coke Closet

Home

Schultzy’s

Parent’s Home

Café Allegro

10The Ave

U Village

350

375

40

25

3515

25

15,356

35

28575

70365

350

Can we calculate shortest distance to all nodes from Allen Center?

Weighted SSSP:The Quest For Food

45

Vending Machine in EE1

ALLENHUB

Delfino’s

Ben & Jerry’sNeelam’sCedars

Coke Closet

Home

Schultzy’s

Parent’s Home

Café Allegro

10The Ave

U Village

5

375

40

25

3515

25

15,356

35

28575

70365

350

Can we calculate shortest distance to all nodes from Allen Center?

Edsger Wybe Dijkstra (1930-2002)

• Invented concepts of structured programming, synchronization, weakest precondition, and "semaphores" for controlling computer processes. The Oxford English Dictionary cites his use of the words "vector" and "stack" in a computing context.

• Believed programming should be taught without computers• 1972 Turing Award• “In their capacity as a tool, computers will be but a ripple on the

surface of our culture. In their capacity as intellectual challenge, they are without precedent in the cultural history of mankind.”

46

General Graph Search Algorithm

• Search( Start, Goal_test, Criteria)• insert(Start, Open);• repeat• if (empty(Open)) then return fail;• select Node from Open using Criteria;• if (Goal_test(Node)) then return Node;• for each Child of node do• if (Child not already visited) then Insert( Child,

Open );• Mark Node as visited;• end

47

Open – some data structure (e.g., stack, queue, heap)

Criteria – some method for removing an element from Open

Shortest Path for Weighted Graphs

• Given a graph G = (V, E) with edge costs c(e), and a vertex s V, find the shortest (lowest cost) path from s to every vertex in V

• Assume: only positive edge costs

48

Dijkstra’s Algorithm for Single Source Shortest Path

• Similar to breadth-first search, but uses a heap instead of a queue:– Always select (expand) the vertex that has a

lowest-cost path to the start vertex

• Correctly handles the case where the lowest-cost (shortest) path to a vertex is not the one with fewest edges

49

CSE 326: Data StructuresDijkstra’s Algorithm

Dijkstra, Edsger Wybe

Legendary figure in computer science; was a professor at University of Texas.

Supported teaching introductory computer courses without computers (pencil and paper programming)

Supposedly wouldn’t (until very late in life) read his e-mail; so, his staff had to print out messages and put them in his box.

51

E.W. Dijkstra (1930-2002)

1972 Turning Award Winner, Programming Languages, semaphores, and …

Dijkstra’s Algorithm: Idea

Adapt BFS to handle weighted graphs

Two kinds of vertices:– Finished or known

vertices• Shortest distance

hasbeen computed

– Unknown vertices• Have tentative

distance

52

Dijkstra’s Algorithm: Idea

At each step:1) Pick closest unknown

vertex

2) Add it to known vertices

3) Update distances

53

Dijkstra’s Algorithm: Pseudocode

Initialize the cost of each node to

Initialize the cost of the source to 0

While there are unknown nodes left in the graphSelect an unknown node b with the lowest costMark b as knownFor each node a adjacent to b

a’s cost = min(a’s old cost, b’s cost + cost of (b, a))a’s prev path node = b

54

Important Features

• Once a vertex is made known, the cost of the shortest path to that node is known

• While a vertex is still not known, another shorter path to it might still be found

• The shortest path itself can found by following the backward pointers stored in node.path

55

Dijkstra’s Algorithm in action

56

A B

DC

F H

E

G

0

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A 0

B ??

C ??

D ??

E ??

F ??

G ??

H ??

Dijkstra’s Algorithm in action

57

A B

DC

F H

E

G

0 2

4

1

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B <=2 A

C <=1 A

D <=4 A

E ??

F ??

G ??

H ??

Dijkstra’s Algorithm in action

58

A B

DC

F H

E

G

0 2

4

1

12

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B <=2 A

C Y 1 A

D <=4 A

E <=12 C

F ??

G ??

H ??

Dijkstra’s Algorithm in action

59

A B

DC

F H

E

G

0 2 4

4

1

12

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B Y 2 A

C Y 1 A

D <=4 A

E <=12 C

F <=4 B

G ??

H ??

Dijkstra’s Algorithm in action

60

A B

DC

F H

E

G

0 2 4

4

1

12

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B Y 2 A

C Y 1 A

D Y 4 A

E <=12 C

F <=4 B

G ??

H ??

Dijkstra’s Algorithm in action

61

A B

DC

F H

E

G

0 2 4 7

4

1

12

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B Y 2 A

C Y 1 A

D Y 4 A

E <=12 C

F Y 4 B

G ??

H <=7 F

Dijkstra’s Algorithm in action

62

A B

DC

F H

E

G

0 2 4 7

4

1

12

8

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B Y 2 A

C Y 1 A

D Y 4 A

E <=12 C

F Y 4 B

G <=8 H

H Y 7 F

Dijkstra’s Algorithm in action

63

A B

DC

F H

E

G

0 2 4 7

4

1

11

8

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B Y 2 A

C Y 1 A

D Y 4 A

E <=11 G

F Y 4 B

G Y 8 H

H Y 7 F

Dijkstra’s Algorithm in action

64

A B

DC

F H

E

G

0 2 4 7

4

1

11

8

2 2 3

110 2

3

111

7

1

9

2

4

Vertex Visited? Cost Found by

A Y 0

B Y 2 A

C Y 1 A

D Y 4 A

E Y 11 G

F Y 4 B

G Y 8 H

H Y 7 F

Your turn

V Visited? Cost Found by

v0

v1

v2

v3

v4

v5

v6 65

v3

v6

v1

v2 v4

v5

v0s

1

2

2

2

1

1 1

5 3

5

6

10

Dijkstra’s Alg: Implementation

Initialize the cost of each node to Initialize the cost of the source to 0

While there are unknown nodes left in the graphSelect the unknown node b with the lowest cost

Mark b as known

For each node a adjacent to b

a’s cost = min(a’s old cost, b’s cost + cost of (b, a))

a’s prev path node = b (if we updated a’s cost)

67

What data structures should we use?

Running time?

void Graph::dijkstra(Vertex s){ Vertex v,w;

Initialize s.dist = 0 and set dist of all other vertices to infinity

while (there exist unknown vertices, find the one b with the smallest distance)

b.known = true;

for each a adjacent to b if (!a.known) if (b.dist + weight(b,a) < a.dist){

a.dist = (b.dist + weight(b,a)); a.path = b; } }}

68

Sounds like deleteMin on

a heap…

Sounds like adjacency

listsSounds like

decreaseKey

Running time: O(|E| log |V|) – there are |E| edges to examine, and each one causes a heap operation of time O(log |V|)

Dijkstra’s Algorithm: Summary

• Classic algorithm for solving SSSP in weighted graphs without negative weights

• A greedy algorithm (irrevocably makes decisions without considering future consequences)

• Intuition for correctness:– shortest path from source vertex to itself is 0– cost of going to adjacent nodes is at most edge weights– cheapest of these must be shortest path to that node– update paths for new node and continue picking cheapest path

69

Correctness: The Cloud Proof

How does Dijkstra’s decide which vertex to add to the Known set next?• If path to V is shortest, path to W must be at least as long

(or else we would have picked W as the next vertex)• So the path through W to V cannot be any shorter! 70

The Known Cloud

V

Next shortest path from inside the known cloud

W

Better path to V? No!

Source

Correctness: Inside the CloudProve by induction on # of nodes in the cloud:

Initial cloud is just the source with shortest path 0

Assume: Everything inside the cloud has the correct shortest path

Inductive step: Only when we prove the shortest path to some node v (which is not in the cloud) is correct, we add it to the cloud

71

When does Dijkstra’s algorithm not work?

The Trouble with Negative Weight Cycles

72

A B

C D

E

210

1-5

2

What’s the shortest path from A to E?

Problem?

Dijkstra’s vs BFSAt each step:

1) Pick closest unknown vertex

2) Add it to finished vertices

3) Update distances

Dijkstra’s Algorithm

At each step:1) Pick vertex from queue

2) Add it to visited vertices

3) Update queue with neighbors

Breadth-first Search

73

Some Similarities:

Single-Source Shortest Path• Given a graph G = (V, E) and a single

distinguished vertex s, find the shortest weighted path from s to every other vertex in G.

All-Pairs Shortest Path:• Find the shortest paths between all

pairs of vertices in the graph.

• How?

74

Analysis

• Total running time for Dijkstra’s:O(|V| log |V| + |E| log |V|) (heaps)

What if we want to find the shortest path from each point to ALL other points?

75

Dynamic Programming

Algorithmic technique that systematically records the answers to sub-problems in a table and re-uses those recorded results (rather than re-computing them).

Simple Example: Calculating the Nth Fibonacci number.

Fib(N) = Fib(N-1) + Fib(N-2)

76

Floyd-Warshallfor (int k = 1; k =< V; k++)

for (int i = 1; i =< V; i++)

for (int j = 1; j =< V; j++)

if ( ( M[i][k]+ M[k][j] ) < M[i][j] )M[i][j] = M[i][k]+ M[k][j]

77

Invariant: After the kth iteration, the matrix includes the shortest paths for all pairs of vertices (i,j) containing only vertices 1..k as intermediate vertices

a b c d e

a 0 2 - -4 -

b - 0 -2 1 3

c - - 0 - 1

d - - - 0 4

e - - - - 0

78

b

c

d e

a

-4

2-2

1

31

4

Initial state of the matrix:

M[i][j] = min(M[i][j], M[i][k]+ M[k][j])

a b c d e

a 0 2 0 -4 0

b - 0 -2 1 -1

c - - 0 - 1

d - - - 0 4

e - - - - 079

b

c

d e

a

-4

2-2

1

31

4

Floyd-Warshall - for All-pairs shortest path

Final Matrix Contents

CSE 326: Data StructuresSpanning Trees

A Hidden Tree

81

Start

End

Spanning Tree in a Graph

82

Vertex = routerEdge = link between routers

Spanning tree - Connects all the vertices - No cycles

Undirected Graph

• G = (V,E)– V is a set of vertices (or nodes)– E is a set of unordered pairs of vertices

83

12

3

4

56

7

V = {1,2,3,4,5,6,7}E = {{1,2},{1,6},{1,5},{2,7},{2,3}, {3,4},{4,7},{4,5},{5,6}}

2 and 3 are adjacent2 is incident to edge {2,3}

Spanning Tree Problem

• Input: An undirected graph G = (V,E). G is connected.

• Output: T contained in E such that– (V,T) is a connected graph– (V,T) has no cycles

84

Spanning Tree Algorithm

85

ST(i: vertex) mark i; for each j adjacent to i do if j is unmarked then Add {i,j} to T; ST(j);end{ST}

MainT := empty set;ST(1);end{Main}

Example of Depth First Search

86

12

3

4

5

6

7

ST(1)

Example Step 16

87

12

3

4

5

6

7

ST(1)

{1,2} {2,7} {7,5} {5,4} {4,3} {5,6}

Minimum Spanning TreesGiven an undirected graph G=(V,E), find

a graph G’=(V, E’) such that:– E’ is a subset of E– |E’| = |V| - 1– G’ is connected– is minimal

Applications: wiring a house, power grids, Internet connections

88

'),(

cEvu

uv

G’ is a minimum spanning tree.

Minimum Spanning Tree Problem

• Input: Undirected Graph G = (V,E) and a cost function C from E to the reals. C(e) is the cost of edge e.

• Output: A spanning tree T with minimum total cost. That is: T that minimizes

89

Te

eCTC )()(

Best Spanning Tree

• Each edge has the probability that it won’t fail

• Find the spanning tree that is least likely to fail

90

12

3

4

56

7

.80 .75.95

.50.95 1.0

.85

.84

.80

.89

Example of a Spanning Tree

91

12

3

4

56

7

.80 .75.95

.50.95 1.0

.85

.84

.80

.89

Probability of success = .85 x .95 x .89 x .95 x 1.0 x .84 = .5735

Minimum Spanning Tree Problem

• Input: Undirected Graph G = (V,E) and a cost function C from E to the reals. C(e) is the cost of edge e.

• Output: A spanning tree T with minimum total cost. That is: T that minimizes

92

Te

eCTC )()(

Reducing Best to Minimum

93

Let P(e) be the probability that an edge doesn’t fail.Define:

))((log)( 10 ePeC

Minimizing Te

eC )(

is equivalent to maximizing Te

eP )(

because

Te

eC

Te

eC

Te

eP)(

)( 1010)(

Example of Reduction

94

12

3

4

56

7

.80 .75.95

.50.95 1.0

.85

.84

.80

.89

12

3

4

56

7

.097 .125.022

.301.022 .000

.071

.076

.097

.051

Best Spanning Tree Problem Minimum Spanning Tree Problem

Find the MST

95

47

1 5

9

2

Find the MST

96

A

C

B

D

FH

G

E

1

76

5

11

4

12

13

23

9

10

4

Two Different Approaches

97

Prim’s AlgorithmLooks familiar!

Kruskals’s AlgorithmCompletely different!

Prim’s algorithm

Idea: Grow a tree by adding an edge from the “known” vertices to the “unknown” vertices. Pick the edge with the smallest weight.

98

G

v

known

Prim’s Algorithm for MSTA node-based greedy algorithm

Builds MST by greedily adding nodes

1. Select a node to be the “root”• mark it as known• Update cost of all its neighbors

2. While there are unknown nodes left in the grapha. Select an unknown node b with the smallest cost

from some known node ab. Mark b as knownc. Add (a, b) to MSTd. Update cost of all nodes adjacent to b

99

Find MST using Prim’s

V Kwn Distance path

v1

v2

v3

v4

v5

v6

v7100

v4

v7

v2

v3 v5

v6

v1

Start with V1

2

2

5

4

7

1 10

4 6

3

8

1

Your Turn

Order Declared Known:

V1

Prim’s Algorithm Analysis

Running time:

Same as Dijkstra’s: O(|E| log |V|)

Correctness:

Proof is similar to Dijkstra’s

101

Kruskal’s MST Algorithm

Idea: Grow a forest out of edges that do not create a cycle. Pick an edge with the smallest weight.

102

G=(V,E)

v

Kruskal’s Algorithm for MSTAn edge-based greedy algorithm

Builds MST by greedily adding edges

1. Initialize with• empty MST• all vertices marked unconnected• all edges unmarked

2. While there are still unmarked edgesa. Pick the lowest cost edge (u,v) and mark itb. If u and v are not already connected, add (u,v) to

the MST and mark u and v as connected to each other

103Doesn’t it sound familiar?

Example of Kruskal 1

104

1

6

5

4

7

2

33

34 0

2 2

1

3

{7,4} {2,1} {7,5} {5,6} {5,4} {1,6} {2,7} {2,3} {3,4} {1,5} 0 1 1 2 2 3 3 3 3 4

1 3

Data Structures for Kruskal

• Sorted edge list

• Disjoint Union / Find– Union(a,b) - union the disjoint sets named by

a and b– Find(a) returns the name of the set containing

a

105

{7,4} {2,1} {7,5} {5,6} {5,4} {1,6} {2,7} {2,3} {3,4} {1,5} 0 1 1 2 2 3 3 3 3 4

Example of DU/F 1

106

1

6

5

4

7

2

33

34 0

2 2

1

3

{7,4} {2,1} {7,5} {5,6} {5,4} {1,6} {2,7} {2,3} {3,4} {1,5} 0 1 1 2 2 3 3 3 3 4

1 3

7

1

3Find(5) = 7Find(4) = 7

Example of DU/F 2

107

1

6

5

4

7

2

33

34 0

2 2

1

3

{7,4} {2,1} {7,5} {5,6} {5,4} {1,6} {2,7} {2,3} {3,4} {1,5} 0 1 1 2 2 3 3 3 3 4

1 3

7

1

3

Find(1) = 1Find(6) = 7

Example of DU/F 3

108

1

6

5

4

7

2

33

34 0

2 2

1

3

{7,4} {2,1} {7,5} {5,6} {5,4} {1,6} {2,7} {2,3} {3,4} {1,5} 0 1 1 2 2 3 3 3 3 4

1 3

7

3

Union(1,7)

Kruskal’s Algorithm with DU / F

109

Sort the edges by increasing cost;Initialize A to be empty;for each edge {i,j} chosen in increasing order do u := Find(i); v := Find(j); if not(u = v) then add {i,j} to A; Union(u,v);

Kruskal codevoid Graph::kruskal(){ int edgesAccepted = 0; DisjSet s(NUM_VERTICES);

while (edgesAccepted < NUM_VERTICES – 1){ e = smallest weight edge not deleted yet; // edge e = (u, v) uset = s.find(u); vset = s.find(v); if (uset != vset){ edgesAccepted++; s.unionSets(uset, vset); } }}

110

2|E| finds

|V| unions

|E| heap ops

Find MST using Kruskal’s

111

A

C

B

D

F H

G

E

2 2 3

21

4

10

8

194

2

7

Total Cost:

• Now find the MST using Prim’s method.• Under what conditions will these methods give the same result?

Recommended