Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
2/29/2020
1
Fundamental AlgorithmsTopic 5: Minimum Spanning Trees
By Adam Sheffer
Leonhard Euler
Problem: Designing a Network
Problem. We wish to rebuild the communication network of the college.
◦ We have a list of routers, and the cost of connecting every pair of routers (some connections might be impossible).
◦ We wish to obtain a connected network, while minimizing the total cost.
2
1
2
2/29/2020
2
A Solution?
We build a graph 𝐺 = (𝑉, 𝐸). What are the vertices of 𝑉?
◦ A vertex for every router.
What are the edges of 𝐸?
◦ An edge between every two routers that can be connected.
Is the graph directed?
◦ No.
Where are the connection costspresented in the graph?
Weighted Graphs
For graph 𝐺 = (𝑉, 𝐸), we can define a weight function on the edges
𝑤:𝐸 → ℝ.◦ The weight of an edge 𝑒 is denoted 𝑤(𝑒).
1.3 2
0
-1-3
5𝜋
3
4
2/29/2020
3
Back to Designing a Network
We have a graph 𝐺 = 𝑉, 𝐸 whose vertices represent the routers and edgesrepresent possible connections.
How are the costs related to the graph?
◦ The weight of every edge should be the corresponding cost.
5
5
91
1
2
Example: Network Design
What’s the cheapest solution in this case?
5
5
91
1
2
5
1
1
Network cost: 9 2
5
6
2/29/2020
4
Spanning Trees
A spanning tree is a tree that contains every vertex of the graph.
A graph can contain many distinct spanning trees.
The Weight of a Spanning Tree
The weight 𝑤(𝑇) of a spanning tree 𝑇 is the sum of its edge weights.
1
11
2
22
2
2
5
8
1
11
2
22
2
2
5
8
1
11
2
22
2
2
5
8
𝑤 𝑇 = 13𝑤 𝑇 = 19𝑤 𝑇 = 16
7
8
2/29/2020
5
Minimum Spanning Tree
A minimum spanning tree (or MST) is a spanning tree of a minimum weight.
1
11
2
22
2
2
5
8
1
11
2
22
2
2
5
8
1
11
2
22
2
2
5
8
Routers and MSTs
In the network design problem we are looking for an MST of the graph.
5
5
91
1
2
9
10
2/29/2020
6
MSTs Are Also Useful For…
Genetic research
Galaxies research
Cacti species in the desert
Tree Research in CUNY
11
12
2/29/2020
7
Find an MST
𝑠
𝑎 𝑐
𝑑
𝑔𝑒
16
2
1
The green numbers are theedge weights.
2
2
21
3
1
The Size of a Spanning Tree
Given a connected graph with 𝑛 vertices and 𝑚 edges, how many edges are in a spanning tree?
◦ Exactly 𝑛 − 1.
13
14
2/29/2020
8
MSTs and BFS Trees
Discuss in groups.
◦ Draw a connected graph that contains a BFS tree that is not an MST.
◦ True or False: In every connected graph there exists a BFS tree that is not an MST.
Paths in a Spanning Tree
Claim. In any spanning tree there is exactly one path between every two vertices.
◦ There is a path between every two vertices. Otherwise, the graph is not connected.
15
16
2/29/2020
9
Spanning Trees: Unique Paths
Claim. In any spanning tree 𝑇 there is exactlyone path between every two vertices.
◦ Assume for contradiction that there are two paths 𝑃, 𝑄 in 𝑇 between vertices 𝑠 and 𝑡.
◦ 𝑢 – the last common vertex before the paths 𝑃,𝑄 split (when traveling from 𝑠 to 𝑡).
◦ 𝑣 – the first vertex common to both paths after 𝑢.
◦ The portions of 𝑃 and 𝑄between 𝑢 and 𝑣 form a cycle. Contradiction!
𝑠 𝑢 𝑣 𝑡
Spanning Trees: Removing an Edge
Claim. Removing any edge of a spanning tree splits it into a forest of two trees.
Proof. 𝑒 = (𝑢, 𝑣) – the removed edge.
◦ 𝑒 is the unique path between 𝑢 and 𝑣. After removing 𝑒 there is no path between 𝑢 and 𝑣. So at least two trees.
◦ A vertex whose unique path to 𝑢contains 𝑢, 𝑣 remains connected to 𝑣. Every other vertex remains connected to 𝑢. So exactly two trees.
17
18
2/29/2020
10
Spanning Trees: Adding an Edge
Claim. Adding an edge to a spanning tree creates exactly one cycle.
Proof.
◦ We add the edge 𝑒 = 𝑢, 𝑣 .
◦ There was already a path between 𝑢 and 𝑣, and adding 𝑒 turns it into a cycle.
◦ Assume for contradiction that adding 𝑒 closes two cycles.
◦ Each cycle must contain a different path between 𝑣 and 𝑢. But only one such path exists! Contradiction!
How Many MSTs Are There?
Can a graph have more than one MST?
◦ Yes!
1
11
2
22
2
2
4
8
1
11
2
22
2
2
4
8
1
11
2
22
2
2
4
8
19
20
2/29/2020
11
A Lot of MSTs
1 1 1 1
1
1
1
1
1111
1
1
1
1
1
𝑉 MSTs𝑉 𝑉 −2 MSTs
(This is not so simple to show.)
𝟏All weights
Group Exercise
Discuss in groups.
◦ A complete graph has an edge between every two vertices.
◦ Consider the complete graph where every edge has weight 1.
◦ How many MSTs can you prove exist in this graph?
◦ Can you show that there are at least ( V !)/2 MSTs?
𝟏All weights
21
22
2/29/2020
12
A Moment of Nonsense
What is the origin of the word algorithm?
It is named after a person! Who?
Al Gore? Al-Khwārizmī!
?23
A Moment of Nonsense
What other major mathematical concept has a very similar origin?
Al Khwarizmi’s book is called
Al-Kitāb al-mukhtaṣar fī hīsāb al-ğabrwa’l-muqābala
24
23
24
2/29/2020
13
Prim’s MST Algorithm
Prim’s algorithm for finding an MST:
◦ Input: a connected graph 𝐺 = 𝑉, 𝐸 and weight function 𝑤:𝐸 → ℝ.
◦ First, choose an arbitrary vertex 𝑟 to be the root and set 𝑇 ← 𝑟.
◦ As long as 𝑇 is not a spanning tree:
Find the lightest edge 𝑒 that connects a vertex of 𝑇 to a vertex 𝑣 not in 𝑇. Add 𝑣 and 𝑒 to 𝑇.
Example: Prim’s Algorithm
26
4
2
2
5
3
14
2
2
5
3
14
2
2
5
3
14
2
2
5
3
1
25
26
2/29/2020
14
Data Structure: Priority Queue
A priority queue stores “objects” (in our case – vertices). Each object has a priority.
Supports the operations:
◦ Insert – inserts an object to the queue.
◦ Remove – removes the object with the highest priority from the queue.
◦ Update the priority of an existing element.
Prim’s Algorithm in More Detail
28
𝑄 - priority queueaccording to the values of 𝑘𝑒𝑦[𝑣].
As in BFS, 𝜋[𝑣] is theparent of 𝑣 in the tree.
𝑘𝑒𝑦 𝑣 is the current best weight for connecting 𝑣 to the tree.
27
28
2/29/2020
15
Another Prim Run
r
ba
ce
d
3 2
2
4 35
8 5
r
ba
ce
d
3 2
2
4 35
8 5
r
ba
ce
d
3 2
2
4 35
8 5
ba
ce
3 2
2
4 35
8 5
r r
ba
ce
3 2
2
4 35
8 5d
ba
ce
3 2
2
4 35
8 5d d
r
𝑄
Try Running Prim on Your Own
1 1
1
1
324
4
3
100 80
4
4
29
30
2/29/2020
16
Prim’s Running Time
What is the running time of Prim’s algorithm?
◦ It depends on the running time of the priority queue!
Priority queue running time.
◦ Standard implementation: When there are 𝑛elements in queue, removal takes 𝑂 log 𝑛steps. Other operations take 𝑂 1 .
◦ We will not study the details in this course.
Prim’s Running Time (2)
Priority queue running time.
◦ 𝑛 elements in queue: removal takes 𝑂 log 𝑛steps. Other operations take 𝑂 1 .
Running time of Prim’s algorithm.
◦ Graph 𝐺 = 𝑉, 𝐸 .
◦ Initialization steps: 𝑂 𝑉 .
◦ 𝑉 removals from 𝑄: 𝑂(|𝑉| log 𝑉 ).
◦ Constant number of operations for every edge: 𝑂( 𝐸 ).
◦ Total running time: 𝑂 |𝑉| log 𝑉 + |𝐸| .
31
32
2/29/2020
17
Group Exercise
Graph 𝐺 = 𝑉, 𝐸 with edge weights.
◦ Consider a connected graph containing all the vertices of 𝑉 and any number of edges of 𝐸.
◦ A Minimum Spanning Graph of 𝐺 is such a connected graph of minimum weight.
◦ Similar to an MST, but may contain cycles.
Problem.
a) Find an MST that is not an MSG.
b) Design an alg for finding an MSG in 𝐺. Running time: 𝑂( 𝑉 log 𝑉 + 𝐸 )
Some History
Prim’s algorithm was published in 1957.
A Czech mathematician named Borůvkadescribed an efficient MST algorithm in 1926.
◦ Graph algorithms was not yet a research field.
◦ Borůvka was just trying to find an efficient way to build the electric network of some geographic area.
Otakar Borůvka
33
34
2/29/2020
18
Kruskal’s Algorithm
Kruskal’s MST algorithm:
◦ Receive graph 𝐺 = 𝑉, 𝐸
◦ Create graph 𝐺′ = 𝑉, 𝐸′ , where 𝐸′ is empty (no edges in 𝐺′).
◦ Sort edges of 𝐸 according to their weight.
◦ Go over the edges by increasing order of weight:
If 𝑒 connects two connected components of 𝐺 ′, insert 𝑒 to 𝐸′.
◦ Return 𝐺′ as the MST.
Kruskal Example
1 11
1
324
4
300
100 80
6
4
That’s a heavy tree!
35
36
2/29/2020
19
Kruskal Example (end)
1 11
24
4
80
That’s a heavy tree!
Running Kruskal on Your Own
1 1
3
3
325
3
2
4 5
4
5
37
38
2/29/2020
20
Kruskal’s Running Time
Running time of Kruskal’s algorithm.
◦ Graph 𝐺 = 𝑉, 𝐸 .
◦ Create graph 𝐺′ with no edges: 𝑂 𝑉 .
◦ Sort the edges: 𝑂( 𝐸 log 𝐸 ).
◦ For every edge of 𝐸:
Check if it connects vertices from the same component. Can use BFS: 𝑂 |𝑉| per edge.
Total: 𝑂 𝑉 ⋅ 𝐸 .
◦ Total running time: 𝑂 𝑉 ⋅ 𝐸 .
40
That’s a horrible
time. Adam, you’re fired!
Leonhard Euler
39
40
2/29/2020
21
Data Structure: Disjoint Sets
Holds disjoint sets of elements.
Supports operations:
◦ Create new set containing one new element.
◦ Given element 𝑥, find the set containing 𝑥.
◦ Merge two sets into one.
Merge
Disjoint Sets Running Time
𝛼 𝑛 – inverse Ackerman function.
◦ Grows MUCH slower than log 𝑛, but is still not a constant.
Total number of elements in the sets: 𝑛.
◦ Create new set containing one new element: 𝑂 1 .
◦ “Average” running time of one merge/find
operation 𝑂 𝛼 𝑛 .
41
42
2/29/2020
22
𝛼 𝑛 Grows SLOWLY
To get log2 𝑛 ≥ 4, set 𝑛 ≥ 24 = 16.
To get log2 log2 𝑛 ≥ 4, set
𝑛 ≥ 224= 216 ≈ 65,000.
To get 𝛼 𝑛 ≥ 4, set 𝑛 > 10220,000
.
Estimated number of particles in the visible universe: 1086.
For all practical purposes, 𝛼 𝑛 ≤ 4.
Disjoint Sets and Kruskal
Problem. How can we use disjoint sets in Kruskal’s algorithm?
◦ Create a different set for each vertex.
◦ When inspecting edge 𝑒, check if both vertices of 𝑒 are in the same set.
◦ If not, add 𝑒 to tree and merge the two sets.
+
43
44
2/29/2020
23
Example: Kruskal with Disjoint Sets
45
Revised Running Time
Running time of Kruskal’s algorithm.
◦ Graph 𝐺 = 𝑉, 𝐸 .
◦ Create graph 𝐺′ with no edges: 𝑂 𝑉 .
◦ Create disjoint set for each vertex: 𝑂 𝑉 .
◦ Sort the edges: 𝑂( 𝐸 log 𝐸 ).
◦ For every edge of 𝐸:
Check if it connects vertices from same set, and possibly merge both sets: 𝑂 𝛼( 𝑉 ) per edge “on average”.
Total: 𝑂 |𝐸|𝛼 𝑉 .
◦ Total time: 𝑂 𝐸 (log 𝐸 + 𝛼 𝑉 ) + 𝑉 .
45
46
2/29/2020
24
Simplifying the Running Time
We obtained the running time𝑂 𝐸 (log 𝐸 + 𝛼 𝑉 ) + 𝑉 .
If 𝐸 < 𝑉 − 1, the graph is not connected, so no spanning trees.
◦ We may assume |𝐸| ≥ 𝑉 − 1.
◦ 𝐸 ⋅ 𝛼 |𝑉| + 𝑉 = 𝑂 |𝐸| log 𝐸 .
The Kruskal running time is 𝑂 𝐸 log 𝐸 .
48
Euler is Feeling Better
47
48
2/29/2020
25
Running Time Comparison
Kruskal: 𝑂 |𝐸| log 𝐸 .
Prim: 𝑂 𝐸 + 𝑉 log 𝑉 .
Which is more efficient?
◦ The same when 𝐸 = 𝑂 𝑉 .
◦ When 𝐸 is larger, Prim is better.
Faster MST Algorithms
In 2000, Chazelle found an algorithm for finding an MST with running time
𝑂 𝐸 ⋅ 𝛼 𝑉 .
◦ This algorithm only works when the weights are integers.
◦ It remains unknown whether there exists an MST algorithm with running time 𝑂 𝐸 .
◦ Too advanced for this course.
Bernard Chazelle
49
50
2/29/2020
26
Algorithm Correctness
One can prove that both Prim and Kruskal always correctly find an MST.
◦ In this course, we do not study proofs of correctness for these algorithms.
Group Exercise
Discuss in Groups.
◦ We are given a graph 𝐺 = (𝑉, 𝐸) with a weight function 𝑤:𝐸 → ℝ.
◦ Every edge is colored either blue or red.
◦ Describe an efficient alg. for finding, among the MSTs of 𝐺, an MST that contains a maximum number of red edges.
◦ (A difficult problem!)1 1
1
1
324
4
300
100 80
4
51
52
2/29/2020
27
Bad Solution
Algorithm.
◦ Go over every MST of 𝐺, count how many red edges it contains.
◦ Keep MST with a maximum number of edges.
Issue.
◦ A graph can have 𝑉 𝑉 −2 MSTs.
◦ Going over every MST can take
Ω 𝑉 𝑉 −2 steps. 1 1
1
1
324
4
300
100 80
4
First Solution
Let 𝜀 > 0 be a very small number.
Increase the weight of every blue edge by 𝜀.
Find MST in the revised graph.
1 1
1
1
32
4
4
4
300
100 80
4
1 1
3
4
300
+1100 80
+1
+4
+4+4 +2
53
54
2/29/2020
28
Algorithm Correctness
Claim. Algorithm always returns an MST.
◦ 𝑚 – weight of MST.
◦ 𝑚2 – weight of lightest spanning tree that is not an MST.
◦ Adding 𝜀’s can increase the weight of an MST by at most 𝑉 − 1 ⋅ 𝜀.
◦ When 𝜀 is sufficiently small:𝑚 + 𝑉 − 1 ⋅ 𝜀 < 𝑚2.
Algorithm Correctness (cont.)
𝑚 – weight of MST.
After adding 𝜀’s, the weight of an MST with 𝑘 blue edges is 𝑚 + 𝑘 ⋅ 𝜀.
◦ The algorithm returns MST with maximum number of red edges!
55
56
2/29/2020
29
Running Time
Adding 𝜀 to every blue edge: 𝑂 𝐸 .
Prim’s algorithm: 𝑂 𝐸 + 𝑉 log 𝑉 .
Total: 𝑂 𝐸 + 𝑉 log 𝑉 .
1 1
1
1
32
4
4
4
300
100 80
4
1 1
3
4
300
+1100 80
+1
+4
+4+4 +2
A More Elegant Solution
Using 𝜀 can be tricky.
◦ How do we know when 𝜀 is sufficiently small?
◦ Tiny numbers might cause accuracy issues.
We can have the same alg. without 𝜀’s.
◦ Run Prim’s algorithm with a single change.
◦ When comparing two edges with the same weight, consider a red edgeto be lighter than a blue edge.
57
58
2/29/2020
30
1 1
1
1
324
4
300
100 80
6
4
Example with Kruskal
Kruskal Leads to Different Results
𝐺 = 𝑉, 𝐸 – graph with edge weights.
The MST returned by Kruskal depends on the arbitrary order chosen for edges of same weight.
◦ In the graph below, Kruskal has two possible results.
3 3
2
3 3
2
3 3
2
59
60
2/29/2020
31
Group Exercise
Explain the following claim.
◦ 𝐺 = 𝑉,𝐸 – graph with edge weights.
◦ For each MST 𝑇 of 𝐺, there is a valid order of the edges of 𝐸 such that Kruskal returns 𝑇.
◦ Hint. How is this related to the previous problem?
3 3
2
3 3
2
3 3
2
The End
61
62