5 Greedy Algorithms - Computer Scienceduan/class/435/notes/5_Greedy_Algorithms.pdf · 1 Chapter 5 Greedy Algorithms Optimization Problems Optimization problem: a problem of finding

1

Chapter 5

Greedy Algorithms

Optimization Problems

Optimization problem: a problem of finding the best solution from all feasible solutions.

Two common techniques: Greedy Algorithms (local) Dynamic Programming (global)

Greedy Algorithms

Greedy algorithms typically consist of

A set of candidate solutions Function that checks if the candidates are feasible Selection function indicating at a given time which is

the most promising candidate not yet used Objective function giving the value of a solution; this is

the function we are trying to optimize

Step by Step Approach

Initially, the set of chosen candidates is empty At each step, add to this set the best remaining

candidate; this is guided by selection function. If increased set is no longer feasible, then remove the

candidate just added; else it stays. Each time the set of chosen candidates is increased,

check whether the current set now constitutes a solution to the problem.

When a greedy algorithm works correctly, the first solution found in this way is always optimal.

Examples of Greedy Algorithms Graph Algorithms

Breath First Search (shortest path 4 un-weighted graph) Dijkstra’s (shortest path) Algorithm Minimum Spanning Trees

Data compression Huffman coding

Scheduling Activity Selection Minimizing time in system Deadline scheduling

Other Heuristics Coloring a graph Traveling Salesman Set-covering

Elements of Greedy Strategy

Greedy-choice property: A global optimal solution can be arrived at by making locally optimal (greedy) choices

Optimal substructure: an optimal solution to the problem contains within it optimal solutions to sub-problems Be able to demonstrate that if A is an optimal solution containing

s1, then the set A’ = A - {s1} is an optimal solution to a smaller problem w/o s1.

2

Analysis The selection function is usually based on the objective

function; they may be identical. But, often there are several plausible ones.

At every step, the procedure chooses the best candidate, without worrying about the future. It never changes its mind: once a candidate is included in the solution, it is there for good; once a candidate is excluded, it’s never considered again.

Greedy algorithms do NOT always yield optimal solutions, but for many problems they do.

Huffman Coding

Huffman codes–- very effective technique for compressing data,

saving 20% - 90%.

ASCII table

CodingProblem: Consider a data file of 100,000 characters You can safely assume that there are many

a,e,i,o,u, blanks, newlines, few q, x, z’s Want to store it compactly

Solution: Fixed-length code, ex. ASCII, 8 bits per character

Variable length code, Huffman code(Can take advantage of relative freq of letters to save space)

Example

2Z

Total BitsCodeFrequencyChar

7K24M32C37U 42D42L

111110101100011010001000120E

• Fixed-length code, need ? bits for each char

6918

217296111126126360

3

Example (cont.)

37L:42 U:37 C:32 M:24 K:7 Z:2D:42E:120

0

0

0

0 0 1

1

1

1

1

1 1 0

0

E L D U C M K Z

000 001 010 011 100 101 110 111

CharCode

Complete binary tree

3

Example (cont.)

Variable length code(Can take advantage of relative freq of letters to save space)

- Huffman codes

E L D U C M K ZCharCode

Huffman Tree Construction (1)

1. Associate each char with weight (= frequency) to form a subtree of one node (char, weight)

2. Group all subtrees to form a forest 3. Sort subtrees by ascending weight of subroots4. Merge the first two subtrees (ones with lowest weights)5. Assign weight of subroot with sum of weights of two

children.6. Repeat 3,4,5 until only one tree in the forest



M

Assigning Codes

Compare with: 918~15% less

111100

100

110

111101

11111

0

101

1110

BitsCode

2Z

37U

42L

7K

24M

120E

42D

32C

Freqchar

12785

111

126

42

120

120

126

128

Huffman Coding Tree

4

Coding and Decoding

DEED: MUCK:

E L D U C M K Z000 001 010 011 100 101 110 111

CharCode

E L D U C M K Z0 110 101 100 1110 11111 111101 111100

CharCode

DEED: MUCK:

010000000010101011100110

10100101111111001110111101

Prefix codes

A set of codes is said to meet the prefix propertyif no code in the set is the prefix of another. Such codes are called prefix codes.

Huffman codes are prefix codes.

E L D U C M K Z0 110 101 100 1110 11111 111101 111100

CharCode

LZW A universal lossless data compression algorithm created

by Abraham Lempel, Jacob Ziv, and Terry Welch. It is the basis of many PC utilities that claim to “double the

capacity of your hard drive” Unix file compression utility compress GIF image, …

Universal coding schemes, like LZW, do not require advance knowledge and can build such knowledge on-the-fly.

LZW LZW compression uses a code table, with 4096 as a

common choice for the number of table entries Codes 0-255 = ASCII table The code table contains only the first 256 entries, with

the remainder of the table being blanks Compression is achieved by using codes 256 through

4095 to represent sequences of bytes. LZW identifies them and adds them to the code table

Decoding is achieved by taking each code from the compressed file, and translating it through the code table in the same manner as encoding

LZW code table

10 BA AB BAA

11

12

13

…

LZW1. Initialize table with single character strings2. P = first input character3. WHILE not end of input stream4. C = next input character5. IF P + C is in the string table6. P = P + C7. ELSE8. output the code for P9. add P + C to the string table10. P = C11. END WHILE12. output code for P

5

LZW - Example

Example 1BABAABAAA

Example 1: LZW Compression Step 1

BABAABAAA P=AC=empty

STRING TABLEENCODER OUTPUTstringcodewordrepresentingoutput codeBA256B66


BABAABAAA P=BC=empty

STRING TABLEENCODER OUTPUTstringcodewordrepresentingoutput codeBA256B66AB257A65



STRING TABLEENCODER OUTPUTstringcodewordrepresentingoutput codeBA256B66AB257A65BAA258BA256



STRING TABLEENCODER OUTPUTstringcodewordrepresentingoutput codeBA256B66AB257A65BAA258BA256ABA259AB257


BABAABAAA P=AC=A

STRING TABLEENCODER OUTPUTstringcodewordrepresentingoutput codeBA256B66AB257A65BAA258BA256ABA259AB257AA260A65

6


BABAABAAA P=AAC=empty

STRING TABLEENCODER OUTPUTstringcodewordrepresentingoutput codeBA256B66AB257A65BAA258BA256ABA259AB257AA260A65

AA260

LZW Decompression

The LZW decompressor creates the same string table during decompression.

It starts with the first 256 table entries initialized to single characters.

The string table is updated for each character in the input stream, except the first one.

Decoding achieved by reading codes and translating them through the code table being built.

LZW Decompression1 Initialize table with single character strings2 OLD = first input code3 output translation of OLD4 WHILE not end of input stream5 NEW = next input code6 IF NEW is not in the string table7 S = translation of OLD8 S = S + C9 ELSE10 S = translation of NEW11 output S12 C = first character of S13 OLD + C to the string table14 OLD = NEW15 END WHILE

Example 2: LZW Decompression 1

Example 2: Use LZW to decompress the output sequence of

Example 1:

<66><65><256><257><65><260>.

LZW Decompression Step 1

<66><65><256><257><65><260> Old = 65 S = ANew = 66 C = A

STRING TABLEENCODER OUTPUTstringcodewordstring

BBA256A

LZW Decompression Step 2<66><65><256><257><65><260> Old = 256 S = BA

New = 256 C = BSTRING TABLEENCODER OUTPUT

stringcodewordstringB

BA256A

AB257BA

7


<66><65><256><257><65><260> Old = 257 S = ABNew = 257 C = A


BBA256A

AB257BA

BAA258AB


<66><65><256><257><65><260> Old = 65 S = ANew = 65 C = A


BBA256A

AB257BA

BAA258AB

ABA259A


<66><65><256><257><65><260> Old = 260 S = AANew = 260 C = A


BBA256A

AB257BA

BAA258AB

ABA259AAA260AA

This algorithm compresses repetitive sequences of data well. Since the codewords are 12 bits, any single encoded character

will expand the data size rather than reduce it.

Advantages of LZW over Huffman: LZW requires no prior information about the input data stream. LZW can compress the input stream in one single pass. LZW is very simple, allowing fast execution.

Notes on LZW

Improving LZW What happens when the dictionary gets too large (i.e.,

when all the 4096 locations have been used)?

Here are some options usually implemented: Simply stop adding any more entries, use the table as it is. Throw the dictionary away when it reaches a certain size.

Throw the dictionary away when it is no longer effective at compression.

Incorporate the advantages of variable length coding schemes.

Chapter 3

Decompositions of Graphs

8

Graphs Graph Depth-first search in undirected graphs Depth-first search in directed graphs Forward edge Cross edge Back edge

DAG Strongly connected components Union-Find

Graph

Graph G = (V, E) V = set of vertices E = set of edges ⊆ (V×V)

Representation of Graphs Two standard ways.

Adjacency Lists.

Adjacency Matrix.

Object and pointers.

a

dc

b a

bcd

b

a

d

d c

c

a b

a c

a

dc

b1 2

3 4

1 2 3 41 0 1 1 12 1 0 1 03 1 1 0 14 1 0 1 0

a

dc

b1 2

3 4

Adjacency Lists Consists of an array Adj of |V| lists. One list per vertex. For u ∈ V, Adj[u] consists of all vertices adjacent to u.

a

dc

b a

bcd

b

c

d

d c

a

dc

b a

bcd

b

a

d

d c

c

a b

a c

If weighted, store weights also in adjacency lists.

Adjacency Matrix |V| × |V| matrix A. Number vertices from 1 to |V| in some arbitrary manner. A is then given by:

a

dc

b1 2

3 4

1 2 3 41 0 1 1 12 0 0 1 03 0 0 0 14 0 0 0 0

a

dc

b1 2

3 4

1 2 3 41 0 1 1 12 1 0 1 03 1 1 0 14 1 0 1 0

A = AT for undirected graphs.

Adjacency Matrix vs Adjacency List

Adjacency matrix uses fixed amount of space Depends on number of vertices Does not depend on number of edges Presence of an edge between two vertices can be known

immediately All neighbors of a vertex found by scanning entire row for

that vertex

Adjacency list represents only edges that originate from the vertex Space not reserved for edges that do not exist more often used than adjacency matrix

9

DFS Depth-first search in undirected graphs

DFS Depth-first search in undirected graphs

Back edges (non-tree edge)

Tree edge

DFS

Depth-first search in directed graphs

DFS

Implementation Stack

Cost?O(|V|+|E|)

DFS

DAG

Directed acyclic graphs

10

Strongly connected components Directed graph: u and v are connected iff there

is a path from u to v and a path from v to u.

5 stronglyconnectedcomponents

Breadth-First Traversal A breadth-first traversal visits a vertex and then each of the vertex's neighbors before advancing

Cost ?

O(|V|+|E|)

Breadth-First Traversal

FinishedDiscovered

Undiscovered

S22

2

2

22

S

3

3 3

3

3

S11

1

Implementation?

Example (BFS)

0

r s t u

v w x y

Q: s0

Example (BFS)

1 0

1

r s t u

v w x y

Q: w r1 1

Example (BFS)

1 0

1 2

2

r s t u

v w x y

Q: r t x1 2 2

11

Example (BFS)

1 0

1 2

2

2

r s t u

v w x y

Q: t x v2 2 2

Example (BFS)

1 0

1 2

2 3

2

r s t u

v w x y

Q: x v u2 2 3

Example (BFS)

1 0

1 2 3

2 3

2

r s t u

v w x y

Q: v u y2 3 3

Example (BFS)

1 0

1 2 3

2 3

2

r s t u

v w x y

Q: u y3 3

Example (BFS)

1 0

1 2 3

2 3

2

r s t u

v w x y

Q: y3

Example (BFS)

1 0

1 2 3

2 3

2

r s t u

v w x y

Q:

12

Example (BFS)

1 0

1 2 3

2 3

2

r s t u

v w x y

Breath First Tree

BFS : Application

Solves the shortest path problem for un-weighted graph

1 0

1 2 3

2 3

2

r s t u

v w x y

0r s t u

v w x y

Chapter 4

Paths in Graphs

Dijkstra’s algorithm (4.4) An adaptation of BFS

s

x y

10

1

9

2

4 6

5

2 3

7

Example

0

∞∞

∞∞

s

u v

x y

10

1

9

2

4 6

5

2 3

7

Example

0

∞

∞10

s

u v

x y

10

1

9

2

4 6

5

2 3

7

55

13

Example

0

75

148

s

u v

x y

10

1

9

2

4 6

5

2 3

7

7

Example

0

75

138

s

u v

x y

10

1

9

2

4 6

5

2 3

7

8

Example

0

75

98

s

u v

x y

10

1

9

2

4 6

5

2 3

7

9

Example

0

75

98

s

u v

x y

10

1

9

2

4 6

5

2 3

7

Dijkstra’s Algorithm Assumes no negative-weight edges. Maintains a set S of vertices whose shortest path

from s has been determined. Repeatedly selects u in V – S with minimum

shortest path estimate (greedy choice). Store V – S in

priority queue Q.

Cost: ?

Dijkstra’s Algorithm

14

Minimal Spanning Trees

Minimal Spanning Tree (MST) Problem:

Input: An undirected, connected graph G.Output: The subgraph of G that keeps the vertices connected; has minimum total cost;

(the sum of the values of the edges in the subset is at the minimum)

MST Example

Step by Step Greedy Approach

Initially, the set of chosen candidates is empty At each step, add to this set the best remaining

candidate; this is guided by selection function. If increased set is no longer feasible, then remove the

candidate just added; else it stays. Each time the set of chosen candidates is increased,

check whether the current set now constitutes a solution to the problem.

When a greedy algorithm works correctly, the first solution found in this way is optimal.

MST Example

Greedy Algorithms Kruskal's algorithm. Start with T = φ. Consider edges in

ascending order of cost. Insert edge e in T unless doing so would create a cycle.

Prim's algorithm. Start with some root node s and greedily grow a tree T from s outward. At each step, add the cheapest edge e to T that has exactly one endpoint in T.

Reverse-Delete algorithm. Start with T = E. Consider edges in descending order of cost. Delete edge e from T unless doing so would disconnect T.

All three algorithms produce an MST.

Kruskal’s MST Algorithm choose each vertex to be in its own MST;

merge two MST’s that have the shortest edge between them;

repeat step 2 until no tree to merge.

Implementation ? Union-Find O(E log V)

15

Prim’s MST Algorithm

A greedy algorithm.

choose any vertex N to be the MST; grow the tree by picking the least cost edge

connected to any vertices in the MST; repeat step 2 until the MST includes all the

vertices.

Prim’s MST demo

http://www-b2.is.tokushimau.ac.jp/~ikeda/suuri/dijkstra/PrimApp.shtml?demo1

O(|E| log |V|)

Coin Changing

Coin Changing Goal. Given currency denominations: 1, 5, 10, 25, 100, devise a

method to pay amount to customer using fewest number of coins.

Ex: 34¢.

Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid.

Ex: $2.89.

Coin-Changing: Greedy Algorithm

Cashier's algorithm. At each iteration, add coin of the largest value that does not take us past the amount to be paid.

Q. Is cashier's algorithm optimal?

Sort coins denominations by value: c1 < c2 < … < cn.

S ← φwhile (x ≠ 0) {

let k be largest integer such that ck ≤ xif (k = 0)

return "no solution found"x ← x - ckS ← S ∪ {k}

}return S

coins selected

Coin-Changing: Analysis of Greedy Algorithm

Observation. Greedy algorithm is sub-optimal for US postal denominations: 1, 10, 21, 34, 37, 44, 70, 100, 350, 1225, 1500.

Counterexample. 140¢. Greedy: 100, 37, 1, 1, 1. Optimal: 70, 70.

Greedy algorithm failed!

16

Coin-Changing: Analysis of Greedy Algorithm Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100. Proof. (by induction on x)

Let ck be the kth smallest coin Consider optimal way to change ck ≤ x < ck+1 : greedy takes coin k. We claim that any optimal solution must also take coin k.

if not, it needs enough coins of type c1, …, ck-1 to add up to x table below indicates no optimal solution can do this

Problem reduces to coin-changing x - ck cents, which, by induction, is optimally solved by greedy algorithm.

1

ck

1025100

P ≤ 4

All optimal solutionsmust satisfy

N + D ≤ 2Q ≤ 3

5 N ≤ 1

no limit

k

1

345

2-

Max value of coins1, 2, …, k-1 in any OPT

4 + 5 = 920 + 4 = 24

4

75 + 24 = 99

Coin-Changing: Analysis of Greedy Algorithm

Theorem. Greed is optimal for U.S. coinage: 1, 5, 10, 25, 100. Consider optimal way to change ck ≤ x < ck+1 : greedy takes coin k. We claim that any optimal solution must also take coin k.

1

ck

1025100

P ≤ 4


N + D ≤ 2Q ≤ 3

5 N ≤ 1

no limit

k

1

345

2-


4 + 5 = 920 + 4 = 24

4

75 + 24 = 99

1

ck

1025100

P ≤ 9


P + D ≤ 8Q ≤ 3no limit

k

1

234

-


940 + 4 = 44

75 + 44 = 119

Kevin’s problem

Knapsack Problem

Knapsack Problem

• 0-1 knapsack: A thief robbing a store finds n items; the ith item is worth vi dollars and weighs wi pounds, where viand wi are integers. He wants to take as valuable a load as possible, but he can only carry at most W pounds. What items should he take? Dynamic programming

• Fractional knapsack: Same set up. But, the thief can take fractions of items, instead of making a binary (0-1) choice for each item.

Fractional Knapsack Problem

Greedy method: repeatedly add item with max ratio vi/wi.

Value:

1

Value

18

22

28

1

Weight

5

6

6 2

7

Item

1

3

4

5

2Let W=11

Answer for 0-1 knapsack problem:

OPT: { 4, 3 }value = 22 + 18 = 40

> 45

Activity-selection Problem

17

Activity-selection Problem Input: Set S of n activities, a1, a2, …, an.

si = start time of activity i. fi = finish time of activity i.

Output: Subset A of max # of compatible activities. Two activities are compatible, if their intervals don’t overlap.

Time0 1 2 3 4 5 6 7 8 9 1

011

fg

h

e

ab

cd

Interval Scheduling: Greedy Algorithms

Greedy template. Consider jobs in some order. Take each job provided it's compatible with the ones already taken.

Earliest start time:Consider jobs in ascending order of start time sj.

Earliest finish time:Consider jobs in ascending order of finish time fj.

Shortest interval:Consider jobs in ascending order of interval length fj - sj.

Fewest conflicts:For each job, count the number of conflicting jobs cj. Schedule in ascending order of conflicts cj.

…

Greedy algorithm. Consider jobs in increasing order of finish time. Take each job provided it's compatible with the ones already taken.

Implementation. O(n log n). Remember job j* that was added last to A. Job j is compatible with A if sj ≥ fj*.

Interval Scheduling: Greedy Algorithm

Sort jobs by finish times so that f1 ≤ f2 ≤ ... ≤ fn.

A ← φfor j = 1 to n {

if (job j compatible with A)A ← A ∪ {j}

}return A

jobs selected

Interval Scheduling: AnalysisTheorem. Greedy algorithm is optimal.Proof: (by contradiction)

Assume greedy is not optimal, and let's see what happens. Let i1, ... ik denote set of jobs selected by greedy. Let j1, ... jm denote set of jobs in the optimal solution with

i1 = j1, i2 = j2, ..., ir = jr for the largest possible value of r.

j1 j2 jr

i1 i1 ir ir+1

. . .

Greedy:

OPT: jr+1

why not replace job jr+1 with job ir+1?

job ir+1 finishes before jr+1

ir+1

Still optimal with a bigger value than r : ir+1=jr+1 contradiction!

Weighted Interval Scheduling Weighted interval scheduling problem.

Job j starts at sj, finishes at fj, and has weight or value vj . Two jobs compatible if they don't overlap. Goal: find maximum weight subset of mutually compatible

jobs.

Time0 1 2 3 4 5 6 7 8 9 1

011

f

g

h

e

a

b

c

d

Greedy algorithm?

Weighted Interval Scheduling

Cost?

18

Set Covering - one of Karp's 21 NP-complete problems

Given: a set of elements B a set S of n sets {Si} whose

union equals the universe B

Output: cover of B A subset of S whose union = B

Cost: Number of sets picked

Goal: Minimum cost cover

How many Walmart centers should Walmart build in Ohio?

For each town t, St = {towns that are within 30 miles of it} -- a Walmart center at t will cover all towns in St .

Set Covering - Greedy approachwhile (not all covered)

Pick Si with largest uncovered elements

Proof: Let nt be # of elements not covered after t iterations. There must be a set with ≥ nt/k elements:nt+1 ≤ nt - nt/k ≤ n0(1-1/k)t+1 ≤ n0(e-1/k)t+1

nt < 1 when t=klnn.

Dynamic programming

Documents

5 Greedy Algorithms - Computer Scienceduan/class/435/notes/5_Greedy_Algorithms.pdf · 1 Chapter 5 Greedy Algorithms Optimization Problems Optimization problem: a problem of finding