Synapseindia dotnet development chapter 8-0 dynamic programming

Chapter 8Chapter 8

Dynamic ProgrammingDynamic Programming

2

Dynamic Programming

DDynamic Programming ynamic Programming is a general algorithm design technique is a general algorithm design technique for solving problems defined by or formulated as recurrences with for solving problems defined by or formulated as recurrences with overlapping subinstancesoverlapping subinstances

• Invented by American mathematician Richard Bellman in the Invented by American mathematician Richard Bellman in the 1950s to solve optimization problems and later assimilated by CS1950s to solve optimization problems and later assimilated by CS

• “ “Programming” here means “planning”Programming” here means “planning”

• Main idea:Main idea:- set up a recurrence relating a solution to a larger instance to set up a recurrence relating a solution to a larger instance to

solutions of some smaller instancessolutions of some smaller instances- solve smaller instances once- solve smaller instances once- record solutions in a table record solutions in a table - extract solution to the initial instance from that tableextract solution to the initial instance from that table

3

Example: Fibonacci numbers

• Recall definition of Fibonacci numbers:Recall definition of Fibonacci numbers:

FF((nn)) = F = F((nn-1)-1) + F + F((nn-2)-2)FF(0)(0) = = 00FF(1)(1) = = 11

• Computing the Computing the nnthth Fibonacci number recursively (top-down): Fibonacci number recursively (top-down):

FF((nn))

FF((n-n-1) 1) + F + F((n-n-2)2)

FF((n-n-2) 2) + F+ F((n-n-3) 3) FF((n-n-3) 3) + F+ F((n-n-4)4)

......

4

Example: Fibonacci numbers (cont.)

Computing the Computing the nnthth Fibonacci number using bottom-up iteration and Fibonacci number using bottom-up iteration and recording results:recording results:

FF(0)(0) = = 00 FF(1)(1) = = 11 FF(2)(2) = = 1+0 = 11+0 = 1 … … FF((nn-2) = -2) = FF((nn-1) = -1) = FF((nn) = ) = FF((nn-1)-1) + F + F((nn-2)-2)

Efficiency:Efficiency: - time- time - space- space

0

1

1

. . .

F(n-2)

F(n-1)

F(n)

n n

What if we solve it recursively?

5

Examples of DP algorithms

• Computing a binomial coefficientComputing a binomial coefficient

• Longest common subsequenceLongest common subsequence

• Warshall’s algorithm for transitive closureWarshall’s algorithm for transitive closure

• Floyd’s algorithm for all-pairs shortest pathsFloyd’s algorithm for all-pairs shortest paths

• Constructing an optimal binary search treeConstructing an optimal binary search tree

• Some instances of difficult discrete optimization problems:Some instances of difficult discrete optimization problems: - traveling salesman- traveling salesman - knapsack- knapsack

6

Computing a binomial coefficient by DP

Binomial coefficients are coefficients of the binomial formula:Binomial coefficients are coefficients of the binomial formula:((a + ba + b))nn = = CC((nn,0),0)aannbb0 0 + . . . + + . . . + CC((nn,,kk))aan-kn-kbbkk + . . . + . . . + CC((nn,,nn))aa00bbnn

Recurrence: Recurrence: CC((nn,,kk) = ) = CC((n-n-1,1,kk) + ) + CC((nn-1,-1,kk-1) for -1) for n > k n > k > 0> 0 CC((nn,0) = 1, ,0) = 1, CC((nn,,nn) = 1 for ) = 1 for n n ≥≥ 0 0 Value of Value of CC((nn,,kk) can be computed by filling a table:) can be computed by filling a table:

0 1 2 . . . 0 1 2 . . . kk-1 -1 kk 0 10 1 1 1 11 1 1 .. .. .. n-n-1 1 CC((n-n-1,1,kk-1) -1) CC((n-n-1,1,kk) ) nn C C((nn,,kk) )

7

Computing C(n,k): pseudocode and analysis

Time efficiency: Time efficiency: ΘΘ ((nknk))

Space efficiency: Space efficiency: ΘΘ ((nknk))

8

Knapsack Problem by DP

Given n items of integer weights: w1 w2 … wn

values: v1 v2 … vn

a knapsack of integer capacity W

find most valuable subset of the items that fit into the knapsack

Consider instance defined by first i items and capacity j (j ≤ W).

Let V[i,j] be optimal value of such an instance. Then max {V[i-1,j], vi + V[i-1,j- wi]} if j- wi ≥ 0

V[i,j] = V[i-1,j] if j- wi < 0

Initial conditions: V[0,j] = 0 and V[i,0] = 0

{

9

Knapsack Problem by DP (example)

Example: Knapsack of capacity W = 5item weight value 1 2 $12 2 1 $10 3 3 $20 4 2 $15 capacity j

0 1 2 3 4 5 0

w1 = 2, v1= 12 1

w2 = 1, v2= 10 2

w3 = 3, v3= 20 3

w4 = 2, v4= 15 4 ?

0 0 0

0 0 12

0 10 12 22 22 22

0 10 12 22 30 32

0 10 15 25 30 37

Backtracing finds the actual optimal subset, i.e. solution.

10

Knapsack Problem by DP (pseudocode)

Algorithm DPKnapsack(w[1..n], v[1..n], W)var V[0..n,0..W], P[1..n,1..W]: intfor j := 0 to W do

V[0,j] := 0 for i := 0 to n do V[i,0] := 0 for i := 1 to n do

for j := 1 to W doif w[i] ≤ j and v[i] + V[i-1,j-w[i]] > V[i-1,j]

thenV[i,j] := v[i] + V[i-1,j-w[i]]; P[i,j] := j-

w[i]else

V[i,j] := V[i-1,j]; P[i,j] := jreturn V[n,W] and the optimal subset by backtracing

Running time and space: O(nW).

11

Longest Common Subsequence (LCS)

A subsequence of a sequence/string S is obtained by deleting zero or more symbols from S. For example, the following are some subsequences of “president”: pred, sdn, predent. In other words, the letters of a subsequence of S appear in order in S, but they are not required to be consecutive.

The longest common subsequence problem is to find a maximum length common subsequence between two sequences.

12

LCS

For instance,

Sequence 1: president

Sequence 2: providence

Its LCS is priden.

president

providence

13

LCS

Another example:

Sequence 1: algorithm

Sequence 2: alignment

One of its LCS is algm.

14

How to compute LCS?

Let A=a1a2…am and B=b1b2…bn . len(i, j): the length of an LCS between

a1a2…ai and b1b2…bj

With proper initializations, len(i, j) can be computed as follows.

,

. and 0, if)),1(),1,(max(

and 0, if1)1,1(

,0or 0 if0

),(

≠>−−=>+−−

===

ji

ji

bajijilenjilen

bajijilen

ji

jilen

15

procedure LCS-Length(A, B)

1. for i ← 0 to m do len(i,0) = 0

2. for j ← 1 to n do len(0,j) = 0

3. for i ← 1 to m do

4. for j ← 1 to n do

5. if ji ba = then

=

+−−=" "),(

1)1,1(),(

jiprev

jilenjilen

6. else if )1,(),1( −≥− jilenjilen

7. then

=

−=" "),(

),1(),(

jiprev

jilenjilen

8. else

=

−=" "),(

)1,(),(

jiprev

jilenjilen

9. return len and prev

16

i j 0 1 p

2 r

3 o

4 v

5 i

6 d

7 e

8 n

9 c

10 e

0 0 0 0 0 0 0 0 0 0 0 0

1 p 2

0 1 1 1 1 1 1 1 1 1 1

2 r 0 1 2 2 2 2 2 2 2 2 2

3 e 0 1 2 2 2 2 2 3 3 3 3

4 s 0 1 2 2 2 2 2 3 3 3 3

5 i 0 1 2 2 2 3 3 3 3 3 3

6 d 0 1 2 2 2 3 4 4 4 4 4

7 e 0 1 2 2 2 3 4 5 5 5 5

8 n 0 1 2 2 2 3 4 5 6 6 6

9 t 0 1 2 2 2 3 4 5 6 6 6

Running time and memory: O(mn) and O(mn).

17

procedure Output-LCS(A, prev, i, j)

1 if i = 0 or j = 0 then return

2 if prev(i, j)=” “ then

−−−

ia

jiprevALCSOutput

print

)1,1,,(

3 else if prev(i, j)=” “ then Output-LCS(A, prev, i-1, j)

4 else Output-LCS(A, prev, i, j-1)

The backtracing algorithm

18

i j 0 1 p

2 r

3 o

4 v

5 i

6 d

7 e

8 n

9 c

10 e

0 0 0 0 0 0 0 0 0 0 0 0

1 p 2

0 1 1 1 1 1 1 1 1 1 1

2 r 0 1 2 2 2 2 2 2 2 2 2

3 e 0 1 2 2 2 2 2 3 3 3 3

4 s 0 1 2 2 2 2 2 3 3 3 3

5 i 0 1 2 2 2 3 3 3 3 3 3

6 d 0 1 2 2 2 3 4 4 4 4 4

7 e 0 1 2 2 2 3 4 5 5 5 5

8 n 0 1 2 2 2 3 4 5 6 6 6

9 t 0 1 2 2 2 3 4 5 6 6 6

Output: priden

19

Warshall’s Algorithm: Transitive Closure

• Computes the transitive closure of a relationComputes the transitive closure of a relation

• Alternatively: existence of all nontrivial paths in a digraphAlternatively: existence of all nontrivial paths in a digraph

• Example of transitive closure:Example of transitive closure:

3

42

1

0 0 1 01 0 0 10 0 0 00 1 0 0

0 0 1 01 1 11 1 10 0 0 011 1 1 11 1

3

42

1

20

Warshall’s Algorithm

Constructs transitive closure Constructs transitive closure TT as the last matrix in the sequence as the last matrix in the sequence of of nn-by--by-n n matrices matrices RR(0)(0), … ,, … , RR((kk)), … ,, … , RR((nn)) wherewhereRR((kk))[[ii,,jj] = 1 iff there is nontrivial path from ] = 1 iff there is nontrivial path from ii to to jj with only the with only the first first k k vertices allowed as intermediate vertices allowed as intermediate Note that Note that RR(0) (0) = = A A (adjacency matrix)(adjacency matrix),, RR((nn))

= T = T (transitive closure)(transitive closure)

3

42

13

42

13

42

13

42

1

R(0)

0 0 1 01 0 0 10 0 0 00 1 0 0

R(1)

0 0 1 01 0 11 10 0 0 00 1 0 0

R(2)

0 0 1 01 0 1 10 0 0 01 1 1 1 11 1

R(3)

0 0 1 01 0 1 10 0 0 01 1 1 1

R(4)

0 0 1 01 11 1 10 0 0 01 1 1 1

3

42

1

21

Warshall’s Algorithm (recurrence)

On theOn the k- k-th iteration, the algorithm determines for every pair of th iteration, the algorithm determines for every pair of vertices vertices i, ji, j if a path exists from if a path exists from i i andand j j with just vertices 1,…,with just vertices 1,…,k k allowedallowed asas intermediateintermediate

RR((kk-1)-1)[[i,ji,j]] (path using just 1 ,…, (path using just 1 ,…,k-k-1)1) RR((kk))[[i,ji,j] =] = oror

RR((kk-1)-1)[[i,ki,k] and ] and RR((kk-1)-1)[[k,jk,j]] (path from (path from i i to to kk and from and from kk to to jj using just 1 ,…,using just 1 ,…,k-k-1)1)

i

j

k

{

Initial condition?

22

Warshall’s Algorithm (matrix generation)

Recurrence relating elements Recurrence relating elements RR((kk)) to elements of to elements of RR((kk-1)-1) is: is:

RR((kk))[[i,ji,j] = ] = RR((kk-1)-1)[[i,ji,j] or] or ((RR((kk-1)-1)[[i,ki,k] and ] and RR((kk-1)-1)[[k,jk,j])])

It implies the following rules for generating It implies the following rules for generating RR((kk)) from from RR((kk-1)-1)::

Rule 1Rule 1 If an element in row If an element in row i i and column and column jj is 1 in is 1 in RR((k-k-1)1), , it remains 1 in it remains 1 in RR((kk))

Rule 2 Rule 2 If an element in row If an element in row i i and column and column jj is 0 in is 0 in RR((k-k-1)1),, it has to be changed to 1 in it has to be changed to 1 in RR((kk)) if and only if if and only if the element in its row the element in its row ii and column and column kk and the element and the element in its column in its column jj and row and row kk are both 1’s in are both 1’s in RR((k-k-1)1)

23

Warshall’s Algorithm (example)

3

42

1 0 0 1 01 0 0 10 0 0 00 1 0 0

R(0) =

0 0 1 01 0 1 10 0 0 00 1 0 0

R(1) =

0 0 1 01 0 1 10 0 0 01 1 1 1

R(2) =

0 0 1 01 0 1 10 0 0 01 1 1 1

R(3) =

0 0 1 01 1 1 10 0 0 01 1 1 1

R(4) =

24

Warshall’s Algorithm (pseudocode and analysis)

Time efficiency: Time efficiency: ΘΘ ((nn33))

Space efficiency: Matrices can be written over their predecessorsSpace efficiency: Matrices can be written over their predecessors

(with some care), so it’s (with some care), so it’s ΘΘ((nn^2).^2).

25

Floyd’s Algorithm: All pairs shortest paths

Problem: In a weighted (di)graph, find shortest paths betweenProblem: In a weighted (di)graph, find shortest paths between every pair of vertices every pair of vertices

Same idea: construct solution through series of matrices Same idea: construct solution through series of matrices DD(0)(0), …,, …, D D ((nn)) using increasing subsets of the vertices allowed using increasing subsets of the vertices allowed as intermediate as intermediate

Example:Example: 3

42

14

16

1

5

3

0 ∞ 4 ∞ 1 0 4 3 ∞ ∞ 0 ∞6 5 1 0

26

Floyd’s Algorithm (matrix generation)

On theOn the k- k-th iteration, the algorithm determines shortest paths th iteration, the algorithm determines shortest paths between every pair of vertices between every pair of vertices i, j i, j that use only vertices among 1,that use only vertices among 1,…,…,k k as intermediateas intermediate

DD((kk))[[i,ji,j] = min {] = min {DD((kk-1)-1)[[i,ji,j], ], DD((kk-1)-1)[[i,ki,k] + ] + DD((kk-1)-1)[[k,jk,j]}]}

i

j

k

DD((kk-1)-1)[[i,ji,j]]

DD((kk-1)-1)[[i,ki,k]]

DD((kk-1)-1)[[k,jk,j]]

Initial condition?

27

Floyd’s Algorithm (example)

0 ∞ 3 ∞ 2 0 ∞ ∞∞ 7 0 16 ∞ ∞ 0

D(0) =

0 ∞ 3 ∞ 2 0 5 ∞∞ 7 0 16 ∞ 9 0

D(1) =

0 ∞ 3 ∞2 0 5 ∞9 7 0 16 ∞ 9 0

D(2) =

0 10 3 42 0 5 69 7 0 16 16 9 0

D(3) =

0 10 3 42 0 5 67 7 0 16 16 9 0

D(4) =

31

3

2

6 7

4

1 2

28

Floyd’s Algorithm (pseudocode and analysis)

Time efficiency: Time efficiency: ΘΘ ((nn33))

Space efficiency: Matrices can be written over their predecessorsSpace efficiency: Matrices can be written over their predecessors

Note: Works on graphs with negative edges but without negative cycles. Note: Works on graphs with negative edges but without negative cycles. Shortest paths themselves can be found, too. Shortest paths themselves can be found, too. How?How?

If D[i,k] + D[k,j] < D[i,j] then P[i,j] k

Since the superscripts k or k-1 make no difference to D[i,k] and D[k,j].

29

Optimal Binary Search Trees

Problem: Given Problem: Given n n keys keys aa1 1 < …< < …< aan n and probabilities and probabilities pp11,, …, …, ppnn

searching for them, find a BST with a minimumsearching for them, find a BST with a minimum

average number of comparisons in successful search. average number of comparisons in successful search.

Since total number of BSTs with Since total number of BSTs with n n nodes is given by C(2nodes is given by C(2nn,,nn)/)/((nn+1), which grows exponentially, brute force is hopeless. +1), which grows exponentially, brute force is hopeless.

Example: What is an optimal BST for keys Example: What is an optimal BST for keys AA, , BB,, C C, and , and D D withwith search probabilities 0.1, 0.2, 0.4, and 0.3, respectively? search probabilities 0.1, 0.2, 0.4, and 0.3, respectively?

D

A

B

C

Average # of comparisons = 1*0.4 + 2*(0.2+0.3) + 3*0.1 = 1.7

30

DP for Optimal BST Problem

Let Let CC[[i,ji,j] be minimum average number of comparisons made in ] be minimum average number of comparisons made in T[T[i,ji,j], optimal BST for keys ], optimal BST for keys aaii < …< < …< aajj ,, where 1 ≤ where 1 ≤ i i ≤ ≤ j j ≤ ≤ n. n.

Consider optimal BST among all BSTs with some Consider optimal BST among all BSTs with some aak k ((i i ≤ ≤ k k ≤≤ jj ) )

as their root; T[as their root; T[i,ji,j] is the best among them. ] is the best among them.

a

OptimalBST for

a , ..., a

OptimalBST for

a , ..., ai

k

k-1 k+1 j

CC[[i,ji,j] =] =

min {min {ppk k · · 1 +1 +

∑ ∑ ppss (level (level aas s in T[in T[i,k-i,k-1] +1)1] +1) ++

∑ ∑ ppss (level (level aass in T[in T[k+k+11,j,j] +1)}] +1)}

i i ≤ ≤ k k ≤≤ jj

s s == ii

k-k-11

s =s =k+k+11

jj

31

goal0

0

C[i,j]

0

1

n+1

0 1 n

p 1

p2

np

i

j

DP for Optimal BST Problem (cont.)

After simplifications, we obtain the recurrence for After simplifications, we obtain the recurrence for CC[[i,ji,j]:]:

CC[[i,ji,j] = ] = min {min {CC[[ii,,kk-1] + -1] + CC[[kk+1,+1,jj]} + ∑ ]} + ∑ ppss forfor 1 1 ≤≤ i i ≤≤ j j ≤≤ nn

CC[[i,ii,i] = ] = ppi i for 1 for 1 ≤≤ i i ≤≤ j j ≤≤ nn

s s == ii

jj

i i ≤≤ k k ≤≤ jj

32

Example: key A B C D

probability 0.1 0.2 0.4 0.3

The tables below are filled diagonal by diagonal: the left one is filled The tables below are filled diagonal by diagonal: the left one is filled using the recurrence using the recurrence CC[[i,ji,j] = ] = min {min {CC[[ii,,kk-1] + -1] + CC[[kk+1,+1,jj]} + ∑ ]} + ∑ pps , s , CC[[i,ii,i] = ] = ppi i ;;

the right one, for trees’ roots, records the right one, for trees’ roots, records kk’s values giving the minima’s values giving the minima 00 11 22 33 44

11 00 .1.1 .4.4 1.11.1 1.71.7

22 00 .2.2 .8.8 1.41.4

33 00 .4.4 1.01.0

44 00 .3.3

55 00

00 11 22 33 44

11 11 22 33 33

22 22 33 33

33 33 33

44 44

55

i i ≤ ≤ k k ≤≤ jj s s == ii

jj

optimal BSToptimal BST

B

A

C

D

i i jji i jj

33

Optimal Binary Search Trees

34

Analysis DP for Optimal BST Problem

Time efficiency: Time efficiency: ΘΘ((nn33) but can be reduced to ) but can be reduced to ΘΘ((nn22)) by takingby taking advantage of monotonicity of entries in the advantage of monotonicity of entries in the root table, i.e., root table, i.e., RR[[i,ji,j] is always in the range ] is always in the range between between RR[[i,ji,j-1] and R[-1] and R[ii+1,j]+1,j]

Space efficiency: Space efficiency: ΘΘ((nn22))

Method can be expanded to include unsuccessful searchesMethod can be expanded to include unsuccessful searches

Technology

Synapseindia dotnet development chapter 8-0 dynamic programming