Lecture 8:Dynamic Programming
Shang-Hua Teng
Longest Common Subsequence
• Biologists need to measure how similar strands of DNA are to determine how closely related an organism is to another.
• They do this by considering DNA as strings of letters A,C,G,T and then comparing similarities in the strings.
• Formally they look at common subsequences in the strings.• Example X = AGTCAACGTT, Y=GTTCGACTGTG• Both S = AGTG and S’=GTCACGT are subsequences• How to do find these efficiently?
Brute Force
• if |X| = m, |Y| = n, then there are 2m subsequences of x; we must compare each with Y (n comparisons)
• So the running time of the brute-force algorithm is O(n 2m)
• Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution.
• Subproblems: “find LCS of pairs of prefixes of X and Y”
111 and of LCSan is and then , If nmknmknm YXZyxzyx
1 and of LCSan is that implies then , If
n
nknm
YXZyzyx
Some observations
•
•
•
•
. and of LCSany be ,,,let and sequences
be ,,, and ,,,Let
21
2121
YXzzzZ
yyyYxxxX
k
nm
YXZxzyx
m
mknm
and of LCSan is that implies then , If
1
Setup
• First we’ll find the length of LCS, along the way we will leave “clues” on finding subsequence.
• Define Xi, Yj to be the prefixes of X and Y of length i and j respectively.
• Define c[i,j] to be the length of LCS of Xi and Yj
• Then the length of LCS of X and Y will be c[m,n]
The recurrence
• The subproblems overlap, to find LCS we need to find LCS of c[i, j-1] and of c[i-1, j]
ji
ji
yxjijicjicyxjijic
jijic
and 0, if ]),1[],1,[max(, and 0, if 1]1,1[
0or 0 if 0],[
LCS Algorithm• First we’ll find the length of LCS. Later we’ll
modify the algorithm to find LCS itself.• Recall we want to let Xi, Yj to be the prefixes
of X and Y of length i and j respectively• And that Define c[i,j] to be the length of LCS
of Xi and Yj
• Then the length of LCS of X and Y will be c[m,n]
otherwise]),1[],1,[max(
],[][ if1]1,1[],[
jicjicjyixjic
jic
LCS recursive solution
• We start with i = j = 0 (empty substrings of x and y)
• Since X0 and Y0 are empty strings, their LCS
is always empty (i.e. c[0,0] = 0)
• LCS of empty string and any other string is empty, so for every i and j: c[0, j] = c[i,0] = 0
LCS recursive solution
• When we calculate c[i,j], we consider two cases:
• First case: x[i]=y[j]: one more symbol in strings X and Y matches, so the length of LCS
Xi and Yj equals to the length of LCS of
smaller strings Xi-1 and Yi-1 , plus 1
LCS recursive solution
• Second case: x[i] != y[j]
• As symbols don’t match, our solution is not
improved, and the length of LCS(Xi , Yj) is the same
as before (i.e. maximum of LCS(Xi, Yj-1) and
LCS(Xi-1,Yj)
LCS ExampleWe’ll see how LCS algorithm works on the
following example:• X = ABCB• Y = BDCAB
LCS(X, Y) = BCBX = A B C BY = B D C A B
What is the Longest Common Subsequence of X and Y?
LCS Example (0)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
X = ABCB; m = |X| = 4Y = BDCAB; n = |Y| = 5Allocate array c[6,5]
LCS Example (1)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
0
0
for i = 1 to m c[i,0] = 0
LCS Example (2)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
for j = 0 to n c[0,j] = 0
LCS Example (3)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0
case i=1 and j=1 A != B but, c[0,1]>=c[1,0] so c[1,1] = c[0,1], and b[1,1] =
LCS Example (4)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0
case i=1 and j=2 A != D but, c[0,2]>=c[1,1] so c[1,2] = c[0,2], and b[1,2] =
0
LCS Example (5)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0
case i=1 and j=3 A != C but, c[0,3]>=c[1,2] so c[1,3] = c[0,3], and b[1,3] =
0 0
LCS Example (6)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 0 1
case i=1 and j=4 A = A so c[1,4] = c[0,2]+1, and b[1,4] =
LCS Example (7)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
000 1 1
case i=1 and j=5 A != B this time c[0,5]<c[1,4] so c[1,5] = c[1, 4], and b[1,5] =
LCS Example (8)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=2 and j=1 B = B so c[2, 1] = c[1, 0]+1, and b[2, 1] =
LCS Example (9)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=2 and j=2 B != D and c[1, 2] < c[2, 1] so c[2, 2] = c[2, 1] and b[2, 2] =
1
LCS Example (10)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=2 and j=3 B != D and c[1, 3] < c[2, 2] so c[2, 3] = c[2, 2] and b[2, 3] =
1 1
LCS Example (11)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=2 and j=4 B != A and c[1, 4] = c[2, 3] so c[2, 4] = c[1, 4] and b[2, 2] =
1 1 1
LCS Example (12)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=2 and j=5 B = B so c[2, 5] = c[1, 4]+1 and b[2, 5] =
1 1 1 2
LCS Example (13)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=3 and j=1 C != B and c[2, 1] > c[3,0] so c[3, 1] = c[2, 1] and b[3, 1] =
1 1 1 2
1
LCS Example (14)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=3 and j= 2 C != D and c[2, 2] = c[3, 1] so c[3, 2] = c[2, 2] and b[3, 2] =
1 1 1 2
1 1
LCS Example (15)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=3 and j= 3 C = C so c[3, 3] = c[2, 2]+1 and b[3, 3] =
1 1 1 2
1 1 2
LCS Example (16)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=3 and j= 4 C != A c[2, 4] < c[3, 3] so c[3, 4] = c[3, 3] and b[3, 3] =
1 1 1 2
1 1 2 2
LCS Example (17)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=3 and j= 5 C != B c[2, 5] = c[3, 4] so c[3, 5] = c[2, 5] and b[3, 5] =
1 1 1 2
1 1 2 22
LCS Example (18)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=4 and j=1 B = B so c[4, 1] = c[3, 0]+1 and b[4, 1] =
1 1 1 2
1 1 2 2
1
2
LCS Example (19)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=4 and j=2 B != D c[3, 2] = c[4, 1] so c[4, 2] = c[3, 2] and b[4, 2] =
1 1 1 2
1 1 2 2
11
2
LCS Example (20)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=4 and j= 3 B != C c[3, 3] > c[4, 2] so c[4, 3] = c[3, 3] and b[4, 3] =
1 1 1 2
1 1 2 2
11 2
2
LCS Example (21)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=4 and j=4 B != A c[3, 4] = c[4, 3] so c[4, 4] = c[3, 4] and b[3, 5] =
1 1 1 2
1 1 2 2
11 22
2
LCS Example (22)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
B
Yj BB ACD
0
0
00000
0
0
0
0 0 10 1
1
case i=4 and j=5 B= B so c[4, 5] = c[3, 4]+1 and b[4, 5] =
1 1 1 2
1 1 2 2
11 22 32
LCS Algorithm Running Time• LCS algorithm calculates the values of each entry of the array c[m,n]• So the running time is clearly O(mn) as each entry is done in 3 steps.• Now how to get at the solution?• We use the arrows we created to guide us.• We simply follow arrows back to base case 0
Finding LCSj 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
Yj BB ACD
0
0
00000
0
0
0
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3B
Finding LCS (2)j 0 1 2 3 4 5
0
1
2
3
4
i
Xi
A
B
C
Yj BB ACD
0
0
00000
0
0
0
1000 1
1 21 1
1 1 2
1
22
1 1 2 2 3B
B C BLCS (reversed order):LCS (straight order): B C B (this string turned out to be a palindrome)
LCS-Length(X, Y)m = length(X), n = length(Y)for i = 1 to m do c[i, 0] = 0 for j = 0 to n do c[0, j] = 0 for i = 1 to m do for j = 1 to n do if ( xi = = yj ) then c[i, j] = c[i - 1, j - 1] + 1
else if c[i - 1, j]>=c[i, j - 1] then c[i, j] = c[i - 1, j]
else c[i, j] = c[i, j - 1] return c and b
"" , jib
"" , jib
"" , jib
Two Common Issues
Optimal substructure
Overlapping subproblems
Optimal Substructure• A problem exhibits optimal substructure if an optimal
solution contains optimal solutions to its sub-problems.
• Build an optimal solution from optimal solutions to sub-problems
• Example - Matrix-chain multiplication: An optimal parenthesization of AiAi+1…Aj that splits the product between Ak and Ak+1 contains within it optimal solutions to the problem of parenthesizing AiAi+1…Ak and Ak+1Ak+2…Aj.
Illustration of Optimal Substructure
A1A2A3A4A5A6A7A8A9
Suppose ((A7A8)A9) is optimal((A1A2)(A3((A4A5)A6)))
Minimal Cost_A1..6 + Cost_A7..9+p0p6p9
(A3((A4A5)A6))(A1A2)Then must be optimal for A1A2A3A4A5A6
Otherwise, if ((A4A5)A6)(A1(A2 A3)) is optimal for A1A2A3A4A5A6
Then ((A1(A2 A3)) ((A4A5)A6)) ((A7A8)A9) will be better than
((A7A8)A9)((A1A2)(A3((A4A5)A6)))
Recognizing subproblems• Show a solution to the problem consists of making a choice.
Making the choice leaves one or more sub-problems to be solved.
• Suppose that for a given problem, the choice that leads to an optimal solution is available.
• Notice something in common with a greedy solution, more on this later.
Dynamic vs. Greedy
• Dynamic programming uses optimal substructure in a bottom-up fashion– First find optimal solutions to subproblems and, having
solved the subproblems, we find an optimal solution to the problem
• Greedy algorithms use optimal substructure in a top-down fashion– First make a choice – the choice that looks best at the
time – and then solving a resulting subproblem
Overlapping Subproblems
• Divide-and-Conquer is suitable when generating brand-new problems at each step of the recursion.
• Dynamic-programming algorithms take advantage of overlapping subproblems by solving each subproblem once and then storing the solution in a table where it can be looked up when needed, using constant time per lookup
Assembly Line
Problem Definition
e1, e2: time to enter assembly lines 1 and 2x1, x2: time to exit assembly lines 1 and 2
ti,j: time to transfer from assembly line 12 or 21ai,j: processing time in each station
Time between adjacent stationsare 0
2n possible solutions
Optimal Substructure• We want the fastest way through the factory (from the starting point)
– What is the fastest possible way through S1,1 (similar for S2,1)• Only one way take time e1
– The fastest possible way through S1,j for j=2, 3, ..., n (similar for S2,j)
• S1,j-1 S1,j : T1,j-1 + a1,j – If the fastest way through S1,j is through S1,j-1 must have
taken a fastest way through S1,j-1
• Similar for S2, i-1
• An optimal solution contains an optimal solution to sub-problems optimal substructure.
S1,1 2 + 7 = 9S2,1 4 + 8 = 12
S1,1 2 + 7 = 9S2,1 4 + 8 = 12 S1,2 =
S1,1 + 9 = 9 + 9 = 18
S2,1 + 2 + 9 = 12 + 2 + 9 = 23
S2,2 =S1,1 + 2 + 5 = 9 + 2 + 5 = 16
S2,1 + 5 = 12 + 5 = 17
S1,2 18S2,2 16 S1,3 =
S1,2 + 3 = 18 + 3 = 21
S2,2 + 1 + 3 = 16 + 1 + 3 = 20
S2,3 =S1,2 + 3 + 6 = 16 + 3 + 6 = 25
S2,2 + 6 = 16 + 6 = 22
Formal Setup
• fi[j]: fastest possible time to go from starting point through station Si,j
• The fastest time to go all the way through the factory: f* = min(f1[n] + x1, f2[n] + x2)
• Boundary conditions, – f1[1] = e1 + a1,1
– f2[1] = e2 + a2,1
Setup Contd..
• The fastest time to go through Si,j (for j=2,..., n)
• f1[j] = min(f1[j-1] + a1,j, f2[j-1] + t2,j-1 + a2,j)
• f2[j] = min(f2[j-1] + a2,j, f1[j-1] + t1,j-1 + a2,j)
• li[j]: the line number whose station j-1 is used in a fastest way through Si,j (i=1, 2, and j=2, 3,..., n)– l* : the line whose station n is used in a fastest way
through the entire factory