5/5 - (1 vote)

Lecture 7: Dynamic Programming II
The University of Sydney Page 1

General techniques in this course
Greedy algorithms [Lecture 3]
Divide & Conquer algorithms [Lectures 4 and 5]
Dynamic programming algorithms [Lecture 6 and today] Network flow algorithms [18 Apr and 2 May]
The University of Sydney
Page 2

Algorithmic Paradigms
Greedy. Build up a solution incrementally, myopically optimizing some local criterion.
Divide-and-conquer. Break up a problem into two sub- problems, solve each sub-problem independently, and combine solution to sub-problems to form solution to original problem.
Dynamic programming. Break up a problem into a series of overlapping sub-problems, and build up solutions to larger and larger sub-problems.
The University of Sydney Page 3

Main Idea: Dynamic Programming
Recurrence equation relating optimal solution in terms optimal solutions to smaller subproblems
Given optimal solutions to smaller subproblems, how to construct solution to original problem?
The University of Sydney
Page 4

Key steps: Dynamic programming
1. Define subproblems 2. Find recurrences
3. Solve the base cases
4. Transform recurrence into an efficient algorithm
The University of Sydney Page 5

Recap: Weighted Interval Scheduling
Job j starts at sj, finishes at fj, and has weight vj.
Two jobs compatible if they dont overlap.
Goal: find maximum value subset of mutually compatible jobs.
b
a
c
d
e
f
g
h
0 1 2 3 4 5 6 7 8 9 10 11 The University of Sydney
Time
Page 6

Recap: Weighted Interval Scheduling
Notation. Label jobs by finishing time: f1 f2 . . . fn .
Def. p(j) = largest index i 0 kilograms and has value vi > 0. KnapsackhascapacityofWkilograms.
Goal: fill knapsack so as to maximize total value.
W = 11
Item
Value
Weight
1
1
1
2
6
2
3
18
5
4
22
6
5
28
7
The University of Sydney
Page 14

Recap: Dynamic Programming Step 1 Step 1: Define subproblems
OPT(i, w) = max profit with subset of items 1, , i with weight limit w.
The University of Sydney
Page 15

Recap: Dynamic Programming Step 2 Step 2: Find recurrences
Case 1: OPT does not select item i.
OPT selects best of { 1, 2, , i-1 } using weight limit w
Case 2: OPT selects item i.
newweightlimit=wwi
OPT selects best of { 1, 2, , i1 } using this new weight limit
OPT(i,w) = max { vi + OPT (i-1,w-wi), OPT(i-1,w) }
The University of Sydney
Page 16

Recap: Dynamic Programming Step 2 Step 2: Find recurrences
Case 1: OPT does not select item i.
OPT selects best of { 1, 2, , i-1 } using weight limit w
Case 2: OPT selects item i.
newweightlimit=wwi
OPT selects best of { 1, 2, , i1 } using this new weight limit
If wi>w
OPT(i,w) = OPT (i-1,w)
otherwise
OPT(i,w) = max { vi + OPT (i-1,w-wi), OPT(i-1,w) }
The University of Sydney
Page 17

Recap: Dynamic Programming Step 3 Step 3: Solve the base cases
OPT(0,w) = 0
i0 if i=0 OPT(i,w)=iiOPT(i-1,w) if i>0andwi >w
iimax{OPT(i-1,w),vi+OPT(i-1,w-wi)} otherwise
The University of Sydney
Page 18
Donemore or less

Knapsack Problem: Bottom-Up
Knapsack. Fill up an (n+1)-by-(W+1) array.
Input: n, w1,,wN, v1,,vN for w = 0 to W
M[0, w] = 0
for i = 1 to n
for w = 1 to W
if (wi > w)
M[i, w] = M[i-1, w]
else
M[i, w] = max {M[i-1, w], vi + M[i-1, w-wi ]} return M[n, W]
The University of Sydney Page 19

Knapsack Algorithm
W+1
0
1
2
3
4
5
6
7
8
9
10
11

0
0
0
0
0
0
0
0
0
0
0
0
{1}
0
1
1
1
1
1
1
1
1
1
1
1
{ 1, 2 }
0
1
6
7
7
7
7
7
7
7
7
7
{ 1, 2, 3 }
0
1
6
7
7
18
19
24
25
25
25
25
{ 1, 2, 3, 4 }
0
1
6
7
7
18
22
24
28
29
29
40
{ 1, 2, 3, 4, 5 }
0
1
6
7
7
18
22
28
29
34
34
40
n+1
Item
Value
Weight
1
1
1
2
6
2
3
18
5
4
22
6
5
28
OPT: {4,3}
value = 22 + 18 = 40
W = 11
The University of Sydney
7
Page 20

Longest Common Subsequence
Given two sequences X[1..n] and Y[1..m], find the longest subsequences X of X and Y of Y such that X = Y.
Example 1 X = BANANAS Y = KATANA
LCS(X,Y) = AANA
Example 2 X = DYNAMIC
Y = PROGRAMMING
LCS(X,Y) = AMI The University of Sydney
Page 21

Longest Common Subsequence: Matching Definition
Given two sequences X[1..n] and Y[1..m], find the longest non- crossing matching between elements of X and elements of Y
Example 1 X = BANANAS Y = KATANA
LCS(X,Y) = AANA. Matching: X[2] Y[2], X[4] Y[4], X[5] Y[5], X[6] Y[6].
Example 2 X = DYNAMIC
Y = PROGRAMMING
LCS(X,Y) = AMI. Matching: X[4] Y[6], X[5] Y[7], X[6] Y[8],
Note: if crossings allowed, then can add X[3] Y[10]. The University of Sydney
Page 22

Longest Common Subsequence: Applications
Measures similarity of two strings Bioinformatics
Merging in version control
The University of Sydney Page 23

LCS Dynamic Programming: Step 1 Step 1: Define subproblems
OPT(i,j) = length of LCS(X[1..i], Y[1..j]). Subproblems: finding LCS of prefixes of X and Y
Example
X = BANANAS Y = KATANA
OPT(6, 4) = length of LCS(BANANA, KATA) = 2 The University of Sydney
Page 25

LCS Dynamic Programming: Step 2
Notations:
OPT(i,j) = length of LCS(X[1..i], Y[1..j]). Step 2: Finding recurrences
Case 1: X[i] = Y[j].
Leave X[i] unmatched and take LCS(X[1..i-1], Y[1..j]). OR Leave Y[j] unmatched and take LCS(X[1..i], Y[1..j-1]).
OPT(i,j)= max{OPT(i-1,j), OPT(i,j-1)}
Example
X[1..5] = BANAN Y[1..6] = KATANA
X[5] unmatched: LCS = LCS(X[1..4], Y[1..6]) = ANA
Y[6] unmatched: LCS = LCS(X[1..5], Y[1..5]) = AAN The University of Sydney
Page 26

LCS Dynamic Programming: Step 2
Notations:
OPT(i,j) = length of LCS(X[1..i], Y[1..j]). Step 2: Finding recurrences
Case 1: X[i] = Y[j].
Leave X[i] unmatched and take LCS(X[1..i-1], Y[1..j]). OR Leave Y[j] unmatched and take LCS(X[1..i], Y[1..j-1]).
OPT(i,j)= max{OPT(i-1,j), OPT(i,j-1)}
Example
X[1..3] = BAN Y[1..6] = KATANA
X[3] unmatched: LCS(X[1..2], Y[1..6]) = A
Y[6] unmatched: LCS(X[1..3], Y[1..5]) = AN The University of Sydney
Page 28

LCS Dynamic Programming: Step 2
Notations:
OPT(i,j) = length of LCS(X[1..i], Y[1..j]). Step 2: Finding recurrences
Case 1: X[i] = Y[j].
Leave X[i] unmatched and take LCS(X[1..i-1], Y[1..j]). OR Leave Y[j] unmatched and take LCS(X[1..i], Y[1..j-1]).
OPT(i,j)= max{OPT(i-1,j), OPT(i,j-1)}
Example
X[1..3] = BAN Y[1..6] = KATANA
X[3] unmatched: LCS(X[1..2], Y[1..6]) = A
Y[6] unmatched: LCS(X[1..3], Y[1..5]) = AN
The University of Sydney
Page 29

LCS Dynamic Programming: Step 2
Notations:
OPT(i,j) = length of LCS(X[1..i], Y[1..j]). Step 2: Finding recurrences
Case 2: X[i] = Y[j].
Match X[i] to Y[j] and take LCS(X[1..i-1], Y[1..j-1]) + X[i] OR
Leave X[i] unmatched or Y[j] unmatched
OPT(i,j) = max{OPT(i-1, j-1) + 1, OPT(i-1, j), OPT(i, j-1)
Example
X[1..6] = BANANA Y[1..6] = KATANA
Match X[6] Y[6]: LCS = LCS(X[1..5], Y[1..5]) + A = AANA
X[6] unmatched: LCS = LCS(X[1..5], Y[1..6]) = AAN
Y[6] unmatched: LCS = LCS(X[1..6], Y[1..5]) = AAN The University of Sydney
Page 30

LCS Dynamic Programming: Step 3 Step 3: Solving the base cases
OPT(i,0) = 0 for all i, OPT(0,j) = 0 for all j
The University of Sydney
Page 31

LCS Dynamic Programming: Algorithm
INPUT: n, m, X[1..n], Y[1..m]
for j = 0 to m
M[0,j] = 0
for i = 0 to n
M[i,0] = 0
for i = 1 to n
for j = 1 to m
if (X[i] = Y[j])
M[i,j] = max(M[i-1,j-1] + 1, M[i-1,j], M[i,j-1])
else
M[i,j] = max(M[i-1,j], M[i,j-1])
return M[n,m]
The University of Sydney Page 32

LCS Dynamic Programming: Analysis
INPUT: n, m, X[1..n], Y[1..m]
for j = 0 to m
M[0,j] = 0
for i = 0 to n
M[i,0] = 0
for i = 1 to n
for j = 1 to m
O(n + m)
O(nm)
if (X[i] = Y[j])
M[i,j] = max(M[i-1,j-1] + 1, M[i-1,j], M[i,j-1])
else
M[i,j] = max(M[i-1,j], M[i,j-1])
return M[n,m] Running time: O(nm)
Space: O(nm)
See 3927 Lecture 6 for Sequence Alignment
The University of Sydney
Page 33

6.5 RNA Secondary Structure
Dynamic programming over intervals
The University of Sydney Page 34

RNA (Ribonucleic acid) Secondary Structure
RNA. String B = b1b2bn over alphabet { A, C, G, U }.
Secondary structure. RNA is single-stranded so it tends to loop
back and form base pairs with itself. This structure is essential
for understanding behavior of molecule.
CA AA AU
GC CGUAA G
Ex: GUCGAUUGAGCGAAUGUAACAACGUGGCUACGGCGAGA UGAUUA
G
ACGCU CGCGAGC
G
G AU
The University of Sydney
complementary base pairs: A-U, C-G G
Page 35

RNA Secondary Structure
Secondary structure. A set of pairs S = { (bi, bj) } that satisfy: [Watson-Crick.] S is a matching and each pair in S is a Watson-Crick
complement: A-U, U-A, C-G, or G-C.
[No sharp turns.] The ends of each pair are separated by at least 4
intervening bases. If (bi, bj) I S, then i < j – 4. [Non-crossing.] If (bi, bj) and (bk, bl) are two pairs in S, then we cannothave i < k < j < l. Free energy. Usual hypothesis is that an RNA molecule will form the secondary structure with the optimum total free energy.approximated by number of base pairs Goal. Given an RNA molecule B = b1b2…bn, find a secondary structure S that maximizes the number of base pairs. The University of SydneyPage 3636 RNA Secondary Structure: ExamplesGGGGG CUGGCU CGCGCUAUAUAG UAUAUAbase pairAUGUGGCCAU AUGGGGCAU AGUUGGCCAU 4ok sharp turn crossing The University of SydneyPage 37 RNA Secondary Structure: Subproblems First attempt (Step 1). OPT(j) = maximum number of base pairs in a secondary structure of the substring b1b2…bj.1match bt and bitj Difficulty (in Step 2). Results in two sub-problems. Finding secondary structure in: b1b2…bt-1. Finding secondary structure in: bt+1bt+2…bj-1.OPT(t-1)The University of Sydney need more sub-problems Page 38 Dynamic Programming Over Intervals Step 1 Step 1: Define subproblemsOPT(i, j) = maximum number of base pairs in a secondary structure of the substringbibi+1…bj.The University of SydneyPage 39 Dynamic Programming Over Intervals Step 2 Notation. OPT(i, j) = maximum number of base pairs in asecondary structure of the substring bibi+1…bj. Step 2: Find recurrencesThe University of SydneyPage 40ijCase 1. Base bj is not involved in a pair. OPT(i, j) = OPT(i, j-1) Dynamic Programming Over Intervals Step 2 Notation. OPT(i, j) = maximum number of base pairs in asecondary structure of the substring bibi+1…bj. Step 2: Find recurrencesThe University of SydneyPage 41imatch bt and bitjCase2. Basebjpairswithbt forsomeit M[w] + cvw) {
M[v] M[w] + cvw
successor[v] w }
} }
If no M[w] value changed in iteration i, stop.
}
}
Analysis: Q(mn) time, Q(n) working space. The University of Sydney Page 64

Shortest Paths: Practical Improvements
Practical improvements
Maintain only one array M[v] = shortest v-t path that we have
found so far.
No need to check edges of the form (v, w) unless M[w] changed in previous iteration.
Theorem: Throughout the algorithm, M[v] is length of some v-t path, and after i rounds of updates, the value M[v] is no larger than the length of shortest v-t path using i edges.
Overall impact
Working space: O(n).
Total space (including input): O(m+n)
Running time: O(mn) worst case, but substantially faster in practice.
The University of Sydney
Page 65
65

6.3 Segmented Least Squares
The University of Sydney Page 66

Segmented Least Squares
Least squares.
Foundational problem in statistic and numerical analysis.
Given n points in the plane: (x1, y1), (x2, y2) , . . . , (xn, yn).
Find a line y = ax + b that minimizes the sum of the squared error:
y
SSE = an (yi -axi -b)2 i =1
x
Solution. Calculus min error is achieved when
The University of Sydney Page 67
a=naixiyi -(aixi)(ai yi), b=ai yi -aaixi
n a x 2 (a x )2 n ii ii
O(n)

Segmented Least Squares
Segmented least squares.
Points lie roughly on a sequence of several line segments.
Givennpointsintheplane(x1,y1),(x2,y2),,(xn,yn)with x1 < x2 < … < xn, find a sequence of lines that minimizes f(x). Question. What’s a reasonable choice for f(x) to balance accuracy and complexity?number of linesy The University of Sydneyx Page 68 Segmented Least Squares Segmented least squares. Points lie roughly on a sequence of several line segments. Givennpointsintheplane(x1,y1),(x2,y2),…,(xn,yn)withx1 < x2 < … < xn, find a sequence of lines that minimizes: the sum of the sums of the squared errors E in each segment the number of lines L Tradeoff function: E + cL, for some constant c > 0.
y
The University of Sydney
x Page 69

Segmented Least Squares
Segmented least squares.
Points lie roughly on a sequence of several line segments.
Givennpointsintheplane(x1,y1),(x2,y2),,(xn,yn)with
x1 < x2 < … < xn, find a partition into segments that minimizes: the sum of the sums of the squared errors E in each segment the number of segments L Tradeoff function: E + cL, for some constant c > 0.
y
The University of Sydney
x Page 70

Dynamic Programming: Multiway Choice Step 1 Step 1: Define subproblems
OPT(j) = minimum cost for points p1, p2 , . . . , pj.
The University of Sydney Page 71

Dynamic Programming: Multiway Choice Step 2
Notations:
OPT(j)=minimumcostforpointsp1,p2 ,,pj.
e(i,j) =minimumsumofsquaresforpointspi,pi+1 ,,pj.
Step 2: Finding recurrences
Lastsegmentusespointspi,pi+1 ,,pj forsomei. Cost = e(i, j) + c + OPT(i-1).
OPT(j) = min { e(i,j) + c + OPT (i-1) } 1ij
The University of Sydney Page 72

Dynamic Programming: Multiway Choice Step 3 Step 3: Solving the base cases
OPT(0) = 0
OPT(j) = (0 if j = 0 min1ije(i,j)+c+OPT(i1) ifj>0
The University of Sydney
Page 73

Segmented Least Squares: Algorithm
O(n2) iterations
O(n) iterations
INPUT: n, (p1,,pn), c
Segmented-Least-Squares() {
M[0] = 0
for j = 1 to n
for i = 1 to j
compute the least square error eij for the segment pi,, pj
O(n)
O(n)
for j = 1 to n
M[j]=min1 i j(eij +c+M[i-1])
return M[n] }
OPT(j) = (0 if j = 0 min1ije(i,j)+c+OPT(i1) ifj>0
The University of Sydney
Page 74

Segmented Least Squares: Algorithm
2 O(n )
iterations
O(n) iterations
for j = 1 to n fori=1toj
compute the least square error eij for the segment pi,, pj
for j = 1 to n
M[j]=min1 i j(eij +c+M[i-1])
O(n)
O(n)
INPUT: n, (p1,,pn), c
Segmented-Least-Squares() {
M[0] = 0
The University of Sydney
Page 75
return M[n] }
Running time: O(n3) Space: O(n2)

Dynamic Programming Summary I
3 steps:
1. Definesubproblems
2. Findrecurrences
3. Solvethebasecases
4. Transformrecurrenceintoanefficientalgorithm [usually bottom-up]
The University of Sydney
Page 76
76

Dynamic Programming Summary II
1D dynamic programming
Weighted interval scheduling
Segmented Least Squares
Maximum-sum contiguous subarray Longest increasing subsequence
2D dynamic programming Knapsack
Shortest path
Longest common subsequence
Dynamic programming over intervals RNA Secondary Structure
The University of Sydney
Page 77

.. Interpunctio Verborum Redux recurrence is T(n) = T(n 1) + T(n 2) + O(n), which still has the solution
T(n)=O(n).
. Interpunctio Verborum Redux
For our next dynamic programming algorithm, lets consider the text segmenta- tion problem from the previous chapter. We are given a string A[1 .. n] and a subroutine IW that determines whether a given string is a word (whatever that means), and we want to know whether A can be partitioned into a sequence of words.
We solved this problem by defining a function Splittable(i) that returns T if and only if the sux A[i .. n] can be partitioned into a sequence of words. We need to compute Splittable(1). This function satisfies the recurrence
8><>:T if i > n Splittable(i) = Wn IsWord(i, j) ^ Splittable( j + 1) otherwise
j=i
where IsWord(i, j) is shorthand for IsWord(A[i .. j]). This recurrence translates directly into a recursive backtracking algorithm that calls the IW subroutine O(2n) times in the worst case.
But for any fixed string A[1..n], there are only n dierent ways to call the recursive function Splittable(i)one for each value of i between 1 and n + 1and only O(n2) dierent ways to call IW(i, j)one for each pair (i, j) such that 1 i j n. Why are we spending exponential time computing only a polynomial amount of stu?
Each recursive subproblem is specified by an integer between 1 and n + 1, so we can memoize the function Splittable into an array SplitTable[1 .. n + 1]. Each subproblem Splittable(i) depends only on results of subproblems Splittable(j) where j > i, so the memoized recursive algorithm fills the array in decreasing index order. If we fill the array in this order deliberately, we obtain the dynamic programming algorithm shown in Figure .. The algorithm makes O(n2) calls to IW, an exponential improvement over our earlier backtracking algorithm.
. The Pattern: Smart Recursion
In a nutshell, dynamic programming is recursion without repetition. Dynamic programming algorithms store the solutions of intermediate subproblems, often but not always in some kind of array or table. Many algorithms students

. DP
S(A[1 .. n]): SplitTable[n + 1] T
for i n down to 1 SplitTable[i] F
for j i to n
if IW(i, j) and SplitTable[ j + 1]
SplitTable[i] T return SplitTable[1]

Figure .. Interpunctio verborum velox
(and instructors, and textbooks) make the mistake of focusing on the table because tables are easy and familiarinstead of the much more important (and dicult) task of finding a correct recurrence. As long as we memoize the correct recurrence, an explicit table isnt really necessary, but if the recurrence is incorrect, we are well and truly hosed.
Dynamic programming algorithms are almost always developed in two distinct stages.
. Formulate the problem recursively. Write down a recursive formula or algorithm for the whole problem in terms of the answers to smaller subproblems. This is the hard part. A complete recursive formulation has two parts:
(a) Specification. Describe the problem that you want to solve recursively, in coherent and precise Englishnot how to solve that problem, but what problem youre trying to solve. Without this specification, it is impossible, even in principle, to determine whether your solution is correct.
(b) Solution. Give a clear recursive formula or algorithm for the whole problem in terms of the answers to smaller instances of exactly the same problem.
. Build solutions to your recurrence from the bottom up. Write an algo- rithm that starts with the base cases of your recurrence and works its way up to the final solution, by considering intermediate subproblems in the correct order. This stage can be broken down into several smaller, relatively mechanical steps:
(a) Identify the subproblems. What are all the dierent ways your re- cursive algorithm can call itself, starting with some initial input? For example, the argument to RF is always an integer between 0 and n.
Dynamic programming is not about lling in tables. Its about smart recursion!

(b) Choose a memoization data structure. Find a data structure that can store the solution to every subproblem you identified in step (a). This is usually but not always a multidimensional array.
(c) Identify dependencies. Except for the base cases, every subproblem depends on other subproblemswhich ones? Draw a picture of your data structure, pick a generic element, and draw arrows from each of the other elements it depends on. Then formalize your picture.
(d) Find a good evaluation order. Order the subproblems so that each one comes after the subproblems it depends on. You should consider the base cases first, then the subproblems that depends only on base cases, and so on, eventually building up to the original top-level problem. The dependencies you identified in the previous step define a partial order over the subproblems; you need to find a linear extension of that partial order. Be careful!
(e) Analyze space and running time. The number of distinct subproblems determines the space complexity of your memoized algorithm. To compute the total running time, add up the running times of all possible subproblems, assuming deeper recursive calls are already memoized. You can actually do this immediately after step (a).
(f) Write down the algorithm. You know what order to consider the subproblems, and you know how to solve each subproblem. So do that! If your data structure is an array, this usually means writing a few nested for-loops around your original recurrence, and replacing the recursive calls with array look-ups.
Of course, you have to prove that each of these steps is correct. If your recurrence is wrong, or if you try to build up answers in the wrong order, your algorithm wont work!
. Warning: Greed is Stupid
If were incredibly lucky, we can bypass all the recurrences and tables and so forth, and solve the problem using a greedy algorithm. Like a backtracking algorithm, a greedy algorithm constructs a solution through a series of decisions, but it makes those decisions directly, without solving at any recursive subproblems. While this approach seems very natural, it almost never works; optimization problems that can be solved correctly by a greedy algorithm are quite rare. Nevertheless, for many problems that should be solved by backtracking or dynamic programming, many students first intuition is to apply a greedy strategy.
For example, a greedy algorithm for the longest increasing subsequence problem might look for the smallest element of the input array, accept that
.. Warning: Greed is Stupid

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] CS data structure Bioinformatics algorithm Lecture 7: Dynamic Programming II

Reviews

Whatsapp Us

[SOLVED] CS data structure Bioinformatics algorithm Lecture 7: Dynamic Programming II

Reviews

Related products

[Solved] Modify your first program to print a table of the words

[Solved] Python program that plays a guess-the-number game with magic lists

[Solved] Indel

[Solved] Car.py

[Solved] Python Assignment-Financial Products and Markets

[SOLVED] Naughty Receiver – Reliable Data Transfer