Dynamic Programming

Dynamic Programming • Divide and Conquer and Greedy Algorithms are powerful techniques in situations which fit their strengths • Dynamic Programming can usually be used in a broader set of applications • DP uses some graph algorithm techniques in a specific fashion • Some call Dynamic Programming and Linear Programming (next chapter) the "Sledgehammers" of algorithmic tools • "Programming" in these names does not come from writing code as we normally consider it • These names were given before modern computers and programming was tied to the meaning of "planning" CS 312 – Dynamic Programming

A B C B E F G E G H E F G Divide and Conquer Note Redundant Computations CS 312 – Dynamic Programming

A B C B E F G E G H E F G Dynamic Programming start solving sub-problems at the bottom CS 312 – Dynamic Programming

Dynamic Programming A E: solutionE F: solutionF G: solutionG B: solutionB B C B E F G E G H E F G Find the proper ordering for the subtasks Build a table of results as we go That way do not have to recompute any intermediate results CS 312 – Dynamic Programming

A B C B E F G E G H E F G Dynamic Programming A C B H E F G CS 312 – Dynamic Programming

Fibonacci Series • 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, … • Exponential if we just implement the algorithm directly • DP approach: Build a table with dependencies, store and use intermediate results – O(n) CS 312 - Algorithm Analysis

Example – Longest Increasing Subsequence • 5 2 8 6 3 6 9 7 • 2 3 6 7 • Consider the sequence a graph of n nodes • What algorithm would you use to find longest increasing subsequence? CS 312 – Dynamic Programming

Example – Longest Increasing Subsequence • 5 2 8 6 3 6 9 7 • 2 3 6 7 • Consider sequence a graph of n nodes • What algorithm would you use to find longest increasing subsequence? • Could try all possible paths • 2n possible paths (why)? • There are less increasing paths • Complexity is n·2n • Very expensive because lots of work done multiple times • sub-paths repeatedly checked CS 312 – Dynamic Programming

Example – Longest Increasing Subsequence • Could represent sequence as a DAG with edges corresponding to increasing values • Problem is then just finding the longest path in the DAG • DP approach – solve in terms of smaller subproblems with memory • L(j) is the longest path (increasing subsequence) ending at j • (plus one since we are counting nodes in this problem) • Any node could be the last node in the longest path so we check each one • Build table to track values and avoid recomputes – Complexity? - Space?

Example – Longest Increasing Subsequence • Complexity: O(n·average_indegree) which is worst cast O(n2) • Memory Complexity? – must store intermediate results to avoid recomputes O(n) • Assumes sorted DAG which would also be O(n2) to create • Note that for our longest increasing subsequence problem we get the length, but not the path • Markovian assumption – not dependant on history, just current/recent states • Can fix this (ala Dijkstra) by also saving prev(j) each time we find the max L(j) so that we can reconstruct the longest path • Why not use divide and conquer style recursion? CS 312 – Dynamic Programming

Example – Longest Increasing Subsequence • Recursive version is exponential (lots of redundant work) • Versus an efficient divide and conquer that cuts the problem size by a significant amount at each call and minimizes redundant work • This case just goes from a problem of size n to size n-1 at each call • Why not use divide and conquer style recursion? CS 312 – Dynamic Programming

When is Dynamic Programming Efficient • Anytime we have a collection of subproblems such that: • There is an ordering on the subproblems, and a relation that shows how to solve a subproblem given the answers to "smaller" subproblems, that is, subproblems that appear earlier in the ordering • Problem becomes an implicit DAG with each subproblem represented by a node, with edges giving dependencies • Just one order to solve it? - Any linearization • Does largest increasing subsequence algorithm fit this? • Ordering is in the for loop – an appropriate linearization, finish L(1) before starting L(2), etc. • Relation is L(j) = 1 + max{L(i) : (i,j) E} CS 312 – Dynamic Programming

When is Dynamic Programming Optimal? • DP is optimal when the optimality property is met • First make sure solution is correct • The optimality property: An optimal solution to a problem is built from optimal solutions to sub-problems • Question to consider: Can we divide the problem into sub-problems such that the optimal solutions to each of the sub-problems combine into an optimal solution for the entire problem? CS 312 – Dynamic Programming

When is Dynamic Programming Optimal? The optimality property: An optimal solution to a problem is built from optimal solutions to sub-problems Consider Longest Increasing Subsequence algorithm Is L(1) optimal? As you go through the ordering does the relation always lead to an optimal intermediate solution? Note that the optimal path from j to the end is independent of how we got to j (Markovian) Thus choosing the longest incoming path must be optimal Not always the case for arbitrary problems CS 312 – Dynamic Programming

Dynamic Programming and Memory • Trade off some memory complexity for storing intermediate results so as to avoid recomputes • How much memory • Depends on variables in relation • Just one variable requires a vector: L(j) = 1 + max{L(i) : (i,j) E} • A two variable relation L(i,j) would require a 2-d array, etc. CS 312 – Dynamic Programming

Another Example – Binomial Coefficient • Divide and Conquer? • Is there an appropriate ordering and relationships for DP? How many ways to choose k items from a set of size n (n choose k) CS 312 – Dynamic Programming

Unwise Recursive Method for C(5,3) C(5,3) C(4,2) C(4,3) C(3,1) C(3,2) C(3,2) C(3,3) C(2,0) C(2,1) C(2,1) C(2,2) C(2,1) C(2,2) 1 C(1,0) C(1,1) C(1,0) C(1,1) C(1,0) C(1,1) 1 1 1 1 1 1 1 1 1 CS 312 – Dynamic Programming

Wiser Method – No Recomputes C(5,3) C(4,2) C(4,3) C(3,1) C(3,2) C(3,3) C(2,0) C(2,1) C(2,2) C(1,0) C(1,1) CS 312 – Dynamic Programming

Recurrence Relation to Table Figure out the variables and use them to index the table Figure out the base case(s) and put it/them in the table first Show the DAG dependencies and fill out the table until we get to the desired answer Let's do it for C(5,3) CS 312 – Dynamic Programming

DP Table = C(5,3) CS 312 – Dynamic Programming

DP Table = C(5,3) What is the complexity? CS 312 – Dynamic Programming

DP Table = C(5,3) What is the complexity? Number of cells (table size) × complexity to compute each cell CS 312 – Dynamic Programming

DP Table = C(5,3) 1 5 1 • Notice a familiar pattern? CS 312 – Dynamic Programming

Pascal’s Triangle • Blaise Pascal (1623-1662) • Second person to invent the calculator • Religious philosopher • Mathematician and physicist • Pascal's Triangle is a geometric arrangement of the binomial coefficients in a triangle • Pascal's Triangle holds many other mathematical patterns

Edit Distance • A natural measure of similarity between two strings is the extent to which they can be aligned, or matched up TACO T-ACO = TACO TA-CO TEXCO TEXCO TXCO TEXCO • "-" indicates a gap (insertion) • Note that an insert from the point of view of one string is the same as a delete from the point of view of the other • We'll just say insert from now on to keep it simple (rightmost above) • The Edit Distance between two strings is the minimum number of edits to convert one string into the other: insert (delete) or substitute • What is edit distance of above example? • What is our algorithm to calculate edit distance? • Number of possible alignments grows exponentially with string length n, so we try DP to solve it efficiently CS 312 – Dynamic Programming

DP approach to Edit Distance • Two things to consider • Is there an ordering on the subproblems, and a relation that shows how to solve a subproblem given the answers to "smaller" subproblems, that is, subproblems that appear earlier in the ordering • Is it the case that an optimal solution to a problem is built from optimal solutions to sub-problems CS 312 – Dynamic Programming

DP approach to Edit Distance • Assume two strings x and y of length m and n respectively • Consider the edit subproblem E(i,j) = E(x[1…i], y[1…j]) • For x= "taco" and y= "texco" E(2,3) = E("ta","tex") • What is E(1,1) for this problem? and in general? • Would our approach be optimal for E(1,1)? • The final solution would then be E(m,n) • This notation gives a natural way to start from small cases and build up to larger ones • Now, we need a relation to solve E(i,j) in terms of smaller problems CS 312 – Dynamic Programming

DP Edit Distance Approach • Start building a table • What are base cases? • What is the relationship of the next open cell based on previous cells? • Back pointer, note that it never changes - Markovian property • E(i,j) = ? CS 312 – Dynamic Programming

j: i: What are 3 options? CS 312 – Dynamic Programming

DP Edit Distance Approach • Start building a table • What are base cases? • What is the relationship of the next open cell based on previous cells? • Back pointer, note that it never changes - Markovian property • E(i,j) = min[diff(i,j) + E(i-1,j-1), 1 + E(i-1,j), 1 + E(i,j-1)] • Intuition of current cell based on preceding adjacent cells • Diagonal is a match or substitution • Coming from top cell represents an insert into top word • i.e. a delete from left word • Coming from left cell represents an insert into left word • i.e. a delete from top word CS 312 – Dynamic Programming

Possible Alignments • If we consider an empty cell of E(i,j) there are only three possible alignments (e.g. E(2,2) = E("ta", "te")) • x[i] aligned with "-": cost = 1 + E(i-1,j) - top cell, insert top word • E("ta","te") leads to alignment t- with cost 1 + E("t","te") ta • y[j] aligned with "-": cost = 1 + E(i,j-1) left cell, insert left word • E("ta","te") leads to alignment te with cost 1 + E("ta","t") t- • x[i] = y[j]: cost = diff(i,j) + E(i-1,j-1) • E("ta","te") leads to alignment ta with cost 1 + E("t","t") ta • Thus E(i,j) = min[1 + E(i-1,j), 1 + E(i,j-1), diff(i,j) + E(i-1,j-1)] CS 312 – Dynamic Programming

Edit Distance Algorithm • E(i,j) = min[1 + E(i-1,j), 1 + E(i,j-1), diff(i,j) + E(i-1,j-1)] • Note that we could use different penalties for insert and substitution based on whatever goals we have • Answers fill in a 2-d table • Any computation order is all right as long as E(i-1,j), E(i,j-1), and E(i-1,j-1) are computed before E(i,j) • What are base cases? (x is any integer ≥ 0): • E(0,x) = x example: E("", "rib") = 3 (3 inserts) • E(x,0) = x example: E("ri", "") = 2 (2 inserts) • If we want to recover the edit sequence found we just keep a back pointer to the previous minimum as we grow the table CS 312 – Dynamic Programming

Edit Distance Algorithm For i = 0,1,2,…, mE(i,0) = i// length of string(x) - Exponential For j = 0,1,2,…, nE(0,j) = j// length of string(y) - Polynomial For i = 1,2,…, m For j = 1,2,…, n E(i,j) = min[1 + E(i-1,j), 1 + E(i,j-1), diff(i,j) + E(i-1,j-1)] Return E(m,n) What is Complexity?

Edit Distance Example and DAG • This is a weighted DAG with weights of 0 and 1. We can just find the least cost path in the DAG to retrieve optimal edit sequence(s) • Down arrows are insertions into "Polynomial" with cost 1 • Right arrows are insertions into "Exponential" with cost 1 • Diagonal arrows are either matches (dashed) with cost 0 or substitutions with cost 1 • Edit distance of 6 EXPONEN-TIAL --POLYNOMIAL • Can set costs arbitrarily based on goals CS 312 – Dynamic Programming

Space Requirements Basic table is m× n which is O(n2) assuming mand nare similar What order options can we use to calculate cells But do we really need to use O(n2) memory? How can we implement edit-distance using only O(n) memory? What about prev pointers and extracting the actual alignment? CS 312 – Dynamic Programming

Gene Sequence Alignment X=ACGCTC Y=ACTTG CS 312 – Dynamic Programming

Needleman-Wunsch Algorithm • Gene Sequence Alignment a type of Edit Distance ACGCT-C A--CTTG • Uses Needleman-Wunsch Algorithm • This is just edit distance with a different cost weighting • You will use Needleman-Wunsch in your project • Cost (Typical Needleman-Wunsch costs are shown): • Match: cmatch= -3 (a reward) • Insertion into x (= deletion from y): cindel = 5 • Insertion into y (= deletion from x): cindel = 5 • Substitutions of a character from x into y (or from y into x): csub = 1 • You will use the above costs in your HW and project • Does that change the base cases? CS 312 – Dynamic Programming

Gene Alignment Project • You will implement two versions (using Needleman-Wunsch ) • One which gives the match score in O(n2) time and O(n) space and which does not extract the actual alignment • The other will extract the alignment and will be O(n2) time and space • You will align 10 supplied real gene sequences with each other (100/2 = 50 alignments) • atattaggtttttacctacc • caggaaaagccaaccaact • You will only align the first 5000 bases in each taxa • Some values are given to you for debugging purposes, your other results will be used to test your code correctness CS 312 – Dynamic Programming

Knapsack • Given items x1, x2,…, xn • each with weight wi and value vi • find the set of items which maximizes the total value xivi • under the constraint that the total weight of the items xiwiis does not exceed a given W • Many resource problems follow this pattern • Task scheduling with a CPU • Allocating files to memory/disk • Bandwidth on a network connection, etc. • There are two variations depending on whether an item can be chosen more than once (repetition) CS 312 – Dynamic Programming

Knapsack Approaches W = 10 • Will greedy always work? • Exponential number of item combinations • 2n for Knapsack without repetition – why? • Many more for Knapsack with repetition • How about DP? • Always ask what are the subproblems CS 312 – Dynamic Programming

Knapsack with Repetition • Two types of subproblems possible • consider knapsacks with less capacity • consider fewer items • Define K(w) = maximum value achievable with a knapsack of capacity w • Final answer is K(W) • Subproblem relation – if K(w) includes item i, then removing i leaves optimal solution K(w-wi) • Can only contain i if wi ≤ w • Thus K(w) = maxi:wi≤w[K(w – wi) + vi] • Note that it is not dependent on a n-1 type recurrence like edit distance) CS 312 – Dynamic Programming

Knapsack with Repetition Algorithm W = 10 K(0) = 0 for w = 1 to W K(w) = maxi:wi≤w[K(w – wi) + vi] return(K(W)) Build Table – Table size? – Do example Complexity is ? CS 312 – Dynamic Programming

Knapsack with Repetition Algorithm W = 10 K(0) = 0 for w = 1 to W K(w) = maxi:wi≤w[K(w – wi) + vi] return(K(W)) Build Table – Table size? Complexity is O(nW) Insight: W can get very large, n is typically proportional to logb(W) which would make the order in n be O(nbn) which is exponential in n More on complexity issues in Ch. 8 CS 312 – Dynamic Programming

Recursion and Memoization function K(w) if w = 0 return(0) • K(w) = maxi:wi≤w[K(w – wi) + vi] return(K(W)) • function K(w) • if w = 0 return(0) • if K(w) is in hashtable return(K(w)) K(w) = maxi:wi≤w[K(w – wi) + vi] • insert K(w) into hashtable • return(K(w)) K(0) = 0 for w = 1 to W K(w) = maxi:wi≤w[K(w-wi) + vi] return(K(W)) • Recursive version could do lots of redundant computations plus the overhead of recursion • However, would if we insert all intermediate computations into a hash table – Memoize • Usually still solve all the same subproblems with recursive DP or normal DP (e.g. edit distance) • For knapsack we might avoid unnecessary computations in the DP table because w is decremented by wi (more than 1) each time. • Still O(nW) but with better constants than DP for some cases

Recursion and Memoization • Insight: When can DP gain efficiency by recursively starting from the final goal and only solving those subproblems required for the specific goal? • If we knew exactly which subproblems were needed for the specific goal we could have done a more direct (best-first) approach • With DP, we do not know which of the subproblems are needed so we compute all that might be needed • However, in some cases the final solution will never require that certain previous table cells be computed • For example if there are 3 items in knapsack, with weights 50, 80, and 100, we could do recursive DP and avoid computing K(75), K(76), K(77), etc. which could never be necessary, but would have been calculated with the standard DP algorithm • Would this approach help us for Edit Distance? CS 312 – Dynamic Programming

Knapsack without Repetition • Our relation now has to track what items are available • K(w,j) = maximum value achievable given capacity w and only considering items 1,…, j • Means only items 1,…, j are available, but we actually just use some subset • Final answer is K(W,n) • Express relation as: either the jth item is in the solution or not • K(w,j) = max[K(w – wj, j-1) + vj, K(w, j-1)] • If wj > wthen ignore first case • Base cases? CS 312 – Dynamic Programming

Knapsack without Repetition • Our relation now has to track what items are available • K(w,j) = maximum value achievable given capacity w and only considering items 1,…, j • Means only items 1,…, j are available, but we actually just use some subset • Final answer is K(W,n) • Express relation as: either the jth item is in the solution or not • K(w,j) = max[K(w – wj, j-1) + vj, K(w, j-1)] • If wj > wthen ignore first case • Base cases? • Running time is still O(Wn), and table is W+1 by n+1 CS 312 – Dynamic Programming

Knapsack without Repetition Table? W = 10 CS 312 – Dynamic Programming

Knapsack without Repetition Example W = 10

Dynamic Programming