120 likes | 239 Vues
This lecture focuses on the concepts of dynamic programming and explores the Longest Common Subsequence (LCS) problem. A subsequence allows for skipping characters in a string, differing significantly from a substring. The LCS problem aims to find the longest sequence common to two strings, with applications in fields like DNA testing. The approach to solving this problem involves recognizing simple subproblems, optimality, and overlapping subproblems, along with a detailed explanation of the dynamic programming technique to efficiently compute the LCS.
E N D
CSC 213 Lecture 19:Dynamic Programmingand LCS
Subsequences (§ 11.5.1) • A subsequence of a string x0x1x2…xn-1 is a string of the form xi1xi2…xik, where ij < ij+1 • This is not the same things as a substring! • Subsequences can skip letters in the string • Substrings must use consecutive letters • Example string: ABCDEFGHIJK • Subsequence (& substring): DEFGH • Subsequence (& NOT substring): ACEFHJ • Not subsequence or substring: DAGH
Longest Common Subsequence (LCS) Problem • Given two strings X and Y, find longest subsequence in both X and Y • Applications in DNA testing (S={A,C,G,T}) • Example the LCS for: ABCDEFGandXZACKDFWGH is ACDFG
Longest Common Subsequence (LCS) Problem • Given two strings X and Y, find longest subsequence in both X and Y • Applications in DNA testing (S={A,C,G,T}) • Example the LCS for: ABCDEFGandXZACKDFWGH isACDFG
Dynamic Programming • Some problems appear hard • There does not seem to be a simple solutions • Require a brute force approach --- evaluate every solution • This means constantly reevaluating a lot of options • Ultimately, this takes exponential time -- O(2n) • For a class of problems, however, the solution is: Dynamic Programming
Dynamic Programming • Works from problems with: • Simple subproblems: can be defined using only a few simple variables • Subproblem optimality: can define how to solve problem using the subproblem solutions • Subproblem overlap: subproblems overlap such that the solution to a first subproblem can (help) solve later subproblems
How Not to Solve LCS in your Lifetime • Brute-force solution: • List all subsequences of X • Check each subsequence to see if it is also a subsequence of Y • Return the longest one of these • Analysis: • If X has length n, it has 2n subsequences • While waiting, you can not only get coffee, but could first fly to Columbia and pick the beans!
How to Solve LCS Quickly • If X and Y are 1 character, LCS is 0 or 1 • If we then add 1 character to X and Y, LCS increases by at most 1 • Note that we do not need to rescan the first character
Dynamic-Programming Solution • Use an array Lto hold solution to subproblems • L[i,j] stores LCS of X[0..i] and Y[0..j] • Define array to include an index of -1 • L[-1,*] computes LCS for X[0..-1] = “” • L[*,-1] computes LCS for Y[0..-1] = “” • L[-1,*] = 0 and L[*,-1] = 0 since there are no characters to match!
Dynamic-Programming Solution • Solve for remaining L[i,j] as follows: • If xi=yj, then L[i,j] = L[i -1, j -1] +1 • E.g., one more than previous solution • If xi≠yj, then L[i,j] = max(L[i -1, j], L[i, j -1]) • E.g. use however good we did before • Final result will be stored in L[n,m] Case 1: Case 2:
LCS Algorithm AlgorithmLCS(String X, String Y): fori 1 ton-1 L[i,-1] 0 for j 0 tom-1 L[-1, j]0 for i0 ton-1 forj 0 to m-1 ifxi= yjthen L[i, j] L[i-1, j-1] + 1 else L[i, j] max(L[i-1, j], L[i, j-1]) returnL