1 / 25

Prof. Swarat Chaudhuri

COMP 482: Design and Analysis of Algorithms. Prof. Swarat Chaudhuri. Spring 2012 Lecture 17. Q1: Longest palindromic subsequence. Give an algorithm to find the longest subsequence of a given string A that is a palindrome. “ amantwocamelsacrazyplanacanalpanama ”. Q1-a: Palindromes (contd.).

nell-moran
Télécharger la présentation

Prof. Swarat Chaudhuri

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMP 482: Design and Analysis of Algorithms Prof. Swarat Chaudhuri Spring 2012 Lecture 17

  2. Q1: Longest palindromic subsequence • Give an algorithm to find the longest subsequence of a given string A that is a palindrome. • “amantwocamelsacrazyplanacanalpanama”

  3. Q1-a: Palindromes (contd.) • Every string can be decomposed into a sequence of palindromes. • Give an efficient algorithm to compute the smallest number of palindromes that makes up a given string.

  4. 6.5 RNA Secondary Structure

  5. RNA Secondary Structure • RNA. String B = b1b2bn over alphabet { A, C, G, U }. • Secondary structure. RNA is single-stranded so it tends to loop back and form base pairs with itself. This structure is essential for understanding behavior of molecule. C A Ex: GUCGAUUGAGCGAAUGUAACAACGUGGCUACGGCGAGA A A A U G C C G U A A G G U A U U A G A C G C U G C G C G A G C G A U G complementary base pairs: A-U, C-G

  6. RNA Secondary Structure • Secondary structure. A set of pairs S = { (bi, bj) } that satisfy: • [Watson-Crick.] S is a matching and each pair in S is a Watson-Crick complement: A-U, U-A, C-G, or G-C. • [No sharp turns.] The ends of each pair are separated by at least 4 intervening bases. If (bi, bj)  S, then i < j - 4. • [Non-crossing.] If (bi, bj) and (bk, bl) are two pairs in S, then we cannot have i < k < j < l. • Free energy. Usual hypothesis is that an RNA molecule will form the secondary structure with the optimum total free energy. • Goal. Given an RNA molecule B = b1b2bn, find a secondary structure S that maximizes the number of base pairs. approximate by number of base pairs

  7. RNA Secondary Structure: Examples • Examples. G G G G G G G C U C U G C G C U C A U A U A G U A U A U A base pair U G U G G C C A U U G G G C A U G U U G G C C A U G A A A 4 ok sharp turn crossing

  8. RNA Secondary Structure: Subproblems • First attempt. OPT(j) = maximum number of base pairs in a secondary structure of the substring b1b2bj. • Difficulty. Results in two sub-problems. • Finding secondary structure in: b1b2bt-1. • Finding secondary structure in: bt+1bt+2bn-1. match bt and bn t n 1 OPT(t-1) need more sub-problems

  9. Dynamic Programming Over Intervals • Notation. OPT(i, j) = maximum number of base pairs in a secondary structure of the substring bibi+1bj. • Case 1. If i  j - 4. • OPT(i, j) = 0 by no-sharp turns condition. • Case 2. Base bj is not involved in a pair. • OPT(i, j) = OPT(i, j-1) • Case 3. Base bj pairs with bt for some i  t < j - 4. • non-crossing constraint decouples resulting sub-problems • OPT(i, j) = 1 + maxt { OPT(i, t-1) + OPT(t+1, j-1) } • Remark. Same core idea in CKY algorithm to parse context-free grammars. take max over t such that i  t < j-4 andbt and bj are Watson-Crick complements

  10. Bottom Up Dynamic Programming Over Intervals • Q. What order to solve the sub-problems? • A. Do shortest intervals first. • Running time. O(n3). RNA(b1,…,bn) { for k = 5, 6, …, n-1 for i = 1, 2, …, n-k j = i + k Compute M[i, j] return M[1, n] } 4 0 0 0 3 0 0 i 0 2 1 6 7 8 9 using recurrence j

  11. 6.8 Shortest Paths

  12. Shortest Paths • Shortest path problem. Given a directed graph G = (V, E), with edge weights cvw, find shortest path from node s to node t. • Ex. Nodes represent agents in a financial setting and cvw is cost of transaction in which we buy from agent v and sell immediately to w. allow negative weights 10 2 3 9 s 18 6 6 -16 6 4 19 30 11 5 15 -8 6 16 20 t 7 44

  13. 5 5 6 6 0 Shortest Paths: Failed Attempts • Dijkstra. Can fail if negative edge costs. • Re-weighting. Adding a constant to every edge weight can fail. u 3 2 s v -6 1 t 2 2 s t 3 3 -3

  14. Shortest Paths: Negative Cost Cycles • Negative cost cycle. • Observation. If some path from s to t contains a negative cost cycle, there does not exist a shortest s-t path; otherwise, there exists one that is simple. -6 -4 7 s t W c(W) < 0

  15. Shortest Paths: Dynamic Programming • Def. OPT(i, v) = length of shortest v-t path P using at most i edges. • Case 1: P uses at most i-1 edges. • OPT(i, v) = OPT(i-1, v) • Case 2: P uses exactly i edges. • if (v, w) is first edge, then OPT uses (v, w), and then selects best w-t path using at most i-1 edges • Remark. By previous observation, if no negative cycles, thenOPT(n-1, v) = length of shortest v-t path.

  16. Shortest Paths: Implementation • Analysis. (mn) time, (n2) space. • Finding the shortest paths. Maintain a "successor" for each table entry. Shortest-Path(G, t) { foreach node v  V M[0, v]  M[0, t]  0 for i = 1 to n-1 foreach node v  V M[i, v]  M[i-1, v] foreach edge (v, w)  E M[i, v]  min { M[i, v], M[i-1, w] + cvw } }

  17. Shortest Paths: Practical Improvements • Practical improvements. • Maintain only one array M[v] = shortest v-t path that we havefound so far. • No need to check edges of the form (v, w) unless M[w] changedin previous iteration. • Theorem. Throughout the algorithm, M[v] is length of some v-t path, and after i rounds of updates, the value M[v] is no larger than the length of shortest v-t path using  i edges. • Overall impact. • Memory: O(m + n). • Running time: O(mn) worst case, but substantially faster in practice.

  18. Bellman-Ford: Efficient Implementation Push-Based-Shortest-Path(G, s, t) { foreach node v  V { M[v]  successor[v]   } M[t] = 0 for i = 1 to n-1 { foreach node w  V { if (M[w] has been updated in previous iteration) { foreach node v such that (v, w)  E { if (M[v] > M[w] + cvw) { M[v]  M[w] + cvw successor[v]  w } } } If no M[w] value changed in iteration i, stop. } }

  19. Dynamic Programming Summary • Recipe. • Characterize structure of problem. • Recursively define value of optimal solution. • Compute value of optimal solution. • Construct optimal solution from computed information. • Dynamic programming techniques. • Binary choice: weighted interval scheduling. • Multi-way choice: segmented least squares. • Adding a new variable: knapsack. • Dynamic programming over intervals: RNA secondary structure. • Top-down vs. bottom-up: different people have different intuitions. Viterbi algorithm for HMM also usesDP to optimize a maximum likelihoodtradeoff between parsimony and accuracy CKY parsing algorithm for context-freegrammar has similar structure

  20. 6.10 Negative Cycles in a Graph

  21. Detecting Negative Cycles • Lemma. If OPT(n,v) = OPT(n-1,v) for all v, then no negative cycles. • Pf. Bellman-Ford algorithm. • Lemma. If OPT(n,v) < OPT(n-1,v) for some node v, then (any) shortest path from v to t contains a cycle W. Moreover W has negative cost. • Pf. (by contradiction) • Since OPT(n,v) < OPT(n-1,v), we know P has exactly n edges. • By pigeonhole principle, P must contain a directed cycle W. • Deleting W yields a v-t path with < n edges  W has negative cost. v t W c(W) < 0

  22. Detecting Negative Cycles • Theorem. Can detect negative cost cycle in O(mn) time. • Add new node t and connect all nodes to t with 0-cost edge. • Check if OPT(n, v) = OPT(n-1, v) for all nodes v. • if yes, then no negative cycles • if no, then extract cycle from shortest path from v to t t 0 0 0 0 0 18 2 6 -23 5 -11 v -15

  23. Detecting Negative Cycles: Summary • Bellman-Ford. O(mn) time, O(m + n) space. • Run Bellman-Ford for n iterations (instead of n-1). • Upon termination, Bellman-Ford successor variables trace a negative cycle if one exists. • See p. 288 for improved version and early termination rule.

  24. Q2. Arbitrage • Arbitrage is the use of discrepancies in currency exchange rates to transform one unit of a currenct into more than one unit of the same currency. For example, suppose that 1 US dollar buys 0.7 British pound, 1 British pound buys 9.5 French francs, and 1 French franc buys 0.16 US dollar. Then, by converting currencies, a trader can start with a US dollar and buy 0.7 x 9.5 x 0.16 = 1.064$ US dollars, thus turning a profit of 6.4 percent. • Suppose that we are given n currencies c1,…, cn and an n x n table R of exchange rates, such that one unit of currency ci buys R[i,j] units of currency cj. • Give an efficient algorithm to determine whether or not there exists a sequence of currencies (ci1, …, cik) such that • R[i1, i2] x R[i2, i3] x… x R[ik-1,ik] x R[ik,i1] > 1. • Give an efficient algorithm to print out such a sequence if one exists. Analyze the running time of your algorithm.

  25. Q3. Number of shortest paths • Suppose we have a directed graph with costs on the edges. The costs may be positive or negative, but every cycle in the graph has a strictly positive cost. We are also given two nodes v, w. Give an efficient algorithm that computes the number of shortest v-w paths in G.

More Related