1 / 28

Less Than Matching

Less Than Matching. Orgad Keller Modified by Ariel Rosenfeld. Less Than Matching. Input: A text , a pattern over alphabet with order relation . Output: All locations where Can we use the regular methods?. Transitivity.

saul
Télécharger la présentation

Less Than Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Less Than Matching Orgad Keller Modified by Ariel Rosenfeld

  2. Less Than Matching • Input: A text , a pattern over alphabet with order relation . • Output: All locations where • Can we use the regular methods? Algorithms 2

  3. Transitivity • Less Than Matching is in fact transitive, but that is not enough for us: does not imply anything about the relation between and . Algorithms 2

  4. Approach • A good approach for solving Pattern Matching problems is sometimes solving: • The problem for a binary alphabet . • The problem for a bounded alphabet . • The problem for an ubounded alphabet . In that order. Algorithms 2

  5. Binary Alphabet • The only case that prevents a match at location is the case where: • This is equivalent to: • So how can we solve this case? Algorithms 2

  6. Binary Alphabet • So if , there is no match at . • We can calculate • Then we’ll calculate (P reverse) using FFT. • We’ll return all locations where Algorithms 2

  7. Example P=0101 T=0101001110 PR = 1010 T! = 1010110001 Algorithms 2

  8. Algorithms 2

  9. P=0101 T=0101001110 Algorithms 2

  10. What just happened? T! = PR = Algorithms 2

  11. Complexity Time: Algorithms 2

  12. Bounded Alphabet • We need reductions to binary alphabet. • For each we’ll define: • We notice are binary. Algorithms 2

  13. Bounded Alphabet • Theorem: (less than) matches at location if and only if , (less than) matches at location . • Proof: does not match at iff . that is true iff , meaning that does not (less than) match at location . Algorithms 2

  14. Bounded Alphabet • So for each , we’ll run the binary alphabet algorithm on . • We’ll return only the locations that matched in all iterations. • Time: Algorithms 2

  15. Problem • Can be worse than the naïve algorithm. • What about unbounded alphabet? • We present an improvement on the next slides. Algorithms 2

  16. Abrahamson-Kosaraju Method • First, use the segment splitting trick. Therefore we can assume . • For each location in text, we’ll produce a triplet: , where . • For each location in pattern, we’ll produce a triplet: , where . • We now have triplets all together. Algorithms 2

  17. Abrahamson-Kosaraju Method • We’ll hold all triplets together. • Sort all triplets according to symbol. • We’ll define a symbol that has more than triplets as a “frequent symbol”. • There are frequent symbols. • Put all frequent symbols’ triplets aside. Algorithms 2

  18. Abrahamson-Kosaraju Method • Split non-frequent symbols’ triplets to groups of size in the following manner: Algorithms 2

  19. Abrahamson-Kosaraju Method • The rule is that there can’t be two triplets of the same symbol in different groups. Algorithms 2

  20. Abrahamson-Kosaraju Method • For each such group, choose the symbol of the first triplet in group as the group’s representative. • For instance, on previous example, group 1’s representative is and group 2’s representative is . • There are representatives all together. Algorithms 2

  21. Abrahamson-Kosaraju Method • To sum up: • frequent symbols. • representatives of non-frequent symbols. • We’ll swap each non-frequent symbol in pattern and text with its representative. • Now our text and pattern are over sized alphabet. Algorithms 2

  22. Abrahamson-Kosaraju Method • We want to run our algorithm over the new text and pattern to count the mismatches between symbols of different groups. • But we have a problem: • Let’s say is a frequent symbol, but: Algorithms 2

  23. Abrahamson-Kosaraju Method • The representative of group 2 is , which is smaller than , but the group also contains which is greater than . Algorithms 2

  24. Abrahamson-Kosaraju Method • In that case we’ll split group 2 to two groups with their own representatives. • Since we performed at most such splits, we still have representatives. Algorithms 2

  25. Abrahamson-Kosaraju Method • We can now run our algorithm over the new text and pattern in . • But we still haven’t handled comparisons between two non-frequent symbols that are in the same group. Algorithms 2

  26. Abrahamson-Kosaraju Method • We’ll do so naively in each group: • For each triplet in the group • For each triplet of the form in the group, if , then add an error at location . • Time: Algorithms 2

  27. Running Time • For one segment: • Sorting the triplets and representatives: . • Running the algorithm: . • Correcting results (Adding in-group errors): . • Overall for one segment: . • Overall for all segments: . Algorithms 2

  28. Running Time • We can improve to . • Left as an exercise. Algorithms 2

More Related