160 likes | 492 Vues
Reverse Factor Algorithm. Speeding up on two string matching algorithms, Algorithmica, Vol.12 , 1994, pp. 247-267 CROCHEMORE, M., CZUMAJ, A., GASIENIEC, L., JAROMINEK, S., LECROQ, T., PLANDOWSKI, W. and RYTTER, W. Advisor: Prof. R. C. T. Lee Speaker: L. C. Chen.
E N D
Reverse Factor Algorithm Speeding up on two string matching algorithms, Algorithmica, Vol.12, 1994, pp.247-267 CROCHEMORE, M., CZUMAJ, A., GASIENIEC, L., JAROMINEK, S., LECROQ, T., PLANDOWSKI, W. and RYTTER, W. Advisor: Prof. R. C. T. Lee Speaker: L. C. Chen
Rule 1: The Suffix to Prefix Rule • For a window to have any chance to match a pattern, in some way, there must be a suffix of the window which is equal to a prefix of the pattern.
Basic Ideas Open a window W with size |P| in the text. W T |P| p • Find the longest suffix of W is also the prefix of pattern. Case 1: W T |P| p Match!
Case 2: W T |P| p W T |P| p Case 3: If there is no such suffix, we move W withlength |P|. W T |P| |P| p
Preprocessing phase • T=GCATCGGCGAGAGTATACAGTACG • P=GCAGAGAG • L(S): a set contains all prefixes of the pattern. We construct the suffix automaton of P. C Suffix Automaton A G C A G G G A 8 7 6 5 4 3 2 1 0 C A C
Preprocessing: Construct a Suffix Tree PR: the reversal string of P. 1 2 4 7 3 8 6 5
T P
Find the longest suffix of W is also the prefix of pattern. T P
T P
A Whole Example • T=GCATCGCAGAGAGTATACAGTACG • P=GCAGAGAG • First attempt : T P Shift by: 5 (8 - 3)
Second attempt : T P Shift by: 7 (8 - 1)
Third attempt: T P Shift by: 7 (8 - 1)
Third attempt: T P
Conclusion • Preprocessing phase is O(m). • Searching phase is O(mn).
Reference • [A90]Algorithms for finding patterns in strings, A. V. Aho, Handbook of Theoretical Computer Science, Vol. A, Elsevier, Amsterdam, 1990, pp.255-300. • [A85]The myriad virtues of suffix trees, Apostolico, A., Combinatorial Algorithms on words, NATO Advanced Science Institutes, Series F, Vol. 12, 1985, pp.85-96 • [AG86]The Boyer-Moore-Galil string searching strategies revisited, Apostolico, A. and Giancarlo, R., SIAM, Comput. 15, 1986, pp98-105. • [BR92]Average running time of the Boyer-Moore-Horspool algorithm, Baeza-Yates, R. A. and Regnier, M. Theoret. Comput. Sci., 1992, pp.19-31. • [BKR91]Analysis of algorithms and Data Structures, Banachowski, L., Kreczmar, A. and Rytter, W., Addison-Wesley. Reading, MA,1991.