190 likes | 304 Vues
This paper explores property testing for tree regular languages through efficient probabilistic algorithms. It focuses on testing regular words and ranked trees for edit distances, emphasizing the significance of ε-testers. The research reviews historical context, algorithms for linear algebra, and robust characterizations of polynomials. Key components include testers for graph properties and edit distances across words and trees. The findings demonstrate that regular languages and trees are testable, with implications for applications in XML file verification and related fields.
E N D
Property testing of Tree Regular Languages Frédéric Magniez, LRI, CNRS Michel de Rougemont, LRI , University Paris II
Property testing of Tree Regular Languages • Tester for regular words with the Edit Distance with Moves 2. Tester for ranked regular trees with the Tree-Edit Distance with Moves,
Testers on a class K Let F be a property on a class K of structures U An ε -tester for F is a probabilistic algorithm A such that: • If U |= F, A accepts • If U is ε far from F, A rejects with high probability • Time(A) independent of n. (Goldreich, Golwasser, Ron 1996 , Rubinfeld, Sudan 1994) Tester usually implies a linear time corrector.
History of Testers Self-testers and correctors for Linear Algebra ,Blum & Kanan 1989 Robust characterizations of polynomials, R. Rubinfeld, M. Sudan, 1994 Testers for graph properties : k-colorability, Goldreich and al. 1996 graph properties have testers, Alon and al. 1999 Regular languages have testers, Alon and al. 2000s Testers for Regular tree languages , Mdr and Magniez, ICALP 2004
Edit distance on Words • Classical Edit Distance: Insertions, Deletions, Modifications • Edit Distance with moves 0111000011110011001 0111011110000011001 3. Edit Distance with Moves generalizes to Trees
Testers on words Simpler proof which generalizes to regular trees. L is a regular language and A an automaton for L. Admissible Z= A word W is Z-feasible if there are two states accept init
The Tester Tester. Input : W,A, ε For every admissible path Z: else REJECT. Theorem: Tester(W,A, ε ) is an ε -tester for L(A).
Proof schema of the Tester Theorem: Regular words are testable. Robustness lemma: If W is ε-far from L, then for every admissible path Z, there exists such that the number of Z-infeasible subwords Splitting lemma: if W is far from L there are many disjoint infeasible subwords. Amplifying lemma: If there are many infeasible words, there are many short ones.
Merging Merging lemma: Let Z be an admissible path, and let F be a Z-feasible cut of size h’ . Then C C C C C C Take each word and split it along its connected components, removing single letters. Rearrange all the words of the same component in its Z-order. Add gluing words to obtain W’ in L:
Splitting Splitting lemma: If Z is an admissible path, W a word s.t. dist(W,L) > h, then W has Proof by contraposition:
Tree-Edit-Distance a b Deletion Edge a e c b a b e c d Insertion Node and Label f e e d c Tree Edit distance with moves: a a 1 move b b e e c d c d Distance Problem is NP-complete, non-approximable.
Tree-Edit-Distance on binary trees Binary trees : Distance with moves allows permutations Distance(T1,T2) =4 m-Distance (T1,T2) =2
Tree automata (q1,q1)q2 (q1,q0)q2 (q2,-) q2 (-,q2) q2 • (q0, q0) q1 • (q0,q1) q1 q1 q1 q0 q1 q0 q2 q1 q0 q1 q1 q0 q0 q0 q0 q0 q0
Infeasible subtrees Fact . If then the number of infeasible subtrees of constant size is O(n).
Tester for regular Trees Tester. Input : T,A, Theorem: Tester(T,A, ε ) is an ε -tester for L(A).
Proof schema of the Tester Theorem: Regular trees are testable. Robustness lemma: If T is ε-far from L, then for every admissible path Z, there exists such that the number of Z-infeasible i-subtrees Splitting lemma: if T is far from L there are many disjoint infeasible subtrees. Amplifying lemma: If there are many infeasible subtrees, there are many small ones.
Splitting and Merging Splitting and Merging on words: C C C C C C Splitting and Merging on trees:
Splitting and Merging trees E C C Connected Components Corrected tree C D D
Conclusion • Verification is hard. • Approximate verification can be feasible. • Testers and Correcters for regular words • Tester for regular trees • Corrector for regular trees • Unranked trees: XML files • Applications: Constant algorithm for Edit Distance with moves (Fischer, Magniez, Mdr)