html5-img
1 / 67

Learning from Text

Learning from Text. Colin de la Higuera University of Nantes. Acknowledgements.

dimaia
Télécharger la présentation

Learning from Text

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning from Text Colin de la Higuera University of Nantes Zadar, August 2010

  2. Acknowledgements • Laurent Miclet, Jose Oncina, Tim Oates, Anne-Muriel Arigon, Leo Becerra-Bonache, Rafael Carrasco, Paco Casacuberta, Pierre Dupont, Rémi Eyraud, Philippe Ezequel, Henning Fernau, Jean-Christophe Janodet, Satoshi Kobayachi, Thierry Murgue, Frédéric Tantini, Franck Thollard, Enrique Vidal, Menno van Zaanen,... http://pagesperso.lina.univ-nantes.fr/~cdlh/ http://videolectures.net/colin_de_la_higuera/ Zadar, August 2010

  3. Outline • Motivations, definition and difficulties • Some negative results • Learning k-testable languages from text • Learning k-reversible languages from text • Conclusions http://pagesperso.lina.univ-nantes.fr/~cdlh/slides/ Chapters 8 and 11 Zadar, August 2010

  4. 1 Identification in the limit yields A class of languages L Pres  ℕX a L A learner The naming function G A class of grammars L(a())=yields() (ℕ)=(ℕ) yields()=yields() Zadar, August 2010

  5. Learning from text • Only positive examples are available • Danger of over-generalization: why not return *? • The problem is “basic”: • Negative examples might not be available • Or they might be heavily biased: near-misses, absurd examples… • Base line: all the rest is learning with help Zadar, August 2010

  6. GI as a search problem PTA ?  Zadar, August 2010

  7. Questions? • Data is unlabelled… • Is this a clustering problem? • Is this a problem posed in other settings? Zadar, August 2010

  8. 2 The theory • Gold 67: No super-finite class can be identified from positive examples (or text) only • Necessary and sufficient conditions for learning • Literature: • inductive inference, • ALT series, … Zadar, August 2010

  9. Limit point • A class L of languages has a limit pointif there exists an infinite sequence Lnnℕ of languages in L such that L0  L1  … Ln  …, and there exists another language L L such that L= nℕLn • L is called a limit point of L Zadar, August 2010

  10. L is a limit point L0 L1 Li L2 L L3 Zadar, August 2010

  11. Theorem If L admits a limit point, then L is not learnable from text Proof: Let sibe a presentation in length-lex order for Li,and s be a presentation in length-lex order for L. Thennℕi / kn sik= sk Note: having a limit point is a sufficient condition for non learnability; not a necessary condition Zadar, August 2010

  12. Mincons classes • A class is mincons if there is an algorithm which, given a sample S, builds a GG such that S L L(G) L =L(G) • Ie there is a unique minimum (for inclusion) consistent grammar Zadar, August 2010

  13. Accumulation point (Kapur 91) A class L of languages has an accumulation pointif there exists an infinite sequence Sn nℕ of sets such that S0  S1  … Sn  …, and L= nℕSn  L …and for any nℕ there exists a language Ln’ in L such that Sn  Ln’  L. The language L is called an accumulation point of L Zadar, August 2010

  14. Ln’ S0 S1 S2 S3 L is an accumulation point Sn L Zadar, August 2010

  15. Theorem (for Mincons classes) L admits an accumulation point iff L is not learnable from text Zadar, August 2010

  16. Infinite Elasticity • If a class of languages has a limit point there exists an infinite ascending chain of languages L0 L1  …  Ln  …. • This property is called infinite elasticity Zadar, August 2010

  17. Infinite Elasticity x0 x1 xi Xi+1 Xi+2 Xi+3 Xi+4 x2 x3 Zadar, August 2010

  18. Finite elasticity L has finite elasticity if it does not have infinite elasticity Zadar, August 2010

  19. Theorem (Wright) If L(G) has finite elasticity and is mincons, then G is learnable. Zadar, August 2010

  20. L(G’) TG x1 x3 x2 x4 Tell tale sets Forbidden L(G) Zadar, August 2010

  21. Theorem (Angluin) G is learnable iff there is a computable partial function : Gℕ* such that: • nℕ, (G,n) is defined iffGG and L(G) • GG, TM={(G,n): nℕ} is a finite subset of L(G) called a tell-tale subset • G,G’M, if TM L(G’) then L(G’) L(G) Zadar, August 2010

  22. Proposition (Kapur 91) A language L in L has a tell-tale subsetiffL is not an accumulation point. (for mincons) Zadar, August 2010

  23. Summarizing • Many alternative ways of proving that identification in the limit is not feasible • Methodological-philosophical discussion • We still need practical solutions Zadar, August 2010

  24. 3 Learning k-testable languages P. García and E. Vidal. Inference of K-testable languages in the strict sense and applications to syntactic pattern recognition. Pattern Analysis and Machine Intelligence, 12(9):920–925, 1990 P. García, E. Vidal, and J. Oncina. Learning locally testable languages in the strict sense. In Workshop on Algorithmic Learning Theory (Alt 90), pages 325–338, 1990 Zadar, August 2010

  25. Definition Let k0, a k-testable language in the strict sense (k-TSS) is a 5-tuple Zk=(, I, F, T, C) with: •  a finite alphabet • I, Fk-1(allowed prefixes of length k-1 and suffixes of length k-1) • Tk (allowed segments) • C<k contains all strings of length less than k • Note that I∩F=C∩Σk-1 Zadar, August 2010

  26. The k-testable language is L(Zk)=I*  *F - *(k-T)*C • Strings (of length at least k) have to use a good prefix and a good suffix of length k-1, and all sub-strings have to belong to T. Strings of length less than k should be in C • Or: k-T defines the prohibited segments • Key idea: use a window of size k Zadar, August 2010

  27. a b a a  a b An example (2-testable) I={a} F={a} T={aa, ab, ba} C={,a} Zadar, August 2010

  28. Window language • By sliding a window of size 2 over a string we can parse • ababaaababababaaaab OK • aaabbaaaababab not OK Zadar, August 2010

  29. The hierarchy of k-TSS languages • k-TSS()={L*: L is k-TSS} • All finite languages are in k-TSS() if k is large enough! • k-TSS()  [k+1]-TSS() • (bak)* [k+1]-TSS() • (bak)* k-TSS() Zadar, August 2010

  30. a a a  b b a A language that is not k-testable Zadar, August 2010

  31. K-TSS inference Given a sample S, L(ak-TSS(S))= Zk where Zk=((S), I(S), F(S), T(S), C(S) ) and • (S) is the alphabet used in S • C(S)=(S)<kS • I(S)=(S)k-1Pref(S) • F(S)= (S)k-1Suff(S) • T(S)=(S)k {v: uvwS} Zadar, August 2010

  32. Example • S={a, aa, abba, abbbba} • Let k=3 • (S)={a, b} • I(S)= {aa, ab} • F(S)= {aa, ba} • C(S)= {a , aa} • T(S)={abb, bbb, bba} • L(a3-TSS(S))= ab*a+a Zadar, August 2010

  33. Building the corresponding automaton • Each string in IC and PREF(IC) is a state • Each substring of length k-1 of strings in T is a state •  is the initial state • Add a transition labeled b from u to ub for each state ub • Add a transition labeled b from au to ub for each aub in T • Each state/substring that is in F is a final state • Each state/substring that is in C is a final state Zadar, August 2010

  34. a aa aa a a a I={aa, ab}   b F={aa, ba} ab ab T={abb, bbb, bba} C={a, aa} b bb a bb ba ba b Running the algorithm S={a, aa, abba, abbbba} Zadar, August 2010

  35. Properties (1) • S L(ak-TSS(S)) • L(ak-TSS(S)) is the smallest k-TSS language that contains S • If there is a smaller one, some prefix, suffix or substring has to be absent Zadar, August 2010

  36. Properties (2) • ak-TSS identifies any k-TSS language in the limit from polynomial data • Once all the prefixes, suffixes and substrings have been seen, the correct automaton is returned • If YS, L(ak-TSS(Y)) L(ak-TSS(S)) Zadar, August 2010

  37. Properties (3) • L(ak+1-TSS(S)) L(ak-TSS(S)) In Ik+1 (resp. Fk+1 and Tk+1) there are less allowed prefixes (resp. suffixes or substrings) than in Ik (resp. Fk and Tk) • k>maxxSx, L(ak-TSS(S))= S • Because for a large k, Tk(S)= Zadar, August 2010

  38. 4 Learning k-reversible languages from text D. Angluin. Inference of reversible languages. Journal of the Association for Computing Machinery, 29(3):741–765, 1982 Zadar, August 2010

  39. The k-reversible languages • The class was proposed by Angluin (1982) • The class is identifiable in the limit from text • The class is composed by regular languages that can be accepted by a DFA such that its reverse is deterministic with a look-ahead of k Zadar, August 2010

  40. Let A=(, Q, , I, F) be a NFA, we denote by AT=(, Q, T, F, I) the reversal automaton with: T(q,a)={q’Q: q(q’,a)} Zadar, August 2010

  41. a b A a 2 b 1 0 a 4 a a 3 a AT b a 2 b 1 0 a 4 a a 3 Zadar, August 2010

  42. Some definitions • u is a k-successor of q if │u│=k and (q,u) • u is a k-predecessor of q if │u│=k and T(q,uT) •  is 0-successor and 0-predecessor of any state Zadar, August 2010

  43. a b A a 2 b • aa is a 2-successor of 0 and 1 but not of 3 • a is a 1-successor of 3 • aa is a 2-predecessor of 3 but not of 1 1 0 a 4 a a 3 Zadar, August 2010

  44. A NFA is deterministic with look-ahead kifq,q’Q: qq’ (q,q’I)  (q,q’(q”,a))  (u is a k-successor of q)  (v is a k-successor of q’)  uv Zadar, August 2010

  45. Prohibited: u 1 │u│=k a a u 2 Zadar, August 2010

  46. Example a b This automaton is not deterministic with look-ahead 1 but is deterministic with look-ahead 2 a 2 b 1 0 a 4 a a 3 Zadar, August 2010

  47. K-reversible automata • A is k-reversible if A is deterministic and AT is deterministic with look-ahead k • Example b b a a b a 2 b a 2 1 0 1 0 b b deterministic with look-ahead 1 deterministic Zadar, August 2010

  48. Notations • RL(, k) is the set of all k reversible languages over alphabet  • RL() is the set of all k-reversible languages over alphabet  (ie for all values of k) • ak-RL is the learning algorithm we describe Zadar, August 2010

  49. Properties • There are some regular languages that are not in RL() • RL(,k)RL(,k-1) check Zadar, August 2010

  50. Violation of k-reversibility Two states q, q’ violate the k-reversibility condition if • they violate the deterministic condition: q,q’(q”,a) or • they violate the look-ahead condition: • q,q’F, uk: u is k-predecessor of both q and q’ • uk, (q,a)=(q’,a) and u is k-predecessor of both q and q’ Zadar, August 2010

More Related