1 / 41

Properties of Context-Free Languages

Properties of Context-Free Languages. Juan Carlos Guzmán CS 6413 Theory of Computation Southern Polytechnic State University. Summary. Normal Forms Pumping Lemma Closure Properties Decision Properties. Normal Forms. Recall that many different grammars generate the same language

ilori
Télécharger la présentation

Properties of Context-Free Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Properties of Context-Free Languages Juan Carlos Guzmán CS 6413 Theory of Computation Southern Polytechnic State University

  2. Summary • Normal Forms • Pumping Lemma • Closure Properties • Decision Properties

  3. Normal Forms • Recall that many different grammars generate the same language • We would like to restrict the form of the productions of the CFG • Chomsky Normal Form • Greibach Normal Form • Tasks to accomplish • Eliminate useless symbols • Eliminate ε-productions • Eliminate unit productions

  4. Grammar Transformations • We are about to present a series of transformations on grammars • You should consider each of them as a “transformation function” T: GrammarGrammar

  5. Elimination of Useless Symbols • Let G=(V,T,P,S) • XV is useful if there exist , , and w such that S *X*w • Two considerations: • X must generate strings • X*v • X must be reachable from S • S*X

  6. Elimination of Non-Generating Symbols (Tg) • Let G=(V,T,P,S ) be a CFG • G’ = (V’  {S },T,P’,S ), where • V’ = { A | (A)P (TV’ )* } • P’ = { (A) | (A)P AV’  (TV’ )* } • contains only generating symbols

  7. Example • G = ({S,A,B,C },{a,b },P ,S ), where • P = { Sa | A, AAB | BCA | a, Bb, CACA | BCB } • V’ = {S,A,B } • G’ = ({S,A,B },{a,b },P’ ,S ), where • P’ = { Sa | A, AAB | a, Bb }

  8. Elimination of Non-Reachable Symbols (Tr) • Let G=(V,T,P,S ) be a CFG • G’ = (V’,T,P’,S ), where • V’ = {S}  { B | (AB )PAV’ } • P’ = { A | (A)PAV’ } • contains only reachable symbols

  9. Example • G = ({S,A,B,C },{a,b },P ,S ), where • P = { Sa | A, AAB | a, Bb, CACA | BCB } • V’ = {S,A,B } • G’ = ({S,A,B },{a,b },P’ ,S ), where • P’ = { Sa | A, AAB | a, Bb }

  10. Useful Symbols • Remove • non-generating symbols • non-reachable symbols

  11. Elimination of ε-Productions (Tε) • Let G=(V,T,P,S ) be a CFG • Vε = { A | (A)P Vε*} • G’ = (V-Vε,T,P’,S ), where • P’ = {A0X1… Xkk | A0B1… BkkP for all 1i kBiVεXi {ε, Bi }  for all 0i ki(T V-Vε)* |0X1… Xkk | > 0 } • does not contain ε-prods and generates L(G) - {ε}

  12. Example • G = ({S },{a,b },P ,S ), where • P = { SaSbS | bSaS | ε } • Vε = {S } • G’ = ({S },{a,b },P’ ,S ), where • P’ = {SaSbS | aSb | abS | ab | bSaS | bSa | baS | ba } • Note that G’ does not generate ε

  13. Elimination of Unit Productions (Tu ) • Let G=(V,T,P,S ) be a CFG • Let Up = { (A,A) | AV}  { (A,C ) | (A,B)Up (BC )P } • G’ = (V,T,P’,S ), where • P’ = {A | (A,B)Up (B)P  V } • does not contain unit prods and generates L(G )

  14. Example • G = ({E,T,F },{+,*,(,),a },P ,E ), where • P = {EE+T |T, TT*F |F, Fa | (E )} • Up = {(E,E ),(E,T ),(E,F ),(T,T ),(T,F ),(F,F )} • G’ = (V,T,P’,S ), where • P’ = {EE+T |T*F | a | (E ), TT*F | a | (E ), Fa | (E )}

  15. Summary of Transformations • Given a CFG G, we can obtain a new grammar G’ such that • no ε-productions • no unit productions • no useless symbols • by transforming the original grammar in this order: Tr  Tg  Tu  Tε

  16. Results of the Transformations • After the transformations • the grammars do not have useless symbols (and associated productions) • their productions (A) are not • ε-productions • Unit productions • Therefore,  must satisfy • ||>1, or • T

  17. Implications for Transformed Grammars • Transformed grammars have some nice properties • No unit productions • No ε-productions • However, they produce “bushy” trees

  18. Chomsky Normal Form • Any CFG without ε can be transformed so that each of its productions is of the form • ABC, where A,B,C V • Aa, where A V  a  T • The idea behind CNF is to obtain grammars whose parse trees are binary trees

  19. Chomsky Normal Form • Productions of grammars not yet in CNF, but already transformed, are of the following forms • AX1… Xkk >1, allXi T V, or • Aaa T • We need to further transform the first kind of productions so that • the right-hand-side consists only of variables, and • break long RHS’s into chains of productions

  20. Chomsky Normal Form • Transformations • For every terminal a that appears on a RHS of length 2 or more • Create a production Aa • Replace a in all such productions with A • Replace every production AB1… Bk (k >2) with • AB1C1 • C1B2C2 • … • Ck-2Bk-1Bk

  21. Example

  22. Greibach Normal Form • All productions must be of the form • AaB1… Bkk 0 • Note that each derivation step is associated with the generation of a terminal • This translates nicely to PDA’s where each movement of the automaton will be guided by the recognition of an input character • To convert to GNF • Order the variables (A1 … An) • Modify the production set so that • Ai  Aj implies that i  j • remove left recursion i.e., Ai  Aj implies that i < j • Ai  a • Ai  a, V * • The algorithm resembles matrix triangularization • It appears in 1st edition of our book

  23. Relation Between Height and Yield of a CNF Parse Tree • Note that tree nodes of grammars in CNF are • binary nodes for productions (ABC) • unit terminal nodes for productions (Aa) • The yield of a complete CNF parse tree of height n is of size 2n-1 or less S height n-1 At most 2n-1 height n a1a2 a3 … at

  24. Pumping Lemma • Let L be a context-free language. Then there exists a constant n (which depends on L) such that for every string z in L such that |z|n, we can break z into five strings, z = uvwxy, such that: • |vwx| n • vx  ε • For all i  0, the string uviwxiy is also in L

  25. Pumping Lemma • In plain words • For any context-free language • Words of large size will contain a substring • Somewhere in the middle • Not null, not too big • That substring can itself be broken into three pieces vwx • v not null or x not null • v and x can be “pumped” (together) over and over again • The new words are guaranteed in the language • How large the words must be in order to be considered “large” depends on the actual language

  26. Pumping Lemma – Proof • Find a CNF for the language • The size of the word relates to the height of the tree A0 A1 A2 Ak a

  27. Pumping Lemma – Proof • Find a CNF for the language • For large words, a variable must be repeated S Ai Aj Note: Ai = Aj , i < j u v x y w

  28. Related Strings • The strings • uwy • uvvwxxy • uvnwxny • are also in the language

  29. How about ε? • If the language contains ε • The transformations remove ε from the grammar • Therefore you get a different language!!! • CNF is not defined for languages with ε • If a language contains ε • A new grammar can be given, which generates the same language • ε will be generated in one derivation • All other productions comply with CNF

  30. Closure Properties • Context-free languages are closed under • Substitution • Regular Operators • Homomorphism • Reversal • Intersection with regular language • Inverse homomorphism

  31. Substitution • A substitution is an operation which replaces characters with strings • These strings are pulled from a particular language

  32. Substitution—Formally • Let Σ be an alphabet • Let La a language associated to aΣ • s(a) = La • s(a1a2…an) = s(a1)s(a2)…s(an) = La1 La2…Lan • s(L) = { s(w) | wL }

  33. Substitution • CFL’s are closed under substitution with CFL’s • Let G = (V, Σ,P,S ), such that L(G ) = L • Let Ga = (Va,Ta,Pa,Sa), such that L(Ga) = La • Let G’ = (V’,T’,P’,S ) where • V’ =V (aΣVa ) • T’ = (aΣTa ) • P’ = (aΣPa )  P’’, where • P’’ is all productions of P, where each terminal a was replaced by the corresponding Sa • G’ generates s(L)

  34. Example • G = ({S},{0,1},P,S), where • P= {S  SS| 0S1 | ε} • L0 = {(} • L1 ={)} • Or • L0 =0* • L1=1*

  35. Closure Under Regular Operators • CFL’s are closed under • Union • Concatenation • Closure (*), and positive closure(+)

  36. Closure Under Homomorphism • CFL’s are closed under homomorphism • This is a special case of substitution • Substitution with a single string

  37. Reversal • CFL’s are closed under reversal • Just reverse all productions

  38. Intersection with a Regular Language • CFL’s are not closed under intersection • They are closed under intersection with a regular language

  39. Inverse Homomorphism • CFL’s are closed under inverse homomorphism

  40. Decision Properties of CFL’s • Complexity to transform grammars to PDA’s, and within PDA’s • Complexity of transformation to CNF • Testing Emptyness of CFL’s • Testing Membership in a CFL

  41. Undecidable Problems • Is a given CFG G ambiguous? • Is a given CFL L inherently ambiguous? • Is the intersection of two CFL’s empty? • Are two CFL’s the same? • Is a given CFL equal to Σ*, where Σ* is the alphabet of the language?

More Related