1 / 26

Chomsky Normal Form

Chomsky Normal Form. Chomsky Normal Form (CNF). Definition If a CFG has only productions of the form Non-Terminal  string of exactly two non-terminals OR Non-Terminal  one terminal It is said to be in Chomsky Normal Form. Chomsky Normal Form (CNF).

dmcguire
Télécharger la présentation

Chomsky Normal Form

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chomsky Normal Form

  2. Chomsky Normal Form (CNF) Definition If a CFG has only productions of the form Non-Terminal  string of exactly two non-terminals OR Non-Terminal  one terminal It is said to be in Chomsky Normal Form.

  3. Chomsky Normal Form (CNF) • It is to be noted that any CFG can be converted to be in CNF, if the null productions and unit productions are removed. • Also if a CFG contains nullable productions as well, then the corresponding new production are also to be added.

  4. Theorem 25 • If L is a language generated by some CFG, then there exist another CFG that generate all the non-null words of L, all of whose productions are of one of the two basic forms: Non-Terminal  string of only Non-terminals Or Non-Terminal  one Terminal

  5. Proof • The proof is a constructive algorithm by converting a given CFG in a CFG of the production of designed format. • Let us suppose in given CFG the non-terminal are S, X1, X2, X3, … . Let us also assume {a,b} are two terminals. STEP-1 We now add two new non-terminals A and B and the productions A  a B  b

  6. Proof Theorem 25 contd… STEP-2 now for every previous production involving terminals, we replace each ‘a’ with the non-terminal A and each ‘b’ with B. e.g. X3 X4aX1SbbX7a Becomes X3  X4AX1SBBX7A Which is a string of solid non-terminals X6  abaab Becomes X6 ABAAB

  7. S  X1 | X2aX2 | aSb | b X1  X2X2 | b X2  aX2 | aaX2 S  X1 | X2AX2 | ASB | B X1  X2X2 | B X2  AX2 | AAX2 A  a B  b Example

  8. Example • Right sides of other production may contain terminals and non-terminals e.g. N  aM | NaMb • We need to have string of solid non-terminal on right side we introduce now products as X a Y  b And apply them to all rules getting N  XM | NXMY

  9. Proof contd. • Introduce new non-terminals to restrict the length to 2 N  XM X  a N  NR1 Y  b R1 XR2 R2  MY

  10. Theorem 26 Theorem 26: For any context-free language L, the non-Λ words of L can be generated by a grammar in which all productions are in CNF.

  11. Proof of Theorem 26 • The proof will be by constructive algorithm. • From Theorems 23 and 24, we know that there is a CFG for L (or for all L except Λ) that has no Λ-productions and no unit productions. • Suppose further that we have made the CFG fit the form specified in Theorem 25. • For productions of the form Nonterminal  one terminal we leave them alone. • For each production of the form Nonterminal  string of Nonterminals we use the following expansion that involves new nonterminals R1, R2, ...

  12. Proof of Theorem 26 contd. • For instance, the production S  X1X2X3X8 should be replaced by S  X1R1 where R1 X2R2 and where R2 X3X8. • Thus, we convert a production with long string of non-terminals into a sequence of productions with exactly two non-terminals. • If we need to convert another production, we introduce new R’s with different subscripts. • The new grammar generates the same language as the old grammar with the possible exception of the word Λ.

  13. Remarks • Theorem 26 has enabled a form of algebra of productions by introducing non-terminals R1, R2, R3 … • The derivation tree associated with a derivation in Chomsky Normal Form grammar is a binary tree.

  14. Example S  aSa | bSb | a | b | aa | bb S  AR2 R2  SA S  BR1 R1 SB S  AA S  BB S  a S  b A  a B  b S  ASA S  BSB S  AA S  BB

  15. Example Consider the following CFG S → ABAB A → a|Λ B → b|Λ Here S → ABAB is nullable production and A → Λ, B → Λ are null productions. Removing the null productions A → Λ and B → Λ, and introducing the new productions as S → BAB|AAB|ABB|ABA|AA|AB|BA|BB|A|B Now S → A|B are unit productions to be eliminated as shown below S → A gives S → a (using A → a) S → B gives S → b (using B → b)

  16. Example Thus the new resultant CFG takes the form S → BAB|AAB|ABB|ABA|AA|AB|BA|BB|a|b, A → a, B → b. Introduce the nonterminal C where C → AB, so that S → BAB|AAB|ABB|ABA|AA|AB|BA|BB|a|b S → BC|AC|CB|CA|AA|C|BA|BB|a|b A → a B → b C → AB is the CFG in CNF.

  17. Leftmost Derivation Definition: • The leftmost non-terminal in a working string is the first non-terminal that we encounter when we scan the string from left to right. Example: In the string abNbaXYa, the leftmost non terminal is N. Definition: • If a word w is generated by a CFG such that at each step in the derivation, a production is applied to the leftmost non-terminal in the working string, then this derivation is called a leftmost derivation.

  18. Example: Left derivation S  aSX | b X  Xb | a aababa S  aSX  aaSXX  aabXX  aabXbX  aababX  aababa

  19. Parsing Definition: the process of finding the derivation of word generated by particular grammar is called parsing. There are different parsing techniques, containing the following three • Top down parsing. • Bottom up parsing. • Parsing technique for particular grammar of arithmetic expression.

  20. Top-Down Parsing • The parse tree is created top to bottom. • Top-down parser • Recursive-Descent Parsing • Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.) • It is a general parsing technique, but not widely used. • Not efficient • Predictive Parsing • no backtracking • efficient • needs a special form of grammars (LL(1) grammars). • Recursive Predictive Parsing is a special form of Recursive Descent parsing without backtracking. • Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.

  21. Recursive-Descent Parsing (uses Backtracking) • Backtracking is needed. • It tries to find the left-most derivation. S  aBc B  bc | b S S input: abc a Bc a B c b c b fails, backtrack

  22. Left recursion • Any productions of the form A  Aα A  BA where B*ε Are called left recursive productions

  23. Left Recursion Removal • Consider the grammar A  Aa | b • Halting condition of top-down parsing depend upon the generation of terminal prefixes to discover dead ends. • Repeated application of above rule fail to generate a prefix that can terminate the parse.

  24. Removing left recursion • For each rule of the form • A -> A α1 | A α2| ………| A αn | β 1 |β 2 | …β n • A is a left-recursive nonterminal. • α is a sequence of nonterminals and terminals. • β is a sequence of nonterminals and terminals that does not start with A. • Replace the A-production by the production: • A -> β 1 A’ |….| β n A’ • And create a new nonterminal • A’ -> Λ| α1 A’ |….|αn A’

  25. Removing left recursion • To remove left recursion from A, the A rules are divided into two groups. Left recursive and others A  AU1 | AU2 | AU3 | ….| AUj A  V1 | V2 | V3 | … | Vk Solution: A  V1 | V2 | V3 | … | Vk| V1Z | V2Z | … | VkZ Z  u1Z | u2Z | u3Z | … | ujZ | u1 | u2 | u3 | … | uj

  26. A  Aa | b Solution: A  bZ | b Z  aZ | a A  Aa | Ab | b | c A  bZ | cZ | b | c Z  aZ | bZ | a | b Example

More Related