370 likes | 498 Vues
This chapter explores the simplification of context-free grammars (CFGs) and the introduction of normal forms, specifically focusing on methods such as eliminating useless variables, λ-productions, and unit productions. It covers Chomsky Normal Form (CNF) and provides insights into detecting cycles, nullable variables, and the CYK membership algorithm for efficient parsing. With practical examples and exercises, readers will gain a foundational understanding of how to transform CFGs into simpler and more manageable forms while retaining their generative capabilities.
E N D
CS 3240 – Chapter 6 Normal Forms for Context-Free Grammars
Topics • 6.1: Simplifying Grammars • Substitution • Removing useless variables • Removing λ • Removing unit productions • 6.2: Normal Forms • Chomsky Normal (CNF) • 6.3: A Membership Algorithm • CYK Algorithm • An example of bottom-up parsing CS 3240 - Normal Forms for Context-Free Languages
Simplifying Context-Free Grammars • Variables in CFGs can often be eliminated • If they are not recursive, you can substitute their rules throughout • See next slide CS 3240 - Normal Forms for Context-Free Languages
Substitution Example A ➞ a | aaA | abBc B ➞ abbA | b Just substitute directly for B: A➞ a | aaA | ababbAc | abbc CS 3240 - Normal Forms for Context-Free Languages
Detecting Cycles and Useless Variables • A variable is useless if: • It can’t be reached from the start state, or • It never leads to a terminal string • Due to endless recursion • Both problems can be detected by a dependency graph • See next slide CS 3240 - Normal Forms for Context-Free Languages
Useless Variables S ➞ aSb | A | λ A ➞ aA A is useless (non-terminating): S ➞ aSb | λ CS 3240 - Normal Forms for Context-Free Languages
Useless Variables S ➞ A A ➞ aA | λ B ➞ bA B is useless (non-reachable): S ➞ A A ➞ aA | λ CS 3240 - Normal Forms for Context-Free Languages
Exercise Simplify the following: S ➞ aS | A | C A ➞ a B ➞ aa C ➞ aCb CS 3240 - Normal Forms for Context-Free Languages
A Dependency Graph CS 3240 - Normal Forms for Context-Free Languages
Exercise Simplify the following grammar: S ➞ AB | AC A ➞ aAb | bAa | a B ➞ bbA | aaB | AB C ➞ abCa | aDb D ➞ bD | aC CS 3240 - Normal Forms for Context-Free Languages
Nullable Variables • Any variable that can eventually terminate in the empty string is said to be nullable • Note: a variable may be indirectly nullable • In general: if A ➞ V1V2…Vn, and all the Vi are nullable, then A is also nullable • See Theorem 6.3 CS 3240 - Normal Forms for Context-Free Languages
Finding Nullable Variables Consider the following grammar: S ➞ a | Xb | aYa X ➞ Y | λ Y ➞ b | X Which variables are nullable? How can we substitute the effect of λ before removing it? CS 3240 - Normal Forms for Context-Free Languages
Removing λ • First find all nullable variables • Then substitute (A + λ) for every nullable variable A, and expand • Then remove λ everywhere from the grammar • What’s left is equivalent to the original grammar • except the empty string may be lost • we won’t worry about that CS 3240 - Normal Forms for Context-Free Languages
Removing λExample Consider the following grammar (again): S ➞ a | Xb | aYa X ➞ Y | λ Y ➞ b | X How can we substitute the effect of λ before removing it? CS 3240 - Normal Forms for Context-Free Languages
Remove Nulls S → aSbS | bSaS | λ S → aSa | bSb | X X → aYb | bYa Y → aY | bY | λ CS 3240 - Normal Forms for Context-Free Languages
Removing Unit Productions • Unit Productions often occur in chains • A ➞ B ➞ C • Must maintain the effect of BandC when substituting for A throughout • Procedure: • Find all unit chains • Rebuild grammar by: • Keeping all non-unit productions • Keeping only the effect of all unit productions/chains CS 3240 - Normal Forms for Context-Free Languages
Removing Unit ProductionsExample S ➞ A | bb A ➞ B | b B ➞ S | a Note that S ⇒* {A,B}, A⇒* {B,S}, B ⇒* {S,A} Giving: S ➞ bb | b | a // Added non-unit part of A and B A ➞ b| a | bb // Added non-unit part of B and S B ➞ a | bb | b// Added non-unit part of S and A CS 3240 - Normal Forms for Context-Free Languages
Unit Rule Removal Procedure • 1) Determine all variables reachable by unit rules for each variable • 2) Keep all non-unit rules • 3) Substitute non-unit rules in place of each variable reachable by unit productions CS 3240 - Normal Forms for Context-Free Languages
Why Remove Nulls First? S ➞ aA A ➞ BB B ➞ aBb | λ Now remove nulls and see what happens…. (Also see the solution for #15 in 6.1) CS 3240 - Normal Forms for Context-Free Languages
Remove Nulls and Units S ➞ AB A ➞ B B ➞ aB | BB | λ CS 3240 - Normal Forms for Context-Free Languages
Simplification Summary • Do things in the following recommended order: • Remove nulls • Remove unit productions • Remove useless variables • Simplify by substitution as desired CS 3240 - Normal Forms for Context-Free Languages
Chomsky Normal FormSection 6.2 • Very important for our purposes • All CNF rules are of one of the following two forms: • A ➞ c (a single terminal) • A ➞ XY (exactly two variables) • Must begin the transformation aftersimplifying the grammar (removing λ, all unit productions, useless variables, etc.) CS 3240 - Normal Forms for Context-Free Languages
Chomsky Normal FormExample Convert to CNF: S ➞ bA | aB A ➞ bAA | aS | a B ➞ aBB | bS | b (NOTE: already has no nulls or units) CS 3240 - Normal Forms for Context-Free Languages
Exercise Convert the following grammar to CNF: S ➞ abAB A ➞ bAB | λ B ➞ BAa | A | λ CS 3240 - Normal Forms for Context-Free Languages
Exercise Convert the following grammar to CNF: S ➞ aS | bS | B B ➞ bb | C | λ C ➞ cC | λ CS 3240 - Normal Forms for Context-Free Languages
Greibach Normal Form • Single terminal character followed by zero or more variables (cV*, c ∈ Σ ) • V → a • V → aBCD… • λ allowed only in S → λ • Sometimes need to make up new variable names CS 3240 - Normal Forms for Context-Free Languages
Greibach Example 1 S → AB A → aA | bB | b B → b Substitute for A in first rule (i.e., add B to each rule for A): S → aAB | bBB | bB The other rules are okay CS 3240 - Normal Forms for Context-Free Languages
Greibach Example 2 S → abSb |aa Add rules to generate a and b: S → aBSB |aA A → a B → b CS 3240 - Normal Forms for Context-Free Languages
The CF Membership ProblemSection 6.3 • The “parsing” problem • How do we know if a string is generated by a given grammar? • Bottom-up parsing (CYK Algorithm) • An Example of Dynamic Programming • Requires Chomsky Normal Form (CNF) • Start by considering A ➞ c rules • Build up the parse tree from there CS 3240 - Normal Forms for Context-Free Languages
A Parsing Example S ➞ XY X ➞ XA | a | b Y ➞ AY | a A ➞ a Does this grammar generate “baaa”? CS 3240 - Normal Forms for Context-Free Languages
Parse Trees for “baaa” CNF yields binary trees.(Can you find a third parse tree?) CS 3240 - Normal Forms for Context-Free Languages
Parsing “baaa” S ➞ XY X ➞ XA | a | b Y ➞ AY | a A ➞ a Stage 1: b ⇐ X a ⇐ X, Y, A Stage 2: ba ⇐ X(X,Y,A) = XX, XY, XA ⇐ S, X aa ⇐ (X,Y,A)(X,Y,A) = XX, XY, XA, YX, YY, YA, AX, AY, AA ⇐ S, X, Y Stage 3: baa ⇐ ba a ⇐ (S,X)(X,Y,A) = SX, SY, SA, XX, XY, XA = S, X baa ⇐ baa ⇐ X(S,X,Y) = XS, XX, XY = S aaa ⇐ aa a ⇐ (S,X,Y)(X,Y,A) = _________ aaa ⇐ a aa ⇐ (X,Y,A)(S,X,Y) = _________ Stage 4: (you finish…) baaa: baaa ⇐ __________ baaa ⇐ __________ baa a ⇐ __________ CS 3240 - Normal Forms for Context-Free Languages
CYK AlgorithmTabular Form – Stage 1 CS 3240 - Normal Forms for Context-Free Languages
CYK AlgorithmTabular Form – Stage 2 CS 3240 - Normal Forms for Context-Free Languages
CYK AlgorithmTabular Form – Stage 1 CS 3240 - Normal Forms for Context-Free Languages
CYK AlgorithmTabular Form – Stage 1 CS 3240 - Normal Forms for Context-Free Languages
Exercise Does the following grammar generate “abbaab”? S ➞ SAB | λ A ➞ aA | λ B ➞ bB | λ CS 3240 - Normal Forms for Context-Free Languages