Discussion #5 LL(1) Grammars &Table-Driven Parsing

Discussion #5LL(1) Grammars&Table-Driven Parsing

Topics • Approaches to Parsing • Full backtracking • Deterministic • Simple LL(1), table-driven parsing • Improvements to simple LL(1) grammars

Prefix Expression Grammar • Consider the following grammar (which yields prefix expressions for binary operators): E  N | OEE O  + |  | * | / N  0 | 1 | 2 | 3 | 4 • Here, prefix expressions associate an operator with the next two operands. * + 2 3 4 (* (+ 2 3) 4) (2 + 3) * 4 = 20 * 2 + 3 4 (* 2 (+ 3 4)) 2 * (3 + 4) = 14

Top-Down Parsing with Backtracking *+342 E  N | OEE O  + |  | * | / N  0 | 1 | 2 | 3 | 4

What are the obvious problems? • We never know what production to try. • It appears to be terribly inefficient—and it is. • Are there grammars for which we can always know what rule to choose? Yes! • Characteristics: • Only single symbol look ahead • Given a non-terminal and a current symbol, we always know which production rule to apply

LL(1) Parsers • An LL parser parses the input from Left to right, and constructs a Leftmost derivation of the sentence. • An LL(k) parser uses k tokens of look-ahead. • LL(1) parsers, although fairly restrictive, are attractive because they only need to look at the current non-terminal and the next token to make their parsing decisions. • LL(1) parsers require LL(1) grammars.

Simple LL(1) Grammars For simple LL(1) grammars all rules have the form A a11 | a22 | … | ann where • ai is a terminal, 1 <= i <= n • ai  aj for i  j and • i is a sequence of terminals and non-terminal or is empty, 1 <= i <= n

By making all production rules of the form: A  a11 | a22 | … | ann Thus, E  0 | 1 | 2 | 3 | 4 | +EE | EE | *EE | /EE Why is this not a simple LL(1) grammar? E  N | OEE O  + |  | * | / N  0 | 1 | 2 | 3 | 4 How can we change it to simple LL(1)? Creating Simple LL(1) Grammars

8 7 * E E  E E 6 5 3 8 + E E 4 2 * E E 3 4 4 2 3 3 ? Example: LL(1) Parsing E (1)0 | (2)1 | (3)2 | (4)3 | (5)4 | (6)+EE | (7)EE | (8)*EE | (9)/EE * + 2 3 4  2 * 3 E E Success! Fail! Output = 8 6 3 4 5

Simple LL(1) Parse Table A parse table is defined as follows: (V  {#})  (VT  {#})  {(, i), pop, accept, error} where •  is the right side of production number i • # marks the end of the input string (#  V) If A  (V  {#}) is the symbol on top of the stack and a  (VT  {#}) is the current input symbol, then: ACTION(A, a) = pop if A = a for a  VT accept if A = # and a = # (a, i) which means “pop, then push a and output i” (A  a is the ith production) error otherwise

Parse TableE (1)0 | (2)1 | (3)2 | (4)3 | (5)+EE | (6)*EE VT {#} V{#} All blank entries are error

Simple LL(1):More Restrictive than Necessary • Simple LL(1) grammars are very easy and efficient to parse but also very restrictive. • The good news: we can achieve the same desirable results without being so restrictive. • How? We only need to retain the restriction that single-symbol look ahead uniquely determines which rule to use.

Relaxing Simple LL(1) Restrictions • Consider the following grammar, which is not simple LL(1): E  (1)N | (2)OEE O  (3)+ | (4)* N  (5)0 | (6)1 | (7)2 | (8)3 • What are the problem rules? (1) & (2) • Observe that it is possible distinguish between rules 1 and 2. • N leads to {0, 1, 2, 3} • O leads to {+, *} • {0, 1, 2, 3}  {+, *} =  • Thus, if we see 0, 1, 2, or 3 we choose (1), and if we see + or *, we choose (2).

LL(1) Grammars • FIRST() = { |  * and   VT} • A grammar is LL(1) if for all rules of the form A  1 | 2 | … | n the sets FIRST(1), FIRST(2), …, and FIRST(n) are pair-wise disjoint; that is, FIRST(i)  FIRST(j) =  for i  j

E (1)N | (2)OEEO  (3)+ | (4)*N  (5)0 | (6)1 | (7)2 | (8)3 For (A, a), we select (, i) if a  FIRST() and  is the right hand side of rule i. VT{#} V{#}

(2)OEE (4)* (2)OEE (1)N (3)+ (1)N (1)N (8)3 (6)1 (7)2 What does 2 4 2 3 1 6 1 7 1 8 mean? E (1)N | (2)OEEO  (3)+ | (4)*N  (5)0 | (6)1 | (7)2 | (8)3 E 2 4 2 3 1 6 1 7 1 8 defines a parse tree via a preorder traversal.

Discussion #5 LL(1) Grammars &Table-Driven Parsing