Chap6 LR Parsing

Chap6 LR Parsing • Recall some terminologies: • phrase: • a (VtVn)*, A  Vn, Aa • simple phrase: • a  (Vt  Vn)*, A  Vn, A  a • handle of a sentential form: • leftmost simple phrase of the • sentential form. • e.g. • S consider the • sentential form • A a B • aAbabB • a A b b B • handle simple • phrase *

Chap6 LR Parsing • shift-reduce parser: two operations: • shift input into the parse stack • until the handle is identified; then • reduce the handle to the LHS • nonterminal • e.g. Given the grammar • Pbegin S end $ • S  a ; S • S  begin S end ; S • S  • begin a ; begin a ; end ; end $ • input • stack • handles: l (trace this example) • a;S • l • begin S end; S • a; S • begin S end$

Chap6 LR Parsing LOOKING AT THE TREE P begin S end $ a ; S begin S end ; S a ; S l l (bottom-up tree construction) (This is an overlay slide.) (p. 159)

Chap6 LR Parsing LOOKING AT THE DERIVATION P  begin S end $  begin a ; S end $  begin a ; begin S end ; S end $  begin a ; begin S end ; l end $  begin a ; begin a ; S end ; end $  begin a ; begin a ; l end; end $ handles (This is a rightmost derivation.)

Chap6 LR Parsing • Two questions: • 1. Have we reached the end of handles • and how long is the handle? • 2. Which nonterminal does the handle • reduce to? • We use tables to answer the questions. • ACTION table • GOTO table • We first show how to use the table, • then how to construct the table.

Chap6 LR Parsing 1. How does the LR(0) parser work? 2. How are the action and goto tables constructed? 3. Is LR(0) parsing correct? 4. LR(1) 5. SLR(1) 6. LALR(1)

Chap6 LR Parsing • LR parsers are driven by two tables: • action table, which specifies what • actions to take (shift, reduce, • accept or error) • goto table, which specifies state • transition • and we push states, rather than • symbols, onto the stack. • Each state represents a subtree of • the parse tree.

Chap6 LR Parsing shift-reduce-driver /*look ahead 1 token*/ { push( start_state ); T := scanner(); do { S := state on top of stack switch( action(S,T) ) case shift: push( state(S,T) ); T := scanner(); break; case reduce i: m := length of RHS of prod. i; pop( m ); S := state on top of stack after poping; X := LHS of prod. i; push( state(S,X) ); break; case accept: ... case error: ... end } forever }

Chap6 LR Parsing

Chap6 LR Parsing Parsing Example

Chap6 LR Parsing • 6.2 LR parsers • LR(1): left-to-right scanning • rightmost derivation(reverse) • 1-token lookahead • LR parsers are deterministic • no backup, no retry • LR(k) parsers decide the next action by • examining the tokens already shifted • and at most k lookahead tokens. • LR (1) is the most powerful of • deterministic bottom-up parsers with • at most k lookahead tokens.

Chap6 LR Parsing Use the four small grammars to motivate the construction of LR tables.

Chap6 LR Parsing • 6.2.1 LR(0) tables • A production has the form AX1X2...Xj. • By adding a dot, we get an item • (configuration) • e.g. A·X1 X2 ... Xj • AX1 ·X2 ... Xj • ... ... • AX1 X2 ... Xj · • The· indicates how much of a RHS has • been shifted into the stack.

Chap6 LR Parsing • An item with the · at the end of the RHS, • AX1 X2 ... Xj · • indicates (or recognized) that RHS • should be reduced to LHS. • An item with the · at the beginning of • RHS, i.e. • A·X1 X2 ... Xj • predicts that RHS will be shifted into the • stack.

Chap6 LR Parsing • An LR(0) state is a set of items. • This means that the actual state of LR(0) • parsers is denoted by one of the items. • The close operation: • if there is an item Aa·Bb in the set • then add all items of the form B·g • to the set. • The initial state is • close( { S·a$ } ) • where S is the start symbol. • Show the construction for grammar • S  E $ • E  E + T • E  T • T  id • T  ( E )

Chap6 LR Parsing Close operation example

LR(0) Parsing For example, given grammar G2 S'S$ SID| Chap6 LR Parsing

Chap6 LR Parsing Build goto function from CFSM

To construct the action table of LR(0) parsers, we use the following rule: Chap6 LR Parsing

Chap6 LR Parsing • The state diagram is called the • characteristic finite state machine • (CFSM) of the grammar. • CFSM is the goto table of LR(0) parsers.

Chap6 LR Parsing • Action table of LR(0) • 1. S action[S] = • ... reduce with B  r • B  r· • ... • 2. S action[S] = shift • ... where a  Vt • B  a·ab • ... • 3. S action[S] = accept • ... • S  a$· • ... • 4. otherwise, action[S] = error • * Show the action table for the previous • example.

Chap6 LR Parsing • Constructing the LR(0) machine • S  E $ • E  E + T • E  T • T  id • T  ( E ) • The initial state is • close( { E ·E $ } ) Figure 6.11 CFSM for G1 Figure 6.12 action table for G1

Chap6 LR Parsing • Consider G1 • SE$ • EE+T | T • TID|(E) CFSM for G1 

Chap6 LR Parsing • Two kinds of conflicts: • 1. shift-reduce conflict • if there exist S such that action(S) • can be either shift or reduce • In this case, the parser does not • know whether to shift or to reduce. • 2. reduce-reduce conflict • if there exist S such that action(S) • contains two reduce entries. • In this case, the parser does not • know which production to use in • reduction. • A grammar is LR(0) iff there is no • conflict in the action table.

Chap6 LR Parsing • Few practical grammars are LR(0). • 1. For instance, consider any • l-production A . • If A can generate any terminal string, • then there must be a shift-reduce conflict • Suppose b  First(A), • a·by----------------> aA·by or • a·by----------------> ab·y • 2. Also consider operator precedence: • id + id ·+ id-----------------> E·+ id or • id + id ·* id------------------> id + id * ·id • (remember no lookahead!) reduce to shift reduce to shift

Chap6 LR Parsing • 6.3 LR(1) parsing • An LR(1) item has the form • A  X1X2... Xi·Xi+1... Xj, l • l  { l }  Vt • l is the set of terminals that may • follow A in some context. • close(S) • { for each item in S do • if the item is B  d·Ar,l • then add A  ·g, First(rl) • for each production with • LHS A (i.e. A  g ) • }

Chap6 LR Parsing Ex. S  E $ E  E + T E  T T  ID T  ( E ) First(S) = First(E) = First(T) = { ID, ( } close( S  ·E$, { l } ) ={ S  ·E$, { l } E  ·E+T, { $+ } E  ·T, { $+ } T  ·ID, { $+ } T  ·(E), { $+ } }

Chap6 LR Parsing • Constructing the LR(1) machine • A state is a set of items. • A b·xr,l x A bx·r,l • ...... ...... • (then close the set) • Starting state is • close(S  ·E$ {l}) • Ex. • S  E $ • E  E + T • E  T • T  T * P • T  P • P  ID • P  ( E )

Chap6 LR Parsing Figure 6.16 LR(1) machine for G3

Chap6 LR Parsing • Action table of LR(1) • 1. S action[S,a] = • ... reduce with B  r • B  r·, { a } • ... • 2. S action[S,a] = shift • ... where a  Vt • B  a·ab, l • ... • 3. S action[S,$] = accept • ... • S  a·$, { l } • ... • 4. otherwise, action[S,x] = error • * Show the action table for the previous • example.

Chap6 LR Parsing Figure 6.17 LR(1) action table for G3

Chap6 LR Parsing G is LR(1) iff there is no conflict in the action table.

Chap6 LR Parsing • Compare LR(0) and LR(1) • S  E $ LR(1) • E  E + T | T • G3: T  T * P | P • P  ID | ( E ) • In LR(0) nachine: • S0 • S  ·E$ • E  ·E+T • E  ·T T E  T· • T  ·T*P E  T·*P • T  ·P • P  ·ID reduce-shift • P  ·(E) conflict! • G3 is not LR(0).

Chap6 LR Parsing In LR(1) machine, S0 S  ·E$ {l} E  ·E+T {$+} E  ·T {$+} T  ·T*P {$+*} T  ·P {$+*} P  ·ID {$+*} P  ·(E) {$+*} T S7 E  T· {$+} T  T·*P {$+*} action[S7,+] = reduce * action[S7,$] = reduce action[S7,*] = shift S8 No conflict! T  T*·P {$+*} P  ·ID {$+*} P  ·(E) {$+*} G3 is LR(1).

Chap6 LR Parsing • 6.4 SLR(1) • LR(1) is very powerful. • The goto and action tables of LR(1) • are too big due to too many states. • two alternatives to LR(1) • SLR(1): LR(0) machine+lookahead • LALR(1): merge states of LR(1) • machine

Chap6 LR Parsing • SLR(1) uses LR(0) machine to define • the goto table. • Only the action table is different. • 1. S action[S,a] = • ... reduce with B  r • B  r· for all a  Follow(B) • ... • 2. S action[S,a] = shift • ... where a  Vt • B  a·ab • ... • 3. S action[S,$] = accept • ... • S  a·$ • ... • 4. otherwise, action[S] = error • * Show the action table for the previous • example.

Chap6 LR Parsing Follow(T) = { + * ) $ } Follow(E) = { + ) $ } Follow(P) = { +* ) $ } Note states 7, 10, and 11. Figure 6.18 CFSM for G3

Chap6 LR Parsing Figure 6.19 SLR(1) action function for G3

Chap6 LR Parsing • G is SLR(1) iff there is no conflict in the • action table. • Show example on pp.164 - 165. • For inadequate states (7 and 11), we • use Follow sets to resolve conflicts. • For adequate states, we may or may • not use Follow sets. • Ex. state 10. • LR(0) always reduce • SLR(1) ?? • trade off: • early error detection • smaller tables.

Chap6 LR Parsing • 6.4.2 • The lookahead in LR(1) machine is • computed from the context. • whereas the lookahead in SLR(1) is the • follow sets. • LR(1) lookaheads are more precise.

Chap6 LR Parsing • Some grammars are LR(1) but • not SLR(1) • Ex. E  ( L , E ) • E  S • S  ID • G4 S  ( S ) • L  L , E • L  E • part of LR(0) machine: • 0 E  ·(L,E) 1 E  (·L,E) • E  ·S ( S  (·S) • S  ·ID L  ·E • S  ·(S) E  ·(L,E) • E  ·S • S  ·ID • S  ·(S) • S • 2 S  (S·) • E  S·

Chap6 LR Parsing • Now Follow(E) = { ) ... } • So action[2,)] = shift • action[2,)] = reduce • There is a conflict! • This grammar cannot be SLR(k) for any k. • This grammar is LR(1). • action[2,)] = reduce • action[2,)] = shift

Chap6 LR Parsing • Lookahead in SLR(1) is inexact. • Ex. state 4 in Fig. 6.16 • Follow(T) = { * $ + ) } • in LR(1) • action[4,$] = reduce with T  P • action[4,+] = “ • action[4,*] = “ • action[14,)] = reduce with T  P • action[14,+] = “ • action[14,*] = “ • By construct, in SLR(1), • In state 4 of Fig. 6.18 • action[4,$] = reduce with T  P • action[4,+] = “ • action[4,*] = “ • action[4,)] = “

Chap6 LR Parsing • Since SLR(1) uses inexact lookahead, • some unnecessary conflicts may occur. • Solution: • change the grammar • change the parser (e.g. LALR(1))

Chap6 LR Parsing • 6.5 LALR(1) • Merge states of LR(1) machine that • differ only in the lookaheads. • Ex. Consider LR(1) machine of G3 on • p.160 (Fig. 6.16). • 1. states 13 and 15 are combined • 13 P(E)· {$+*} • P(E)· {$+*)} • 15 P(E)· {)+*} 10 • 2. 4 TP· {$+*} • TP· {$+*)} • 14 TP· {)+*} 4

Chap6 LR Parsing 3. 3 EE+·T $+ T·T*P $+* T·P $+* P·ID $+* EE+·T $+) P·(E) $+* T·T*P $+*) T·P $+*) 17 EE+·T )+ P·ID $+*) T·T*P )+* P·(E) $+*) T·P )+* 3 P·ID )+* P·(E) )+* 4. states 11, 20 5. states 12, 16 6. states 7, 19 7. states 9, 22 8. states 8, 21 9. states 6, 18 10. states 5, 10

Chap6 LR Parsing Figure 6.22 LALR(1) machine for G3 23 states 11 states in LR(1) in LALR(1)

Chap6 LR Parsing • Construct the action table of LALR(1) • (same as LR(1)). • 1. S • ... action[S,a] = • B  r·,{a} reduce with B  r • ... • 2. S • ... action[S,a] = shift • B  a·ab,L • ... • 3. S • ... action[S,$] = accept • S  a·$,{$} • ... • 4. else action[S,a] = error • G is LALR(1) iff there is no conflict in • the action table.

Chap6 LR Parsing

Chap6 LR Parsing

Presentation Transcript

LR(k) Parsing

Lecture 5: LR Parsing

Error detection in LR parsing

Canonical LR Parsing Tables

A little bit about LR Parsing

LR(k) Parsing

LR Parsing Table Costruction

Chap6 LR Parsing

LR Parsing

LR Parsing

Introduction to LR Parsing

LR Parsing – The Items

LR parsing techniques

LR(k) Parsing

LR Parsing – The Tables

LR parsing techniques

More LR Parsing and Bison

LR Parsing

LR Parsing

LR Parsing