Chapter 3-3

Chapter 3-3 Chang Chi-Chung 2015.07.03

Bottom-Up Parsing • LR methods (Left-to-right, Rightmost derivation) • LR(0), SLR, Canonical LR = LR(1), LALR • 最右推導 • 沒有右遞迴 • 明確性文法 • Other special cases: • Shift-reduce parsing • Operator-precedence parsing

Bottom-Up Parsing reduction E→ T → T * F → T * id → F * id → id * id rightmost derivation E → E + T | T T → T * F | F F → ( E ) | id

LR(0) • An LR parser makes shift-reduce decisions by maintaining states to keep track. • An item of a grammar G is a production of G with a dot at some position of the body. • Example A → X Y Z A → ．X Y Z A → X．Y Z A → X Y ．Z A → X Y Z ． A → X．Y Z items stack next derivations with input strings Note that production A  has one item [A •]

LR(0) • Canonical LR(0) Collection • One collection of sets of LR(0) items • Provide the basis for constructing a DFA that is used to make parsing decisions. • LR(0) automation • The canonical LR(0) collection for a grammar • Augmented the grammar • If G is a grammar with start symbol S, then G’ is the augmented grammar for G with new start symbol S’ and new production S’ → S • Closure function • Goto function

Shift-Reduce Parsing • Shift-Reduce Parsing is a form of bottom-up parsing • A stack holds grammar symbols • An input buffer holds the rest of the string to parsed. • Shift-Reduce parser action • shift • reduce • accept • error

Shift-Reduce Parsing • Shift • Shift the next input symbol onto the top of the stack. • Reduce • The right end of the string to be reduced must be at the top of the stack. • Locate the left end of the string within the stack and decide with what nonterminal to replace the string. • Accept • Announce successful completion of parsing • Error • Discover a syntax error and call recovery routine

Conflicts During Shift-Reduce Parsing • Conflicts Type • shift-reduce • reduce-reduce • Shift-reduce and reduce-reduce conflicts are caused by • The limitations of the LR parsing method (even when the grammar is unambiguous) • Ambiguity of the grammar

Use of the LR(0) Automaton

Function Closure • If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I. • Create closure(I) by the two rules: • add every item in I to closure(I) • IfA→α．Bβ is in closure(I) andB→γis a production, then add the item B→．γtoclosure(I). Applythis rule untill no more new items can be added toclosure(I). • Divide all the sets of items into two classes • Kernel items • initial item S’ → ．S, and all items whose dots are not at the left end. • Nonkernel items • All items with their dots at the left end, except for S’ → ．S

Example • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E’ → ．E} , then closure(I) = { E’ →．E E →．E + T E →．T T →．T * F T →．F F →．( E ) F →．id}

Exercise • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E → E +．T }

Function Goto • Function Goto(I, X) • I is a set of items • X is a grammar symbol • Goto(I, X) is defined to be the closure of the set of all items [A  α X‧β] such that [A  α‧Xβ] is in I. • Goto function is used to define the transitions in the LR(0) automation for a grammar.

Example • The grammar G • E’ → E • E → E + T | T • T → T * F | F • F → ( E ) | id • I = { E’ → E ． E → E ．+ T } • Goto (I, +) = { E → E +．T T →．T * F T →．F F →．(E) F →．id }

Constructing the LR(0) Collection • The grammar is augmented with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S]})(this is the start state of the DFA) • For each set of items I  C and each grammar symbol X  (NT) such thatGOTO(I, X)  Cand goto(I, X)  , add the set of items GOTO(I, X) to C • Repeat 3 until no more sets can be added to C

Model of an LR Parser

Structure of the LR Parsing Table • Parsing Table consists of two parts: • A parsing-action function ACTION • A goto function GOTO • The Action function, Action[i, a], have one of four forms: • Shift j, where j is a state. • The action taken by the parser shifts input a to the stack, but use state j to represent a. • Reduce A→β. • The action of the parser reduces β on the top of the stack to head A. • Accept • The parser accepts the input and finishes parsing. • Error

Structure of the LR Parsing Table(1) • The GOTO Function, GOTO[Ii, A], defined on sets of items. • If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.

Configuration ( = LR parser state): ($s0s1s2 … sm,aiai+1 … an$) stack input ($ X1 X2 … Xm,aiai+1 … an$) LR-Parser Configurations Ifaction[sm, ai] = shift sthen push s (ai), and advance input (s0 s1 s2 … sm, aiai+1 … an $)(s0s1s2 … sms, ai+1 … an$) Ifaction[sm, ai] = reduce A   and goto[sm-r, A] = s with r = || then pop r symbols, and push s ( push A )( (s0 s1 s2 … sm, ai ai+1 … an $) (s0s1 s2 … sm-rs, aiai+1 … an$) Ifaction[sm, ai] = accept then stop Ifaction[sm, ai] = error then attempt recovery

Example LR Parse Table action goto Grammar:1. E  E+T2. E  T3. T  T*F4. T  F5. F  (E)6. F  id state shift & goto 5 reduce byproduction #1

Example Grammar0. S  E 1. E  E+T2. E  T3. T  T*F4. T  F5. F  (E)6. F  id

SLR Grammars • SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing • SLR eliminates some conflicts by populating the parsing table with reductions A on symbols in FOLLOW(A) Shift on + State I2:E  id•+EE  id• State I0:S  •EE  •id+EE  •id goto(I0,id) goto(I3,+) S  EE  id+EE  id FOLLOW(E)={$}thus reduce on $

SLR Parsing Table • Reductions do not fill entire rows • Otherwise the same as LR(0) 1. S  E2. E  id+E3. E  id Shift on + FOLLOW(E)={$}thus reduce on $

Constructing SLR Parsing Tables • Augment the grammar with S’ S • Construct the set C={I0, I1, …, In} of LR(0) items • State i is constructed from Ii • If [A•a] Iiand goto(Ii, a)=Ij then set action[i, a]=shiftj • If [A•]  Iithen set action[i,a]=reduce A for all a  FOLLOW(A) (apply only if AS’) • If [S’S•] is in Ii then set action[i,$]=accept • If goto(Ii, A)=Ij then set goto[i, A]=jset goto table • Repeat 3-4 until no more entries added • The initial state iis the Iiholding item [S’•S]

Example SLR Grammar and LR(0) Items Augmentedgrammar:0. C’  C1. C  A B2. A  a3. B  a I0 = closure({[C’  •C]})I1 = goto(I0,C) = closure({[C’  C•]})… State I1:C’  C• State I4:C  A B• final goto(I0,C) goto(I2,B) State I0:C’  •CC  •A BA  •a State I2:C  A•BB  •a start goto(I0,A) goto(I2,a) State I5:B  a• goto(I0,a) State I3:A  a•

Example SLR Parsing Table State I0:C’  •CC  •A BA  •a State I1:C’  C• State I2:C  A•BB  •a State I3:A  a• State I4:C  A B• State I5:B  a• 1 Grammar:0. C’  C1. C  A B2. A  a3. B  a C 4 B start A 2 0 a 5 a 3

SLR and Ambiguity • Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1) • Consider for example the unambiguous grammarS L=R | RL  *R | idR  L FOLLOW(R) = {=, $} I0:S’  •SS  •L=RS  •RL  •*RL  •idR  •L I1:S’  S• I2:S  L•=RR  L• I3:S  R• I4:L  *•RR  •L L  •*RL  •id I5:L  id• I6:S  L=•RR  •L L  •*RL  •id I7:L  *R• I8:R  L• I9:S  L=R• action[2,=]=s6action[2,=]=r5 Has no SLRparsing table no

LR(1) Grammars • SLR too simple • LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table • LR(1) item = LR(0) item + lookahead LR(0) item[A•] LR(1) item [A•, a]

SLR Versus LR(1) • Split the SLR states by adding LR(1) lookahead • Unambiguous grammarS L = R | RL  * R | idR  L I2:S  L•=RR  L• split S  L•=R R  L• action[2,=]=s6 Should not reduce, because noright-sentential form begins with R=

LR(1) Items • An LR(1) item [A•,a]contains a lookahead terminal a, meaning  already on top of the stack, expect to seea • For items of the form[A•,a]the lookahead a is used to reduce A only if the next input is a • For items of the form[A•, a]with  the lookahead has no effect

The Closure Operation for LR(1) Items • Start with closure(I) = I • If [A•B, a]  closure(I) then for each production B in the grammar and each terminal b  FIRST(a) add the item [B•, b] to I if not already in I • Repeat 2 until no new items can be added

The Goto Operation for LR(1) Items • For each item [A•X, a]  I, add the set of items closure({[AX•, a]}) to goto(I,X) if not already there • Repeat step 1 until no more items can be added to goto(I,X)

The grammar G • S’ → S • S → C C • C → cC | d Example • Let I= { (S’ → •S, $) } • I0 = closure(I) = { S’ → •S, $ S → • C C, $ C → •cC, c/d C → •d, c/d } • goto(I0, S) = closure( {S’ → S •, $ } ) = {S’ → S •, $ } = I1

Exercise • The grammar G • S’ → S • S → C C • C → cC | d • Let I= { (S → C •C, $) } • I2 = closure(I) = ? • I3 = goto(I2, c) = ?

Construction of the sets of LR(1) Items • Augment the grammar with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S,$]})(this is the start state of the DFA) • For each set of items I  C and each grammar symbol X  (NT) such that goto(I, X)  Cand goto(I, X)  , add the set of items goto(I, X) to C • Repeat 3 until no more sets can be added to C

LR(1) Automation

Construction of the Canonical LR(1) Parsing Tables • Augment the grammar with S’S • Construct the set C={I0,I1,…,In} of LR(1) items • State i of the parser is constructed from Ii • If [A•a, b]  Iiand goto(Ii,a)=Ij then set action[i,a]=shift j • If [A•, a]  Ii then set action[i,a]=reduce A (apply only if AS’) • If [S’S•, $] is in Ii then set action[i,$]=accept • If goto(Ii,A)=Ij then set goto[i,A]=j • Repeat 3 until no more entries added • The initial state i is the Ii holding item [S’•S,$]

Example • The grammar G • S’ → S • S → C C • C → cC | d

Example Grammar and LR(1) Items • Unambiguous LR(1) grammar:S L=R | RL  *R | idR  L • Augment with S’  S • LR(1) items (next slide)

I0: [S’  •S, $] goto(I0,S)=I1[S  •L=R, $] goto(I0,L)=I2[S  •R, $] goto(I0,R)=I3[L  •*R, =/$] goto(I0,*)=I4[L  •id, =/$] goto(I0,id)=I5[R  •L, $] goto(I0,L)=I2[S’  S•, $][S  L•=R, $] goto(I0,=)=I6[R  L•, $][S  R•, $][L  *•R, =/$] goto(I4,R)=I7[R  •L, =/$] goto(I4,L)=I8[L  •*R, =/$] goto(I4,*)=I4[L  •id, =/$] goto(I4,id)=I5 [L  id•, =/$] I6: [S  L=•R, $] goto(I6,R)=I9[R  •L, $] goto(I6,L)=I10 [L  •*R, $] goto(I6,*)=I11[L  •id, $] goto(I6,id)=I12[L  *R•, =/$] [R  L•, =/$] [S  L=R•, $] [R  L•, $][L  *•R, $] goto(I11,R)=I13[R  •L, $] goto(I11,L)=I10[L  •*R, $] goto(I11,*)=I11[L  •id, $] goto(I11,id)=I12[L  id•, $] [L  *R•, $] I7: I1: I8: I2: I9: I10: I3: I11: I4: I12: I5: I13:

Example LR(1) Parsing Table Grammar:1. S’ S2. S L=R3. S R4. L  *R5. L  id6. R  L

LALR(1) Grammars • LR(1) parsing tables have many states • LALR(1) parsing (Look-Ahead LR) combines LR(1) states to reduce table size • Less powerful than LR(1) • Will not introduce shift-reduce conflicts, because shifts do not use lookaheads • May introduce reduce-reduce conflicts, but seldom do so for grammars of programming languages • SLR and LALR tables for a grammar always have the same number of states, and less than LR(1) tables. • Like C, SLR and LALR >100, LR(1) > 1000

Constructing LALR Parsing Tables • Two ways • Construction of the LALR parsing table from the sets of LR(1) items. • Union the states • Requires much space and time • Construction of the LALR parsing table from the sets of LR(0) items • Efficient • Use in practice.

Example LALR(1) LR(1)

Constructing LALR(1) Parsing Tables • Construct sets of LR(1) items • Combine LR(1) sets with sets of items that share the same first part I4: [L  *•R, =][R  •L, =] [L  •*R, =] [L  •id, =] Shorthandfor two itemsin the same set [L  *•R, =/$][R  •L, =/$][L  •*R, =/$][L  •id, =/$] I11: [L  *•R, $][R  •L, $][L  •*R, $][L  •id, $]

Example LALR(1) Grammar • Unambiguous LR(1) grammar:S L=R | RL  *R | idR  L • Augment with S’  S • LALR(1) items (next slide)

I0: [S’  •S, $] goto(I0,S)=I1[S  •L=R, $] goto(I0,L)=I2[S  •R, $] goto(I0,R)=I3[L  •*R, =/$] goto(I0,*)=I4[L  •id, =/$] goto(I0,id)=I5[R  •L, $] goto(I0,L)=I2[S’  S•, $][S  L•=R, $] goto(I0,=)=I6[R  L•, $][S  R•, $][L  *•R, =/$] goto(I4,R)=I7[R  •L, =/$] goto(I4,L)=I9[L  •*R, =/$] goto(I4,*)=I4[L  •id, =/$] goto(I4,id)=I5 [L  id•, =/$] I6: [S  L=•R, $] goto(I6,R)=I8[R  •L, $] goto(I6,L)=I9 [L  •*R, $] goto(I6,*)=I4[L  •id, $] goto(I6,id)=I5[L  *R•, =/$] [S  L=R•, $][R  L•, =/$] I7: I1: I8: I2: I9: Shorthandfor two items I3: I4: [R  L•, =][R  L•, $] I5:

Example LALR(1) Parsing Table Grammar:1. S’ S2. S L=R3. S R4. L  *R5. L  id6. R  L

LL, SLR, LR, LALR Summary • LL parse tables computed using FIRST/FOLLOW • Nonterminals  terminals  productions • Computed using FIRST/FOLLOW • LR parsing tables computed using closure/goto • LR states  terminals  shift/reduce actions • LR states  nonterminals  goto state transitions • A grammar is • LL(1) if its LL(1) parse table has no conflicts • SLR if its SLR parse table has no conflicts • LR(1) if its LR(1) parse table has no conflicts • LALR(1) if its LALR(1) parse table has no conflicts

LL, SLR, LR, LALR Grammars LR(1) LL(1) LALR(1) SLR LR(0)

Chapter 3-3

Chapter 3-3

Presentation Transcript

Chapter 3 Lesson 3

Chapter 3 Section 3

CHAPTER 3-3

Chapter 3 Part 3

Chapter 3-3

Chapter 3 Section 3

Unit 3 Chapter 3

Chapter 3 Section 3

Chapter 3, Section 3

Chapter 3 Section 3

Week 3 - Chapter 3

Chapter 3 Section 3

Chapter 3 Chapter 3

Presentation 3 chapter 3

Chapter 3 / Section 3

Chapter 3 Section 3

Chapter 3 Section 3

Chapter 3 Sec. 3

Chapter 3—Part 3

Chapter 3: Section 3