1 / 58

Compiler Structures

Compiler Structures. 241-437 , Semester 1 , 2011-2012. Objective describe bottom-up (LR) parsing using shift-reduce and parse tables explain how LR parse tables are generated. 6. Bottom-up (LR) Parsing. Overview. 1. What is a LR Parser? 2. Bottom-up using Shift-Reduce

svea
Télécharger la présentation

Compiler Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Structures 241-437, Semester 1, 2011-2012 • Objective • describe bottom-up (LR) parsing using shift-reduce and parse tables • explain how LR parse tables are generated 6. Bottom-up (LR) Parsing

  2. Overview 1. What is a LR Parser? 2. Bottom-up using Shift-Reduce 3. Building a LR Parser 4. Generating the Parse Table 5. LR Conflicts 6. LL, SLR, LR, LALR Grammars

  3. Source Program In this lecture Lexical Analyzer Front End Syntax Analyzer Semantic Analyzer but concentrating on bottom-up parsing Int. Code Generator Intermediate Code Code Optimizer Back End Target Code Generator Target Lang. Prog.

  4. 1. What is a LR Parser? • A LR parser reads its input tokens from Left-to-right and produces a Rightmost derivation. • The parse tree is built bottom-up, starting from the leaves and working upwards to the start symbol.

  5. LR in Action parse "a b b c d e" Grammar:S  aA B eA  Ab c | bB  d The tree correspondsto a rightmost derivation:S a A B e aA d e a A b c d e a b b c d e Reducing a sentence:a b b c d ea A b c d ea A d ea A BeS These matchproduction’sright-hand sides S A A A A A A B A B a b b c d e a b b c d e a b b c d e a b b c d e

  6. LR(k) Parsing • The k is to the number of input tokens that are looked at when deciding which production to use. • e.g. LR(0), LR(1) • We'll be using a variation of LR(0) parsing in this chapter.

  7. LR versus LL • LR can deal with more complex (powerful) grammars than LL (top-down parsers). • LR can detect errors quicker than LL. • LR parsers can be implemented very efficiently, but they're difficult to build by hand (unlike LL parsers).

  8. 2. Bottom-up using Shift-Reduce • The usual way of implementing bottom-up parsing is by using shift-reduce: • ‘shift’ means read in a new input token, and push it onto a stack • ‘reduce’ means to group several symbols into a single non-terminal • by choosing a production to use 'backwards' • the symbols are popped off the stack, and the production's non-terminal is pushed onto it

  9. Shift-Reduce Parsing S => a A B e A => A b c | b B => d Stack Input Action $ a b b c d e $ Shift $ a b b c d e $ Shift $ a b b c d e $ Reduce A => b $ a A b c d e $ Shift $ a A b c d e $ Shift $ a A b c d e $ Reduce A => A b c $ a A d e $ Shift $ a A d e $ Reduce B => d $ a A B e $ Shift $ a A B e $ Reduce S => a A B e $ $

  10. 3. Building a LR Parser • The standard way of writing a shift-reduce LR parser is to generate a parse table for the grammar, and 'plug' that into a standard LR compiler framework. • The table has two main parts: actions and gotos.

  11. actions gotos a1 a2 … ai … an $ 3.1. Inside an LR Parser input tokens push; pop LR Parser output (parse tree) Xm sm Xm-1 sm-1 stack … Xo s0 Parse table (you create this bit) X is terminals or non-terminals, S = state possible actions are shift, reduce, accept, error gotos involve state changes

  12. State a b c d e $ S A B 0 s1 1 s3 2 2 s5 s6 4 3 r3 r3 4 s7 5 s8 6 r4 7 acc 8 r2 r2 Parse Table for the Example 1: S => a A B e 2: A => A b c 3: A => b 4: B => d Action part Goto part s means shift to to that state r means reduce by that numbered production

  13. 3.2. Table Algorithm push(<$,0>); /* push <symbol,state> pair */ currToken = scanner(); while(1) { <x,state> = pair on top of stack; if (action[state, currToken ] == <shift newState>) { push(<currToken ,newState>); currToken = scanner();} : : 4 branches for the four possible actions that can be in a table cell continued

  14. else if (action[state, currToken ] == <reduce ruleNum> ) { A --> b is rule number ruleNum; bodySize = numElements(b); pop bodySize pairs off stack; state’ = state part of pair on top of stack; push( <A, goto[state’,A] > ); } : : continued

  15. else if (action[state,currToken ] = accept) { S --> b is the start symbol production; bodySize = numElements(b); pop bodySize pairs off stack; state’ = state part of pair on top of stack; if (state’ == 0) break; // success; can now stop else error(); } else error(); } // of while loop

  16. S => a A B e A => A b c | b B => d 3.3. Table Parsing Example Stack Input Action $0 a b b c d e $ Shift 1 $0,a1 b b c d e $ Shift 3 pop 1 pair state' == 1 push(A,goto(1, A)) = push(A,2) $0,a1,b3 b c d e $ Reduce A => b $0,a1,A2 b c d e $ Shift 5 $0,a1,A2,b5 c d e $ Shift 8 $0,a1,A2,b5,c8 d e $ Reduce A => A b c pop 3 pairs state' == 1 push(A,goto(1, A)) = push(A,2) $0,a1,A2 d e $ Shift 6 $0,a1,A2,d6 e $ Reduce B => d $0,a1,A2,B4 e $ Shift 7 $0,a1,A2,B6,e7 $ Accept S => a A B e $0 $

  17. 3.4. The LR Parse Stack • The parse stack holds the branches of the tree being built bottom-up. • For example, • the stack $0,a1,A2,b5,c8 represents: A b c a b continued

  18. A A The next stack: $0,a1,A2 b c a b Later, $0,a1,A2,B6,e7 A A B b c d e a b continued

  19. 4. Generating the Parse Table • The example parse table was generated using the SLR (simple LR)algorithm • an extension of LR(0) which uses the grammar's FOLLOW() sets • The other LR algorithms can be used to make a parse table: • e.g. LR(1), LALR(1)

  20. Supporting Techniques • SLR table generation makes use of three techniques: • LR(0) items • the closure() function • the goto() function • I'll explain each one first, before the table generation algorithm.

  21. 4.1. LR(0) Items • An LR(0) item is a grammar production with a • at some position of the right-hand side. • So, a productionA X Y Zhas four items:A • X Y ZA X • Y Z A X Y • ZA X Y Z • • Production A  has one item A •

  22. 4.2. The closure() Function • The closure() function generates a set ofLR(0) items. • Assume that the grammar only has one production for the start symbol S, S =>b • The initial closure set is: closure( { S =>•b} ) continued

  23. If A•B is in the set, then for each production B, add the item B• to the set, if it's not already there. • Repeat until no new items can be added to the set.

  24. Grammar:S --> E E  E+T | TT  T*F | FF  (E)F  id Example use of closure() closure({ S•E }) = { S  • E } { S  • E E  • E+T E  • T T  • T*FT  • FF  • (E)F  • id } { S  • EE  • E+T E  • T } { S  • E E  • E+T E  • TT  • T*FT  • F } Add E• Add T• Add F•

  25. 4.3. The goto() Function X In In+1 • goto(In, X) takes as input an existing closure set In, and a terminal/non-terminal symbol X. • The output is a new closure set In+1: • for each item A   • X  in In, add closure({ A   X •  }) to In+1 • repeat until no more items can be added to In+1

  26. goto() Example 1 • Grammar: S => A B // rule 1, for start symbol A => a B => b • Initial state I0 = closure( { S =>• A B } ) = { S =>• A B A =>• a } continued

  27. goto( I0, A) = = closure( { S => A • B } ) = { S => A • B, B =>• b} // call it I1 • goto( I0, a) = = closure( { A => a • } ) = { A => a • } // call it I2 A I0 I1 a I2 continued

  28. A B I0 I1 I3 end state a b I2 I4 • goto( I1, B) = = closure( { S => A B • } ) = { S => A B • } // call it I3 • this is the end of the S production • goto( I1, b) = = closure( { B => b • } ) = { B => b • } // call it I4

  29. goto() Example 2 • Grammar: S => a A B e // rule 1, for start symbol A => A b c | b B => d • Initial state I0 = closure( { S =>• a A B e } ) = { S =>• a A B e } continued

  30. a I0 I1 • goto( I0, a) = = closure( { S => a • A B e } ) = { S => a• A B e A =>• A b c A =>• b} // call it I1 continued

  31. goto( I1, A) = = closure( { S => a A • B e A => A • b c } ) = { S => a A • B e A => A • b c B =>• d } // call it I2 • goto( I1, b) = = closure( { A => b • } ) = { A => b • } // call it I3 a I0 I1 A b I2 I3 continued

  32. a I0 I1 A b • goto( I2, B) = = closure( { S => a A B • e } ) = { S => a A B • e } // call it I4 • Others • I5: { A => A b • c } • I6: { B => d • } • I7: { S => a A B e • } // end of start symbol rule • I8: { A => A b c • } I2 I3 B d b I5 I4 I6 e c I8 I7

  33. 4.4. Using goto() to make a Table • The columns of the table should be the grammar's terminals, $, and non-terminals. • The rows should be the I0, I1, …, Innumbers 0, 1, …, n. • what we've been calling states

  34. Stage 1 • In stage 1, we add the shift, goto, and accept entries to the table. • action[i, a] gets <shift j> if goto(Ii,a) = Ij • goto[ i, A ] gets j if goto( Ii, A) == Ij continued

  35. action[i, $] get accept if S => b• in Ii (there must be only one S rule)

  36. a b $ S A B 0 1 2 3 4 Example Grammar 1 A B S --> A BA --> aB --> b I0 I1 I3 a b I2 I4 s2 1 s4 3 acc goto[] action[]

  37. Stage 2 • In stage 2, we add the reduce and error entries to the table. • action[i, a] gets <reduce ruleNum> if [A => a• ] in Ii and A is not S and a is in FOLLOW(A) and A => a is rule number ruleNum continued

  38. After filling the table cells with shift, goto, accept, and reduce actions, any remaining empty cells will trigger an error() call.

  39. Finishing the Example Table • The reduce states are the state boxes at the leaves of the closure graph. • but exclude the end state • For the example 1 grammar, there are two boxes at the leaves: I2 and I4. A B I0 I1 I3 a b I2 I4

  40. I2 Reduction S --> A BA --> aB --> b • I2 = { A => a • } • A => a is rule number 2 • FOLLOW(A) == FIRST(B) = { b } • So action[ 2, b ] gets <reduce 2>

  41. I4 Reduction S --> A BA --> aB --> b • I4 = { B => b • } • B => b is rule number 3 • FOLLOW(B) = { $ } • So action[ 4, $ ] gets <reduce 3>

  42. a b $ S A B 0 1 2 3 4 Adding Reduce Entries A B S --> A BA --> aB --> b I0 I1 I3 a b I2 I4 s2 1 s4 3 r2 acc r3 goto[] action[]

  43. Using the Example 1 Table S --> A BA --> aB --> b Stack Input Action $0 a b $ Shift 2 $0,a2 b $ Reduce 2 (A --> a) $0,A1 b $ Shift 4 $0,A1,b4 $ Reduce 3 (B --> b) $0,A1,B3 $ Accept (S --> A B) $0 $ pop 1 pair; state' = 0; push(A, goto(0,A)) == push(A,1); pop 1 pair; state' = 1; push(B, goto(1,B)) == push(B,3);

  44. a b c d e $ S A B 0 1 2 3 4 5 6 7 8 4.5. Example Grammar 2 Stage 1 a I0 I1 S --> a A B e A --> A b c | b B --> d s1 A b s3 2 I2 I3 s5 s6 4 B d b s7 I5 I4 I6 s8 e c I8 I7 acc action[] goto[]

  45. Reduce States • For the example 2 grammar, there are three boxes at the leaves: I3, I6, and I8.

  46. I3 Reduction S --> a A B e A --> A b c A --> b B --> d • I3 = { A => b • } • A => b is rule number 3 • FOLLOW(A) = {b}  FIRST(B) • = {b, d} • So action[ 3, b ] and action[ 3, d ] gets <reduce 3>

  47. I6 Reduction S --> a A B e A --> A b c A --> b B --> d • I6 = { B => d • } • B => d is rule number 4 • FOLLOW(B) = {e} • So action[ 6, e ] gets <reduce 4>

  48. I8 Reduction S --> a A B e A --> A b c A --> b B --> d • I8 = { A => A b c • } • A => A b c is rule number 2 • FOLLOW(A) = {b, d} • So action[ 8, b ] and action[ 8, d ] gets <reduce 2>

  49. a b c d e $ S A B 0 1 2 3 4 5 6 7 8 S --> a A B e A --> A b c | b B --> d Adding Reduce Entries a I0 I1 s1 A b s3 2 I2 I3 s5 s6 4 B d r3 r3 b s7 I5 I4 I6 s8 e c r4 I8 I7 acc r2 r2 action[] goto[]

  50. 5. LR Conflicts • A LR conflict occurs when a cell in the action part of the parse table contains more than one action. • There are two kinds of conflict: • shift/reduce and reduce/reduce • Conflicts appear because of: • grammar ambiguity • limitations of the SLR parsing method (even when the grammar is unambiguous)

More Related