270 likes | 388 Vues
Learn how to construct efficient parsers through bottom-up syntax analysis, including LR(1) grammars and parser table construction. Study different solutions like SLR(1) and CLR(1) in the context of generating syntax trees or error reports for various types of grammars.
E N D
Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3
Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars context free grammar parser tokens Efficient Parsers bison “Ambiguity errors” parse tree
Top-Down (Predictive Parsing) LL Construct parse tree in a top-down matter Find the leftmost derivation For every non-terminal and token predict the next production Bottom-Up LR Construct parse tree in a bottom-up manner Find the rightmost derivation in a reverse order For every potential right hand side and token decide when a production is found Kinds of Parsers
Input A context free grammar A stream of tokens Output A syntax tree or error Method Construct parse tree in a bottom-up manner Find the rightmost derivation in (reversed order) For every potential right hand side and token decide when a production is found Report an error as soon as the input is not a prefix of valid program Bottom-Up Syntax Analysis
Pushdown automata Bottom-up parsing (given a parser table) Constructing the parser table Interesting non LR grammars Plan
Pushdown Automaton input u t w $ V control parser-table $ stack
reduceA Pop | | symbol from the stack Apply the associated action Push a symbol goto[top, A] on the stack shiftX Push X onto the stack Advance the input accept Parsing is complete error Report an error Bottom-Up Parser Actions
A Parser Table for S a S b| Manual Construction?
The Challenge • How to construct a parser-table from a given grammar • LR(1) grammars • Left to right scanning • Rightmost derivations (reverse) • 1 token • Different solutions • Operator precedence • SLR(1) • Simple LR(1) • CLR(1) • Canonic LR(1) • LALR(1) • Look Ahead LR(1) • Yacc, Bison, JCUP
Grammar Hierarchy Non-ambiguous CFG CLR(1) LL(1) LALR(1) SLR(1)
Constructing an SLR parsing table • Add a production S’ S$ • Construct a finite automaton accepting “valid stack symbols” • The states of the automaton becomes the states of parsing-table • Determine shift operations • Determine goto operations • Construct reduce entries by analyzing the grammar
A finite Automaton for S’ S$ S a S b| a a S b 0 1 2 3 S 4
Constructing a Finite Automaton • NFA • For X X1 X2 … Xn • [X X1 X2 …XiXi+1 … Xn] • “prefixes of rhs (handles)” • X1 X2 … Xi is at the top of the stack and we expect Xi+1 … Xn • The initial state [S’ .S$] • ([X X1…XiXi+1 … Xn], Xi+1 = [X X1 …XiXi+1 … Xn] • For every production Xi+1 ([[X X1 X2 …XiXi+1 … Xn], ) = [Xi+1 ] • Convert into DFA
a S b S NFA S’ S$ S a S b| [S .aSb] [S a.Sb] [S aS.b] [S’ .S$] [S .] [S aSb.] [S’ S.$]
DFA [S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$] a S [S .aSb] [S a.Sb] [S aS.b] [S’ .S$] b S [S .] [S aSb.] [S’ S.$]
[S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$]
Filling reduce entries • For an item [A .] we need to know the tokens that can follow A in a derivation from S’ • Follow(A) = {t | S’ * At} • See the textbook for an algorithm for constructing Follow from a given grammar
[S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$] Follow(S) = {b, $} r S r S r S r S r S a S b
Interesting Non SLR(1) Grammar S’ S$ S L = R | R L *R | id R L Partial DFA [S L=.R] [R .L] [L .*R] [L .id] [S’ .S$] [S .L=R] [S .R] [L .*R] [L .id] [R L] [S L.=R] [R L.] = L Follow(R)= {$, =}
LR(1) Parser • Item [A ., t] • is at the top of the stack and we are expecting t • LR(1) State • Sets of items • LALR(1) State • Merge items with the same look-ahead
Interesting Non LR(1) Grammars • Ambiguous • Arithmetic expressions • Dangling-else • Common derived prefix • A B1 a b | B2 a c • B1 • B2 • Optional non-terminals • St OptLab Ass • OptLab id : | • Ass id := Exp
Summary • LR is a powerful technique • Generates efficient parsers • Generation tools exit • Bison, yacc, CUP • But some grammars need to be tuned • Shift/Reduce conflicts • Reduce/Reduce conflicts • Efficiency of the generated parser