1 / 16

More LR Parsing and Bison

More LR Parsing and Bison. CPSC 388 Ellen Walker Hiram College. More than SLR(1). SLR(k) Parsing Multiple-token lookahead (for shifts) and multiple-token follow information (for reductons) General LR(1) parsing Include lookaheads in DFA construction LALR(1) parsing

Télécharger la présentation

More LR Parsing and Bison

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. More LR Parsing and Bison CPSC 388 Ellen Walker Hiram College

  2. More than SLR(1) • SLR(k) Parsing • Multiple-token lookahead (for shifts) and multiple-token follow information (for reductons) • General LR(1) parsing • Include lookaheads in DFA construction • LALR(1) parsing • Simplified state diagram for GLR(1) • What YACC / Bison uses

  3. LALR: LR(0) + Lookahead • NFA states are [ LR(0) item , lookahead] • Examples: [S->.(S), $] , [S->.a,)] • After comma is first token after the RHS • Building DFA • [S->.(S),$] --(--> [S->(.S),$] same as SLR • [S->(.S),$] --e--> [S->.(S),)] propagate LA • Rule for every S-rule, every first of what follows S in original rule

  4. YACC / Bison • “Yet another Compiler-Compiler” • Given CFG, automatically creates LALR table • Using bison: • Input file: grammar.y • Output file: grammar.tab.c • Generic main reads lines, executes rules

  5. Structure of a Bison File Definitions, including direct code in %{ %} %% Rules of the grammar, with actions %% Additional code, e.g. main(){ return yyparse() }

  6. Example: Expression Calculator • Rules describe the usual grammar • S’ -> exp • exp -> exp + term | exp - term | term • term -> term * factor | factor • factor -> NUMBER | ( exp ) • Associated actions execute the arithmetic

  7. Bison Rule Syntax • LHS followed by : • Each alternative followed by action, then | • ; after the last action • Example factor : NUMBER {$$ = $1;} | '(' exp ')' {$$ = $2;} ;

  8. Bison Actions • Rules include actions in { } (code) • Predefined variables: • $$ value of result of rule (YYSTYPE or int) • $1 value of first token, $2 value of second token, etc. • Example • Exp: exp ‘+’ term {$$ = $1 + $3;}

  9. Bison and Flex together • Define tokens in definition section: • %token ID <val> • Choose values > 256 • Make sure lex.yy.c and yy.tab.c agree on token ID defs #define ID val • Compile both together • g++ -o myparser yy.tab.c lex.yy.c -lfl

  10. Flex for Bison • Each rule should return a token type • E.g. return NUMBER; • In addition, a token value can be saved in the global variable yylval • E.g. yylval = myAtoI(yytext);

  11. Mixing Characters and Tokens • Don’t assign token values < 256 • Allow characters to be their own tokens (rule): . return(yytext[0]); • Or be specific: [-+*()] return(yytext[0]);

  12. A Few Gotcha’s • Bison (and flex) like tabs, not spaces • Beware of commenting out closing } with // • C (++) requires functions to be defined before they are used • Copy signatures to top of file • “extern” for functions and variables defined in other files

  13. Bison Individual Homework • Use Bison to parse and interpret simple LISP-like commands • ( cons a (cons b nil)) => (a b) • ( cons (cons a nil) (cons b nil) => ((a) b) • (car (cons a (cons b nil))) => a • (cdr (cons a (cons b nil))) => (b) • (cdr (cdr (cons a (cons b nil)))) => nil • See handout for details

  14. Error Handling in BU parsing • Error = blank entry in parsing table • To give specific error messages • Many error entries (but bigger table!) • Detect error before reducing when possible • LR(1) is better than SLR(1) here

  15. Recovery • Panic mode: • Pop states from the stack until the parse can be restarted • Advance input until a legal transition is available • Error productions • Treat “error” as a pseudotoken • Rules indicate how much to throw away

  16. Error Example • command : exp {cout << $1 << endl;} | error {yyerror “bad cmd”;} • Once a command is in error, parser will • Perform the error action • Delete tokens until a legal follow of command ($ here)

More Related