1 / 10

Bottom-up parsing

Bottom-up parsing. Synthesize tree from fragments Automaton performs two actions: shift : push next symbol on stack reduce : replace symbols on stack Automaton synthesizes ( reduces ) when end of a production is recognized

senona
Télécharger la présentation

Bottom-up parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bottom-up parsing • Synthesize tree from fragments • Automaton performs two actions: • shift: push next symbol on stack • reduce: replace symbols on stack • Automaton synthesizes (reduces) when end of a production is recognized • States of automaton encode synthesis so far, and expectation of pending non-terminals • Automaton has potentially large set of states • Technique more general than LL (k) (C) Edmond Schonberg, New-York University

  2. LR (k) parsing • Left-to-right, rightmost derivation with k-token lookahead. • Most general parsing technique for deterministic grammars. • In general, not practical: tables too large (10^6 states for C++, Ada). • Common subsets: SLR, LALR (1). (C) Edmond Schonberg, New-York University

  3. The states of the LR(0) automaton • An item is a point within a production, indicating that part of the production has been recognized: • A a . B b , • seen the expansion of a, expect to see expansion of B • A state is a set of items • Transition within states are determined by terminals and non-terminals • Parsing tables are built from automaton: • action: shift / reduce depending on next symbol • goto: change state depending on synthesized non-terminal (C) Edmond Schonberg, New-York University

  4. Building LR (0) states • If a state includes: A a . B b • it also includes every state that is the start of B: B . X Y Z • Informally: if I expect to see B next, I expect to see anything that B can start with, and so on: X . G H I • States are built by closure from individual items. (C) Edmond Schonberg, New-York University

  5. A grammar of expressions: initial state • E’ E • E E + T | T; -- left-recursion ok here. • T T * F | F; • F id | (E) • S0 = { E’ .E, E .E + T, E .T, F .id, F . ( E ) , T .T * F, T .F} (C) Edmond Schonberg, New-York University

  6. Adding states • If a state has itemA a .a b, and the next symbol in the input is a, we shifta on the stack and enter a state that contains item • A a a.b (as well as all other items brought in by closure) • if a state has as item A a. , this indicates the end of a production: reduce action. • If a state has an item A a .N b, then after a reduction that find an N, go to a state with A a N. b (C) Edmond Schonberg, New-York University

  7. The LR (0) states for expressions • S1 = { E’ E., E E. + T } • S2 = { E T., T T. * F } • S3 = { T F. } • S4 = { F (. E), } + S0 (by closure) • S5 = { F id. } • S6 = { E E +. T, T .T * F, T .F, F .id, F .(E)} • S7 = { T T *. F, F .id, F .(E)} • S8 = { F (E.), E E.+ T} • S9 = { E E + T., T T.* F} • S10 = { T T * F.}, S11 = {F (E).} (C) Edmond Schonberg, New-York University

  8. Building SLR tables • An arc between two states labeled with a terminal is a shift action. • An arc between two states labeled with a non-terminal is a goto action. • if a state contains an item A a. , (a reduce item) • the action is to reduce by this production, for all terminals in Follow (A). • If there are shift-reduce conflicts or reduce-reduce conflicts, more elaborate techniques are needed. (C) Edmond Schonberg, New-York University

  9. LR (k) parsing • Canonical LR (1): annotate each item with its own follow set: • (A -> a a.b , f ) • f is a subset of the follow set of A, because it is derived from a single specific production for A • A state that includes A -> a a.b is a reduce state only if next symbol is in f: fewer reduce actions, fewer conflicts, technique is more powerful than SLR (1) • Generalization: use sequences of k symbols in f • Disadvantage: state explosion: impractical in general, even for LR (1) (C) Edmond Schonberg, New-York University

  10. LALR (1) • Compute follow set for a small set of items • Tables no bigger than SLR (1) • Same power as LR (1), slightly worse error diagnostics • Incorporated into yacc, bison, etc. (C) Edmond Schonberg, New-York University

More Related