120 likes | 236 Vues
This lecture tutorial delves into LR Shift-Reduce Parsers and their superiority over LL parsers, utilizing the JavaCup parser generator. Learn how to construct an LR Shift-Reduce parser, which integrates a parsing table (DFSM) and a state stack. The parser operates by reading input strings, shifting tokens onto the stack, and using the parsing table to determine whether to shift or reduce. We cover the generation steps, including finite state machine creation and parsing table establishment. Master the concepts of items, closures, and the parser's mechanics to efficiently parse strings. ###
E N D
Formal AspectsTerm 2, Week4 • LECTURE: • LR “Shift-Reduce” Parsers: The JavaCup Parser-Generator CREATES LR “Shift-Reduce” Parsers, they are very commonly used and superior to LL parsers • TUTORIAL: • How to use, and how to create, a Shift-Reduce Parser
LR “Shift-Reduce” Parsers • LR S-R parsers consist of a Parsing Table (a DFSM) PLUS a Stack of States and Symbols. States are numbered in the Table, and Symbols are tokens or non-terminals. • The Parser is input with a string which it has to parse. It shifts the tokens from the string to the stack. Tokens State Symbol PARSING TABLE - ACTIONS ON THE STACK State STACK States Symbol State
LR “Shift-Reduce” Parser - The Start • Assume String = T1 T2 T3 T4 ..... Tn is input. • The first token T1 from the Left of the string is input to the Table with state 1. The Table is used to find out what to do: SHIFT or REDUCE. EXAMPLE: Stack 1: state 1 INPUT T1 .... consult table => SHIFT T1, move to state X Stack 2: state x T1 state 1
LR “Shift-Reduce” Parsers - General Workings • Given a symbol and a state input to the Table, carry out the following: (see PAGE 60 in Appel’s book) Sn: (means “Shift symbol, move to state n”) Put symbol onto the top of the stack; Put the new state number n on top of the stack Rk: (means “Reduce with rule k”) matching the RHS of rule k with the top of the stack and REMOVE all the matched top; Push the LHS of rule k onto the top of the stack; Input LHS of rule k + state below it to the Table.
To Create a LR(1) Parser • We will now go through the steps required to BUILD a shift-reduce parser • This method is embedded in JavaCup
Jargon 1 : ITEM • An ITEM is a grammar’s production rule with a “DOT” somewhere in its Right Hand Side. • The DOT represents a notional parsing position • e.g. E ::= (.S,E) E ::= (S,.E) • S ::= .S;S S ::= .id := E • are example items from Grammar 3.1
Jargon 2: Closure of an Item • The CLOSURE of an item R (or set of items) is the set C of items such that • (1) C contains R • AND • (2) IF there is a member of C of the form • X ::= w .Y z • where Y is a non-terminal, then ALL the defining production rules of Y must appear in C with the DOT at the start of their RHS. • E.g. closure(E ::= (.S,E) ) = • { E ::= (.S,E) S ::= .S ; S S ::= .id := E S ::= .print (L) }
LR “Shift-Reduce” Parsers - Generation • TWO STAGE PROCESS: • 1: CREATE A FINITE STATE MACHINE WITH • NODES = SETS OF ITEMS • ARCS ANNOTATED WITH NON-TEMINALS OR TOKENS 2: CREATE A PARSING TABLE FROM THE MACHINE
1: CREATING THE FINITE STATE MACHINE • To generate a new state from an old one: • newstate(w: SYMBOL,S: OLDSTATE) = • closure( set of items of the form • Z ::= .... w. .... • where Z ::= .... .w .... is a member of S )
ALGORITHM TO CREATE FSM • T = set of STATES in the FSM, E = set of TRANSITIONS • E = { } ; T = { closure( S’ ::= .S$ ) } ; • repeat • for each state S in T • for each item: ‘Z ::= .... .w ....’ in S • add newstate(w,S) to T • add S --w--> newstate(w,S) to E • end for • end for • until E and T do not change • NB ‘ACCEPT’ STATE OF FSM = newstate($, anystate)
2: TO CREATE THE TABLE FROM THE FSM • 1. NUMBER STATES 1,2,3, ... • 2. For a transition n ---- x ----> m where m contains an item of the form Z ::= ... w. • Put ‘reduce X’ all along row m under the token column, where X is the no. of Z ::= ... W • Otherwise: • 3. For a transition n ---- x ----> m where x is a token, put ‘Shift m’ in row n column x • 4. For a transition n ---- Y ----> m where Y is a non-terminal, put ‘goto m’ in row n column Y
LR Parsers - Summary • In this lecture we have seen HOW LR parsers work and HOW they can be automatically created from a grammar specification. • NB • LR means parse string from Left to right, but build up the parse tree from the Right of the string first. • “Most” parsers are “LR(1)” - the “1” means they look at the 1 next token in the string.