100 likes | 241 Vues
LANGUAGE TRANSLATORS: WEEK 4. LECTURE: Recursive Descent Parsers, Table Driven Recursive Descent Parsers TUTORIAL: Creating LL parsers using Tables NB Terminal = Token !!. CREATING PARSERS.
E N D
LANGUAGE TRANSLATORS: WEEK 4 LECTURE: Recursive Descent Parsers, Table Driven Recursive Descent Parsers TUTORIAL: Creating LL parsers using Tables NB Terminal = Token !!
CREATING PARSERS Assume you have a GRAMMAR G and a string S of Tokens of G. How can you tell if G can generate S? How can you extract the structure of S if it can be generated by G? Example G, Tokens a,b,c,d 1. Z ::= d 2. Z ::=XYZ 3. Y ::= b 4. Y ::= c 5. X ::= Y 6 X ::= aZ Example string ‘bbd’
CREATING “RECURSIVE DESCENT” PARSERS Method Outline: (1) Each non-terminal N becomes a Method N (2) Each set of productions for a non-terminal N are used to define N’s body: -the RHS forms a statement sequence where non-terminals are method calls -alternative productions for the same non-terminal form if/case statements (3) Each Method ‘consumes’ tokens – it inputs a list and outputs a smaller list
RECURSIVE DESCENT PARSER FOR G (NB - This is rough - e.g. it does not handle failure!) Method Z(in: S: list, out: S3: list) if head S == ‘d’ then record(1); S3 = tail S; else record(2); call X(S,S1); call Y(S1,S2); call Z(S2,S3) END Method Y(in: S:list, out S1:list) if head S == ‘b’ then record(3); S1 = tail S; else if head S == ‘c’ then record(4) S1 = tail S; END Method X(in:S: list, out S1:list) if head S == ‘a’ then record(6); call Z(tail S,S1); else record(5); call Y(S,S1) END PARSER_for_G(S) = call Z(S, OUT); if OUT == ““ then SUCCESS.
Definition of ‘LL’ Parser: A Recursive Descent Parser is termed LL(1) because -- It parsers from Left to right; -- It calls methods corresponding to non-terminals in a Left to right fashion -- It looks for the next 1 Token in the string to decide what branch to take in the parser
PROBLEMS with the RD PARSER BUT THE RD won’t work unless G is LL(1), i.e. (i) G is NOT ambiguous (ii) G is NOT left recursive (iii) for EVERY two of G’s productions of the form X ::= W1, X ::= W2, it is the case that First(W1) and First(W2) have no common element [ e.g. Add extra rule 7: Z ::= b ]
TABLE DRIVEN PARSERS Rather than translating a grammar straight into a program, its much better to translate in into a “machine” or “table”, forming a “table-driven parser”. Translating a grammar into a table is a good form of analysis as well as one step towards a parser: GRAMMAR => TABLE => PARSER
TO AUTOMATICALLY CONSTRUCT AN LL(1) PARSING TABLE The Table is of size i x j – it has i columns corresponding to G’s i Tokens, j rows corresponding to G’s j non-terminals Entries in the table are one or more production rules METHOD: 1. Enter the rule N ::= w in Row N Column m for each m in the First(w) 2. For each rule N ::= w, find if w is nullable. If w is nullable, enter N ::= w in Row N Column m for each m in the set Follow(N)
TO RUN AN LL(1) PARSING TABLE ….on a string in its language (Exercise: adjust it to make it execute on any string) Program Run-parsing-table: % assume input string is consumed a token at a time M = special symbol; a = get_token; call Loop(M,a); end Loop(in: M, in/out: a) 1. Apply the production P in Row M Column a; 2. For each of the symbols w in the RHS of P if w is a Non-Terminal: call Loop(w,a); else if w is a Token: a = get_token end
SUMMARY LL parsers are top-down - they start at the special symbol and try to parse a string from left to right. We have seen how to construct two kinds of LL(1) parser - the recursive descent method and the table driven method NEXT WEEK - LR table driven parsing - how JavaCup constructs its parsers