1 / 69

Lecture 4: LL Parsing

Lecture 4: LL Parsing. CS 540 George Mason University. Parsing. Syntatic/semantic structure. Syntatic structure. tokens. Scanner (lexical analysis). Parser (syntax analysis). Semantic Analysis (IC generator). Code Generator. Source language. Target language. Code

ismail
Télécharger la présentation

Lecture 4: LL Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 4: LL Parsing CS 540 George Mason University

  2. Parsing Syntatic/semantic structure Syntatic structure tokens Scanner (lexical analysis) Parser (syntax analysis) Semantic Analysis (IC generator) Code Generator Source language Target language Code Optimizer • Syntax described formally • Tokens organized into syntax tree that describes structure • Error checking Symbol Table CS 540 Spring 2010 GMU

  3. Top Down (LL) Parsing P P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  4. Top Down (LL) Parsing P P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end SS begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  5. Top Down (LL) Parsing P P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end SS SS S begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  6. Top Down (LL) Parsing P P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end SS SS S begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  7. Top Down (LL) Parsing P P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end SS SS SS S S begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  8. Top Down (LL) Parsing P P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end SS SS SS S S begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  9. Top Down (LL) Parsing P beginSSend  beginS; SS end  beginsimplestmt;SSend  beginsimplestmt; S ; SS end  beginsimplestmt; simplestmt ; SSend  beginsimplestmt; simplestmt ; end P begin SS end SS  S ; SS SS  e S  simplestmt S  begin SS end 1 P 2 SS 4 SS 3 SS 6 S 5 S e begin simplestmt ; simplestmt ; end CS 540 Spring 2010 GMU

  10. Grammar S S S  a B | b C B  b b C C  c c Two strings in the language: abbcc and bcc Can choose between them based on the first character of the input. b C a B c c b b C c c CS 540 Spring 2010 GMU

  11. LL(k) parsing also known as the lookahead • Process input k symbols at a time. • Initially, ‘current’ non-terminal is start symbol. • Algorithm • Loop until no more input • Given next k input tokens and ‘current’ non-terminal T, choose a rule R (T  …) • For each element X in rule R from left to right, if X is a non-terminal, we will need to ‘expand’ X else if symbol X is a terminal, see if next input symbol matches X; if so, update from the input • Typically, we consider LL(1) CS 540 Spring 2010 GMU

  12. Two Approaches • Recursive Descent parsing • Code tailored to the grammar • Table Driven – predictive parsing • Table tailored to the grammar • General Algorithm Both algorithms driven by the tokens coming from the lexer. CS 540 Spring 2010 GMU

  13. Writing a Recursive Descent Parser • Generate a procedure for each non-terminal. Use next token from yylex() (lookahead) to choose (PREDICT) which production to ‘mimic’. • for non-terminal X, call procedure X() • for terminals X, call ‘match(X)’ Ex: B  b C D B() { if (lookahead == ‘b’) { match(‘b’); C(); D(); } else … } CS 540 Spring 2010 GMU

  14. Writing a Recursive Descent Parser Also need the following: match(symbol) { if (symbol == lookahead) lookahead = yylex() else error() } main() { lookahead = yylex(); S(); /* S is the start symbol */ if (lookahead == EOF) then accept else reject } error() { … } CS 540 Spring 2010 GMU

  15. S() { if (lookahead == a ) { match(a);B(); } S  a B else if (lookahead == b) { match(b); C(); } S  b C else error(“expecting a or b”); } B() { if (lookahead == b) {match(b); match(b); C();} B  b b C else error(); } C() { if (lookahead == c) { match(c) ; match(c) ;} C  c c else error(); } Back to grammar CS 540 Spring 2010 GMU

  16. Parsing abbcc S abbcc Remaining input: Call S() from main() S() { if (lookahead == a ) { match(a);B(); } S  a B else if (lookahead == b) { match(b); C(); } S  b C else error(“expecting a or b”); } CS 540 Spring 2010 GMU

  17. Parsing abbcc S bbcc Remaining input: a B Call B() from A(): B() { if (lookahead == b) {match(b); match(b); C();} B  b b C else error(); } CS 540 Spring 2010 GMU

  18. Parsing abbcc S cc Remaining input: a B Call C() from B(): C() { if (lookahead == c) { match(c) ; match(c) ;} C  c c else error(); } b b C CS 540 Spring 2010 GMU

  19. Parsing abbcc S Remaining input: a B b b C c c CS 540 Spring 2010 GMU

  20. How do we find the lookaheads? • Can compute PREDICT sets from FIRST and FOLLOW for LL(1) parsing: • PREDICT(A  a) = (FIRST(a) – {e}) FOLLOW(A) if e in FIRST(a) = FIRST(a) if e not in FIRST(a) NOTE: e never in PREDICT sets For LL(k) grammars, the PREDICT sets for the productions associated with a given non-terminal must be disjoint. CS 540 Spring 2010 GMU

  21. Example FIRST(F) = {(,id} FIRST(T) = {(,id} FIRST(E) = {(,id} FIRST(T’) = {*,e} FIRST(E’) = {+,e} FOLLOW(E) = {$,)} FOLLOW(E’) = {$,)} FOLLOW(T) = {+$,)} FOLLOW(T’) = {+,$,)} FOLLOW(F) = {*,+,$,)} Assume E is the start symbol CS 540 Spring 2010 GMU

  22. E() { if (lookahead in {(,id } ) { T(); E_prime(); } E  T E’ else error(“E expecting ( or identifier”); } E_prime() { if (lookahead in {+}) {match(+); T(); E_prime();} E’  + T E’ else if (lookahead in {),end_of_file}) return; E’  e else error(“E_prime expecting +, ) or end of file”); } T() { if (lookahead in {(,id}) { F(); T_prime(); } T  F T’ else error(“T expecting ( or identifier”); } CS 540 Spring 2010 GMU

  23. T_prime() { if (lookahead in {*}) {match(*); F(); T_prime();} T’  * F T’ else if (lookahead in {+,),end_of_file}) return; T’  e else error(“T_prime expecting *, ) or end of file”); } F() { if (lookahead in {id}) match(id); F  id else if (lookahead in {(} ) { match( ( ); E(); match ( ) ); } F  ( E ) else error(“F expecting ( or identifier”);} CS 540 Spring 2010 GMU

  24. Parsing a + b * c E Remaining input: a+b*c CS 540 Spring 2010 GMU

  25. Parsing a + b * c E Remaining input: a+b*c T E’ E() { if (lookahead in {(,id } ) { T(); E_prime(); } else error(“E expecting ( or identifier”); } CS 540 Spring 2010 GMU

  26. Parsing a + b * c E Remaining input: a+b*c TE’ F T’ T() { if (lookahead in {(,id } ) { F(); T_prime(); } else error(“T expecting ( or identifier”); } CS 540 Spring 2010 GMU

  27. Parsing a + b * c E Remaining input: +b*c TE’ FT’ F() { if (lookahead in {id } ) match(id) else if (lookahead in { ( } { match( ( ); E(); match( ) ); } else error(“F expecting ( or identifier”); } id a CS 540 Spring 2010 GMU

  28. Parsing a + b * c E Remaining input: +b*c TE’ F T’ T_prime() { if (lookahead in {*}) {match(*); F(); T_prime();}’ else if (lookahead in {+,),end_of_file}) return; else error(“T_prime expecting *, ) or end of file”); } } id a e CS 540 Spring 2010 GMU

  29. Parsing a + b * c E Remaining input: b*c T E’ F T’ +T E’ E_prime() { if (lookahead in {+}) {match(+); T(); E_prime();}’ else if (lookahead in {),end_of_file}) return; else error(“E_prime expecting *, ) or end of file”); } } id a e CS 540 Spring 2010 GMU

  30. Parsing a + b * c E Remaining input: b*c T E’ F T’ +TE’ T() { if (lookahead in {(,id } ) { F(); T_prime(); } else error(“T expecting ( or identifier”); } id a F T’ e CS 540 Spring 2010 GMU

  31. Parsing a + b * c E Remaining input: *c T E’ F T’ +TE’ F() { if (lookahead in {id } ) match(id) else if (lookahead in { ( } { match( ( ); E(); match( ) ); } else error(“F expecting ( or identifier”); } id a F T’ e id b CS 540 Spring 2010 GMU

  32. Parsing a + b * c E Remaining input: c T E’ F T’ +T E’ T_prime() { if (lookahead in {*}) {match(*); F(); T_prime();}’ else if (lookahead in {+,),end_of_file}) return; else error(“T_prime expecting *, ) or end of file”); } } id a F T’ e id b *F T’ CS 540 Spring 2010 GMU

  33. Parsing a + b * c E Remaining input: T E’ F T’ +TE’ F() { if (lookahead in {id } ) match(id) else if (lookahead in { ( } { match( ( ); E(); match( ) ); } else error(“F expecting ( or identifier”); } id a F T’ e id b *F T’ id c CS 540 Spring 2010 GMU

  34. Parsing a + b * c E Remaining input: T E’ F T’ +TE’ T_prime() { if (lookahead in {*}) {match(*); F(); T_prime();}’ else if (lookahead in {+,),end_of_file}) return; else error(“T_prime expecting *, ) or end of file”); } } id a F T’ e id b * F T’ e id c CS 540 Spring 2010 GMU

  35. Parsing a + b * c E Remaining input: T E’ F T’ + T E’ E_prime() { if (lookahead in {+}) {match(+); T(); E_prime();}’ else if (lookahead in {),end_of_file}) return; else error(“E_prime expecting *, ) or end of file”); } } id a F T’ e e id b * F T’ e id c CS 540 Spring 2010 GMU

  36. Stacks in Recursive Descent Parsing E • Runtime stack • Procedure activations correspond to a path in parse tree from root to some interior node E’ T F id b CS 540 Spring 2010 GMU

  37. Two Approaches • Recursive Descent parsing • Code tailored to the grammar • Table Driven – predictive parsing • Table tailored to the grammar • General Algorithm Both algorithms driven by the tokens coming from the lexer. CS 540 Spring 2010 GMU

  38. LL(1) Predictive Parse Tables An LL(1) Parse table is a mapping T: Vn x Vt production P or error • For all productions A  a do For each terminal t in Predict(A a), T[A][t] = A  a • Every undefined table entry is an error. CS 540 Spring 2010 GMU

  39. Using LL(1) Parse Tables ALGORITHM INPUT: token sequence to be parsed, followed by ‘$’ (end of file) DATA STRUCTURES: • Parse stack: Initialized by pushing ‘$’ and then pushing the start symbol • Parse table T CS 540 Spring 2010 GMU

  40. Algorithm: Predictive Parsing similar to ‘match’ push($); push(start_symbol); lookahead = yylex() repeat X = pop(stack) if X is a terminal symbol or $ then if X = lookahead then lookahead = yylex() else error() else /* X is non-terminal */ if T[X][lookahead] = X  Y1 Y2 …Ym push(Ym) … push (Y1) else error() until X = $ token similar to ‘mimic’ CS 540 Spring 2010 GMU

  41. Example CS 540 Spring 2010 GMU

  42. CS 540 Spring 2010 GMU

  43. Assume E is the start symbol CS 540 Spring 2010 GMU

  44. CS 540 Spring 2010 GMU

  45. CS 540 Spring 2010 GMU

  46. CS 540 Spring 2010 GMU

  47. CS 540 Spring 2010 GMU

  48. CS 540 Spring 2010 GMU

  49. CS 540 Spring 2010 GMU

  50. CS 540 Spring 2010 GMU

More Related