Lesson 4

Télécharger la présentation

Lesson 4

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

1. Lesson 4 CDT301 – CompilerTheory, Spring 2011 Teacher: Linus Källberg

2. Outline • Recursive descent parsers • Left recursion • Left factoring

3. Recursive descent parsing

4. Writing a recursivedescent parser • Straightforward once the grammar is written in an appropriate form: • For each nonterminal: create a function • Represents the expectation of that nonterminal in the input • Each such function should choose a grammar production, i.e., RHS, based on the lookahead token • It should then process the chosen RHS • Terminals are “matched”: match(IF);match(LEFT_PARENTHESIS); … match(RIGHT_PARENTHESIS); … • For nonterminals their corresponding “expectation functions” are called

5. The function match() • Helper function to consume terminals: void match(intexpected_lookahead) { if (lookahead == expected_lookahead) lookahead = nextToken(); else error(); } (assumes tokens are represented as ints)

6. Recursive descent example • Grammar for a subset of the language “types in Pascal”: type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num • Examples of “programs”: ^ my_type array[ 1..10 ] of Integer array[ Char ] of 72..98

7. Recursive descent example void type() { switch(lookahead) { case '^': match('^'); match(ID); break; case ARRAY: match(ARRAY); match('['); simple(); match(']'); match(OF); type(); break; default: simple(); } } void simple() { switch(lookahead) { caseINTEGER: match(INTEGER); break; case CHAR: match(CHAR); break; caseNUM: match(NUM); match(DOTDOT); match(NUM); break; default: error(); } }

8. Exercise (1) List the calls made by the previous recursive descent parser on the input string array [ num dotdot num ] of integer To get you started: type() match(ARRAY) match('[') simple() ...

9. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type array [numdotdotnum] of integer

10. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum] of integer

11. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

12. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

13. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

14. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

15. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

16. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

17. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

18. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type array [numdotdotnum]of integer

19. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type simple array [numdotdotnum]of integer

20. type → ^ id | array [ simple ] of type | simple simple → integer | char | num dotdot num type simple type simple array [numdotdotnum]of integer

21. Left recursion

22. The problem with left recursion • Left-recursive grammar: A → A α | β • Problematic for recursive descent parsing • Infinite recursion

23. The problem with left recursion • The left-recursive expression grammar: expr → expr + num | expr – num | num • Parser code: voidexpr() { if (lookahead != NUM) expr(); match('+'); …

24. Eliminating left recursion • Left-recursive grammar: A → A α | β • Rewritten grammar: A → β M M → α M | ε

25. Exercise (2) Remove the left recursion from the following grammar for formal parameter lists in C: list → par | list , par par → int id int and id are tokens that represent the keyword int and identifiers, respectively. Hint: what is α and what is β in this case?

26. Left factoring

27. The problem • Recall: how does a predictive parser choose production body? • What if the lookahead token matches more than one such production body?

28. The problem • Problematic grammar: list → num | num , list • If lookahead = num, what to expect?

29. Left factoring • The previous grammar, list → num | num, list becomes list → numlist’ list’ → ε | , list

30. Exercise (3) Perform left factoring on the following grammar for declarations of variables and functions in C: decl → intid ; | intid ( pars ) ; pars → ...

31. Conclusion • Recursive descent parsers • Left recursion • Left factoring

32. Next time • The sets firstand follow • Defining LL(1) grammars • Non-recursive top-down parser • Handling syntax errors