450 likes | 466 Vues
Understand top-down parsing, LL(k) and LR(k) grammars, table-driven and recursive descent parsers at FCI, Cairo University. Explore parsing hierarchy and example of LL(1) parser. Learn about predictive top-down parsing and how to build a parsing table.
E N D
Cairo University • FCI Compilers CS419 Lecture13: Syntax Analysis: Top-Down Parsing (Prerequisites) Dr. HussienSharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University Welcome to a journey to
Parsing Top-downversusbottom-up • Top-downparser: • starts at the root of derivation tree • picks a production and tries to match the input • may require backtracking • some grammars are backtrack-free (predictive) • Bottom-upparser: • starts at the leaves • starts in a state valid for legal first tokens • as input is consumed, changes state to encode possibilities (recognizevalidprefixes) • uses a stack to store both state and sentential forms
Parsing Hierarchy of grammar classes: TD/BU • LL(k): • Left-to-right, Leftmost derivation, k tokens lookahead • LR(k): • Left-to-right, Rightmost derivation, k tokens lookahead • SLR: • Simple LR (uses “follow sets”) • LALR: • LookAhead LR (uses “lookahead sets”) http://en.wikipedia.org/wiki/LL_parser …
A parser is a top-down if it discovers a parse tree top to bottom. A top-down parser corresponds to a preorder traversal of the parse tree. A left most derivation is applied to each step. Top-down parsing S a A b c d
Top-down parsing • Main idea: • Start at the root, grow towards leaves • Pick a production and try to match input • May need to backtrack
Based on the predictionof the correct production to choose for further expansion of the left-most non-terminal in hand. • Backtracking resulting from predicting wrong productions should be avoided! • Backtracking can be avoided by removing ambiguity, and making left-factoring • Predictive Top-down parsers can be implemented in 2 ways: • Recursive Descent Parser • Table-driven Parser Predictive Top-down Parsing
Grammar structure is directly translated into program structure. Uses recursive functions to implement predictive parsing. Each nonterminal in the grammaris implemented by a function in the program. Each such function looks at the next input symbol in order to choose one of the productions for the nonterminal. Recursive-descent top-down parser
The right-hand side of the chosen production is then used for parsing in the following way: • A terminal on the right-hand side is matched against the next input symbol. • If they match, we move on to the following input symbol and the next symbol on the right hand side, otherwise an error is reported. • A nonterminal on the right-hand side is handled by calling the corresponding function and, after this call returns, continuing with the next symbol on the right-hand side. • When there are no more symbols on the right-hand side, the function returns. Recursive-descent top-down parser
We encode the selection of productions into a table instead of in the program text. A simple non-recursive program uses this tableand a stack to perform the parsing. The table is cross-indexed by nonterminal and terminal It contains for each such pair the production (if any) that is chosen for that nonterminal when that terminal is the next input symbol. Table-driven top-down parser
CS416 Compiler Design LL(1) Parser – Example of parsing table stackinputoutput $Eid+id$ E TE’ $E’Tid+id$ T FT’ $E’ T’Fid+id$ F id $ E’ T’idid+id$ $ E’ T’+id$ T’ $ E’+id$ E’ +TE’ $ E’ T++id$ $ E’ Tid$ T FT’
Consists of: • Parsing Stack: that holds grammar symbols: non-terminals and tokens. • Parsing Table: that specifies the parser actions (Match, Predict, Accept, Error). • Driver Function: that interacts with parser stack, parsing table, and scanner. LL parsing using table-driven technique Output Scanner Parser Driver Next token Parsing Stack Parsing Table
Predict: to predict a production and apply it in a derivation step. • Match: to match top of parser stack with next input token. • Accept: parsedsuccessfully. • Error: failure. Top-down parser actions
Consider the following grammar: S ( S ) | to parse (()),we follow these steps: Example 1
Which of the following strings are in the language represented by the given CFG? Exercise
Which of the following is a valid derivationof the given grammar?
Which of the following is a valid Parse Treeof the given grammar?
Steps: Elimination of Ambiguity Elimination of Left Recursion Left Factoring Drawing Transition Diagram (Optional) Applying First/Follow operators Building the Parsing Table Parse the given statements designing a Table-driven LL(1) top-down parser:
Existence of left recursion. Unhandled left factoring. Both problems prevents any LL parser from deciding deterministically which rule should be fired (which production should be chosen). Problems facing LL(1) parsers
Steps: Elimination of Ambiguity Elimination of Left Recursion Left Factoring Drawing Transition Diagram (Optional) Applying First/Follow operators Building the Parsing Table Parse the given statements designing a top-down parser:
Left-most derivation After choosing a rule/production to expand, replace/expand its non-terminals left to right
Right-most derivation After choosing a rule/production to expand, replace/expand its non-terminals right to left Same parse tree!
How to choose a rule/production? • Does it matter to choose a different rule to start with? • What is the effect of that? • How to avoid this?
Example • This string has 2 parse trees:
Ambiguity • In general, we try to eliminate ambiguity by rewriting the grammar. • Example: EE+E | EE | (E) | id becomes: EE+T | T TTF | F F (E) | id • Can you build a parse tree for id*id+id?
How to choose a rule/production? • A grammar is ambiguous if it has more than one parse tree for some string • Equivalently, there is more than one right-most or left-most derivation for some string
Idea: • A statement appearing between a then and an else must be matched Elimination of ambiguity (cont.)
Steps: Elimination of Ambiguity Elimination of Left Recursion Left Factoring Drawing Transition Diagram (Optional) Applying First/Follow operators Building the Parsing Table Parse the given statements designing a top-down parser:
Grammar problems • Because we try to generate a leftmost derivation by scanning the input from left to right, grammars of the form A A x may cause endless recursion. • Such grammars are called left-recursive and they must be transformed if we want to use a top-down parser.
Left recursion • A grammar is left recursive if for a non-terminal A, there is a derivation A A • There are three types of left recursion: • direct (A A x) • indirect (A B C, B A ) • hidden (A B A, B )
Left Recursion • Applying the method of a recursive descent parser would lead to the function void A() { A(); // Process } which leads to infinite recursion.
A grammar is left recursive if it has a non-terminal A such that there is a derivation A→ Aα • Top down parsing methods can’t handle left-recursive grammars • A simple rule for direct left recursion elimination: • For a rule like: • A → A α|β • We may replace it with • A → β A’ • A’ → α A’| ɛ Elimination of left recursion
Eliminating Left Recursion • Left recursion in a production may be removed by transforming the grammar in the following way. • Replace A A | with A A' A' A' |
Left recursion • To eliminate direct left recursion replace A A1 | A2 | ... | Am | 1 | 2 | ... | n with A 1B | 2B | ... | nB B 1B | 2B | ... | mB |
Example: Eliminating Left Recursion A A | • Under the original productions, a derivation of is A A A A . • Under the new productions, a derivation of is A A' A' A' A' . A A' A' A' |
Example: Eliminating Left Recursion • Consider the left recursive grammar: EE + T | T TT * F | F F (E) | id
Left Recursion What happens if an expression grammar is left recursive? This can lead to non-termination in a top-down parser For a top-down parser, any recursion must be right recursion We would like to convert the left recursion to right recursion Non-termination is a bad property in any part of a compiler from Cooper & Torczon
Example: Eliminating Left Recursion • To convert this leftrecursive grammar into right recursive: EE + T | T TT * F | F F (E) | id We need to apply the past transformation onto each left-recursive rule/production A A | A A' A' A' |
Example: Eliminating Left Recursion • Apply the transformation to E: ETE' E' + TE' | . • Then apply the transformation to T: TF T' T' * F T' | .
Example: Eliminating Left Recursion • Now the grammar is: ET E' E' + T E' | TF T' T' * F T' | F (E) | id
Example: Eliminating Left Recursion • The function for a production of E' would be: E’ → + T E’ voidEprime() { if (token == PLUS) { match(PLUS); T(); Eprime(); } return; } • The function looks like this in Recursive-descent parser implementations