Top-Down Parsing

Top-Down Parsing • Identify a leftmost derivation for an input string • Why ? • By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion that is consistent with scanning the input. • A  aBc  adDc  adec (scan a, scan d, scan e, scan c - accept!) • Recursive-descent parsing concepts • Predictive parsing • Recursive / Brute force technique • non-recursive / table driven • Error recovery • Implementation

Top-Down Parsing • From Grammar to Parser, take I

Recursive Descent Parsing S S cad cad c d A c d A a b Problem: backtrack S S cad cad c d A c d A a b a • General category of Parsing Top-Down • Choose production rule based on input symbol • May require backtracking to correct a wrong choice. • Example: S  c A d • A  ab | a input: cad S cad c d A a

Top-Down Parsing • From Grammar to Parser, take II

Predictive Parsing • Backtracking is bad! • To eliminate backtracking, what must we do/be sure of for grammar? • no left recursion • apply left factoring • (frequently) when grammar satisfies above conditions:current input symbol in conjunction with current non-terminal uniquely determines the production that needs to be applied. • Utilize transition diagrams: • For each non-terminal of the grammar do following: • 1. Create an initial and final state • 2. If A X1X2…Xn is a production, add path with edges X1, X2, … , Xn • Once transition diagrams have been developed, apply a straightforward technique to algorithmicize transition diagrams with procedure and possible recursion.

Transition Diagrams F  ( E ) | id E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F T T’ E’ T: E: 7 0 8 1 9 2 + T E’ E’: 3 5 4 6  ( * F E ) T’ F: T’: 10 14 11 15 12 16 13 17  id • Unlike lexical equivalents, each edge represents a token • Transition implies: if token, match input else call proc • Recall earlier grammar and its associated transition diagrams How are transition diagrams used ? Are -moves a problem ? Can we simplify transition diagrams ? Why is simplification critical ?

How are Transition Diagrams Used ? main() { TD_E(); } TD_E’() { token = get_token(); if token = ‘+’ then { TD_T(); TD_E’(); } } What happened to -moves? … “else unget()and terminate” NOTE: not all error conditions have been represented. TD_F() { token = get_token(); if token = ‘(’ then { TD_E(); match(‘)’); } else if token.value <> id then {error + EXIT} else ... } TD_E() { TD_T(); TD_E’(); } TD_T() { TD_F(); TD_T’(); } TD_E’() { token = get_token(); if token = ‘*’ then { TD_F(); TD_T’(); } }

How can Transition Diagrams be Simplified ? + E’ E’: 3 5  T 4 6

How can Transition Diagrams be Simplified ? (2) + E’ E’: 3 5   + E’: 3 5  T T 4 4 6 6

How can Transition Diagrams be Simplified ? (3) + E’ E’: 3 5  T  + + E’: 3 5 E’: 3 4   T T 4 4 6 6 6

How can Transition Diagrams be Simplified ? (4) + E’ E’: 3 5  T  + + E’: 3 5 E’: 3 4   T E’ E: 0 1 2 T T 4 4 6 6 6

How can Transition Diagrams be Simplified ? (5) + E’ E’: 3 5  T  + + E’: 3 5 E’: 3 4   T T E’ E: E: 0 0 1 2 T T 4 4 6 6 6 6 T + 3 4 

Additional Transition Diagram Simplifications *  10 13 F F T: 7 * T’: 10 11  13 ( E ) F: 14 15 16 17 id • Similar steps for T and T’ • Simplified Transition diagrams: Why is simplification important ? How does code change?

Top-Down Parsing • From Grammar to Parser, take III

Motivating Table-Driven Parsing 1. Left to right scan input 2. Find leftmost derivation Terminator Grammar: E  TE’ E’  +TE’ |  T  id Input : id + id $ Derivation: E  Processing Stack:

Non-Recursive / Table Driven Input (String + terminator) Predictive Parsing Program Stack a + b $ Output NT + T symbols of CFG What actions parser should take based on stack / input Parsing Table M[A,a] X Y Z $ Empty stack symbol • General parser behavior: X : top of stack a : current input • 1. When X=a = $ halt, accept, success • 2. When X=a  $ , POP X off stack, advance input, go to 1. • 3. When X is a non-terminal, examine M[X,a] • if it is an error  call recovery routine • if M[X,a] = {X  UVW}, POP X, PUSH W,V,U • DO NOT expend any input

Algorithm for Non-Recursive Parsing Set ip to point to the first symbol of w$; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is terminal or $ then if X=a then pop X from the stack and advance ip else error() else /* X is a non-terminal */ if M[X,a] = XY1Y2…Ykthen begin pop X from stack; push Yk, Yk-1, … , Y1 onto stack, with Y1 on top output the production XY1Y2…Yk end else error() until X=$ /* stack is empty */ Input pointer May also execute other code based on the production used

Example E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id INPUT SYMBOL Non-terminal id + * ( ) $ E ETE’ ETE’ E’ E’+TE’ E’ E’ T TFT’ TFT’ T’ T’ T’*FT’ T’ T’ F Fid F(E) Our well-worn example ! Table M

Trace of Example STACK INPUT OUTPUT

Trace of Example STACK INPUT OUTPUT $E $E’T $E’T’F $E’T’id $E’T’ $E’ $E’T+ $E’T $E’T’F $E’T’id $E’T’ $E’T’F* $E’T’F $E’T’id $E’T’ $E’ $ id + id * id$ id + id * id$ id + id * id$ id + id * id$ + id * id$ + id * id$ + id * id$ id * id$ id * id$ id * id$ * id$ * id$ id$ id$ $ $ $ E TE’ T FT’ F  id T’   E’  +TE’ T FT’ F  id T’  *FT’ F  id T’   E’   Expend Input

Leftmost Derivation for the Example The leftmost derivation for the example is as follows: E  TE’  FT’E’  id T’E’  id E’  id + TE’  id + FT’E’  id + id T’E’  id + id * FT’E’  id + id * id T’E’  id + id * id E’  id + id * id

What’s the Missing Puzzle Piece ? Constructing the Parsing Table M ! 1st : Calculate First & Follow for Grammar 2nd: Apply Construction Algorithm for Parsing Table ( We’ll see this shortly ) Basic Tools: First:Let  be a string of grammar symbols. First() is the set that includes every terminal that appears leftmost in  or in any string originating from . NOTE: If   , then  is First( ). Follow: Let A be a non-terminal. Follow(A) is the set of terminals a that can appear directly to the right of A in some sentential form. (S  Aa, for some  and ). NOTE: If S  A, then $ is Follow(A). * * *

Motivation Behind First & Follow Is used to help find the appropriate reduction to follow given the top-of-the-stack non-terminal and the current input symbol. First: Example: If A   , and a is in First(), then when a=input, replace A with  (in the stack). ( a is one of first symbols of , so when A is on the stack and a is input, POP A and PUSH . Follow: Is used when First has a conflict, to resolve choices, or when First gives no suggestion. When    or   , then what follows A dictates the next choice to be made. * Example: If A   , and b is in Follow(A ), then when   and b is an input character, then we expand A with  , which will eventually expand to , of which b follows! (   : i.e., First( ) contains .) * *

An example. STACK INPUT OUTPUT $S abbd$ S  aB C d B  CB | |S a C  b

Computing First(X) : All Grammar Symbols • 1. If X is a terminal, First(X) = {X} • 2. If X  is a production rule, add  to First(X) • 3. If X is a non-terminal, and X Y1Y2…Yk is a production rule • Place First(Y1) in First(X) • if Y1 , Place First(Y2) in First(X) • if Y2  , Place First(Y3) in First(X) • … • if Yk-1  , Place First(Yk) in First(X) • NOTE: As soon as Yi   , Stop. • Repeat above steps until no more elements are added to any First( ) set. • Checking “Yj   ?”essentially amounts to checking whether  belongs to First(Yj) * * * * *

Computing First(X) : All Grammar Symbols - continued • Informally, suppose we want to compute • First(X1 X2 … Xn ) = First (X1) “+” • First(X2) if  is in First(X1) “+” • First(X3) if  is in First(X2) “+” • … • First(Xn) if  is in First(Xn-1) Note 1: Only add  to First(X1 X2 … Xn) if  is in First(Xi) for all i Note 2: For First(X1), if X1 Z1 Z2 … Zm , then we need to compute First(Z1 Z2 … Zm) !

Example 1 Given the production rules: S  i E t SS’ | a S’  eS |  E  b

Example 1 Given the production rules: S  i E t SS’ | a S’  eS |  E  b Verify that First(S) = { i, a } First(S’) = { e,  } First(E) = { b }

Example 2 E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id Computing First for:

Example 2 E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id Overall: First(E) = { ( , id } = First(F) First(E’) = { + ,  } First(T’) = { * ,  } First(T)  First(F) = { ( , id } Computing First for: First(TE’) First(T) “+” First(E’) First(E) * Not First(E’) since T   First(T) First(F) “+” First(T’) First(F) * Not First(T’) since F   First((E)) “+” First(id) “(“ and “id”

Top-Down Parsing

Top-Down Parsing

Presentation Transcript

Top-Down Parsing

Top-Down Parsing

Top-Down Parsing

Parsing methods: Top-down parsing Bottom-up parsing Universal

Top-Down Parsing

Top-Down Parsing

Top-Down Parsing - recursive descent - predictive parsing

Top-Down Parsing

Chapter 4 Top-Down Parsing

Top-Down Parsing

Top-Down Parsing

Top-Down Parsing II

Top-Down Parsing

Top-Down parsing

Top down parsing

Top Down Parsing

Top-Down Parsing

Parsing Top-Down

TOP-DOWN PARSING