Understanding First and Follow Sets in Compiler Theory

Lesson 5 CDT301 – CompilerTheory, Spring 2011 Teacher: Linus Källberg

Outline • The sets first and follow • Non-recursive predictive parsing • Handling syntax errors

The sets first and follow

Motivation • Grammarproblematic for predictive parsing: stmt → func_call | loop func_call → id ( args ) ; loop → while ( expr ) block | for ( expr ; expr ; expr ) block

Motivation stmt → func_call | loop func_call → id ( args ) ; loop → while ( expr ) block | for ( expr ; expr ; expr ) block • FIRST(func_call) = { id } • FIRST(loop) = { while, for }

FIRST(α) • Simple case: α starts with a terminal a: FIRST(α) = { a } • Hardercase: α starts with a nonterminal A • Must examinewhat A canproduce • Ifα ⇒* ε then ε ∊ FIRST(α)

Computing FIRST(X) • Start with Ø • If X is a terminal then add X and return • If X ⇒* ε then add ε • For all rules X → Y1Y2...Yk do • For all Yi, where i = 1..k, do • Add FIRST(Yi) except for ε • If ε is not in FIRST(Yi) then break

FIRST example (4.30 in the book) • Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id • FIRST sets: FIRST(E) = { (, id } FIRST(T) = { (, id } FIRST(F) = { (, id } FIRST(E') = { +, ε } FIRST(T') = { *, ε }

FIRST example (4.30 in the book) • Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id E ⇒ T E' ⇒ F T' E' ⇒ ( E ) T' E' ⇒ … E ⇒ T E' ⇒ F T' E' ⇒ idT' E' ⇒ … FIRST(E) = { (, id } seemscorrect!

FIRST example (4.30 in the book) • Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id E' ⇒ + T E' ⇒ … + ∈ FIRST(E') seemscorrect! T' ⇒ * F T' ⇒ … * ∈ FIRST(T') seemscorrect!

Exercise (1) • Compute FIRST(K) and FIRST(M): K → K , i : M K → i : M M → M , i M → i • Compute FIRST(S), FIRST(A), and FIRST(B): S → 1 A : A S → 0 : A A → A B A → ε B → 0 B → 1

FOLLOW(A) • “What can follow A?” • Examplegrammar: S → a A b A c A → d | e • FOLLOW(A) = { b, c }

Computing FOLLOW(A) • Start with Ø • If A is the start symbol then add $ • For all rules B → α A βdo • Add everything except ε from FIRST(β) • For all rules B → α A, or B → α A βwhereε ∊ FIRST(β), do • Add everything from FOLLOW(B)

FOLLOW example(4.30 in the book) • Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id • FOLLOW sets: FOLLOW(E) = { $, ) } FOLLOW(E') = { $, ) } FOLLOW(T) = { +, $, ) } FOLLOW(T‘) = { +, $, ) } FOLLOW(F) = { *, +, $, ) }

FOLLOW example(4.30 in the book) • Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id E $ ⇒ T E' $ ⇒ F T' E' $ ⇒ ( E ) T' E' $ ⇒ ( T E' ) T' E' $ ⇒ … FOLLOW(E) = { $, ) } seems correct! FOLLOW(E') = { $, ) } seems correct!

FOLLOW example(4.30 in the book) • Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id E $ ⇒ T E' $ ⇒ T + T E' $ ⇒ T + T $ ⇒ T + F T' $ ⇒ T + ( E ) T' $ ⇒ T + ( T E' ) T' $ ⇒ T + ( T ) T' $ ⇒ … FOLLOW(T) = { +, $, ) } seemscorrect!

Exercise (2) • Compute FOLLOW(K) and FOLLOW(M): K → K , i : M K → i : M M → M , i M → i • Compute FOLLOW(S), FOLLOW(A), and FOLLOW(B): S → 1 A : A S → 0 : A A → A B A → ε B → 0 B → 1

LL(1) grammars • Not left-recursive • Not ambiguous • For all A → α | β: • FIRST(α) ∩ FIRST(β) = Ø • Ifε∊ FIRST(α) then FOLLOW(A) ∩ FIRST(β) = Ø • Ifε∊ FIRST(β) thenFOLLOW(A) ∩ FIRST(α) = Ø

Non-recursive predictive parsing

Types of top-down parsers • Predictive recursive descent parsers • Lab 1 • General recursive descent parsers • Non-recursive predictive parsers

Non-recursive predictive parsers • Keeps a stack of expected symbols • Loops: • Pop a symbol X • If X is a terminal, match with lookahead • If X is a nonterminal, predict and push

Parse table • Encodes predictions:

Demonstration Parse the string id * id using the previous parse table

Constructing the parse table • For each rule A → αdo • For each terminal a in FIRST(α) do • Write A → αin position M[A, a] • If ε is in FIRST(α) then • For each element b in FOLLOW(A) do • Add A → α in position M[A, b]

Handling syntax errors

Types of errors • Lexical • Syntactic • Semantic • Logical

Handling errors • Point out the spot • Tell the reason • Try to recover and proceed compiling • Do not generate code

Recovery strategies • Panic mode • Phrase-level • Error productions • Global correction

Panic mode • Discard until synchronizing token • What are good synchronizing tokens? • Properties: • Simple and fast • Might miss errors in discarded input

Phrase-level • Try to “fix” the input • Replace a comma by a semicolon • Delete or insert a semicolon • …

Error productions • Anticipate common errors • Add productions for these • One variant supported in Bison

Global correction • Try to find alternative parse tree • Minimize corrections • Too costly

Conclusion • The sets first and follow • Definition of LL(1) grammars • Non-recursive predictive parsing • Handling syntax errors

Next time • Code generation usingsyntax-directedtranslation • Lexicalanalysis

Understanding First and Follow Sets in Compiler Theory

Understanding First and Follow Sets in Compiler Theory

Presentation Transcript

Lesson 5

Lesson 5

Lesson 5

Lesson 5

Lesson 5

Lesson # 5

Lesson 5-5

Lesson 5

Lesson 5

Lesson 5

Lesson #5

Lesson 5

Lesson 5

LESSON 5:

Lesson 5

Lesson 5

Lesson 5

Lesson 5

Lesson 5-5

LESSON 5–5

Lesson 5

Lesson 5