240 likes | 473 Vues
Fall 2010. The Chinese University of Hong Kong. CSCI 3130: Automata theory and formal languages. LR( 1 ) grammars. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. LR(0) parsing review. A  a A b A  ab. 3. 4. 2. 1. a. parser generator. A. CFG G. 5.
                
                E N D
Fall 2010 The Chinese University of Hong Kong CSCI 3130: Automata theory and formal languages LR(1) grammars Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130
LR(0) parsing review A  aAb A  ab 3 4 2 1 a parser generator A CFG G 5 “PDA” for parsing G error if G is not LR(0) a b A  a•Ab A  a•b A  •aAb A  •ab A  aA•b A  aAb• A  •aAb A •ab Motivation: Fast parsing for programming languages b A  ab•
Parsing computer programs if (n == 0) { return x; } else { return x + 1; } elseStatement Statement Block ifParExpressionStatement ... Block (Expression) ... ... Most programming language CFGs are not LR(0)!
LR(0) parsing review 4 5 3 2 1 a b a b A a b A action state stack  1 S A  aAb | ab a A 1 2 S • • A  a•Ab A  a•b A  •aAb A  •ab A  aA•b A  aAb• 12 2 S A  •aAb A •ab b • • • 122 5 R • A  ab• 3 S 12 • • • 4 R 123
Meaning of LR(0) items NFA transitions to: X  •g A undiscovered part shift focus to subtree rooted at X (if X is nonterminal) b a X • focus A  aX•b A  a•Xb move past subtreerooted at X
Outline of LR(0) parsing algorithm • LR(0) parser has two kinds of actions: • What if: no complete itemis valid there is one valid item,and it is complete reduce (R) shift (S) some valid itemscomplete, some not more than one validcomplete item R / R conflict S / R conflict
Hierarchy of context-free grammars context-free grammars CYK algorithm (slow) allow some conflicts conflicts can be resolved by lookahead LR(1) grammars LR(0) grammars LR(0) parsing algorithm
A CFG that is not LR(0) S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a valid LR(0) items: S  •A, S  •Bc A  •aA, A  •a B  •a, B  •ab, update
A CFG that is not LR(0) S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a peek inside! valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a S S S A A B S/R, R/R conflicts! A A R(4), R(5), S(6) A a a a a a a c • possible parse trees • •
Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a a peek inside! S valid LR(0) items: A  a•A, A  a• B  a•, B  a•b, A  •aA, A  •a A A … a a • action: shift parse tree must look like this
Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a a a peek inside! S valid LR(0) items: A  a•A, A  a• A  •aA, A  •a A A A … a a • action: shift parse tree must look like this
Lookahead S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) input: a a a e S valid LR(0) items: A  a•A, A  a• A  •aA, A  •a A A A a a a • action: reduce parse tree must look like this
LR(0) items vs. LR(1) items A LR(1) A LR(0) A A b b a a • • b b a a A A A  a•Ab [A  a•Ab, b] a a b b A  aAb | ab
LR(1) items A A x a b a b • • [A  a•b, x] [A  a•b, e]
Generating an LR(1) parser S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) NFA DFA + stack states are LR(1) items may have S/R, R/R conflicts A CFG is LR(1) if conflicts can always be resolved with one symbol lookahead
NFA for LR(0) parsing a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation e q0 S  •a For every LR(0) item S  •a X A  •X A  X• For every LR(0) item A  •X e A  •C C  •d For every pair of LR(0) items A  •C, C  •d
NFA for LR(1) parsing a, b: terminals A, B, C: variables a, b, d: mixed strings X: terminal or variable notation e q0 [S  •a, e] For every item S  •a X [A  X•, x] [A  •X, x] For every LR(1) item [A  •X, x] e [A  •C, x] [C  •d, y] For every LR(1) item [A  a•Cb, x] and production C  d and every y in FIRST(bx)
Explaining the transitions A A x x b b a X a X • • X [A  •X, x] [A  X•, x] C b A y • d x b a C • e [A  •C, x] [C  •d, y] y ∈ FIRST(bx)
FIRST sets S  A(1) | cB(2)A  aA(3) | a(4)B  a(5) | ab(6) For every y in FIRST(bx) g FIRST(g) A a {a} A {a} x a • b C {a, c} e S [A  •C, x] [C  •d, y] {c} cA {a} BA FIRST(g) are all leftmost terminals in derivations g ⇒ ... ∅ e
Example: Constructing the NFA [S  A•, e] S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) A [A  •aA, e] e [S  •A, e] [A  •a, e] e e . . . q0 [S  B•c, e] e B e [S  •Bc, e] [B  •a,c] e [B  •ab,c]
Example: Constructing the NFA S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) [S  A•, e] A a e A [A  aA•, e] [A  •aA, e] [S  •A, e] [A  a•A, e] e e a [A  •a, e] [A  a•, e] e q0 e c [S  B•c, e] [S  Bc•, e] B e a [S  •Bc, e] [B  •a,c] [B  a•,c] e a b [B  •ab,c] [B  a•b,c] [B  ab•,c]
Example: Convert NFA to DFA LEGEND S  A | Bc A  aA | a B  a | ab shift variable 8 1 2 7 4 5 6 3 shift terminal reduce A [A  a•A, e] [S  •A, e] [A  •aA, e] [A  a•A, e] [S  •Bc, e] [A  •a, e] [A  •aA, e] A a a [A  •aA, e] [A  aA•, e] [B  a•b,c] [A  •a, e] [A  •a, e] [A  a•, e] [A  a•, e] [B  •a,c] [B  a•,c] [B  •ab,c] a b A B c [S  B•c, e] [S  Bc•, e] [B  ab•,c] [S  A•, e]
Example: Resolving conflicts by lookahead LEGEND S  A(1) | Bc(2)A  aA(3) | a(4)B  a(5) | ab(6) shift variable 2 3 shift terminal reduce action action next next [A  a•A, e] [A  a•A, e] shift shift a a [A  •aA, e] [A  •aA, e] shift error [A  •a, e] [A  •a, e] b b [B  a•b,c] [A  a•, e] c reduce A c error [A  a•, e] e e reduce B reduce A [B  a•,c]
Example: Reconstruct the parse tree action state stack [S  •A, e] [A  a•A, e] 1 2 3 4 6 7 8 5 [S  •Bc, e]  1 S [A  •aA, e] [A  •aA, e] [A  •a, e] A a 1 2 S [A  •a, e] [B  a•b,c] [B  •a,c] 12 8 R [A  a•, e] [B  •ab,c] [B  a•,c] 1 6 S A a 7 R 16 B [S  A•, e]  [A  a•A, e] S b [A  •aA, e] [S  B•c, e] A [A  •a, e] B c [A  a•, e] [S  Bc•, e] a A a b c • • • • [A  aA•, e] [B  ab•,c]