480 likes | 585 Vues
This comprehensive guide explores the intricacies of compiler design, focusing on parsing and the generation of machine code. It examines lexical analysis, the grammar of programming languages, and the various phases of parsing, including derivation trees and the significance of syntactic structures. Key concepts such as useless productions, nullable variables, and unit productions are discussed, along with techniques for optimizing grammars. By understanding the principles outlined here, readers will gain insights into efficient compiler construction and algorithmic parsing methods.
E N D
Machine Code Program Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v ELSE: WHILE: cmp x,3 ... v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } ...... Compiler
Compiler Lexical analyzer parser input output machine code program
A parser knows the grammar of the programming language
Parser PROGRAM STMT_LIST STMT_LIST STMT; STMT_LIST | STMT; STMT EXPR | IF_STMT | WHILE_STMT | { STMT_LIST } EXPR EXPR + EXPR | EXPR - EXPR | ID IF_STMT if (EXPR) then STMT | if (EXPR) then STMT else STMT WHILE_STMT while (EXPR) do STMT
The parser finds the derivation of a particular input derivation Parser input E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 E -> E + E | E * E | INT 10 + 2 * 5
derivation tree derivation E E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 + E E 10 E E * 2 5
derivation tree E machine code + E E mult a, 2, 5 add b, 10, a 10 E E * 2 5
Parser input string derivation grammar
Example: Parser derivation input ?
Exhaustive Search Phase 1: Find derivation of All possible derivations of length 1
Phase 2 Phase 1
Phase 2 Phase 3
Final result of exhaustive search (top-down parsing) Parser input derivation
Time complexity of exhaustive search Suppose there are no productions of the form Number of phases for string :
For grammar with rules Time for phase 1: possible derivations
Time for phase 2: possible derivations
Time for phase : possible derivations
Total time needed for string : phase 1 phase 2|w| phase 2 Extremely bad!!!
There exist faster algorithms for specialized grammars S-grammar: symbol string of variables appears once Pair
S-grammar example: Each string has a unique derivation
For S-grammars: In the exhaustive search parsing there is only one choice in each phase Time for a phase: Total time for parsing string :
For general context-free grammars: There exists a parsing algorithm that parses a string in time
A Substitution Rule Equivalent grammar SubstituteB
In general: Substitute B equivalent grammar
Useless Production Some derivations never terminate... Useless Productions
Useless Production Another grammar: Not reachable from S
In general: If Then variable is useful Otherwise, variable is useless
A production is useful if all its variables are useful
Removing Useless Productions Example Grammar:
First: find all variables that produce strings with only terminals Round 1: Round 2:
Keep only the variables that produce terminal symbols
Second: Find all variables reachable from Dependency Graph not reachable
Keep only the variables reachable from S Final Grammar
Nullable Variables Nullable Variable:
Removing Nullable Variables Example Grammar: Nullable variable
Final Grammar Substitute
Unit-Productions Unit Production:
Removing Unit Productions Observation: Is removed immediately
Remove repeated productions Final grammar
Removing All • Step 1: Remove Nullable Variables • Step 2: Remove Unit-Productions • Step 3: Remove Useless Variables