Lecture # 4

Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue)

Left over Topics of Chapter 1 • What is Analysis /Synthesis Model of Compilation? • Symbol Table Management • Error Detection and Reporting • What is meant by grouping of compilation phases into Front End and Back End? • What is meant by Single / Multiple Passes? • What are the Compiler Construction Tools available?

The Analysis-Synthesis Model of Compilation • There are two parts to compilation: • Analysis determines the operations implied by the source program which are recorded in a tree structure • Synthesis takes the tree structure and translates the operations therein into the target program

Another way.. • Analysis: breaks the source program into constituent pieces and creates intermediate representation • Synthesis: constructs target program from the intermediate representation • The first three phases namely: Lexical Analysis, Syntax Analysis and Semantic Analysis form the analysis part • The last three phases form the Synthesis part

Symbol Table Management • An essential function of a compiler is to record the identifiers used in the source program and to collect information about various attributes of each identifier • A symbol table is a data structure containing an entry for each identifier with fields for the attributes of the identifier

Error Detection and Reporting • Each phase of the compiler can encounter error. • After detecting error the compiler must deal with that error so that compilation can proceed. • A lexical analyzer will detect errors where characters do not form a token • Errors where token violates the syntax are determined by syntax analysis • If the compiler tries to add two variables one of which is the name of a function and another is an array then Symantic Analysis will throw error

Section 1.5: The Grouping of Phases • Compiler phases are grouped into frontand backends: • Front end:analysis (machine independent) • Back end: synthesis (machine dependent) • Front End focuses on understanding the source program and the backend focuses on mapping programs to the target machine.

Compiler Passes • Compiler Passes: • A collection of phases is done only once (single pass) or multiple times (multi pass) • Single pass: usually requires everything to be defined before being used in source program • Multi pass: compiler may have to keep entire program representation in memory

Section 1.6: Compiler-Construction Tools • Software development tools are available to implement one or more compiler phases • Scanner generators (Lex and Flex) • Parser generators (Yacc and Bison) • Syntax-directed translation engines • Automatic code generators • Data-flow engines For further details this webpage would be sufficient http://dinosaur.compilertools.net/ COP5621 Fall 2009

ANTLR 3.x Project for Compiler Construction • This is a project that is built using Eclipse and the source code along with all the class files are available in Java. This aids the students in creating compiler project on a fly. • Its C# libraries are also available that can be used. • I would try to take a lab and discuss it • It tutorials and videos are available at the following address: http://www.vimeo.com/groups/29150/videos

Recap of the last lecture • Difference: Skeletal Source Program Preprocessor Source Program Compiler Target Assembly Program Assembler Relocatable Object Code Libraries and Relocatable Object Files Linker Absolute Machine Code

Recap We discussed: • What are Regular Expressions ? How to write ? • RE→NFA (Thompson’s construction) • NFA →DFA (Subset construction)

The Subset Construction Algorithm Initially, -closure(s0) is the only state in Dstates and it is unmarkedwhile there is an unmarked state T in Dstatesdo mark Tfor each input symbol a  doU := -closure(move(T,a))ifU is not in Dstatesthen add U as an unmarked state to Dstatesend ifDtran[T,a] := Uend doend do

Subset Construction Example a a1 1 2  start  a b b a2 3 0 4 5 6 a b  a3 7 8 b b DstatesA = {0,1,3,7}B = {2,4,7}C = {8}D = {7}E = {5,8}F = {6,8} a3 C a b b b start A D a a b b B E F a1 a3 a2 a3

Today’s Lecture • How can we minimize a DFA? (Hopcroft’s Algorithm) • What are the important states of an NFA? • How to convert from a Regular Expression to DFA directly?

Section 3.9: Minimization of DFA • What do we want to achieve?

Hopcroft’s Algorithm Pg 142 • Input: A DFA M with set of states S, set of inputs , transition function defined, start state So and set of accepting states F • Output: A DFA M’ accepting the same language as M and having fewer states as possible

Algorithm 3.6 • Method: Step1:Construct an initial partition P of the states with two groups : the accepting states (F) and the non accepting states (S-F) Step2:Apply the following procedure (Construction of Pnew) to construct a new partition (Pnew)

Procedure for Pnew construction • For each group G of P do partition G into subgroups such that two states s and t are in the same subgroup if and only if for all input symbols a, states s and t have transitions on a to states in the same group of P • Replace G in Pnew by the set of all subgroups formed

Algorithm 3.6(continue..) • Step3: If Pnew = P and proceed to step 4 . Otherwise repeat step 2 with P=Pnew • Step4:Choose one state as the state representative and add these states in M’ • Step5: If M’ has a dead state and unreachable state then remove those states (A dead state is a non accepting state that has transitions to itself on all inputs. An unreachable state is any state not reachable from the start state ) • Step6: Complete

Example # 1 • The DFA for (a|b) *abb

Example # 1 (Applying Minimization)

Example # 2 • The DFA for a(b|c)*

Example #2: Applying Minimization

Example # 3 • Minimize the following DFA:

Example 3: Minimized

Example # 4 • Minimize the following DFA: b C a b a b start a b b a b b A B D E A B D E a a a a b a

Section 3.9: From Regular Expression to DFA Directly • The “important states” of an NFA are those without an -transition, that is ifmove({s},a)  for some athen s is an important state • The subset construction algorithm uses only the important states when it determines-closure(move(T,a))

From Regular Expression to DFA Directly (Algorithm) • Augment the regular expression r with a special end symbol # to make accepting states important: the new expression is r# • Construct a syntax tree for r# • Traverse the tree to construct functions nullable, firstpos, lastpos, and followpos

From Regular Expression to DFA Directly: Syntax Tree of (a|b)*abb# concatenation # 6 b closure 5 b 4 a * 3 alternation | positionnumber (for leafs ) a b 1 2

From Regular Expression to DFA Directly: Annotating the Tree • nullable(n): the sub tree at node n generates languages including the empty string • firstpos(n): set of positions that can match the first symbol of a string generated by the sub tree at node n • lastpos(n): the set of positions that can match the last symbol of a string generated be the sub tree at node n • followpos(i): the set of positions that can follow position iin the tree

From Regular Expression to DFA Directly: Annotating the Tree

From Regular Expression to DFA Directly: Syntax Tree of (a|b)*abb# {1, 2, 3} {6} # {6} {6} {1, 2, 3} {5} 6 b {1, 2, 3} {4} {5} {5} nullable 5 b {1, 2, 3} {3} {4} {4} 4 firstpos lastpos a {3} {3} {1, 2} * {1, 2} 3 | {1, 2} {1, 2} a b {1} {1} {2} {2} 1 2

From Regular Expression to DFA Directly: followpos for each node n in the tree do if n is a cat-node with left child c1 and right child c2then for each iin lastpos(c1) dofollowpos(i) := followpos(i)  firstpos(c2)end do else if n is a star-node for each iin lastpos(n) dofollowpos(i) := followpos(i)  firstpos(n)end do end ifend do

From Regular Expression to DFA Directly: Algorithm s0 := firstpos(root) where root is the root of the syntax treeDstates := {s0} and is unmarkedwhile there is an unmarked state T in Dstatesdomark T for each input symbol a  dolet U be the set of positions that are in followpos(p) for some position p in T, such that the symbol at position p is a if U is not empty and not in Dstatesthenadd U as an unmarked state to Dstates end ifDtran[T,a] := U end doend do

From Regular Expression to DFA Directly: Example 1 3 4 5 6 2 b b a start 1,2,3 1,2,3,4 b 1,2,3,5 b 1,2,3,6 a a a

Time-Space Tradeoffs

Lecture # 4

Lecture # 4

Presentation Transcript

LECTURE

Lecture 25 Lecture 26

Lecture

Lecture

Lecture VIII Lecture IX

Lecture

Lecture 10 Lecture 10 Lecture 11 Lecture 11 Lecture 11 Lecture 11

Lecture S1: Sample Lecture

Lecture