Compiler Construction
This document provides key announcements and reminders for Compiler Construction students, including the extension of PA4 until the end of the exam period and the upcoming PA5 bonus exercise to be released post-semester. Key topics covered include parsing techniques, lexical and syntax analysis, and semantic checks. Students are encouraged to review past exams and focus on issues concerning grammar types, such as LL(1) and LR(0). Important aspects of project development, including representation of scopes and type-checking in semantic analysis, are also highlighted.
Compiler Construction
E N D
Presentation Transcript
Compiler Construction Recap
Announcements • PA4: extension until end of exam period • Not a single day more, for any reason! • PA5: bonus exercise • Will be posted immediately after semester ends
Exam • 18/02/2013 at 9:00 • Past exams on my website • חומר פתוח • Possible questions • Extend IC • Parsing • Register allocation • …
Scanning // An example programclass Hello { boolean state; static void main(string[] args) { Hello h = new Hello(); boolean s = h.rise(); Library.printb(s); h.setState(false); } boolean rise() { boolean oldState = state; state = true; return oldState; } void setState(boolean newState) { state = newState; }} Issues in lexical analysis: • Language changes: • New keywords • New operators • New meta-languagefeatures (e.g., annotations) CLASS,CLASS_ID(Hello),LB,BOOLEAN,ID(state),SEMI …
prog class_list class field_method_list field field_method_list method type ID(state) field_method_list … BOOLEAN … Parsing and AST CLASS,CLASS_ID(Hello),LB,BOOLEAN,ID(state),SEMI … Issues in syntax analysis: • Grammars: LL(1), LR(0) • Ambiguity Parser uses token stream, and generates derivation tree
Parsing and AST • Should know difference between derivation tree and AST • Know how to build AST from input CLASS,CLASS_ID(Hello),LB,BOOLEAN,ID(state),SEMI … Parser uses token stream, and generates derivation tree prog class_list ProgAST Syntax tree builtduring parsing classList class ClassAST methodList fieldList field_method_list FieldAST[0]type:BoolTypename:state MethodAST[0] … field field_method_list MethodAST[1] … method type ID(state) MethodAST[2] field_method_list … BOOLEAN …
Question: Parsing • Is the following grammar is LR(0)? S -> B $ B -> id P | id ( E ] P -> epsilon | ( E ) E -> B | B,E A grammar with epsilon productions is not LR(0)
Other possible questions • Is the following grammar in LR(k)? • Build a parser for given grammar • Run an input string using your parser • …
Semantic analysis • Representing scopes • Type checking • Semantic checks (Program) ProgAST (Hello) classList ClassAST methodList fieldList FieldAST[0]type:BoolType MethodAST[0] … (setState) MethodAST[1] … … MethodAST[2]
Semantic conditions • What is checked at compile time, and what is checked at runtime?
Question: IC language • Support Java override annotation inside comments • // @Override • Annotation is written above method to indicate it overrides a method in superclass • Describe the phases in the compiler affected by the change and the changes themselves Legal program Illegal program class A { void rise() {…}}class B extends A { // @Override void ris() {…}} class A { void rise() {…}}class B extends A { // @Override void rise() {…}}
Answer • The change affects the lexical analysis, syntax analysis and semantic analysis • Does not affect later phases • User-level semantic condition
Changes to scanner • Add pattern for @Override inside comment state patterns • Add Java action code to comments: Instead of not returning any token, we now return a token for the annotation boolean override=false;%%<INITIAL> // { override=false; yybegin(comment); }<comment> @Override { override=true; }<comment> \n { if (override) return new Token(…,override,…) }
Changes to parser and AST method static type name params ‘{‘ mbody ‘}’ | type name params ‘{‘ mbody ‘}’ | OVERRIDE type name params ‘{‘ mbody ‘}’ Add a Boolean flag to the method AST node to indicate that the method is annotated
Changes to semantic analysis • Suppose we have an override annotation above a method m in class A • We check the following semantic conditions: • class A extends a superclass (otherwise it does not make sense to override a method) • Traverse the superclasses of A by going up the class hierarchy, until we find the first method m, and check that it has the same signature as A.mIf we fail to find such a method, then we report an error
Question: IC language Add constructors to IC (must be called)
Answer • Treat the constructor as a function, and call when object allocated • Lexical analysis: nothing • Parsing: AST node for constructor • Semantic analysis: • Check that every class has a constructor • Actual/formal compatibility • IR/code generation: • Call the constructor on allocation
Translation to IR • Accept annotated AST and translate functions into lists of instructions • Compute offsets for fields and virtual methods • Issues: dispatch tables, weighted register allocation
Question: IR • Give the method tables for Rectangle and Square class Shape {booleanisShape() {return true;}booleanisRectangle() {return false;}booleanisSquare() {return false;} double surfaceArea() {…}}class Rectangle extends Shape { double surfaceArea() {…}booleanisRectangle() {return true;}}class Square extends Rectangle {booleanisSquare() {return true;}}
Answer Method table for rectangle Method table for square
Question: IR • Suppose we wish to provide type information at runtime • Similar to instanceof in Java • x instanceof A returns true iff x is exactly of type A (in Java it can also be subtype of A) • Describe the changes in runtime organization needed to support this operator and the translation to IR
Answer • Use the pointer to the dispatch table as the type indicator • Translate x instanceof A asMove x,R0MoveField R0.0,R0Compare R0,_DV_A • If we want to support the Java operator • Represent the type hierarchy at runtime and generate code to search up the hierarchy • Keep ancestor info for each type to enable constant-time checking
Register allocation • Sethi Ullman • can only handle expressions without side effect • Global register allocation • IR registers are treated as local variables • When we have an actual spill we use the stack
Weighted register allocation • Can save registers by reordering subtreecomputations • Label each node with its weight • Weight = number of registers needed • Leaf weight known • Internal node weight • w(left) > w(right) then w = left • w(right) > w(left) then w = right • w(right) = w(left) then w = left + 1 • Choose heavier child as first to be translated • Have to check that there are no side effects
W=2 W=1 W=2 W=1 W=1 W=0 W=1 Weighted reg. alloc. example R0 :=TR[a+b[5*c]] Phase 1: - check absence of side-effects in expression tree - assign weight to each AST node + a array access base index b * 5 c
Reminder R0 := TR[a+(b+(c*d))] left child first right child first + R0 + R0 a a R0 + + R1 R0 b b R1 R2 * R0 * c d c d R2 R0 Translation using all optimizationsshown until now uses 3 registers Managed to save two registers
Sethi Ullman • What type of tree is worst case for SU with respect to the tree’s height?
Sethi Ullman • What type of tree is best casefor SU with respect to the tree’s height? . . .
Sethi Ullman • What type of tree maximizes the ratio between registers allocated by traversing the tree left-to-right and right-to-left? . . .