Lecture 7 Semantic Analysis

Lecture 7 Semantic Analysis Xiaoyin Wang CS 5363 Programming Languages and Compilers

Where We Are Source code (character stream)‏ if (b == 0) a = b; Lexical Analysis Tokenstream if ( b == 0 )‏ a = b ; Syntax Analysis (Parsing)‏ if = == Abstract syntaxtree (AST)‏ b 0 a b Semantic Analysis 2

AST Decoration Before performing code generation, we should do some preparation in the AST Level Static Code Analysis, e.g., type inference, undefined variables, etc. Scope Analysis, e.g., global, class, function, smaller compilation scopes Symbol Table to support the analyses

AST Data Structure abstractclassExpr { } classAddextends Expr { ... Expr e1, e2; } classNumextendsExpr { ... int value; } classIdextendsExpr { ... String name; } 4

Could add AST Analysis to class, but… abstractclassExpr { …; /* state variables for visitA */ } classAddextends Expr { ... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} } classNumextendsExpr { ... int value; void visitA(){…} } classIdextendsExpr { ... String name; void visitA(){…} } 5

Undesirable Approach to AST Analysis abstractclassExpr { …; /* state variables for visitA */ …; /* state variables for visitB */ } classAddextends Expr { ... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} void visitB(){ …; visitB(this.e2); …; visitB(this.e1); …} } classNumextendsExpr { ... int value; void visitA(){…} void visitB(){…} } classIdextendsExpr { ... String name; void visitA(){…} void visitB(){…} } 6

Undesirable Approach to AST Computation • The problem with this approach is incorporating different semantic actions into the classes. • Type checking • Code generation • Optimization • Each class would have to implement each “action” as a separate method. 7

Visitor Methodology for AST Traversal • Visitor pattern: separate data structure definition (e.g., AST) from algorithms that traverse the structure (e.g., name resolution code, type checking code, etc.). • Define Visitor interface for all AST traversals types. • i.e., code generation, type checking etc. • Extend each AST class with a method that accepts any Visitor (by calling it back)‏ • Code each traversal as a separate class that implements the Visitor interface 8

Visitor Interface interfaceVisitor { void visit(Adde); void visit(Nume); void visit(Ide); } class TypeCheckVisitor implements Visitor { void visit(Adde) {…}; void visit(Nume){…}; void visit(Ide){…}; } class CodeGenVisitor implements Visitor { void visit(Adde) {…}; void visit(Nume){…}; void visit(Ide){…}; } 9

Accept methods The declared type of this is the subclass in which it occurs. Overload resolution of v.visit(this); invokes appropriate visit function in Visitor v. abstractclassExpr { … abstractpublicvoid accept(Visitor v); } classAddextendsExpr { … publicvoid accept(Visitor v) { v.visit(this); } } classNumextendsExpr { … publicvoid accept(Visitor v) { v.visit(this); } } classIdextendsExpr { … publicvoid accept(Visitor v) { v.visit(this); } } 10

Visitor Methods • For each kind of traversal, implement the Visitor interface, e.g., classPostfixOutputVisitorimplementsVisitor { void visit(Adde) { e.e1.accept(this); e.e2.accept(this); System.out.print( “+” ); } void visit(Num e) { System.out.print(e.value); } void visit(Ide) { System.out.print(e.id); } } • To traverse expression e: PostfixOutputVisitor v = newPostfixOutputVisitor(); e.accept(v); Dynamic dispatch e’.accept invokes accept method of appropriate AST subclass and eliminates case analysis on AST subclasses 11

Visitor Interface (2)‏ interface Visitor { Object visit(Add e, Object inh); Object visit(Num e, Object inh); Object visit(Ide, Object inh); } 12

Semantic Analysis/Checking Semantic analysis: the final part of the analysis half of compilation • afterwards comes the synthesis half of compilation Purposes: • perform final checking of legality of input program, “missed” by lexical and syntactic checking • name resolution, type checking,break stmt in loop, ... • “understand” program well enough to do synthesis • Typical goal: relate assignments to & references of particular variable

Types • What is a type? • The notion varies from language to language • Consensus • A set of values • A set of operations on those values • Classes are one instantiation of the modern notion of type

Why Do We Need Type Systems? Consider the assembly language fragment add $r1, $r2, $r3 What are the types of $r1, $r2, $r3?

Types and Operations • Most operations are legal only for values of some types • It doesn’t make sense to add a function pointer and an integer in C • It does make sense to add two integers • But both have the same assembly language implementation!

Type Systems • A language’s type system specifies which operations are valid for which types • The goal of type checking is to ensure that operations are used with the correct types • Enforces intended interpretation of values, because nothing else will! • Type systems provide a concise formalization of the semantic checking rules

What Can Types do For Us? • Can detect certain kinds of errors • Arithmetic errors • Memory errors: • Reading from an invalid pointer, etc. • Calling methods from wrong object

Type Checking Overview • Three kinds of languages: • Statically typed: All or almost all checking of types is done as part of compilation (C, Java, Cool) • Dynamically typed: Almost all checking of types is done as part of program execution (Scheme, Python) • Untyped: No type checking (machine code)

Type Inference • Type Checking is the process of checking that the program obeys the type system • Often involves inferring types for parts of the program • Some people call the process type inference when inference is necessary

Why Rules of Inference? • Inference rules have the form If Hypothesis is true, then Conclusion is true • Type checking computes via reasoning If E1 and E2 have certain types, then E3 has a certain type • Rules of inference are a compact notation for “If-Then” statements

From English to an Inference Rule • The notation is easy to read (with practice) • Start with a simplified system and gradually add features • Building blocks • SymbolÞ is “if-then” • x:T is “x has type T”

Notation for Inference Rules • By tradition inference rules are written • Type rules have hypotheses and conclusions of the form:  e : T •  means “we can prove that . . .”

Two Rules (3 is an integer) [Int] [Add]

Two Rules (Cont.) • These rules give templates describing how to type integers and + expressions • By filling in the templates, we can produce complete typings for expressions • We can fill the template with ANY expression!

Example: 1 + 2

Soundness • A type system is sound if • Whenever |- e : T • Then e evaluates to a value of type T • We only want sound rules • But some sound rules are better than others; here’s one that’s not very useful: (i is an integer)

Type Checking Proofs • Type checking proves facts e : T • One type rule is used for each kind of expression • In the type rule used for a node e: • The hypotheses are the proofs of types of e’s sub-expressions • The conclusion is the proof of type of e

Rules for Constants [Bool] [String] (s is a string constant)

Object Creation Example (T denotes a class with parameterless constructor) [New]

Typing: Example • Typing for 0.1 + 2 * 3 + : Float : Int * 0.1 : Float : Int 3 2 : Int

Typing Derivations • The typing reasoning can be expressed as a tree: • The root of the tree is the whole expression • Each node is an instance of a typing rule • Leaves are the rules with no hypotheses

A Problem • What is the type of a variable reference? • This rules does not have enough information to give a type. • We need a hypothesis of the form “we are in the scope of a declaration of x with type T”) (x is an identifier) [Var]

A Solution: Put more information in the rules! • A type environment gives types for free variables • A type environment is a mapping from Identifiers to Types • A variable is free in an expression if: • The expression contains an occurrence of the variable that refers to a declaration outside the expression

Type Environments Let O be a function from Identifiers to Types The sentence O |- e : T is read: Under the assumption that variables in the current scope have the types given by O, it is provable that the expression e has the type T

Modified Rules The type environment is added to the earlier rules: [Int] (i is an integer) [Add]

New Rules And we can write new rules: [Var] (if O(x) = T)

Subtyping • Define a relation X  Y on classes to say that: • An object of type X could be used when one of type Y is acceptable, or equivalently • X conforms with Y • This means that X is a subclass of Y

Dynamic And Static Types • The dynamic type of an object is the class C that is used in the “new C” expression that creates the object • A run-time notion • Even languages that are not statically typed have the notion of dynamic type • The static type of an expression is a notion that captures all possible dynamic types the expression could take • A compile-time notion

Dynamic and Static Types. (Cont.) • In early type systems the set of static types correspond directly with the dynamic types • Soundness theorem: for all expressions E dynamic_type(E) = static_type(E) (in all executions, E evaluates to values of the type inferred by the compiler) • This gets more complicated in advanced type systems

Here, x’s value has dynamic type A Here, x’s value has dynamic type B Dynamic and Static Types • A variable of static type A can hold values of static type B, if B <= A class A extends Object: … class B extends A: … def Main(): x: A x = A() … x  B() … x has static type A

Dynamic and Static Types Soundness theorem: " E. dynamic_type(E) <= static_type(E) Why is this Ok? • For E, compiler uses static_type(E) (call it C) • All operations that can be used on an object of type C can also be used on an object of type C’ <= C • Such as fetching the value of an attribute • Or invoking a method on the object • Subclasses can only add attributes or methods • Methods can be redefined but with same type !

Assignment More uses of subtyping: To the left, rule for languages with assignment expressions; to the right, assignmentstatements O |- id : T0 O |- e1 : T1 T1 T0 O |- id = e1; : void

Conditional Expression • Consider: e0 ? e1 : e2 in C • The result can be either e1or e2 • The dynamic type is either e1’s or e2’s type • The best we can do statically is the smallest supertype larger than the type of e1 and e2

If-Then-Else example • Consider the class hierarchy • … and the expression C? new A : new B • Its type should allow for the dynamic type to be both A or B • Smallest supertype is P P B A

Least Upper Bounds • lub(X,Y), the least upper bound of X and Y,is Z if • X <= Z Ù Y <= Z Z is an upper bound • X <= Z’ Ù Y <= Z’ Þ Z <= Z’ Z is least among upper bounds • Typically, the least upper bound of two types is their least common ancestor in the inheritance tree

If-Then-Else Revisited [If-Then-Else]

Symbol Tables Key data structure during semantic analysis, code generation Stores info about the names used in program • a map (table) from names to info about them • each symbol table entry is a binding • a declaration adds a binding to the map • a use of a name looks up binding in the map • report a type error if none found

Lexical Analyzer Syntax Analyzer Semantic Analyzer Code Generator Symbol Table The Symbol Table • When identifiers are found, they will be entered into a symbol table, which will hold all relevant information about identifiers. • This information will be used later by the semantic analyzer and the code generator.

Symbol Table Entries • We will store the following information about identifiers. • The name (as a string). • The data type. • The block level. • Its scope (global, local, or parameter). • Its offset from the base pointer (for local variables and parameters only).

Lecture 7 Semantic Analysis

Lecture 7 Semantic Analysis

Presentation Transcript

Lecture 7 Technical Analysis

Semantic analysis

Semantic Analysis

Semantic Analysis

Lecture 5: Probabilistic Latent Semantic Analysis

Semantic Analysis

Lecture 7: Task Analysis

Lecture 7: Item Analysis

Lecture 7 Technical Analysis

Semantic Analysis

Lecture 5: Probabilistic Latent Semantic Analysis

Semantic Analysis

Semantic Analysis

Lecture 7: Factor Analysis

Lecture 7 Technical Analysis

Semantic Analysis

Lecture 7 Technical Analysis

Semantic Analysis

Semantic Analysis