1 / 68

Lecture 7 Semantic Analysis

This lecture covers semantic analysis in programming languages, including lexical analysis, syntax analysis, and semantic analysis. It also discusses the Visitor methodology for AST traversal and the importance of type systems in programming languages.

millsj
Télécharger la présentation

Lecture 7 Semantic Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 7 Semantic Analysis Xiaoyin Wang CS 5363 Programming Languages and Compilers

  2. Where We Are Source code (character stream)‏ if (b == 0) a = b; Lexical Analysis Tokenstream if ( b == 0 )‏ a = b ; Syntax Analysis (Parsing)‏ if = == Abstract syntaxtree (AST)‏ b 0 a b Semantic Analysis 2

  3. AST Decoration Before performing code generation, we should do some preparation in the AST Level Static Code Analysis, e.g., type inference, undefined variables, etc. Scope Analysis, e.g., global, class, function, smaller compilation scopes Symbol Table to support the analyses

  4. AST Data Structure abstractclassExpr { } classAddextends Expr { ... Expr e1, e2; } classNumextendsExpr { ... int value; } classIdextendsExpr { ... String name; } 4

  5. Could add AST Analysis to class, but… abstractclassExpr { …; /* state variables for visitA */ } classAddextends Expr { ... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} } classNumextendsExpr { ... int value; void visitA(){…} } classIdextendsExpr { ... String name; void visitA(){…} } 5

  6. Undesirable Approach to AST Analysis abstractclassExpr { …; /* state variables for visitA */ …; /* state variables for visitB */ } classAddextends Expr { ... Expr e1, e2; void visitA(){ …; visitA(this.e1); …; visitA(this.e2); …} void visitB(){ …; visitB(this.e2); …; visitB(this.e1); …} } classNumextendsExpr { ... int value; void visitA(){…} void visitB(){…} } classIdextendsExpr { ... String name; void visitA(){…} void visitB(){…} } 6

  7. Undesirable Approach to AST Computation • The problem with this approach is incorporating different semantic actions into the classes. • Type checking • Code generation • Optimization • Each class would have to implement each “action” as a separate method. 7

  8. Visitor Methodology for AST Traversal • Visitor pattern: separate data structure definition (e.g., AST) from algorithms that traverse the structure (e.g., name resolution code, type checking code, etc.). • Define Visitor interface for all AST traversals types. • i.e., code generation, type checking etc. • Extend each AST class with a method that accepts any Visitor (by calling it back)‏ • Code each traversal as a separate class that implements the Visitor interface 8

  9. Visitor Interface interfaceVisitor { void visit(Adde); void visit(Nume); void visit(Ide); } class TypeCheckVisitor implements Visitor { void visit(Adde) {…}; void visit(Nume){…}; void visit(Ide){…}; } class CodeGenVisitor implements Visitor { void visit(Adde) {…}; void visit(Nume){…}; void visit(Ide){…}; } 9

  10. Accept methods The declared type of this is the subclass in which it occurs. Overload resolution of v.visit(this); invokes appropriate visit function in Visitor v. abstractclassExpr { … abstractpublicvoid accept(Visitor v); } classAddextendsExpr { … publicvoid accept(Visitor v) { v.visit(this); } } classNumextendsExpr { … publicvoid accept(Visitor v) { v.visit(this); } } classIdextendsExpr { … publicvoid accept(Visitor v) { v.visit(this); } } 10

  11. Visitor Methods • For each kind of traversal, implement the Visitor interface, e.g., classPostfixOutputVisitorimplementsVisitor { void visit(Adde) { e.e1.accept(this); e.e2.accept(this); System.out.print( “+” ); } void visit(Num e) { System.out.print(e.value); } void visit(Ide) { System.out.print(e.id); } } • To traverse expression e: PostfixOutputVisitor v = newPostfixOutputVisitor(); e.accept(v); Dynamic dispatch e’.accept invokes accept method of appropriate AST subclass and eliminates case analysis on AST subclasses 11

  12. Visitor Interface (2)‏ interface Visitor { Object visit(Add e, Object inh); Object visit(Num e, Object inh); Object visit(Ide, Object inh); } 12

  13. Semantic Analysis/Checking Semantic analysis: the final part of the analysis half of compilation • afterwards comes the synthesis half of compilation Purposes: • perform final checking of legality of input program, “missed” by lexical and syntactic checking • name resolution, type checking,break stmt in loop, ... • “understand” program well enough to do synthesis • Typical goal: relate assignments to & references of particular variable

  14. Types • What is a type? • The notion varies from language to language • Consensus • A set of values • A set of operations on those values • Classes are one instantiation of the modern notion of type

  15. Why Do We Need Type Systems? Consider the assembly language fragment add $r1, $r2, $r3 What are the types of $r1, $r2, $r3?

  16. Types and Operations • Most operations are legal only for values of some types • It doesn’t make sense to add a function pointer and an integer in C • It does make sense to add two integers • But both have the same assembly language implementation!

  17. Type Systems • A language’s type system specifies which operations are valid for which types • The goal of type checking is to ensure that operations are used with the correct types • Enforces intended interpretation of values, because nothing else will! • Type systems provide a concise formalization of the semantic checking rules

  18. What Can Types do For Us? • Can detect certain kinds of errors • Arithmetic errors • Memory errors: • Reading from an invalid pointer, etc. • Calling methods from wrong object

  19. Type Checking Overview • Three kinds of languages: • Statically typed: All or almost all checking of types is done as part of compilation (C, Java, Cool) • Dynamically typed: Almost all checking of types is done as part of program execution (Scheme, Python) • Untyped: No type checking (machine code)

  20. Type Inference • Type Checking is the process of checking that the program obeys the type system • Often involves inferring types for parts of the program • Some people call the process type inference when inference is necessary

  21. Why Rules of Inference? • Inference rules have the form If Hypothesis is true, then Conclusion is true • Type checking computes via reasoning If E1 and E2 have certain types, then E3 has a certain type • Rules of inference are a compact notation for “If-Then” statements

  22. From English to an Inference Rule • The notation is easy to read (with practice) • Start with a simplified system and gradually add features • Building blocks • SymbolÞ is “if-then” • x:T is “x has type T”

  23. Notation for Inference Rules • By tradition inference rules are written • Type rules have hypotheses and conclusions of the form:  e : T •  means “we can prove that . . .”

  24. Two Rules (3 is an integer) [Int] [Add]

  25. Two Rules (Cont.) • These rules give templates describing how to type integers and + expressions • By filling in the templates, we can produce complete typings for expressions • We can fill the template with ANY expression!

  26. Example: 1 + 2

  27. Soundness • A type system is sound if • Whenever |- e : T • Then e evaluates to a value of type T • We only want sound rules • But some sound rules are better than others; here’s one that’s not very useful: (i is an integer)

  28. Type Checking Proofs • Type checking proves facts e : T • One type rule is used for each kind of expression • In the type rule used for a node e: • The hypotheses are the proofs of types of e’s sub-expressions • The conclusion is the proof of type of e

  29. Rules for Constants [Bool] [String] (s is a string constant)

  30. Object Creation Example (T denotes a class with parameterless constructor) [New]

  31. Typing: Example • Typing for 0.1 + 2 * 3 + : Float : Int * 0.1 : Float : Int 3 2 : Int

  32. Typing Derivations • The typing reasoning can be expressed as a tree: • The root of the tree is the whole expression • Each node is an instance of a typing rule • Leaves are the rules with no hypotheses

  33. A Problem • What is the type of a variable reference? • This rules does not have enough information to give a type. • We need a hypothesis of the form “we are in the scope of a declaration of x with type T”) (x is an identifier) [Var]

  34. A Solution: Put more information in the rules! • A type environment gives types for free variables • A type environment is a mapping from Identifiers to Types • A variable is free in an expression if: • The expression contains an occurrence of the variable that refers to a declaration outside the expression

  35. Type Environments Let O be a function from Identifiers to Types The sentence O |- e : T is read: Under the assumption that variables in the current scope have the types given by O, it is provable that the expression e has the type T

  36. Modified Rules The type environment is added to the earlier rules: [Int] (i is an integer) [Add]

  37. New Rules And we can write new rules: [Var] (if O(x) = T)

  38. Subtyping • Define a relation X  Y on classes to say that: • An object of type X could be used when one of type Y is acceptable, or equivalently • X conforms with Y • This means that X is a subclass of Y

  39. Dynamic And Static Types • The dynamic type of an object is the class C that is used in the “new C” expression that creates the object • A run-time notion • Even languages that are not statically typed have the notion of dynamic type • The static type of an expression is a notion that captures all possible dynamic types the expression could take • A compile-time notion

  40. Dynamic and Static Types. (Cont.) • In early type systems the set of static types correspond directly with the dynamic types • Soundness theorem: for all expressions E dynamic_type(E) = static_type(E) (in all executions, E evaluates to values of the type inferred by the compiler) • This gets more complicated in advanced type systems

  41. Here, x’s value has dynamic type A Here, x’s value has dynamic type B Dynamic and Static Types • A variable of static type A can hold values of static type B, if B <= A class A extends Object: … class B extends A: … def Main(): x: A x = A() … x  B() … x has static type A

  42. Dynamic and Static Types Soundness theorem: " E. dynamic_type(E) <= static_type(E) Why is this Ok? • For E, compiler uses static_type(E) (call it C) • All operations that can be used on an object of type C can also be used on an object of type C’ <= C • Such as fetching the value of an attribute • Or invoking a method on the object • Subclasses can only add attributes or methods • Methods can be redefined but with same type !

  43. Assignment More uses of subtyping: To the left, rule for languages with assignment expressions; to the right, assignmentstatements O |- id : T0 O |- e1 : T1 T1 T0 O |- id = e1; : void

  44. Conditional Expression • Consider: e0 ? e1 : e2 in C • The result can be either e1or e2 • The dynamic type is either e1’s or e2’s type • The best we can do statically is the smallest supertype larger than the type of e1 and e2

  45. If-Then-Else example • Consider the class hierarchy • … and the expression C? new A : new B • Its type should allow for the dynamic type to be both A or B • Smallest supertype is P P B A

  46. Least Upper Bounds • lub(X,Y), the least upper bound of X and Y,is Z if • X <= Z Ù Y <= Z Z is an upper bound • X <= Z’ Ù Y <= Z’ Þ Z <= Z’ Z is least among upper bounds • Typically, the least upper bound of two types is their least common ancestor in the inheritance tree

  47. If-Then-Else Revisited [If-Then-Else]

  48. Symbol Tables Key data structure during semantic analysis, code generation Stores info about the names used in program • a map (table) from names to info about them • each symbol table entry is a binding • a declaration adds a binding to the map • a use of a name looks up binding in the map • report a type error if none found

  49. Lexical Analyzer Syntax Analyzer Semantic Analyzer Code Generator Symbol Table The Symbol Table • When identifiers are found, they will be entered into a symbol table, which will hold all relevant information about identifiers. • This information will be used later by the semantic analyzer and the code generator.

  50. Symbol Table Entries • We will store the following information about identifiers. • The name (as a string). • The data type. • The block level. • Its scope (global, local, or parameter). • Its offset from the base pointer (for local variables and parameters only).

More Related