1 / 35

Types and Programming Languages

Types and Programming Languages. Lecture 5. Simon Gay Department of Computing Science University of Glasgow. 2006/07. A Practical Interlude. We want to understand how to convert the formal specification of a type system into an implemented typechecker.

royal
Télécharger la présentation

Types and Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Types and Programming Languages Lecture 5 Simon Gay Department of Computing Science University of Glasgow 2006/07

  2. A Practical Interlude We want to understand how to convert the formal specification of a type system into an implemented typechecker. We will build typecheckers for the simple expression language and the simple functional language, and use them as the basis for implementations of more complex type systems later. • The process is fairly straightforward, but we need to take care • with some details: • correct handling of variables and environments • production of useful error messages. We will use Java for our implementations. You might find it interesting to look at Pierce’s implementations in OCaml. Types and Programming Languages Lecture 5 - Simon Gay

  3. Implementing Typechecking Typechecking consists of traversing the AST, checking that the typing rules are obeyed. This requires establishing the type of each expression. Later stages of compilation require type information so the type of each expression must be stored. Variables must be matched with declarations - so scoping rules are also checked. The process is often called contextual analysis - perhaps this means more than typechecking, but we’ll just refer to typechecking. The process of establishing the type of every expression is sometimes called elaboration. Types and Programming Languages Lecture 5 - Simon Gay

  4. Example: Typechecking a Triangle Program The following simple program is written in Triangle, a Pascal-like language defined by Watt and Brown in their book Programming Language Processors in Java. let var n : Integer; var c : Char in begin c := ‘&’; n := n + 1 end Types and Programming Languages Lecture 5 - Simon Gay

  5. Example: Typechecking a Triangle Program Its abstract syntax tree: Program LetCommand SequentialDeclaration SequentialCommand AssignCommand AssignCommand BinaryExpression VarDec VarDec CharExpr VnameExpr IntExpr SimpleT SimpleT SimpleV SimpleV SimpleV Ident Ident Ident Ident Ident CharLit Ident Ident Op IntLit n Integer c Char c ‘&’ n n + 1 Types and Programming Languages Lecture 5 - Simon Gay

  6. Example: Typechecking a Triangle Program Traversal: Program LetCommand SequentialDeclaration SequentialCommand AssignCommand AssignCommand BinaryExpression VarDec VarDec CharExpr VnameExpr IntExpr SimpleT SimpleT SimpleV SimpleV SimpleV Ident Ident Ident Ident Ident CharLit Ident Ident Op IntLit n Integer c Char c ‘&’ n n + 1 Types and Programming Languages Lecture 5 - Simon Gay

  7. Example: Typechecking a Triangle Program Checking: Program LetCommand  SequentialDeclaration SequentialCommand  AssignCommand  AssignCommand  : int BinaryExpression VarDec VarDec CharExpr VnameExpr IntExpr : char : int : int SimpleT SimpleT SimpleV SimpleV SimpleV : char : int : int Ident Ident Ident Ident Ident CharLit Ident Ident Op IntLit n Integer c Char c ‘&’ n n + 1 Types and Programming Languages Lecture 5 - Simon Gay

  8. Implementing Typechecking The details depend on the representation of ASTs, which in turn depends partly on the implementation language. For example, in a functional language we define a datatype corresponding to the abstract syntax of the language. In ML the datatype for SEL might look like this: datatype expr = IntLit of int | BoolLit of bool | Eq of expr * expr | Plus of expr * expr | And of expr * expr | Cond of expr * expr * expr Types and Programming Languages Lecture 5 - Simon Gay

  9. Implementing Typechecking (ML example) The typechecker is a function from expr to … what? We can define another datatype for elaborated ASTs. In general this must represent the type of every expression, but in SEL the only expression whose type is not obvious is the conditional: datatype ty = Int | Bool datatype typed_expr = IntLit of int | BoolLit of bool | Eq of typed_expr * typed_expr | Plus of typed_expr * typed_expr | And of typed_expr * typed_expr | Cond of typed_expr * typed_expr * typed_expr * ty Compare this with Pierce’s approach. Types and Programming Languages Lecture 5 - Simon Gay

  10. Implementing Typechecking (ML example) The typechecker is a function from expr to typed_expr * ty. fun check (IntLit n) = (IntLit n, Int) | check (BoolLit b) = (BoolLit b, Bool) | check (Eq(e,f)) = let val (e’,t) = check e val (f’,u) = check f in if (t = Int) andalso (u = Int) then (Eq(e’,f’), Bool) else error | ... | check (Cond(c,e,f)) = let val (c’,t) = check c val (e’,u) = check e val (f’,v) = check f in if (t = Bool) andalso (u = v) then (Cond(c’,e’,f’,u), u) else error What do we do here? Also we need to consider different error cases. Types and Programming Languages Lecture 5 - Simon Gay

  11. Implementing Typechecking • We’re going to use Java, which means that • we use an object-oriented representation of ASTs • we don’t need to rebuild the elaborated AST because we can store type information by updating the original AST or another data structure • we have more choice about how to implement AST traversal. • The representation of ASTs uses a natural OO style: • define an abstract class for each kind of phrase • define a class for each specific way of constructing a phrase. Watt and Brown’s book describes this in detail. Types and Programming Languages Lecture 5 - Simon Gay

  12. Classes for the Simple Expression Language abstract class Expr { } class IntLitExpr extends Expr { int value; } class BoolLitExpr extends Expr { boolean value; } class EqExpr extends Expr { Expr left, right; } Types and Programming Languages Lecture 5 - Simon Gay

  13. Classes for the Simple Expression Language class PlusExpr extends Expr { Expr left, right; } class AndExpr extends Expr { Expr left, right; } class CondExpr extends Expr { Expr cond, then_br, else_br; } Types and Programming Languages Lecture 5 - Simon Gay

  14. Implementing Tree Traversal: instanceof One possibility is to copy the functional language approach and implement a case-analysis on the class of an Expr object. Type check(Expr e) { if (e instanceof IntLitExpr) return representation of type int else if (e instanceof BoolLitExpr) return representation of type bool else if (e instanceof EqExpr) { Type t = check(((EqExpr)e).left); Type u = check(((EqExpr)e).right); if (t == representation of type int && u == representation of type int) return representation of type bool ... Types and Programming Languages Lecture 5 - Simon Gay

  15. Implementing Tree Traversal: instanceof This approach leads to a messy nested if, which can’t be converted into a switch because Java has no mechanism for switching on the class of an object. Also this technique is not very object-oriented: instead of explicitly using instanceof, we prefer to arrange for analysis of an object’s class to be done via the built-in mechanisms of overloading and dynamic method dispatch. Types and Programming Languages Lecture 5 - Simon Gay

  16. Implementing Tree Traversal: Visitor Pattern A more object-oriented approach is to use the visitor design pattern. (See Watt and Brown for more details.) A visitor class implements the Visitor interface, and therefore contains a method for each kind of expression: interface Visitor { void visitIntLitExpr(IntLitExpr e); void visitBoolLitExpr(BoolLitExpr e); void visitEqExpr(EqExpr e); void visitPlusExpr(PlusExpr e); void visitAndExpr(AndExpr e); void visitCondExpr(CondExpr e); } Types and Programming Languages Lecture 5 - Simon Gay

  17. Implementing Tree Traversal: Visitor Pattern The abstract class Expr contains a visit method: abstract class Expr { abstract void visit(Visitor v); } and each class defines visit so that the appropriate method from the Visitor object is called: class EqExpr extends Expr { Expr left, right; void visit(Visitor v) { v.visitEqExpr(this); } } Types and Programming Languages Lecture 5 - Simon Gay

  18. Implementing Tree Traversal: Visitor Pattern The typechecker is defined as a class which implements the Visitor interface: class Checker implements Visitor { void visitIntLitExpr(IntLitExpr e) { store the type Int in association with e } ... void visitCondExpr(CondExpr e) { e.cond.visit(this); e.then_br.visit(this); e.else_br.visit(this); inspect the types of cond, then_br, else_br, and store type of e } Types and Programming Languages Lecture 5 - Simon Gay

  19. Implementing Typechecking: Tools If we want to implement a typechecker (for SEL or SFL, say) then we also need a parser. It is convenient to use an automated tool to generate as much as possible of the front-end machinery. We will use SableCC, a compiler construction tool developed at McGill University in Canada. • SableCC is given an annotated grammar, and generates • Java class definitions to represent syntax trees supporting the use of visitors • a parser • a more flexible (in some ways) version of the visitor pattern Types and Programming Languages Lecture 5 - Simon Gay

  20. A SableCC Grammar for SEL A grammar for SEL, suitable for SableCC, begins with a specification of tokens: Package sel; Helpers digit = ['0' .. '9']; tab = 9; cr = 13; lf = 10; space = ' '; graphic = [[32 .. 127] + tab]; Tokens blank = (space | tab | cr | lf)* ; comment = '//' graphic* (cr | lf); int = digit digit*; plus = '+'; and = '&'; eq = '=='; if = 'if'; then = 'then'; else = 'else'; true = 'true'; false = 'false'; lparen = '('; rparen = ')'; Ignored Tokens blank, comment; Types and Programming Languages Lecture 5 - Simon Gay

  21. A SableCC Grammar for SEL Followed by the productions: Productions expression = {term} term | {plus}[left]:term plus [right]:term | {and}[left]:term and [right]:term | {eq}[left]:term eq [right]:term | {cond} if [cond]:expression then [then_branch]:expression else [else_branch]:expression; term = {int_lit} int | {bool_lit} bool | {exp} lparen expression rparen; bool = {true} true | {false} false; Types and Programming Languages Lecture 5 - Simon Gay

  22. A SableCC Grammar for SEL Exercise: Draw a parse tree for the expression 1 + (2 + 3) . Why are the brackets necessary and why has the grammar been defined in a way that makes them necessary? Types and Programming Languages Lecture 5 - Simon Gay

  23. Syntax Tree Classes for SEL For each non-terminal in the grammar, SableCC generates an abstract class, for example: abstract class PExpression extends Node {} where Node is a pre-defined class of syntax tree nodes which provides some general functionality. Similarly we get abstract classes PTerm and PBool. The names of these classes are systematically generated from the names of the non-terminals. Types and Programming Languages Lecture 5 - Simon Gay

  24. Syntax Tree Classes for SEL For each production, SableCC generates a class, for example: class APlusExpression extends PExpression { PTerm _left_; PTerm _right_; public void apply(Switch sw) { ((Analysis) sw).caseAPlusExpression(this); } } There are also set and get methods for _left_ and _right_, constructors, and other housekeeping methods which we won’t use. Types and Programming Languages Lecture 5 - Simon Gay

  25. Using SableCC’s Visitor Pattern The main way of using SableCC’s visitor pattern is to define a class which extends DepthFirstAdapter. By over-riding the methods inAPlusExpression or outAPlusExpression etc. we can specify code to be executed when entering or leaving each node during a depth first traversal of the syntax tree. If we want to modify the order of traversal then we can over-ride caseAPlusExpression etc. but this is often not necessary. The in and out methods return void, but the class provides HashTable in, out; which we can use to store types of expressions. Types and Programming Languages Lecture 5 - Simon Gay

  26. Typechecking SEL We define class Checker extends DepthFirstAdapter and over-ride the out methods. We use the out Hashtable to store and retrieve the type of each expression, using methods setOut and getOut. We represent types by means of an abstract class Type with subclasses IntType and BoolType. Errors are added to an ErrorTable by creating an object of the right error class. At the end of typechecking, errors are reported. Types and Programming Languages Lecture 5 - Simon Gay

  27. Typechecking SEL: PlusExpression public void outAPlusExpression(APlusExpression node) { Type leftType = (Type)getOut(node.getLeft()); Type rightType = (Type)getOut(node.getRight()); if (leftType != null) { if (!(leftType instanceof IntType)) { errorTable.add(node.getPlus().getLine(), new PlusLeftError(leftType.name()));}}; if (rightType != null) { if (!(rightType instanceof IntType)) { errorTable.add(node.getPlus().getLine(), new PlusRightError(rightType.name()));}}; if ((leftType instanceof IntType) && (rightType instanceof IntType)) { setOut(node, new IntType());}; } Types and Programming Languages Lecture 5 - Simon Gay

  28. The SEL Typechecker An implementation of a typechecker for SEL can be found on the course web page. You should study the implementation, the accompanying notes, and Worksheet 3. Any questions about the implementation of the typechecker can be dealt with in a future tutorial. Types and Programming Languages Lecture 5 - Simon Gay

  29. Implementing an SFL Typechecker An implementation of a typechecker for the Simple Functional Language can be found on the course web page and is described in the accompanying notes. You should study them in comparison with the SEL typechecker. • The typechecker is based on the SEL typechecker, with two main • differences: • expressions are typechecked with respect to an environment, so we need an implementation of environments • function definitions and function applications must be checked, and type information for functions must be stored in the environment. There are of course some changes to the grammar, including the fact that there is now syntax for the types int and bool. Types and Programming Languages Lecture 5 - Simon Gay

  30. Implementing Environments An environment is essentially a lookup table, indexed by strings (identifier names) and containing two kinds of entry: variable with type function name with parameter types and result type We can use a Hashtable. Types and Programming Languages Lecture 5 - Simon Gay

  31. Nested Scopes We must deal with nesting of scopes. Even though SFL does not have nested functions, there is still a global scope (containing type information for all functions) and a local scope within each function. The class Env implements a stack of Hashtables. To look up a variable or function name, first look in the Hashtable on top of the stack. If it is not there, keep looking down the stack. We will be able to use the same Env class for environments in languages with full scope nesting. Types and Programming Languages Lecture 5 - Simon Gay

  32. x : float, y : int x : int, b : bool Example: Nested Scopes { int x; bool b; { float x; int y; code…x…y…b… } code…x…b… } search this way openScope( ) creates a new Hashtable on the stack closeScope( ) removes the top Hashtable put(String n, EnvEntry e), get(String n) Types and Programming Languages Lecture 5 - Simon Gay

  33. Mutual Recursion The SEL typechecker makes a single traversal of the syntax tree. If we want to typecheck SFL in a single pass, then in order to support mutually recursive functions we need to follow Pascal: function f(x:int):int; forward; function g(x:int):int begin g := f(x); end; function f(x:int):int begin f := g(x); end; or Standard ML: fun f(x:int) = g(x) and g(x:int) = f(x) Instead, to stick closely to the formal definition of SFL, we use two passes: the first just looks at function definitions and builds an initial environment containing their type information. Types and Programming Languages Lecture 5 - Simon Gay

  34. Making SFL More Powerful We have a formal definition of the syntax, operational semantics and type system of SFL and we have proved that the type system is sound. Our design of the language itself was rather ad hoc, and we have seen that functions in SFL lack flexibility. To make SFL look more like a real functional language, we need to build on a suitable theoretical foundation: the lambda calculus ( calculus). When we have seen how to introduce functions properly, we’ll go on to look at structured data types (e.g. records). Types and Programming Languages Lecture 5 - Simon Gay

  35. Exercise for Tutorial • The aim of next week’s tutorial is to ensure that you understand • the SEL typechecker. Please work through the exercises, which • have the following main tasks, in advance. • Using SableCC to build the syntax tree classes for SEL, then compiling and testing the typechecker. • Understanding the structure of directories and files containing the SableCC-generated classes, the Checker class, and the error-reporting mechanism. • Adding a new operator to the SEL grammar, using SableCC to rebuild the generated classes, extending Checker and defining appropriate new error classes. In the tutorial we will discuss these exercises and any further details of the SEL example. Types and Programming Languages Lecture 5 - Simon Gay

More Related