Programming Languages 2nd edition Tucker and Noonan

Programming Languages2nd editionTucker and Noonan Chapter 4 Names The first step toward wisdom is calling things by their right names. Anon. Chinese Proverb

4.1 Syntactic Issues 4.2 Variables 4.3 Scope 4.4 Symbol Table 4.5 Resolving References 4.6 Dynamic Scoping 4.7 Visibility 4.8 Overloading 4.9 Lifetime

Binding • Binding establishes an association between an entity (such as a variable) and a property (such as its value or type). • A binding is static if the association occurs before run-time. • A binding is dynamic if the association occurs at run-time.

Names and Bindings • Names (identifiers) refer to variables, constants, functions. • Names are bound to objects when they are declared. • Declaring variables: int jim; • Declaring functions: int fun1(float, float); • Questions: • Implicit or explicit declarations? • Fortran versus most other languages • Static or dynamic binding? • Most languages versus Python, Perl, … • Variable bindings versus others…

Names and Bindings • Reserved words: predefined names that have a special meaning • Useful, but can be overdone; e.g., COBOL: ~300, including some you might like to use as identifiers, (COUNT, DAY, LENGTH, …) • Name resolution: the process of associating a name with its declaration. • Declaration versus definition

Syntactic Issues for Names • Lexical rules for names (identifiers). • Case sensitivity ? • C-like: yes • Early languages, including Fortran, : no • Restrictions on allowed characters? • Use of special character to convey information • e.g., Perl • $number: a scalar value (integer, float, string, boolean) • @number: number is an array (but $number[0] is a single array element)

Variables • A variable is a binding of a name to a (virtual) memory address. • Variables have 4 basic bindings: • Name • Address • Type • Value • Variables also have scope and lifetime.

Two Meanings of A Variable Name • L-value - use of a variable name to denote its memory address. • Ex: x = … • R-value - use of a variable name to denote its value. • Ex: … = … x … • R-value is x dereferenced

Some languages only recognize the l-value meaning; they require explicit dereferencing of variables. Example: ML x := !y + 1 // !y => value of y • Pointer dereferencing in C/C++ is similar • int x, y; • int *p; • x = *p; // x gets the R-value of the R-value of p • *p = y; // the R-value of y is assigned to the memory location referenced by p

Characteristics of Names (1) • The scope of a name: the collection of statements that can reference the name. Names usually must be unique within a scope. • The lifetime of a variable name: the execution time interval during which memory is allocated to the object it is bound to.

Characteristics of Names (2) (Not always unique within a scope) • Visibility refers to the possibility that a name may be re-declared within the normal scope of an existing declaration, thus hiding previous instances of the name. • Overloading permits name resolution for functions & operators to be based on number or type of parameters, so a single name can have two meanings within a single scope.

Scope • The scope of a name is the collection of statements which can access the name binding. • Static scoping: a name is bound to a collection of statements according to its position in the source program. Can be determined statically (before runtime). • Most modern languages use static (or lexical) scoping.

Static scoping can be handled at compile time by a simple scan of the code. • Two different scopes are either nested or disjoint. • In disjoint scopes, same name can be bound to different entities without any interference between the two definitions. • What constitutes a scope?

Algol C Java Ada Package n/a n/a yes yes Class n/a n/a nested yes Function nested yes yes nested Block nested nestednested nested For Loop no no yes automatic A compilation unit also defines a scope

The scope in which a name is defined or declared is called its definingscope. • A reference to a name is nonlocal if it occurs in a scope that is nested in the defining scope; otherwise, it is local. • Most languages forbid or strictly limit forward references, the use of a name before its declaration.

Example Scopes in C – Figure 4.1 1 void sort (float a[ ], int size) { 2 inti, j; 3 for (i = 0; i < size; i++) 4 for (j = i + 1; j < size; j++) 5if (a[j] < a[i]) { 6 float temp; 7 temp = a[i]; // temp:local ref 8 a[i] = a[j]; // a, i, j: non-local 9 a[j] = temp; 10 } 11 } // inner { } delimit a nested scope The defining scope of a & size is lines 2 – 10 (between the outer curly braces) The defining scope of iand j is the same block, but no references to them are permitted until after they are declared. The defining scope of temp is lines 6 – 10, with references only permitted after the declaration.

// for-loop scope in C++ & Java for (inti = 0; i < 10; i++) { System.out.println(i); ... } ... i ... // invalid reference to i for-loop scope extends only to the end of the body of the for-loop.

Global Scope intt; void main { inti, j; t = 10; i = t; j = t * I; cout << j;. . . } Global Scope • Some languages ( C, C++, Python, …) allow a program unit to consist of several functions, with variable definitions occurring outside of the functions. • Global variables have global scope (are non-local to all functions in the program unit.) • t is global in this program Example code with global variable

Symbol Table – Static Scoping • A symboltable is a data structure built as part of semantic analysis that allows the translator to keep track of each declared name and its bindings. • The data structure can be any implementation of a dictionary or set, where the name is the key & its binding is the value. • In Java, the Dictionary class is the abstract parent of any class, such as Hashtable, which maps keys to values

Contents of Symbol Table • Name + bindings • For names that refer to variables, the bindings include • Address (relative location in this module) • Type : simple or structured • Dimensions, if an array • . . .

Symbol Tables • Assumption: a name is unique within its local scope (we ignore overloading for simplicity) • Assumption: no forward references to variables • One symbol table entry for each name: • name – bindings

Resolving References • Name-binding pairs are entered in the symbol table when they are defined; e.g., when a variable or function is declared • When a name reference occurs during syntax or semantic processing, use the symbol table to determine the binding. • Possible bindings: • name-virtual address • name-type • name-parameter list (for a function)

Symbol Tables • A separate dictionary/symbol table for each local scope. • During parsing, semantic analysis, bindings for local variables are established by querying the dictionary for the current scope. • What about non-local references? • Use a stack of scopes

1 inth, i; 2 void B(int w) { 14 void main( ){ int j, k; 15 int a, b; 4 i = 2*w; 16 h = 5; a = 3; b = 2; 5 w = w+1; 17 A(a, b); 6 ... 18 B(h); 7 } 19 . . . 8 void A (int x, int y) { 20 } 9 float i, j; 10 B(h); 11 i = 3; 12 ... 13 } 14

Creating/managing the Symbol Table • Each time a scope is entered (during translation), push a new dictionary onto the stack. • Each time a scope is exited, (by entering a disjoint scope; e.g., completing a function or block), pop a dictionary off the top of the stack. • For each name declared in the scope, generate an appropriate binding and enter the name+binding pair into the dictionary that is on the top of the stack.

Name resolution: Given a name reference, search the dictionary (symbol table) on top of the stack: • If found, return the binding. • Otherwise, repeat the process on the next dictionary down in the stack. • If the name is not found in any dictionary on the stack, report an error – undefined name.

Comments • There is one symbol table for each scope in the program. • At any time, only some of the symbol tables are on the stack. • Only symbol tables for current active scopes are on the stack. They represent the current local scope and any nesting scope • For compiled languages, symbol tables are only used during translation; for interpreted languages they must be present at runtime.

1 void sort (float a[ ], int size) { 2 inti, j; 3 for (i = 0; i < size; i++) 4 for (j = i + 1; j < size; j++) 5 if (a[j] < a[i]) { 6 float temp; 7 temp = a[i]; 8 a[i] = a[j]; 9 a[j] = temp; 10 } 11 } // inner { } delimit a nested scope At line 4 and 11: ST2: <size,1> <a, 1> <i, 2> <j, 2> ST1: <sort, 1> At line 7: ST3: <temp, 6> ST2: <size,1> <a, 1> <i, 2> <j, 2> ST1: <sort, 1> Symbol Table Stacks

Resolving References – Static Scope • The referencingenvironment for a name is its defining scope and all nested (inner) scopes. • The referencing environment defines the set of statements which can validly reference a name. • A valid non-local reference is to a name defined in a nesting (outer) scope.

1 inth, i; 2 void B(int w) { 3 int j, k; 4 i = 2*w; 5 w = w+1; 6 ... 7 } 8 void A (int x, int y) { 9 floati, j; 10 B(h); 11 i = 3; 12 ... 13 } 14 void main() { 15 int a, b; 16 h = 5; a = 3; b = 2; 17 A(a, b); 18 B(h); 19 ... 20 } Resolving names in programs with nested and disjoint scopes

Symbol Table Stack in Function B : FB: <w, 2> <j, 3> <k, 4> Outer:<h, 1> <i, 1> <B, 2> <A, 8> <main, 14> Symbol Table Stack in Function A: FA:<x, 8> <y, 8> <i, 9> <j, 9> Outer:<h, 1> <i, 1> <B, 2> <A, 8> <main, 14> Symbol Table Stack in Function main: Main: <a, 15> <b, 15> Outer: <h, 1> <i, 1> <B, 2> <A, 8><main, 14>

LineReferenceDeclaration 4 (in B) i 1 10 h 1 11 (in A) i 9 16,18 h 1 Resolution of nonlocal and multiply defined references.

How the compiler uses symbol tables • Symbols & bindings (probably) entered by lexer or parser. • The information is used during type/semantic analysis and possibly code generation; e.g., the symbol table is used for type-checking statements and determining the relative program addresses. • Symtabs aren’t present during execution for compiled languages.

Dynamic Scoping: NOT GENERALLY USED!When we talk about scope in this class assume static unless otherwise specified. • In dynamic scoping, a name is bound to its most recent declaration based on the program’s call history. • Used by early Lisp, APL, Snobol, Perl (optional) • Symbol table for each scope is built at compile time, but managed at run time. • Scope pushed/popped on/off stack when entered/exited during execution.

Characteristics of Names • Scope • Generally determined by static structure of program • Lines of code where name can be referenced • Symbol table records name/attribute bindings, based on current scope • Lifetime • Visibility • Overloading

Visibility • A name is visible (when a reference is made) if its referencing environment (its defining scope and all nested scopes) includes the reference and if the name has not been re-declared in an inner scope. • A name re-declared in an inner scope effectively hides the outer declaration for the extent of that scope • For example, the definition of i in function A hides the global definition.

1 inth, i; 2 void B(int w) { 3 int j, k; 4 i = 2*w; 5 w = w+1; 6 ... 7 } 8 void A (int x, int y) { 9 floati, j; 10 B(h); 11 i = 3; 12 ... 13 } 14 void main() { 15 int a, b; 16 h = 5; a = 3; b = 2; 17 A(a, b); 18 B(h); 19 ... 20 } A name is visible (when a reference is made) if its referencing environment (its defining scope and all nested scopes) includes the reference and the name has not been re-declared in an inner scope.

Some languages provide a way to reference certain “hidden” identifiers; e.g., Java’s “this” public class Student {private String name; public Student (String name, ...) { this.name = name; ... } }

Function Nesting • Used in Ada, Pascal, a few other languages. • See Ada example: • Nested functions invoke visibility rules to distinguish between different definitions of the same name.

procedure Main is x : Integer; procedure p1 is x : Float; procedure p2 is begin ... x ... end p2; begin ... x ... end p1; procedure p3 is begin ... x ... end p3; begin ... x ... end Main; -- Ada -- x in p2? In p1? -- x in p3? In Main? -- while C-languages do not nest functions, they do nest blocks. Color coding shows beginnings and ends of blocks

Overloading • Overloading uses the number or type of parameters or operands to distinguish among identical function names or operators. • Examples: • +, -, *, / can be float or int • + can be float or int addition or string concatenation in Java, C, C++

Overloading Examples • Java, C, C++ overload print and println to print many different types (boolean, char, int, ...) (see next page) • Java also allows an instance variable and a method to have the same name.

public class PrintStream extends • FilterOutputStream { • ... • public void print(boolean b); • public void print(char c); • public void print(int i); • public void print(long l); • public void print(float f); • public void print(double d); • public void print(char[ ] s); • public void print(String s); • public void print(Object obj); • }

Modula (no overloading) • Different library functions for each type • Read( ) for characters • ReadReal( ) for floating point • ReadInt( ) for integers • ReadString( ) for strings

Overloading in programmer-defined types and functions • Ada: first language to allow programmer-defined overloaded operators and functions. • C++: follows the Ada model (both operators and functions can be overloaded) • Java: overloading for methods only

Lifetime • The lifetime of a variable is the time interval during which the variable has been allocated a block of memory. • Earliest languages used static allocation for all memory (no stack or heap). • Algol introduced the notion that memory should be allocated/ deallocated at scope entry/exit; i.e., the scope in which it is defined.

Lifetime in a Statically Allocated Language • Statically allocated storage has a lifetime equal to the execution time of the program. • There is only one copy of a function and its parameters or local variables • Function variables retain values between calls (unless re-initialized). • Storage allocation needs are determined at compile time and never change

Non-static allocation in C/C++ & Java and other modern languages • By default, storage for local variables is allocated and deallocated as they go in and out of scope • Stack based implementation • C/C++: Explicitly declaring a variable static overrides the default • Global compilation scope: static • “compilation unit” • Java also allows class variables to be declared static

More about memory allocation when we discuss functions and function implementation.

Preview First test: One week from today. Topics: • Imperative and OO languages • Chapters 1 – 4 (partial) • Python CS 524 students: Term Project Scope will be reduced. You will be given the assignment soon.

Programming Languages 2nd edition Tucker and Noonan