Some details of implementation

Some details of implementation As part of / extension of type-checking: • Each declaration d(x) associated with a type  type & size information Compiler decides on • Address (see below) • Size of storage • Each use u(x) associated with some d(x) – compiler generates code to access allocated address (for read/write/…) environment

On address allocation : • For global block – fixed addresses • For functions: • Many calls  many allocations • Allocated at run-time • address allocation and access code are relative to (yet unknown) base address (assumed to be stored in a register at run-time) If total size for parameters is M, compiler • associates M with the compiled function • generates code to allocate M at run-time when the function is called (the estimate M is modified below) environment

At run-timewhen a function value is applied : • a memory chunk S of size M is allocated • the values of the arguments are stored: • In S (in right positions) , or • Elsewhere (in the heap); references are stored in S • Arg. Values may be variable size (inconvenient for implementation) • Another reason – next chapter • Base address of S is put into a register, valid for this activation • Access for resolution of a use is by relative addressing  one machine instruction environment

Extension for other blocks (let/let*/letrec/letrec*): • Compiler also knows types & sizes of variables declared locally in a function It associates a relative address also to each variable name conflicts(variable x in different blocks) are no problem — each d(x) is associated with a distinct address (essentially, variables are re-named to distinct names --- the addresses) • Each use u(x) is associated with one d(x) – relative access code is generated (scope rules used here) • The size M for a function & the allocated chunk S include the parameters & alllocals (same for global block) environment

When a function value is applied (revisited) : • A memory S of size M for the arguments and locals isallocated (only exception: real-time with small memory) -- essentially frames of parameters and locals are merged into an extended frame • the values of the arguments are stored immediately • variables bound to cells – these are effectively allocated • Other variable bindings – code for processing the defining expressions is generated in the right places (their use only in scope was assured at compile-time ) • Address allocation, relative addressing (for locals) – as described above • Access for resolution of parameters & locals is by one machine instruction environment

memory chunk for f’s value x y z memory chunk for global block x ref to fv Example ( C) : int x; int f(double x) { int y, z; … Note: the actual names are not used at run-time environment

Parameters and locals are accessed efficiently What aboutglobal variable access? The static (nesting) depth, of a use u(x), sd(u(x)) – # of function boundaries to its binding declaration d(x) (given that local blocks are merged into function, the compiled static depth counts only function boundaries) sd(u(x)) -- the number of frame hops needed to get to a binding for it this number is known to the compiler! environment

Cont’d: Compiler compiles each non-local use v to code for a loop to traverse sd(v) links of static parents , followed by relative access to right position Summary: Variable declarations are replaced by positions Uses (both locals and non-locals) are compiled to access code to these positions  Compilation generates efficient computation *(~ rule alpha!) * environment

Implementation of function values : A function value is implemented by a pair of pointers • To the compiled code, (including code to allocate storage, to store return value, ….. ) • To a [(extended) frame, static parent] pair (its environment) environment

Life and death on the stack Two approaches to programming languages implementation : • Stack-based: a run-time stack of activation records(one per function activation – corresponds to conceptual activation stack) – common for imperative/OO pl’s • Continuation-based: do not employ a stack –common for functional languages We discuss the stack-based environment

An activation record contains activation-relevant data: • Saved registers • Return address • Dynamic pointer • ….. (more such data) • one of • [(extended) frame , static (parent) pointer] (automatic storage) • A reference to [frame, static pointer] (heap storage) Heap: A storage area allocated for a program, contains • Internal, run-time data structure • Program data Q: what determines which of these two is selected? environment

We consider two questions: • What determines whether a frame is stored in the activation record on the stack, or in the heap? • Can cells be allocated for parameters/locals on the stack, or should they be allocated in the heap? Pros & cons of stack allocation: • (de-)allocation is very efficient (hardware support) • Data stored in activation record die when it is popped Need to understand liveness (only for the 2 questions) environment

Example: (a simple counter object) (define counter (let ((count 0)) (lambda (msg) (cond ((eqv? msg ‘show) count) ((eqv? msg ‘inc) (set! count (+ 1 count)))))) counter is bound to a function value with a reference to a frame that corresponded to a dead activation (de-allocated fromstack) This frame is live(reachable from a live activation)  The frame must reside in the heap environment

Liveness definition forfunc. values, frames, bindings (conservative definition) Liveness = reachable • Abinding isreachableif it is stored in reachable frame • A frame is reachable if it is • Associated with a liveactivation • Frame of a reachable function value • Reachable by static parent reference from a reachable frame • A function value is reachable if it is the value in a reachable binding (the above contains mutual recursion, what is the base case?) environment

Addition: a func. value is live, but not bound, when • Created as anonymous function • Is the value of an expression (e.g., a function call) In 1st case, • if immediately applied, then use and throw -- not stored, irrelevant to current discussion • If bound to an identifier, then covered by the definition • If expression value – that is 2nd case In 2nd case, the interesting situations are (why?) • It is the return value from a function call • It is passed as a parameter to a function call Of these two, only the first should really worry us – why? environment

a function call cannot return a function value Restriction P (used in Pascal): Pascal allows nesting of function expressions, but A locally generated function can be passed : • to functions invoked in the activation (stack grows) • but never out of the activation down the stack  when activation dies, so does the frame, and the bindings in it (there is no live function value that references it) The restriction must apply also to functions stored in cells or data structures: cannot return anything containing a function environment

Example : Main{ func D(func z, int x) { …. z(x), } func E(int x){ func F(func C) { } func G(int y) { …F(D),…..D(F,y),…} G(x+3) } E(5) } Main starts  values d and e for D, E are created Calls e(5)  values f and g for F and G are created Starts g(8)  can call a static brother (f) / a static uncle (d) of g In either case, the dynamic activation stack grows Mutually recursive Mutually recursive environment

In Pascal, frame with bindings stored in activation record (when activation dies, they certainly cease to be live) Functions can be created locally, then passed as parameters • Static pointer (reference to static parent) is needed It always points to an activation record deeper in the stack (there is an alternative implementation strategy – store static pointers for activations in a vector – we skip) environment

No function nesting allowed –all functions are global Restriction C (used in C): • All the bindings generated in program outside functions are stored in one global frame(fixed) (addresses for all global variables computed at compile-time) • A function value knows about the variables declared before it in program • That it uses only these is checked by the compiler • Bindings for parameters & locals in a function call are stored in a extended framefor the activation • The static parent of each activation record is the global frame not needed environment

Summary: In Pascal, C (and similar imperative pl’s): Restrictions on functions guarantee that frame and bindings created for an activation die with it • Bindings stored in activation record (automatic) • fast (de-)allocation Additionally: • That uses are in scope is checked at compile-time environment

At one extreme is C: Does not believe in function nesting • Extremely simple environment structure At the other extreme are functional pl’s: No restrictions on nesting, heavy use of higher-order functions • Environment structure (frames) separated from run-time stack, stored in heap • Rely on garbage collection (In some implementations, even an activation stack is absent) environment

Some details of implementation