190 likes | 307 Vues
This article explores two key methods for handling nested procedures: static access links and displays. It discusses how static chains are created, scope resolution, and the differences in performance and overhead between these methods. Additionally, the article covers parameter passing techniques, stack maintenance during procedure calls, and the intricacies of code generation such as instruction selection and scheduling. Through practical examples, we will illustrate how to optimize code for better performance and efficiency by efficiently utilizing stack frames and minimizing latencies.
E N D
Handling nested procedures • Method 1 : static (access) links • Reference to the frame of the lexically enclosing procedure • Static chains of such links are created. • How do we use them to access non-locals? • The compiler knows the scope s of a variable • The compiler knows the current scope t • Follow s-t links
Handling nested procedures • Method 1 : static (access) links • Setting the links: • if the callee is nested directly within the caller, set its static link to point to the caller's frame pointer (or stack pointer) • if the callee has the same nesting level as the caller, set its static link to point to wherever the caller's static link points to
Handling nested procedures • Method 2 : Displays • A Display encodes the static link info in an array. • The ith element of the array points to the frame of the most recent procedure at scope level i • How? • When a new stack frame is created for a procedure at nesting level i, • save the current value of D[i] in the new stack frame (to be restored on exit) • set D[i] to the new stack frame
Handling nested procedures • Displays vs. Static links • criteria • added overhead • space • nesting depth • frequency of non-local accesses
Parameter passing • By value • actual parameter is copied • By reference • address of actual parameter is stored • By value-result • call by value, AND • the values of the formal parameters are copied back into the actual parameters. • Example: int a; void test(int x) { x = 2; a = 0; } int main () { a = 1; test(a);
Stack maintenance • Calling sequence : • code executed by the caller before and after a call • code executed by the callee at the beginning • code executed by the callee at the end
Stack maintenance • A typical calling sequence : • Caller assembles arguments and transfers control • evaluate arguments • place arguments in stack frame and/or registers • save caller-saved registers • save return address • jump to callee's first instruction
Stack maintenance • A typical calling sequence : • Callee saves info on entry • allocate memory for stack frame, update stack pointer • save callee-saved registers • save old frame pointer • update frame pointer • Callee executes
Stack maintenance • A typical calling sequence : • Callee restores info on exit and returns control • place return value in appropriate location • restore callee-saved registers • restore frame pointer • pop the stack frame • jump to return address • Caller restores info • restore caller-saved registers
Code generation • Our book's target machine (appendix A): • opcode source1, source2, destination • add r1, r2, r3 • addI r1, c, r2 • loadI c, r2 • load r1, r2 • loadAI r1, c, r2 • loadAO r1, r2, r3 • i2i r1, r2 • cmp_LE r1, r2, r3 • cbr r1, l1, l2 • jump r1
Code generation • Let's start with some examples. • Generate code from a tree representing x = a+2 - (c+d-4) • Issues: • which children should go first? • what if we already had a-c in a register? • Does it make a difference if a and c are floating point as opposed to integer? • Generate code for a case statement • Generate code for w = w*2*x*y*z
Code generation • Code generation = • instruction selection • instruction scheduling • register allocation
Instruction selection • IR to assembly • Why is it an issue? • Example: copy a value from r1 to r2 • Let me count the ways... • Criteria • How hard is it? • Use a cost model to choose. • How about register usage?
Instruction selection • How hard is it? • Can make locally optimal choices • Global optimality is NP-complete • Criteria • speed of generated code • size of generated code • power consumption • Considering registers • Assume enough registers are available, let register allocator figure it out.
Instruction scheduling • Reorder instructions to hide latencies. • Example: (1) loadAI $sp, @w, r1 (4) add r1, r1, r1 (5) loadAI $sp, @x, r2 (8) mult r1, r2, r1 (9) loadAI $sp, @y, r2 (12) mult r1, r2, r1 (13) loadAI $sp, @z, r2 (16) mult r1, r2, r1 (18) storeAI r1, $sp, @w memory ops : 3 cycles multiply : 2 cycles everything else: 1 cycle
Instruction scheduling • Reorder instructions to hide latencies. • Example: (1) loadAI $sp, @w, r1 (4) add r1, r1, r1 (5) loadAI $sp, @x, r2 (8) mult r1, r2, r1 (9) loadAI $sp, @y, r2 (12) mult r1, r2, r1 (13) loadAI $sp, @z, r2 (16) mult r1, r2, r1 (18) storeAI r1, $sp, @w (1) loadAI $sp, @w, r1 (2) loadAI $sp, @x, r2 (3) loadAI $sp, @y, r3 (4) add r1, r1, r1 (5) mult r1, r2, r1 (6) loadAI $sp, @z, r2 (7) mult r1, r3, r1 (9) mult r1, r2, r1 (11) storeAI r1, $sp, @w
Instruction scheduling • Reorder instructions to hide latencies. • Example2: (1) loadAI $sp, @x, r1 (4) mult r1, r1, r1 (6) mult r1, r1, r1 (8) mult r1, r1, r1 (10) storeAI r1, $sp, @x
Instruction scheduling • Reorder instructions to hide latencies. • We need to collect dependence info • Scheduling affects register lifetimes ==> different demand for registers • Should we do register allocation before or after? • How hard is it? • more than one instructions may be ready • too many variables may be live at the same time • NP-complete!
Register allocation • Consists of two parts: • register allocation • register assignment • Goal : minimize spills • How hard is it? • BB w/ one size of data: polynomial • otherwise, NP-complete • based on graph coloring.