1 / 105

Compilation 0368-3133 (Semester A, 2013/14)

Compilation 0368-3133 (Semester A, 2013/14). Lecture 10: Register Allocation Noam Rinetzky. Slides credit: Roman Manevich , Mooly Sagiv and Eran Yahav. Promo: (Re)thinking Software Design .

kare
Télécharger la présentation

Compilation 0368-3133 (Semester A, 2013/14)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compilation 0368-3133 (Semester A, 2013/14) Lecture 10: Register Allocation Noam Rinetzky Slides credit: Roman Manevich, MoolySagiv and EranYahav

  2. Promo: (Re)thinking Software Design What is the essence of software design? Researchers and practitioners have for many years quoted Fred Brooks's assertions that "conceptual integrity is the most important consideration in system design" and is "central to product quality". But what exactly is conceptual integrity? In this talk, I'll report on progress in a new research project that explores this question by attempting to develop a theory of conceptual design, and applying it experimentally in a series of redesigns of common applications (such as Git, Gmail and Dropbox) Daniel Jackson (MIT) This Wednesday 12:00 Gilman 223

  3. What is a Compiler? “A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.” --Wikipedia

  4. Source text txt Executable code exe Conceptual Structure of a Compiler Compiler Frontend Semantic Representation Backend LexicalAnalysis Syntax Analysis Parsing Semantic Analysis IntermediateRepresentation (IR) Code Generation

  5. Op(*) Op(+) Id(b) Num(23) Num(7) From scanning to parsing program text ((23 + 7) * x) Lexical Analyzer token stream Grammar: E ... | Id Id  ‘a’ | ... | ‘z’ Parser valid syntaxerror Abstract Syntax Tree

  6. Op(*) Op(+) Id(b) Num(23) Num(7) From scanning to parsing program text ((23 + 7) * x) Lexical Analyzer token stream Grammar: E ... | Id Id  ‘a’ | ... | ‘z’ Parser valid syntaxerror Abstract Syntax Tree

  7. E1 : int E2 : int E1 + E2 : int Op(*) Op(+) Id(b) Num(23) Num(7) Context Analysis Abstract Syntax Tree Type rules Semantic Error Valid + Symbol Table

  8. Op(*) Op(+) Id(b) Num(23) Num(7) Code Generation Valid Abstract Syntax Tree Symbol Table cgen Frame Manager Verification (possible runtime) Errors/Warnings Intermediate Representation (IR) input Executable Code output

  9. Register Allocation Target code (executable) LexicalAnalysis Inter.Rep.(IR) CodeGeneration Source code (program) Syntax Analysis Parsing AST SymbolTableetc. • The process of assigning variables to registers and managing data transfer in and out of registers • Using registers intelligently is a critical step in any compiler • A good register allocator can generate code orders of magnitude better than a bad register allocator

  10. Register Allocation: Goals Target code (executable) LexicalAnalysis Inter.Rep.(IR) CodeGeneration Source code (program) Syntax Analysis Parsing AST SymbolTableetc. • Reduce number of temporaries (registers) • Machine has at most K registers • Some registers have special purpose • E.g., pass parameters • Reduce the number of move instructions • MOVE R1,R2 // R1  R2

  11. Registers • Most machines have a set of registers, dedicated memory locations that • can be accessed quickly, • can have computations performed on them, and • are used for special purposes (e.g., parameter passing) • Usages • Operands of instructions • Store temporary results • Can (should) be used as loop indexes due to frequent arithmetic operation • Used to manage administrative info • e.g., runtime stack

  12. Register allocation • In TAC, there are an unlimited number of variables • On a physical machine there are a small number of registers: • x86 has four general-purpose registers and a number of specialized registers • MIPS has twenty-four general-purpose registers and eight special-purpose registers

  13. Spilling • Even an optimal register allocator can require more registers than available • Need to generate code for every correct program • The compiler can save temporary results • Spill registers into temporaries • Load when needed • Many heuristics exist

  14. Simple approach • Straightforward solution: • Allocate each variable in activation record • At each instruction, bring values needed into registers, perform operation, then store result to memory mov 16(%ebp), %eaxmov 20(%ebp), %ebxadd %ebx, %eaxmov %eax, 24(%ebx) x = y + z Problem: program execution very inefficient–moving data back and forth between memory and registers

  15. Register Allocation • Machine-agnostic optimizations • Assume unbounded number of registers • Expression trees (tree-local) • Basic blocks (block-local) • Machine-dependent optimization • K registers • Some have special purposes • Control flow graphs (global register allocation)

  16. Register Allocation: IR Target code (executable) LexicalAnalysis Inter.Rep.(IR) CodeGeneration Source code (program) Syntax Analysis Parsing AST SymbolTableetc.

  17. Register Allocation • Machine-agnostic optimizations • Assume unbounded number of registers • Expression trees • Basic blocks • Machine-dependent optimization • K registers • Some have special purposes • Control flow graphs (whole program)

  18. Sethi-Ullman translation • Algorithm by Ravi Sethi and Jeffrey D. Ullman to emit optimal TAC • Minimizes number of temporaries for a single expression

  19. Simple Spilling Method • Heavy tree – Needs more registers than available • A “heavy” tree contains a “heavy” subtree whose dependents are “light” • Simple spilling • Generate code for the light tree • Spill the content into memory and replace subtree by temporary • Generate code for the resultant tree

  20. Example (optimized): b*b-4*a*c 3 - 2 2 * * 1 2 1 1 4 * b b 1 1 a c

  21. Example (spilled): x := b*b-4*a*c 2 2 - * 1 t7 1 2 1 * b b 2 1 4 * 1 1 a c x := t7 – 4 * a * c t7 := b * b

  22. Register Allocation • Machine-agnostic optimizations • Assume unbounded number of registers • Expression trees • Basic blocks • Machine-dependent optimization • K registers • Some have special purposes • Control flow graphs (whole program)

  23. Basic Blocks • basic blockis a sequence of instructions with • single entry (to first instruction), no jumps to the middle of the block • single exit (last instruction) • code execute as a sequence from first instruction to last instruction without any jumps • edge from one basic block B1 to another block B2 when the last statement of B1 may jump to B2

  24. B1 control flow graph prod := 0 i := 1 … B1 B2 B2 t1 := 4 * i t2 := a [ t1 ] t3 := 4 * i t4 := b [ t3 ] t5 := t2* t4 t6:= prod + t5 prod := t6 t7:= i + 1 i := t7 if i <= 20 gotoB2 … True False • A directed graph G=(V,E) • nodes V = basic blocks • edges E = control flow • (B1,B2)  E when control from B1 flows to B2 • Leaders-based construction • Target of jump instructions • Instructions following jumps

  25. Register Allocation for B.B. • Dependency graphs for basic blocks • Transformations on dependency graphs • From dependency graphs into code • Instruction selection • linearizations of dependency graphs • Register allocation • At the basic block level

  26. AST for a Basic Block { int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; }

  27. { int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; } Dependency graph

  28. { int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; } Simplified Data Dependency Graph

  29. Pseudo Register Target Code

  30. { int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; } Question: Why “y”?

  31. Question: Why “y”? … False True int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; z := y + x; y := 0;

  32. Question: Why “y”? … False True int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; y := 0; z := y + x;

  33. Question: Why “y”? … False True int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; y := 0; z := y + x;

  34. y, dead or alive? … … False True False True int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; y := 0; z := y + x; z := y + x; y := 0;

  35. x, dead or alive? … … False True False True int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; int n; n := a + 1; x := b + n * n + c; n := n + 1; y := d * n; y := 0; z := y + x; z := y + x; y := 0;

  36. Register Allocation • Machine-agnostic optimizations • Assume unbounded number of registers • Expression trees • Basic blocks • Machine-dependent optimization • K registers • Some have special purposes • Control flow graphs (global register allocation)

  37. Register Allocation: Assembly Target code (executable) CodeGeneration LexicalAnalysis Inter.Rep.(IR) Source code (program) Syntax Analysis Parsing AST SymbolTableetc.

  38. Register Allocation: Assembly Source code (program) LexicalAnalysis Syntax Analysis Parsing AST SymbolTableetc. Inter.Rep.(IR) CodeGeneration Target code (executable) Global register allocation IR (TAC) generation Instruction selection IR Optimization “Optimized” IR AST + Sym. Tab. IR “Assembly” Assembly

  39. Register Allocation: Assembly Source code (program) LexicalAnalysis Syntax Analysis Parsing AST SymbolTableetc. Inter.Rep.(IR) CodeGeneration Target code (executable) Global register allocation IR (TAC) generation Instruction selection IR Optimization “Optimized” IR AST + Sym. Tab. IR “Assembly” Assembly Modern compiler implementation in C Andrew A. Appel

  40. “Global” Register Allocation • Input: • Sequence of machine instructions (“assembly”) • Unbounded number of temporary variables • aka symbolic registers • “machine description” • # of registers, restrictions • Output • Sequence of machine instructions using machine registers (assembly) • Some MOV instructions removed

  41. Variable Liveness y = 42 z = 73 x = y + z print(x); x undef, y live, z undef x undef, y live, z live x is live, y dead, z dead x is dead, y dead, z dead • A statement x = y + z • defines x • uses y and z • A variable x is live at a program point if its value (at this point) is used at a later point (showing state after the statement)

  42. Find a register allocation register eax ebx b = a + 2 c = b * b b = c + 1 return b * c

  43. Is this a valid allocation? register eax ebx Overwrites previous value of ‘a’ also stored in eax ebx= eax+ 2 eax= ebx* ebx ebx= eax+ 1 return ebx* eax b = a + 2 c = b * b b = c + 1 return b * c

  44. Is this a valid allocation? register eax ebx ebx= eax+ 2 eax= ebx* ebx ebx= eax+ 1 return ebx* eax b = a + 2 c = b * b b = c + 1 return b * c Value of ‘a’ stored in eax is not needed anymore so reuse it for ‘b’

  45. Is this a valid allocation? register eax ebx ebx= eax+ 2 eax= ebx* ebx ebx= eax+ 1 return ebx* eax b = a + 2 c = b * b b = c + 1 return b * a Value of ‘a’ stored in eax is not needed anymore so reuse it for ‘b’

  46. Main idea • For every node n in CFG, we have out[n] • Set of temporaries live out of n • Two variables interfereif they appear in the same out[n] of any node n • Cannot be allocated to the same register • Conversely, if two variables do not interfere with each other, they can be assigned the same register • We say they have disjoint live ranges • How to assign registers to variables?

  47. Interference graph • Nodes of the graph = variables • Edges connect variables that interfere with one another • Nodes will be assigned a color corresponding to the register assigned to the variable • Two colors can’t be next to one another in the graph

  48. Interference graph construction b = a + 2 c = b * b b = c + 1 return b * a

  49. Interference graph construction b = a + 2 c = b * b b = c + 1 {b, a} return b * a

  50. Interference graph construction b = a + 2 c = b * b {a, c} b = c + 1 {b, a} return b * a

More Related