1 / 76

CMPUT680 - Fall 2003

CMPUT680 - Fall 2003. Topic 7: Register Allocation and Instruction Scheduling José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680. Tiger book: chapter 10 and 11 Dragon book: chapter 10 Other papers as assigned in class or homeworks. Reading List. Register Allocation.

telma
Télécharger la présentation

CMPUT680 - Fall 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMPUT680 - Fall 2003 Topic 7: Register Allocation and Instruction Scheduling José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680 CMPUT 680 - Compiler Design and Optimization

  2. Tiger book: chapter 10 and 11 Dragon book: chapter 10 Other papers as assigned in class or homeworks Reading List CMPUT 680 - Compiler Design and Optimization

  3. Register Allocation • Motivation • Live ranges and interference graphs • Problem formulation • Solution methods CMPUT 680 - Compiler Design and Optimization

  4. Goals of Optimized Register Allocation • To select variables that should be assigned to registers. • Use the same register for multiple variables when it is legal, and profitable, to do so. CMPUT 680 - Compiler Design and Optimization

  5. Liveness Intuitively a variable v is live if it holds a value that may be needed in the future. In other words, v is live at a point pi if: (i) v has been defined in a statement that precedes pi in any path, and (ii) v may be used by a statement sj, and there is a path from pito sj. (iii) v is not killed between piand sj. CMPUT 680 - Compiler Design and Optimization

  6. s1 s2 a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 Variables s1 ands2have a live range of four statements. Live Variables A variable v is live between the point piimmediately after its definition and the point pjimmediately after its last use. The interval [pi, pj] is the live range of the variable v. Which variables have the longest live range in the example? CMPUT 680 - Compiler Design and Optimization

  7. a: s1 = ld(x) b: s2 = s1 + 4 c: s3 = s1  8 d: s4 = s1 - 4 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 Register Allocation How can we find out what is the minimum number of registers required by this basic block to avoid spilling values to memory? We have to compute the live range of all variables and find the “fatest” statement. Which statements have the most variables live simultaneously? CMPUT 680 - Compiler Design and Optimization

  8. s1 s2 s3 a: s1 = ld(x) s4 b: s2 = s1 + 4 s5 c: s3 = s1  8 s6 d: s4 = s1 - 4 s7 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 Register Allocation At statement d variables s1,s2,s3,ands4are live, and during statement e variabless2,s3,s4, ands5are live. But we have to use some math: our choice is liveness analysis. CMPUT 680 - Compiler Design and Optimization

  9. s1 s2 s3 a: s1 = ld(x) s4 b: s2 = s1 + 4 s5 c: s3 = s1  8 s6 d: s4 = s1 - 4 s7 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 Live-in and Live-out live-in(r): set of variables that are live at the point immediately before statement r. live-out(r): set of variables that are live at the point immediately after statement r. CMPUT 680 - Compiler Design and Optimization

  10. s1 s2 s3 a: s1 = ld(x) s4 b: s2 = s1 + 4 s5 c: s3 = s1  8 s6 d: s4 = s1 - 4 s7 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 Live-in and Live-out: Program Example What are live-in(e) and live-out(e)? live-in(e) = {s1, s2, s3, s4} live-out(e) = {s2, s3, s4, s5} CMPUT 680 - Compiler Design and Optimization

  11. Live-in and Live-out in Control Flow Graphs The entry point of a basic block B is the point before its first statement. The exit point is the point after its last statement. live-in(B): set of variables that are live at the entry point of the basic block B. live-out(B): set of variables that are live at the exit point of the basic block B. CMPUT 680 - Compiler Design and Optimization

  12. B1 a := b + c d := d - b e := a + f •live-in(B2)={a,c,d,e} •live-in(B3)={a,c,d,f} •live-in(B4)={c,d,e,f} B2 B3 f := a - d b := d + f e := a - c •live-out(B2)={c,d,e,f} •live-out(B3)={b,c,d,e,f} •live-out(B4)={b,c,d,e,f} b := d + c B4 Live-in and Live-out of basic blocks • live-in(B1)={b,c,d,f} • live-out(B1)={a,c,d,e,f} b, d, e, f live Compute live-in and live-out for each basic block b, c, d, e, f live CMPUT 680 - Compiler Design and Optimization (Aho-Sethi-Ullman, pp. 544)

  13. Register-Interference Graph A register-interference graph is an undirected graph that summarizes live analysis at the variable level as follows: • A node is a variable/temporary that is a candidate for register allocation. • An edge connects nodes v1 and v2 if there is some statement in the program where variables v1and v2are live simultaneously. (Variables v1 and v2 are said to interfere, in this case). CMPUT 680 - Compiler Design and Optimization

  14. s1 s2 s3 a: s1 = ld(x) s4 b: s2 = s1 + 4 s5 c: s3 = s1  8 s6 d: s4 = s1 - 4 s7 e: s5 = s1/2 f: s6 = s2 * s3 g: s7 = s4 - s5 h: s8 = s6 * s7 Register Interference Graph: Program Example s1 s7 s2 s3 s6 s4 s5 CMPUT 680 - Compiler Design and Optimization

  15. Register Allocation by Graph Coloring Background: A graph is k-colorable if each node can be assigned one of k colors in such a way that no two adjacent nodes have the same color. Basic idea: A k-coloring of the interference graph can be directly mapped to a legal register allocation by mapping each color to a distinct register. The coloring property ensures that no two variables that interfere with each other are assigned the same register. CMPUT 680 - Compiler Design and Optimization

  16. The basic idea behind register allocation by graph coloring is to 1. Build the register interference graph, 2. Attempt to find a k-coloring for the interference graph. Register Allocation by Graph Coloring CMPUT 680 - Compiler Design and Optimization

  17. The problem of determining if an undirected graph is k-colorable is NP-hard for k 3. It is also hard to find approximate solutions to the graph coloring problem. Complexity of the Graph Coloring Problem CMPUT 680 - Compiler Design and Optimization

  18. Question: What to do if a register-interference graph is notk-colorable? Or if the compiler cannot efficiently find a k-coloring even if the graph is k-colorable? Register Allocation Answer: Repeatedly select less profitable variables for “spilling” (i.e. not to be assigned to registers) and remove them from the interference graph until the graph becomes k-colorable. CMPUT 680 - Compiler Design and Optimization

  19. Estimating Register Profitability CMPUT 680 - Compiler Design and Optimization

  20. graph G’ graph G Remove node x and all associate edges Heuristic Solution for Graph Coloring Let G be an undirected graph. Key observation: Let x be a node of G such that degree(x) < k. Then G is k-colorable if G’ is k-colorable. CMPUT 680 - Compiler Design and Optimization

  21. Select and Spill Build IG Simplify Reverse pass Forward pass A 2-Phase Register Allocation Algorithm CMPUT 680 - Compiler Design and Optimization

  22. Heuristic “Optimistic”Algorithm /* neighbor(v) contains a list of the neighbors of v. */ /* Build step */ BuildG,the register-interference graph; /* Forward pass */ Initialize an empty stack; repeat whileG has a node v such that |neighbors(v)| < kdo /* Simplify step */ Push (v, neighbors(v), no-spill) Delete v and its edges from G end while ifG is non-empty then /* Spill step */ Choose “least profitable” node v as a potential spill node; Push(v, neighbors(v), may-spill) Deletev and its edges from G end if untilG is an empty graph; CMPUT 680 - Compiler Design and Optimization

  23. Heuristic “Optimistic”Algorithm /* Reverse Pass */ whilethe stack is non-empty do Pop (v, neighbors(v), tag) N := set of nodes in neighbors(v); if(tag = no-spill) then /* Select step */ Select a register R for v such that R is not assigned to nodes inN; Insert v as a new node in G; Insert edges in G from v to each node in N; else /* tag = may-spill */ ifv can be assigned a register R such thatR is not assigned to nodes in Nthen /* Optimism paid off: need not spill */ Assign register R to v; Insertv as a new node in G; Insert edges in G from v to each node in N; else /* Need to spill v */ Mark v as a node that needs spill end if end if end while CMPUT 680 - Compiler Design and Optimization

  24. This register allocation algorithm, based on graph coloring, is both efficient (linear time) and effective (good assignment). It has been used in many industry-strength compilers to obtain significant improvements over simpler register allocation heuristics. Remarks CMPUT 680 - Compiler Design and Optimization

  25. Extensions • Coalescing • Live range splitting CMPUT 680 - Compiler Design and Optimization

  26. In the sequence of intermediate level instructions with a copy statement below, assume that registers are allocated to both variables x and y. Coalescing x := … . . . y := x . . . … := y There is an opportunity for further optimization by eliminating the copy statement if x and y are assigned the same register. The constraint that x and y receive the same register can be modeled by coalescing the nodes forx and y in the interference graph i.e., by treating them as the same variable. CMPUT 680 - Compiler Design and Optimization

  27. An Extension with Coalesce Build IG Simplify Coalesce Select and Spill CMPUT 680 - Compiler Design and Optimization

  28. Register Allocation with Coalescing 1. Build: build the register interference graph Gand categorize nodes as move-related or non-move-related. 2. Simplify:one at a time, remove non-move-related nodes of low (< k) degree fromG. 3. Coalesce:conservatively coalesce G: only coalesce nodes aand b if the resulting a-b node has less than k neighbors. 4. Freeze:If neither coalesce nor simplify works, freeze a move-related node of low degree, making it non-move-related and available for simplify. CMPUT 680 - Compiler Design and Optimization (Appel, pp. 240)

  29. Register Allocation with Coalescing 5. Spill:if there are no low-degree nodes, select a node for potential spilling. 6. Select:pop each element of the stack assigning colors. (re)build simplify coalesce freeze actual spill potential spill select CMPUT 680 - Compiler Design and Optimization (Appel, pp. 240)

  30. k j LIVE-IN: k j g g := mem[j+12] h h := k -1 f f := g + h e e := mem[j+8] m m := mem[j+16] b b := mem[f] c c := e + 8 d d := c k := m + 4 j := b LIVE-OUT: d k j Example:Step 1: Compute Live Ranges CMPUT 680 - Compiler Design and Optimization

  31. Example:Step 3: Simplify (K=4) stack f e j k b m d c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  32. Example:Step 3: Simplify (K=4) stack f (h,no-spill) e j k b m d c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  33. Example:Step 3: Simplify (K=4) stack f (g, no-spill) (h, no-spill) e j k b m d c g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  34. Example:Step 3: Simplify (K=4) stack f (k, no-spill) (g, no-spill) (h, no-spill) e j k b m d c CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  35. Example:Step 3: Simplify (K=4) stack f (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) e j b m d c CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  36. Example:Step 3: Simplify (K=4) stack (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) e j b m d c CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  37. Example:Step 3: Simplify (K=4) stack (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) j b m d c CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  38. Example:Step 3: Coalesce (K=4) stack (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) j b d c Why we cannot simplify? Cannot simplify move-related nodes. CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  39. Example:Step 3: Coalesce (K=4) stack (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) j b d c CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  40. Example:Step 3: Simplify (K=4) stack (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) j b c-d CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  41. Example:Step 3: Coalesce (K=4) stack (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) j b CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  42. Example:Step 3: Simplify (K=4) stack (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) b-j CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  43. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  44. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  45. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  46. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  47. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  48. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  49. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

  50. Example:Step 3: Select (K=4) stack f e (b-j, no-spill) (c-d, no-spill) (m, no-spill) (e, no-spill) (f, no-spill) (k, no-spill) (g, no-spill) (h, no-spill) R1 R2 j k b m R3 d R4 c h g CMPUT 680 - Compiler Design and Optimization (Appel, pp. 237)

More Related