1 / 33

CSE P501 – Compiler Construction

CSE P501 – Compiler Construction. Overview of SSA IR Constructing SSA graphs SSA-based optimizations Converting back from SSA form Appel ch. 19 Cooper- Torczon sec. 9.3. Def-Use (DU) Chains. Common dataflow analysis problem: Find all sites where a variable is used ( =x )

Télécharger la présentation

CSE P501 – Compiler Construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE P501 – Compiler Construction Overview of SSA IR Constructing SSA graphs SSA-based optimizations Converting back from SSA form Appelch.19 Cooper-Torczonsec. 9.3 Jim Hogg - UW - CSE - P501

  2. Def-Use (DU) Chains • Common dataflow analysis problem: • Find all sites where a variable is used ( =x ) • Find all sites where that variable was last defined (x= ) • Traditional solution: • Def-Use (DU) chains – additional data structure on top of flowgraph • Link each def (assigns-to) of a variable to all uses • Use-Def (UD) chains • Link each use of a variable to its def Jim Hogg - UW - CSE - P501

  3. Def-Use (DU) Chains entry Two DU chains intersect z>1 x=1 z>2 x=2 y=x+1 z=x-3 x=4 z=x+7 exit Jim Hogg - UW - CSE - P501

  4. DU-Chain Drawbacks • Expensive • if a typical variable has D defs and U uses, total cost is O(D*U) • ie: O(n2) where n is number of statements in function • Better if cost varied as O(n) • Unrelated uses of same variable are mixed together. This complicates analysis. Eg: • Variable vis used over and over; but uses are disjoint • Register allocated might think it is live across all uses Jim Hogg - UW - CSE - P501

  5. SSA: Static Single Assignment • IR form where each variable has only one def in the function • Equivalent, in a source program, to never using same variable for different purposes • staticsingle definition. But that def can be in a loop that is executed dynamically many times • In a loop, it really is the same variable each time, so makes sense • Separates values from where they are stored in memory After: Manual, source SSA format Before intx = 0; while (x < a.length) a[x] = i+7; for(x = 42; x >= 0; --x) b[x] = a[x] * 3; intx = 0; while (x < a.length) a[x] = i+7; for(intx1 = 42; x1 >= 0; --x1) b[x1] = a[x1] * 3; Jim Hogg - UW - CSE - P501

  6. SSA in Basic Blocks We’ve seen this before when we looked at Value Numbering a = x + y b = a – 1 a = y + b b = x * 4 a = a + b a1 = x0 + y0 b1 = a1 – 1 a2 = y0 + b1 b2 = x0 * 4 a3 = a2 + b2 Original SSA • Bump the SSA number each time a variable is def'd • x0 is the value assigned before the current function - eg: input parameter Jim Hogg - UW - CSE - P501

  7. Merge Points • Main issue • How to handle merge points in the flowgraph? • Solution • introduce a Φ-function • a3 = Φ(a1, a2) • Meaning • a3 is assigned either a1 or a2 depending on which control path was used to reach the Φ-function Jim Hogg - UW - CSE - P501

  8. Example Original SSA b1= M[x0]a1= 0 b = M[x]a = 0 if b1 < 4 if b < 4 a2= b1 a = b c = a + b a3= Φ(a1, a2) c1= a3 + b1 Jim Hogg - UW - CSE - P501

  9. How Does Φknow what to Pick? • It doesn’t • When we translate the program to executable form, we add code to copy either value to a common location on each incoming edge • For analysis, all we may need to know is the connection of uses to defs – no need to “execute” anything After Before a1 = a = a1 a2 = a = a2 a1 = a2 = a3= Φ(a1, a2) a3= Φ(a1, a2) = a

  10. Example With Loop Original SSA a1= 0 • Notes: • a0, b0, c0 are initial values of a, b, c • b1is dead – can delete later • c is live on entry – either input parameter or uninitialized a = 0 a3= Φ(a1, a2) b1= Φ(b0, b2) c2= Φ(c0, c1) b2= a3 + 1c1= c2 + b2a2= b2 * 2if a2 < N b = a + 1c = c + ba = b * 2if a < N return c return c1 Jim Hogg - UW - CSE - P501

  11. What does SSA 'buy' us? entry entry • No need for DU or UD • Implicit in SSA • Compact representation z1>1 z>1 x1=1 z1>2 x=1 z>2 x=2 x2=2 y=x1+1 y=x+1 x3=φ(x1,x2) z2=x3-3 x4=4 z=x-3 x=4 z=x+7 z3=x4+7 exit exit • SSA form is "recent" (80s) • Prevalent in real compilers for {} languages

  12. Converting To SSA Form • Basic idea • First, add Φ-functions • Then rename all defs and uses of variables by adding subscripts Jim Hogg - UW - CSE - P501

  13. Inserting Φ-Functions • Add Φ-functions for everyvariable in the function, at every join point? • This is called "maximal SSA" • But • Wastes lots of memory • Slows compile time • Not needed in many cases Jim Hogg - UW - CSE - P501

  14. Path-convergence criterion • Insert a Φ-function for variable a at block B3 when: • There are blocks B1 and B2, both containing definitions of a, and B1  B2 • There are paths from B1 to B3 and from B2 to B3 • These paths have no common nodes other than B3 • B3 is not in both paths prior to the end (eg: loop), altho' it may appear in one of them a2 = a1 = B2 B1 a3 = Φ(a1, a2) B3

  15. Details • The start node of the flowgraph is considered to define every variable (a0, b0, etc) even if to “undefined” • Each Φ-function itself defines a variable, so keep adding Φ-functions until things converge Jim Hogg - UW - CSE - P501

  16. Dominators and SSA • One property of SSA is that defs dominate uses. More specifically: • If an = Φ(… ai…) is in block B, then the definition of ai dominates the ith predecessor of B • If a is used in a non-Φ statement in block B, then the definition of a dominates block B a1 = a2 a3 = Φ(a1, a2) B

  17. Dominance Frontier (1) • To get a practical algorithm for placing Φ-functions, we need to avoid looking at all combinations of nodes leading to a join-node • Instead, use the dominator tree in the flow graph Jim Hogg - UW - CSE - P501

  18. Dominance Frontier (2) Definitions • xstrictly dominatesy if x dominates y and x  y • x domy && x  y => x sdom y • The dominance frontierof x is the set of all nodes z such that x dominates a predecessor of z, but x does not dominate z • Dominance Frontier is the border between dominated and undominated nodes; it passes thru the undominated set • Leave x in all directions; stop when you hit undominated nodes x x dom {x, y} x sdom {y} z DF(x) y z Jim Hogg - UW - CSE - P501

  19. Example 1 2 5 9 3 10 11 7 6 12 4 8 13 Jim Hogg - UW - CSE - P501

  20. Node 5 dominates ... 5 dom {5,6,7,8} 5 sdom {6,7,8} 1 2 5 9 3 10 11 7 6 12 4 8 13 Jim Hogg - UW - CSE - P501

  21. Node 5 Dominance Frontier ... DF(5) 1 5 dom {5,6,7,8} 5 sdom {6,7,8} 2 x 5 9 3 10 11 7 6 12 4 8 y DF(x) = {z | y  pred(z), xdomy, x !domz, x !sdomz } 13 z DF(5) excludes 5 itself (backedge 85) because 5 !sdmon 5 Jim Hogg - UW - CSE - P501

  22. Dominance Frontier Criterion • If node xdef's variable a, then every node in DF(x) needs a Φ-function for a • Since the Φ-function itself is a def, this needs to be iterated until it reaches a fixed-point • Theorem: this algorithm places exactly the same set of Φ-functions as the path criterion given previously Jim Hogg - UW - CSE - P501

  23. Placing Φ-Functions: Details • Compute the DF for each node in the flowgraph • Insert just enough Φ-functions to satisfy the criterion. Use a worklist algorithm to avoid revisiting nodes • Walk the dominator tree and rename the different definitions of variable a to be a1, a2, a3, … Jim Hogg - UW - CSE - P501

  24. Efficient Dominator Tree Computation • Goal: SSA makes optimizing compilers faster since we can find defs/uses without expensive bit-vector algorithms • So, need to be able to compute SSA form quickly • Computation of SSA from dominator trees is efficient, but… Jim Hogg - UW - CSE - P501

  25. Lengauer-Tarjan Algorithm • Iterative, set-based algorithm for finding dominator trees is slow in worst case • Lengauer-Tarjan is near linear time (in number of nodes) • Uses depth-first spanning tree from start node of control flow graph • See books for details Jim Hogg - UW - CSE - P501

  26. SSA Optimizations • Given the SSA form, what can we do with it? • First, what do we know? - what info is kept in the SSA graph? Jim Hogg - UW - CSE - P501

  27. SSA Data Structures • Instruction • links to containing block, next and previous instructions, variables defined, variables used. • Instruction kinds: ordinary, Φ-function, fetch, store, branch • Variable • link to def (instruction) and use sites • Block • list of contained instructions, ordered list of predecessors, successor(s) Jim Hogg - UW - CSE - P501

  28. Dead-Code Elimination (DCE) Many optimizations become easy to recognize in SSA form. Eg: • A variable is live <=> list of uses is not empty • To remove dead code: while (there is some variable v with no uses) { if (instruction that def'sv has no side effects) { delete instruction } } • Need to remove this instruction from list of uses for its operands: may kill those variables too Jim Hogg - UW - CSE - P501

  29. Simple Constant Prop. • In v = c, if c is a constant, then: • Then update every use of variable v to use constant c instant • In v = Φ(c1, c2 … cn), if all ci’s have same constant value c, replace instruction with v = c • Incorporate copy prop, constant folding, and others in the same worklist algorithm Jim Hogg - UW - CSE - P501

  30. Simple Constant Propagation W = list of all instruction in SSA function while W is not empty { remove some instructions I from W if I is v = Φ(c, c, …, c), replace I with v = c if I is v = c { delete I from the function foreach instruction T that uses v { substitute c for v in T add T to W } } } Jim Hogg - UW - CSE - P501

  31. Converting Back from SSA • Unfortunately, real machines do not include a Φ instruction • So after analysis, optimization, and transformation, need to convert back to a “Φ-less” form for execution Jim Hogg - UW - CSE - P501

  32. Translating Φ-functions • The meaning of x = Φ(x1, x2 … xn) is: • set x = x1 if arriving on edge 1 • set x = x2 if arriving on edge 2 • etc • So, foreachi, insert x = xi at the end of predecessor block i • Rely on copy prop. and coalescing in register allocation to eliminate redundant moves Jim Hogg - UW - CSE - P501

  33. SSA Wrapup • More details in recent compiler books (but not the new Dragon book!) • Allows efficient implementation of many optimizations • Used in many new compiler (eg: LLVM) & retrofitted into many older ones (GCC) • Not a silver bullet – some optimizations still need non-SSA forms Jim Hogg - UW - CSE - P501

More Related