Administrative info
E N D
Presentation Transcript
Administrative info • Subscribe to the class mailing list • instructions are on the class web page, which is accessible from my home page, which is accessible by searching for Sorin Lerner on google • Project page is up • Not much info there yet. More will be available by Thursday • You will need to create groups for the project by Thursday. So start thinking about who you would like to work with.
Simple example foo(z) { x := 3 + 6; y := x – 5 return z * y }
Simple example foo(z) { x := 3 + 6; y := x – 5 return z * y }
Another example x := a + b; ... y := a + b;
Another example x := a + b; ... y := a + b;
Another example if (...) { x := a + b; } ... y := a + b;
Another example if (...) { x := a + b; } ... y := a + b;
Another example x := y ... z := z + x
Another example x := y ... z := z + x
Another example x := y ... z := z + y What if we run CSE now?
Another example x := y ... z := z + y What if we run CSE now?
Another example x := y**z ... x := ...
Another example • Often used as a clean-up pass x := y**z ... x := ... Copy prop DAE x := y z := z + x x := y z := z + y x := y z := z + y
Another example if (false) { ... }
Another example if (false) { ... }
Another example • In Java: a := new int [10]; for (index = 0; index < 10; index ++) { a[index] := 100; }
Another example • In “lowered” Java: a := new int [10]; for (index = 0; index < 10; index ++) { if (index < 0 || index >= a.length()) { throw OutOfBoundsException; } a[index] := 0; }
Another example • In “lowered” Java: a := new int [10]; for (index = 0; index < 10; index ++) { if (index < 0 || index >= a.length()) { throw OutOfBoundsException; } a[index] := 0; }
Another example p := &x; *p := 5 y := x + 1;
Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Another example for j := 1 to N for i := 1 to N a[i] := a[i] + b[j]
Another example for j := 1 to N for i := 1 to N a[i] := a[i] + b[j]
Another example area(h,w) { return h * w } h := ...; w := 4; a := area(h,w)
Another example area(h,w) { return h * w } h := ...; w := 4; a := area(h,w)
Optimization themes • Don’t compute if you don’t have to • unused assignment elimination • Compute at compile-time if possible • constant folding, loop unrolling, inlining • Compute it as few times as possible • CSE, PRE, PDE, loop invariant code motion • Compute it as cheaply as possible • strength reduction • Enable other optimizations • constant and copy prop, pointer analysis • Compute it with as little code space as possible • unreachable code elimination
Dataflow analysis: what is it? • A common framework for expressing algorithms that compute information about a program • Why is such a framework useful?
Dataflow analysis: what is it? • A common framework for expressing algorithms that compute information about a program • Why is such a framework useful? • Provides a common language, which makes it easier to: • communicate your analysis to others • compare analyses • adapt techniques from one analysis to another • reuse implementations (eg: dataflow analysis frameworks)
Control Flow Graphs • For now, we will use a Control Flow Graph representation of programs • each statement becomes a node • edges between nodes represent control flow • Later we will see other program representations • variations on the CFG (eg CFG with basic blocks) • other graph based representations
Example CFG x := ... y := ... x := ... y := ... y := ... p := ... if (...) { ... x ... x := ... ... y ... } else { ... x ... x := ... *p := ... } ... x ... ... y ... y := ... y := ... p := ... if (...) ... x ... ... x ... x := ... x := ... ... y ... *p := ... ... x ... ... x ... y := ...
An example DFA: reaching definitions • For each use of a variable, determine what assignments could have set the value being read from the variable • Information useful for: • performing constant and copy prop • detecting references to undefined variables • presenting “def/use chains” to the programmer • building other representations, like the DFG • Let’s try this out on an example
x := ... Visual sugar y := ... 1: x := ... 2: y := ... 3: y := ... 4: p := ... y := ... p := ... if (...) ... x ... 5: x := ... ... y ... ... x ... 6: x := ... 7: *p := ... ... x ... ... x ... x := ... x := ... ... y ... *p := ... ... x ... ... y ... 8: y := ... ... x ... ... x ... y := ...
1: x := ... 2: y := ... 3: y := ... 4: p := ... ... x ... 5: x := ... ... y ... ... x ... 6: x := ... 7: *p := ... ... x ... ... y ... 8: y := ...
1: x := ... 2: y := ... 3: y := ... 4: p := ... ... x ... 5: x := ... ... y ... ... x ... 6: x := ... 7: *p := ... ... x ... ... y ... 8: y := ...
Safety • Recall intended use of this info: • performing constant and copy prop • detecting references to undefined variables • presenting “def/use chains” to the programmer • building other representations, like the DFG • Safety: • can have more bindings than the “true” answer, but can’t miss any
Reaching definitions generalized • DFA framework is geared towards computing information at each program point (edge) in the CFG • So generalize the reaching definitions problem by stating what should be computed at each program point • For each program point in the CFG, compute the set of definitions (statements) that may reach that point • Notion of safety remains the same
Reaching definitions generalized • Computed information at a program point is a set of var ! stmt bindings • eg: { x ! s1, x ! s2, y ! s3 } • How do we get the previous info we wanted? • if a var x is used in a stmt whose incoming info is in, then: { s | (x!s) 2in } • This is a common pattern • generalize the problem to define what information should be computed at each program point • use the computed information at the program points to get the original info we wanted
1: x := ... 2: y := ... 3: y := ... 4: p := ... ... x ... 5: x := ... ... y ... ... x ... 6: x := ... 7: *p := ... ... x ... ... y ... 8: y := ...
1: x := ... 2: y := ... 3: y := ... 4: p := ... ... x ... 5: x := ... ... y ... ... x ... 6: x := ... 7: *p := ... ... x ... ... y ... 8: y := ...
Using constraints to formalize DFA • Now that we’ve gone through some examples, let’s try to precisely express the algorithms for computing dataflow information • We’ll model DFA as solving a system of constraints • Each node in the CFG will impose constraints relating information at predecessor and successor points • Solution to constraints is result of analysis
Constraints for reaching definitions in s: x := ... out in s: *p := ... out
Constraints for reaching definitions • Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } • Using must-point-to aswell: out = in – { x ! s’ | x 2 must-point-to(p) Æ s’ 2 stmts } [ { x ! s | x 2 may-point-to(p) } in out = in – { x ! s’ | s’ 2 stmts } [ { x ! s } s: x := ... out in s: *p := ... out
Constraints for reaching definitions in s: if (...) out[0] out[1] in[0] in[1] merge out
Constraints for reaching definitions in out [ 0] = in Æ out [ 0] = in s: if (...) out[0] out[1] more generally: 8i . out [ i ] = in in[0] in[1] out = in [ 0] [in [ 1] merge more generally: out = iin [ i ] out
Flow functions • The constraint for a statement kind s often have the form: out = Fs(in) • Fs is called a flow function • other names for it: dataflow function, transfer function • Given information in before statement s, Fs(in) returns information after statement s • Other formulations have the statement s as an explicit parameter to F: given a statement s and some information in, F(s,in) returns the outgoing information after statement s
Flow functions • Issue: what does one do when there are multiple input edges to a node? • the flow functions takes as input a tuple of values, one value for each incoming edge • Issue: what does one do when there are multiple outgoing edges to a node? • the flow function returns a tuple of values, one value for each outgoing edge • can also have one flow function per outgoing edge
Flow functions • Flow functions are a central component of a dataflow analysis • They state constraints on the information flowing into and out of a statement • This version of the flow functions is local • it applies to a particular statement kind • we’ll see global flow functions shortly...