360 likes | 381 Vues
Control Flow Analysis. Mooly Sagiv http://www.math.tau.ac.il/~sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber 317 Textbook Chapter 3 (Simplified+OO). Goals. Understand the problem of Control Flow Analysis in Functional Languages
E N D
Control Flow Analysis Mooly Sagiv http://www.math.tau.ac.il/~sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber 317 Textbook Chapter 3(Simplified+OO)
Goals • Understand the problem of Control Flow Analysis • in Functional Languages • In Object Oriented Languages • Function Pointers • Learn Constraint Based Program Analysis Technique • General • Usage for Control Flow Analysis • Algorithms • Systems • Similarities between Problems &Techniques
Outline • A Motivating Example (OO) • The Control Flow Analysis Problem • A Formal Specification • Set Constraints • Solving Constraints • Adding Dataflow information • Adding Context Information • Back to the Motivating Example • Conclusions
A Motivating Example class Vehicle Object { int position = 10; void move(x1 : int) { position = position + x1 ;}} class Car extends Vehicle { int passengers; void await(v : Vehicle) { if (v.position < position) then v.move(position - v.position); else self.move(10); }} class Truck extends Vehicle { void move(x2 : int) { if (x2 < 55) position = position + x2; }} void main { Car c; Truck t; Vehicle v1; new c; new t; v1 := c; c.passangers := 2; c.move(60); v1.move(70); c.await(t) ;}
The Control Flow Analysis (CFA) Problem • Given a program in a functional programming language with higher order functions(functions can serve as parameters and return values) • Find out for each function invocation which functions may be applied • Obvious in C without function pointers • Difficult in C++, Java and ML • The Dynamic Dispatch Problem
An ML Example let f = fn x => x 1 ; g = fn y => y + 2 ; h = fn z => z + 3; in (f g) + (f h)
An ML Example let f = fn x => /* {g, h} */ x 1 ; g = fn y => y + 2 ; h = fn z => z + 3; in (f g) + (f h)
The Language FUN • Notations • e Exp // expressions (or labeled terms) • t Term // terms (or unlabeled terms) • f, x Var // variables • c Const // Constants • op Op // Binary operators • l Lab // Labels • Abstract Syntax • e ::= tl • t ::= c | x | fn x e // function definition | fun f x e // recursive function definition | e1 e2 // function applications | if e0 then e1 else e2 | let x = e1 in e2 | e1 op e2
A Simple Example ((fn x x1)2 (fn y y3)4)5
An Example which Loops (let g = fun f x (f1 (fn y y2)3)4)5 (g6 (fn z z7)8)9)10
The 0-CFA Problem • Compute for every program a pair (C, ) where: • C is the abstract cache associating abstract values with labeled program points • is the abstract environment associating abstract values with variables • Formally • v Val = P(Term) // Abstract values • Env = Var Val // Abstract environment • C Cache - Lab Val // Abstract Cache • For function application (t1l1 t2l2)l C(l1) determine the function that can be applied • These maps are finite for a given program • No context is considered for parameters
(let g = fun f x (f1 (fn y y2)3)4)5 (g6 (fn z z7)8)9)10 Shorthand sf fun f x (f1 (fn y y2)3)4 idy fn y y2 idz fn z z7 C(1) = {sf} C(2) = {} C(3) = {idy} C(4) = {} C(5) = {sf} C(6) = {sf} C(7) = {} C(8) = {idy} C(9) = {} C(10) = {} (x) = {idy , idy } (y) = {} (z) = {}
Relationship to Dataflow Analysis • Expressions are side effect free • no entry/exit • A single environment • Represents information at different points via maps • A single value for all occurrences of a variable • Function applications act similar to assignments • “Definition” - Function abstraction is created • “Use” - Function is applied
A Formal Specification of 0-CFA • A Boolean function define when a solution is acceptable • (C, ) e means that (C, ) is acceptable for the expression e • Define by structural induction on e • Every function is analyzed once • Every acceptable solution is sound (conservative) • Many acceptable solutions • Generate a set of constraints • Obtain the least acceptable solution by solving the constraints
Syntax Directed 0-CFA(Simple Expressions) [const] (C, ) cl always [var] (C, ) xl if (x) C (l)
Syntax Directed 0-CFAFunction Abstraction [fn] (C, ) (fn x e)l if: (C, ) e fn x e C(l) [fun] (C, ) (fun f x e)l if: (C, ) e fun x e C(l) fun x e (f)
Syntax Directed 0-CFAFunction Application [app] (C, ) (t1l1 t2l2)l if: (C, ) t1l1 (C, ) t2l2 for all fn x t0l0 C(l): C (l2) (x) C(l0) C(l) for all fun x t0l0 C(l): C (l2) (x) C(l0) C(l)
Syntax Directed 0-CFAOther Constructs [if] (C, ) (if t0l0 then t1l1 else t2l2)l if: (C, ) t0l0 (C, ) t1l1 (C, ) t2l2 C(l1) C(l) C(l2) C(l) [let] (C, ) (let x = t1l1 in t2l2)l if: (C, ) t1l1 (C, ) t2l2 C(l1) (x) C(l2) C(l) [op] (C, ) (t1l1 op t2l2)l if: (C, ) t1l1 (C, ) t2l2
Set Constraints • A set of rules of the form: • lhs rhs • {t} rhs’ lhs rhs (conditional constraint) • lhs, rhs, rhs’ are • terms • C(l) • (x) • The least solution (C, ) can be found iterativelly • start with empty sets • add terms when needed • Efficient cubic graph based solution
Syntax Directed Constraint Generation (Part I) C* cl = {} C* xl = { (x) C (l)} C* (fn x e)l = C* e { {fn x e} C(l)} C* (fun x e)l = C* e { {fun x e}C(l)} {{fun x e}(f)} C* (t1l1 t2l2)l = C* t1l1 C* t2l2 {{t} C(l) C (l2) (x) | t=fn x t0l0 Term*} {{t} C(l) C (l0) C (l) | t=fn x t0l0 Term*} {{t} C(l) C (l2) (x) | t=fun x t0l0 Term*} {{t} C(l) C (l0) C (l) | t=fun x t0l0 Term*}
Syntax Directed Constraint Generation (Part II) C* (if t0l0 then t1l1 else t2l2)l = C* t0l0 C* t1l1 C* t2l2 {C(l1) C (l)} {C(l2) C (l)} C*(let x = t1l1 in t2l2)l =C* t1l1 C* t2l2 {C(l1) (x)} {C(l2) C(l)} C* (t1l1 op t2l2)l =C* t1l1 C* t2l2
Iterative Solution to the Set Constraints for ((fn x x1)2 (fn y y3)4)5
Adding Data Flow Information • Dataflow values can affect control flow analysis • Example(let f = (fn x (if (x1 > 02)3 then (fn y y4)5 else (fn z 56)7)8)9in ((f10 311)12 013)14)15
Adding Data Flow Information • Add a finite set of “abstract” values per program Data • Update Val = P(TermData) • Env = Var Val // Abstract environment • C Cache - Lab Val // Abstract Cache • Generate extra constraints for data • Obtained a more precise solution • A special of case of product domain (4.4) • The combination of two analyses may be more precise than both • For some programs may even be more efficient
Adding Dataflow Information (Sign Analysis) • Sign analysis • Add a finite set of “abstract” values per program Data = {P, N, TT, FF} • Update Val = P(TermData) • dc is the abstract value that represents a constant c • d3 = {p} • d-7= {n} • dtrue= {tt} • dfalse= {ff} • Every operator is conservatively interpreted
Syntax Directed Constraint Generation (Part I) C* cl = dc C (l)} C* xl = { (x) C (l)} C* (fn x e)l = C* e { {fn x e} C(l)} C* (fun x e)l = C* e { {fun x e}C(l)} {{fun x e}(f)} C* (t1l1 t2l2)l = C* t1l1 C* t2l2 {{t} C(l) C (l2) (x) | t=fn x t0l0 Term*} {{t} C(l) C (l0) C (l) | t=fn x t0l0 Term*} {{t} C(l) C (l2) (x) | t=fun x t0l0 Term*} {{t} C(l) C (l0) C (l) | t=fun x t0l0 Term*}
Syntax Directed Constraint Generation (Part II) C* (if t0l0 then t1l1 else t2l2)l = C* t0l0 C* t1l1 C* t2l2 {dt C (l0) C(l1) C (l)} {df C (l0) C(l2) C (l)} C*(let x = t1l1 in t2l2)l =C* t1l1 C* t2l2 {C(l1) (x)} {C(l2) C(l)} C* (t1l1 op t2l2)l =C* t1l1 C* t2l2 {C(l1) op C(l2) C(l)}
Adding Context Information • The analysis does not distinguish between different occurrences of a variable(Monovariant analysis) • Example(let f = (fn x x1) 2 in ((f3 f4)5 (fn y y6) 7)8)9 • Source to source can help (but may lead to code explosion) • Example rewrittenlet f1 = fn x1 x1in letf2 = fn x2 x2 in (f1 f2) (fn y y)
Simplified K-CFA • Records the last k dynamic calls (for some fixed k) • Similar to the call string approach • Remember the context in which expression is evaluated • Val is now P(Term)Contexts • Env = Var Contexts Val • C Cache - LabContexts Val
1-CFA • (let f = (fn x x1) 2 in ((f3 f4)5 (fn y y6) 7)8)9 • Contexts • [] - The empty context • [5] The application at label 5 • [8] The application at label 8 • Polyvariant Control FlowC(1, [5]) = (x, 5)= C(2, []) = C(3, []) = (f, []) = ({(fn x x1)}, [])C(1, [8]) = (x, 8)= C(7, []) = C(8, []) = C(9, []) = ({(fn y y6)}, [])
The Motivating Example class Vehicle Object { int position = 10; void move(x1 : int) { position = position + x1 ;}} class Car extends Vehicle { int passengers; void await(v : Vehicle) { if (v.position < position) then v.move(position - v.position); else self.move(10); }} class Truck extends Vehicle { void move(x2 : int) { if (x2 < 55) position = position + x2; }} void main { Car c; Truck t; Vehicle v1; new c; new t; v1 := c; c.passangers := 2; c.move(60); v1.move(70); c.await(t) ;}
Missing Material • Efficient Cubic Solution to Set Constraints www.cs.berkeley.edu/Research/Aiken/bane.html • Experimental results for OO www.cs.washington.edu/research/projects/cecil • Operational Semantics for FUN (3.2.1) • Defining acceptability without structural induction • More precise treatment of termination (3.2.2) • Needs Co-Induction (greatest fixed point) • Using general lattices as Dataflow values instead of powersets (3.5.2) • Lower-bounds • Decidability of JOP • Polynomiality
Conclusions • Set constraints are quite useful • A Uniform syntax • Can even deal with pointers • But semantic foundation is still based on abstract interpretation • Techniques used in functional and imperative (OO) programming are similar • Control and data flow analysis are related