 Download Download Presentation Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II

# Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II

Télécharger la présentation ## Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Spring 2014Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University

2. Syllabus

3. Previously • Semantic domains • Preorders • Partial orders (posets) • Pointed posets • Ascending/descending chains • The height of a poset • Join and Meet operators • Complete lattices • Constructing new lattices from old • Abstract Interpretation package – domains

4. Abstract domain types

5. A taxonomy of semantic domain types Join/Meet exist for every subset of D Join/Meet exist for every finite subset of D (alternatively, binary join/meet) Complete Lattice(D, , , , , ) Lattice(D, , , , , ) Meet of the empty set Join of the empty set Join semilattice(D, , , ) Meet semilattice(D, , , ) poset with LUB for all ascending chains Complete partial order (CPO)(D, , ) reflexivetransitiveanti-symmetric: d  d’ and d’  d implies d = d’ Partial order (poset)(D, ) • reflexive: d  dtransitive: d  d’, d’  d’’ implies d  d’’ Preorder(D, )

6. Composing domains

7. Cartesian product of complete lattices • For two complete lattices L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) • Define the posetLcart = (D1D2, cart, cart, cart, cart, cart)as follows: • (x1, x2) cart (y1, y2) iffx1 1 y1 andx2 2 y2 • cart = ? cart = ? cart = ? cart = ? • Lemma: L is a complete lattice • Define the Cartesian constructor Lcart = Cart(L1, L2)

8. Disjunctive completion • For a complete lattice L = (D, , , , , ) • Define the powerset latticeL = (2D, , , , , ) = ?  = ?  = ?  = ?  = ? • Lemma: L is a complete lattice • L contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates • Define the disjunctive completion constructorL = Disj(L)

9. Relational product of lattices • L1 = (D1, 1, 1, 1, 1, 1)L2 = (D2, 2, 2, 2, 2, 2) • Lrel = (2D1D2, rel, rel, rel, rel, rel)as follows: • Lrel = Disj(Cart(L1, L2)) • Lemma: L is a complete lattice

10. Finite maps • For a complete latticeL = (D, , , , , )and finite set V • Define the posetLVL = (VD, VL, VL, VL, VL, VL)as follows: • f1 VLf2iff for all vVf1(v)  f2(v) • VL = ? VL = ? VL = ? VL = ? • Lemma: L is a complete lattice • Define the map constructor LVL = Map(V, L)

11. The collecting lattice Lattice for a given control-flow node v: Lv=(2State, , , , , State) Lattice for entire control-flow graph with nodes V: LCFG = Map(V, Lv) We will use this lattice as a baseline for static analysis and define abstractions of its elements

12. Implementation

13. Software package: paver142 • Built on top of the Soot compiler framework for Java • Download from web-site • Includes all necessary Soot jar files

14. Example analyses Soot-specific utilities Infrastructurefor implementingstatic analysis

15. Existing analyses

16. Implementing abstract domains

17. Variable equalities analysis

18. Today Solving monotone systems Fixed-points Vanilla static analysis algorithm Chaotic iteration

19. Abstract interpretation via abstraction generalizes axiomatic verification statement S abstract semantics abstract representationof sets of states abstract representationof sets of states abstract representationof sets of states  abstraction abstraction statement S collecting semantics set of states set of states {P} S {Q}  sp(S, P)

20. Abstract interpretation via concretization abstract representationof sets of states abstract representationof sets of states statement S abstract semantics concretization concretization set of states set of states set of states statement S  collecting semantics  models(P) {P} models(sp(S, P)) S models(Q) {Q}

21. Missing knowledge Collecting semantics Abstract semantics Connection between collecting semantics and abstract semantics Algorithm to compute abstract semantics

22. Review of collecting semantics

23. The collecting lattice (sets of states) Lattice for a given control-flow node v: Lv=(2State, , , , , State) Lattice for entire control-flow graph with nodes V: LCFG = Map(V, Lv) We will use this lattice as a baseline for static analysis and define abstractions of its elements

24. Collecting semantics as equation system Semantic function for assume x>0 Semantic function for x:=x-1 lifted to sets of states entry R R if x > 0 R R R exit x := x-1 A vector of variables R[0, 1, 2, 3, 4] R = {xZ} // established inputR = R  RR = R  {s | s(x) > 0}R = R  {s | s(x)  0}R = x:=x-1 R A (recursive) system of equations

25. General definition entry R R if x > 0 R R R exit x := x-1 • A vector of variables R[0, …, k] one per input/output of a node • R is for entry • For node n with multiple predecessors add equationR[n] = {R[k] | k is a predecessor of n} • For an atomic operation node R[m] S R[n] add equationR[n] = S R[m] • Transform if bthenS1elseS2to (assumeb; S1) or (assumeb; S2)

26. Static analysis • R = {xZ} // established input • R = R  R • R = assume x>0 R • R = assume x0 R • R = x:=x-1 R • R# = {xZ}# • R# = R  R • R# = assume x>0#R • R# = assume x0#R • R# = x:=x-1#R • Given a system of equationsfor the collecting semanticsA static analysis solves a corresponding system of equations over an abstract domain • Questions: • What is the relation between the solutions?Next lecture • How do you solve the second system? This lecture

27. Solving equation systems

28. Equation systems in general For R[i]=f[i] R Usually f[i] reads only a small subset of R – D[i]. We say that R[i] depends on D[i] • R = {xZ} // established input • R = R  R • R = R  {s | s(x) > 0} • R = R  {s | s(x)  0} • R = x:=x-1 R • Let L be a complete lattice (D, , , , , ) • Let R be a vector of analysis variables R[0, …, n]  D… D • Let F be a vector of functions of the type F[i] : R[0, …, n]  R[0, …, n] • A system of equationsR = f(R, …, R[n])…R[n] = f[n](R, …, R[n]) • In vector notation R = F(R) • Questions: • Does a solution always exist? • If so, is it unique? • If so, is it computable?

29. Equation systems in general If it does – it is a fixed point of this equation • Let L be a complete lattice (D, , , , , ) • Let R be a vector of analysis variables R[0, …, n]  D… D • Let F be a vector of functions of the type F[i] : R[0, …, n]  R[0, …, n] • A system of equationsR = f(R, …, R[n])…R[n] = f[n](R, …, R[n]) • In vector notation R = F(R) • Questions: • Does a solution always exist? • If so, is it unique? • If so, is it computable?

30. Monotone systems

31. Monotone functions Let L1=(D1, ) and L2=(D2, ) be two posets A function f : D1D2 is monotone if for every pair x, y D1x y implies f(x)  f(y) A special case: L1=L2=(D, ) f : DD

32. Monotone function L1 L2 f  f  y f(y) f(x) 2 3 4 x 1

33. Important cases of monotonicity • Join: f(X, Y) = X  Y is monotone in each operand • Prove it! • Set lifting function: for a set X and any function gF(X) = { g(x) | x X } is monotone w.r.t.  • Prove it! • Notice that the collecting semantics function is defined in terms of • Join (set union) • Semantic function for atomic statements lifted to sets of states • Conclusion: collecting semantics function is monotone

34. Fixed points

35. Extensive/reductive functions Let L=(D, ) be a poset A function f : DD is extensiveif for every x D, we have that x f(x) A function f : DD is reductiveif for every x D, we have that x f(x)

36. Fixed points  Red(f) gfp Fix(f) lfp Ext(f) fn()  • Does a solution always exist? Yes • If so, is it unique? No, but it has least/greatest solutions • If so, is it computable? Under some conditions… • L = (D, , , , , ) • f : DDmonotone • Fix(f) = { d | f(d) = d } • Red(f) = { d | f(d)  d } • Ext(f) = { d | d  f(d) } • Theorem [Tarski 1955] • lfp(f) = Fix(f) = Red(f)  Fix(f) • gfp(f) = Fix(f) = Ext(f)  Fix(f)

37. Fixed point example F(d) : Fixed point d xZ xZ 0 0 = entry entry 1 1 xZ if x>0 xZ if x>0 {x>0} {x>0} 3 2 4 3 2 4 exit x := x-1 exit x := x-1 {x0} {x0} {x0} {x0} R = {xZ}R = R  RR = R  {s | s(x) > 0}R = R  {s | s(x)  0}R = x:=x-1 R

38. Pre-fixed point example F(d) : pre-fixed point d xZ xZ 0 0  entry entry 1 1 xZ if x>0 xZ if x>0 {x>0} {x>0} 3 2 4 3 2 4 exit x := x-1 exit x := x-1 {x<-5} {x0} {x0} {x0} R = {xZ}R = R  RR = R  {s | s(x) > 0}R = R  {s | s(x)  0}R = x:=x-1 R

39. Post-fixed point example F(d) : post-fixed point d xZ xZ 0 0  entry entry 1 1 xZ if x>0 xZ if x>0 {x>0} {x>0} 3 2 4 3 2 4 exit x := x-1 exit x := x-1 {x<9} {x0} {x0} {x0} R = {xZ}R = R  RR = R  {s | s(x) > 0}R = R  {s | s(x)  0}R = x:=x-1 R

40. Recap • A system of equations of the form R=F(R) where R draws its elements from a complete latticeL= (D, , , , , ) • Tarski’s fixed point theorem ensures us that there exists a least fixed point: lfp(f) = Fix(f) • However, it is not an algorithm since D is often infinite • Ineffective when D is finite • We need a more constructive way of computing lfp(f)

41. Computingthe least Fixed point

42. Continuous functions • Let L = (D, , , ) be a complete partial order • Every ascending chain has an upper bound • A function f is continuous if for every increasing chain Y  D*, f(Y) = { f(y) | yY} • Lemma: if f is continuous then f is monotone • Proof: assume x yTherefore xy=yThen f(y) = f(xy) = f(x)  f(y), which means f(x)  f(y)

43. Kleene’s fixed point theorem • Let L = (D, , , ) be a complete partial order and a continuous function f: DD thenlfp(f) = nNfn() • That is, take the ascending chain  f()  f(f())  …  fn()  …and return the supremum • Why is this an ascending chain? • But how do you know if a function f is continuous

44. Continuity and ACC condition • Let L = (D, , , ) be a complete partial order • Every ascending chain has an upper bound • L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes:d0 d1  …  dn = dn+1 = dn+2 = … • Lemma: Monotone functions on posets satisfying ACC are continuousProof:We need to show thatf(Y) = { f(y) | yY } • Every ascending chain Y eventually stabilizes d0 d1  …  dn = dn+1 = … hence dn is the least upper bound of {d0, d1, … , dn},thus f(Y) = f(dn) • From monotonicity of f we get thatf(d0)  f(d1)  …  f(dn) = f(dn+1) = … Hence f(dn) is the least upper bound of {f(d0), f(d1), … , f(dn)},thus { f(y) | yY } = f(dn)

45. Resulting algorithm  Mathematical definition lfp(f) = nNfn() lfp fn() Algorithm d := whilef(d)  ddod := f(d)returnd … f2() f()  Kleene’s fixed point theorem gives a constructive method for computing lfp(f) over a poset with ACC when f is monotone

46. Our very first genericstatic analysis algorithm

47. Vanilla algorithm Non-incremental. Most variables don’t change. Problem Definition: • Lattice of properties L of finite height (ACC) • For each statement define a monotone transformer Preparation: • Parse program into AST • Convert AST into CFG • Generate system of equations from CFG Analysis: • Initialize each analysis variable with  • Update all analysis variables of each equation until reaching a fixed point

48. Chaotic iteration

49. Chaotic iteration fori:=1 to n do X[i] := WL = {1,…,n}while WL  do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := Nadd all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i])return X • Input: • A cpoL = (D, , , ) satisfying ACC • Ln = LL … L • A monotone function f : DnDn • A system of equations { X[i] | f(X) | 1  i  n} • Output: lfp(f) • A worklist-based algorithm

50. Chaotic iteration for static analysis • Specialize chaotic iteration for programs • Create a CFG for program • Choose a cpo of properties for the static analysis to infer: L = (D, , , ) • Define variables R[0,…,n] for input/output of each CFG node such that R[i]D • For each node v let vout be the variable at the output of that node:vout = F[v]( u | (u,v) is a CFG edge) • Make sure each F[v] is monotone • Variable dependence determined by outgoing edges in CFG