790 likes | 815 Vues
This lecture discusses the application of abstract interpretation in solving monotone systems of equations, using the example of a static analysis algorithm. Topics include fixed points, continuous functions, Kleene's fixed point theorem, and the vanilla algorithm.
 
                
                E N D
Spring 2014Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University
Previously Solving monotone systems Fixed-points Vanilla static analysis algorithm Chaotic iteration
Static analysis • R[0] = {xZ} // established input • R[1] = R[0]  R[4] • R[2] = assume x>0 R[1] • R[3] = assume x0 R[1] • R[4] = x:=x-1 R[2] • R[0]# = {xZ}# • R[1]# = R[0]  R[4] • R[2]# = assume x>0#R[1] • R[3]# = assume x0#R[1] • R[4]# = x:=x-1#R[2] • Given a system of equationsfor the collecting semanticsA static analysis solves a corresponding system of equations over an abstract domain • Questions: • How do you solve the second system? Chaotic Iteration • What is the relation between the solutions?This lecture
Monotone function L1 L2 f  f  y f(y) f(x) 2 3 4 x 1
Important cases of monotonicity • Join: f(X, Y) = X  Y is monotone in each operand • Prove it! • Set lifting function: for a set X and any function gF(X) = { g(x) | x X } is monotone w.r.t.  • Prove it! • Notice that the collecting semantics function is defined in terms of • Join (set union) • Semantic function for atomic statements lifted to sets of states • Conclusion: collecting semantics function is monotone
Fixed points  Red(f) gfp Fix(f) lfp Ext(f) fn()  • Does a solution always exist? Yes • If so, is it unique? No, but it has least/greatest solutions • If so, is it computable? Under some conditions… • L = (D, , , , , ) • f : DDmonotone • Fix(f) = { d | f(d) = d } • Red(f) = { d | f(d)  d } • Ext(f) = { d | d  f(d) } • Theorem [Tarski 1955] • lfp(f) = Fix(f) = Red(f)  Fix(f) • gfp(f) = Fix(f) = Ext(f)  Fix(f)
Continuous functions • Let L = (D, , , ) be a complete partial order • Every ascending chain has an upper bound • A function f is continuous if for every increasing chain Y  D*, f(Y) = { f(y) | yY } • Lemma: if f is continuous then f is monotone • Proof:assume x yTherefore xy=yThen f(y) = f(xy) = f(x)  f(y), which means f(x)  f(y)
Continuous functions • Let L = (D, , , ) be a complete partial order • Every ascending chain has an upper bound • A function f is continuous if for every increasing chain Y  D*, f(Y) = { f(y) | yY } • Lemma: if f is continuous then f is monotone • Proof: assume x yTherefore xy=yThen f(y) = f(xy) = f(x)  f(y), which means f(x)  f(y)
Kleene’s fixed point theorem • Let L = (D, , , ) be a complete partial order and a continuous function f: DD thenlfp(f) = nNfn() • That is, take the ascending chain  f()  f(f())  …  fn()  …and return the supremum • Why is this an ascending chain? • But how do you know if a function f is continuous
Continuity and ACC condition • Let L = (D, , , ) be a complete partial order • Every ascending chain has an upper bound • L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes:d0 d1  …  dn = dn+1 = dn+2 = … • Lemma: Monotone functions on posets satisfying ACC are continuous
Resulting algorithm  Mathematical definition lfp(f) = nNfn() lfp fn() Algorithm d := whilef(d)  ddod := f(d)returnd … f2() f()  Kleene’s fixed point theorem gives a constructive method for computing lfp(f) over a poset with ACC when f is monotone
Vanilla algorithm Non-incremental. Most variables don’t change. Problem Definition: • Lattice of properties L of finite height (ACC) • For each statement define a monotone transformer Preparation: • Parse program into AST • Convert AST into CFG • Generate system of equations from CFG Analysis: • Initialize each analysis variable with  • Update all analysis variables of each equation until reaching a fixed point
Chaotic iteration fori:=1 to n do X[i] := WL = {1,…,n}while WL  do j := pop WL // choose index non-deterministically N := F[i](X) if N  X[i] then X[i] := Nadd all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i])return X • Input: • A cpoL = (D, , , ) satisfying ACC • Ln = LL … L • A monotone function f : DnDn • A system of equations { X[i] | f(X) | 1  i  n} • Output: lfp(f) • A worklist-based algorithm
Required knowledge • Collecting semantics • Abstract semantics (over lattices) • Algorithm to compute abstract semantics(chaotic iteration) • Connection between collecting semantics and abstract semantics • Abstract transformers
Today Galois connections Abstract transformers Global soundness
Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: What is the connection between the two least fixed-points? Transformer monotonicity is required for termination – what should we require for correctness?
Recap We defined a reference semantics – the collecting semantics We defined an abstract semantics for a given lattice and abstract transformers We defined an algorithm to compute abstract least fixed-point when transformers are monotone and lattice obeys ACC Questions: Does the algorithm terminate? What is the connection between the two least fixed-points? Transformer monotonicity is required for termination – what should we require for correctness?
Handling non-monotone transformers Mathematical definition lfp(f) = nNfn() Algorithm d := whilef(d)  ddod := f(d)returnd Kleene’s fixed point theorem gives a constructive methodfor computing lfp(f) over a poset with ACC when f is monotone Monotonicity ensures   f()  … fn()  …is an ascending chain What if f is not necessarily monotone? How can we ensure termination?
Handling non-monotone transformers Mathematical definition lfp(f) = nNfn()  nNf’n() Revised algorithm d := whilef’(d)  ddod := f’(d)returnd Define f’(d) = d f(d) Now f’ is extensive:d d f(d) = f’(d) and so   f’()  … f’n()  …is an ascending chain Result is not necessarily the least fixed point – we get a (post)fixed point in finite time (ACC)
Galois Connection • Given two complete latticesC = (DC, C, C, C, C, C) – concrete domainA = (DA, A, A, A, A, A) – abstract domain • A Galois Connection (GC) is quadruple (C, , , A)that relates C and A via the monotone functions • The abstraction function  : DC DA • The concretization function  : DA DC • For every concrete element cDCand abstract element aDA((a)) Aa and cC ((c)) • Alternatively (c) AaiffcC(a)
Galois Connection: cC ((c)) C A The most precise (least) element in A representing c  ((c)) 3  (c) 2 c  1
Galois Connection: ((a)) Aa What a represents in C(its meaning) C A  a (a) 1 2  ((a))  3
Example: lattice of equalities • Concrete lattice:C = (2State, , , , , State) • Abstract lattice:EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , ) • Treat elements of A as both formulas and sets of constraints • Useful for copy propagation – a compiler optimization • (X) = ?(Y) = ?
Example: lattice of equalities • Concrete lattice:C = (2State, , , , , State) • Abstract lattice:EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , ) • Treat elements of A as both formulas and sets of constraints • Useful for copy propagation – a compiler optimization • () = ({}) = { x=y | x = y} that is  x=y(X) = {() |  X} = A{() |  X}(Y) = { | Y} = models(Y)
Galois Connection: cC ((c)) C A …[x6, y6, z6][x5, y5, z5][x4, y4, z4] … x=x, y=y, z=z 4   3 x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y  2  1 [x5, y5, z5] The most precise (least) element in A representing [x5, y5, z5]
Most precise abstract representation (c) = {c’ | c  (c’)} C A 6 7  4 2 5   3 (c)  8 9  c 1
Most precise abstract representation (c) = {c’ | c  (c’)} C A x=y 6 7 x=y, z=y  x=y, y=z 4 2 5  3 (c)= x=x, y=y, z=z, x=y, y=x, x=z, z=x, y=z, z=y  8 9  c 1 [x5, y5, z5]
Galois Connection: ((a)) Aa What a represents in C(its meaning) C A …[x6, y6, z6][x5, y5, z5][x4, y4, z4] …    is called a semanticreduction 1 x=y, y=z 2   3 x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y
Partial reduction • The operator    is called a semantic reduction since((a)) means the same a a but it is a reduced – more precise version of a • An operator reduce : DA DAis a partial reduction if • reduce(a) Aaand • (a)=(reduce(a))
Galois Insertion a: ((a))=a How can we obtain a Galois Insertion from a Galois Connection? C A …[x6, y6, z6][x5, y5, z5][x4, y4, z4] … All elementsare reduced  1  2 x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y
Properties of a Galois Connection The abstraction and concretization functions uniquely determine each other:(a) = {c | (c)  a}(c) = {a | c  (a)}
Abstracting (disjunctive) sets It is usually convenient to first define the abstraction of single elements(s) = ({s}) Then lift the abstraction to sets of elements (X) = A{(s) | sX}
The case of symbolic domains An important class of abstract domains are symbolic domains – domains of formulas C = (2State, , , , , State)A = (DA, A, A, A, A, A) If DA is a set of formulas then the abstraction of a state is defined as() = ({}) = A{ |  }the least formula from DA that s satisfies The abstraction of a set of states is(X) = A{() | sX} The concretization is() = { |  } = models()
Inducing along the connections Assume the complete latticesC = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A)M = (DM, M, M, M, M, M)andGalois connectionsGCC,A=(C, C,A, A,C, A) and GCA,M=(A, A,M, M,A, M) Lemma: both connections induce the GCC,M= (C, C,M, M,C, M) defined by C,M = C,A A,M and M,C = M,A A,C
Inducing along the connections C A M A,C M,A c’ 4 5 a’=A,M(C,A(c)) 3 c C,A(c) C,A A,M 1 2
Sound abstract transformer • Given two latticesC = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with • A concrete transformer f : DC DCan abstract transformer f# : DA DA • We say that f#is a sound transformer (w.r.t. f) if • c: f(c)=c’ (f#(c))  (c’) • For every a and a’ such that (f((a)))A f#(a)
Transformer soundness condition 1 c: f(c)=c’ (f#(c))  (c’) C A f#  5 f 4 1 2 3
Transformer soundness condition 2 a: f#(a)=a’ f((a))  (a’) C A 4  f 5 1 2 f# 3
Best (induced) transformer f#(a)=(f((a))) C A f# 4 f 3 1 2 Problem:  incomputable directly
Best abstract transformer [CC’77] • Best in terms of precision • Most precise abstract transformer • May be too expensive to compute • Constructively defined asf# =  f   • Induced by the GC • Not directly computable because first step is concretization • We often compromise for a “good enough” transformer • Useful tool: partial concretization
Transformer example C = (2State, , , , , State) EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , ) () = ({}) = { x=y | x = y }that is  x=y(S) = {() |  S} = A{ () | S }() = { |  } = models() Concrete: x:=y S = { [x y] | S } Abstract: x:=y#S = ?
Developing a transformer for EQ - 1 • Input has the form S = {a=b} • sp(x:=expr, ) = v. x=expr[v/x] [v/x] • sp(x:=y, S) = v. x=y[v/x] S[v/x] = … • Let’s define helper notations: • Mod(x:=y, S) = {x=a, b=x  S} • Subset of equalities containing x (will be modified) • Frame(x:=y, S) = S \ Mod(x:=y, S) • Subset of equalities not containing x (i.e., the frame)
Developing a transformer for EQ - 2 • sp(x:=y, S) = v. x=y[v/x] {a=b}[v/x] = … • Two cases • x is y: sp(x:=x, S) = S • x is different from y:sp(x:=y, S)= v. x=yMod(x:=y, S)[v/x] Frame(x:=y, S)[v/x]= x=y Frame(x:=y, S)  v. Mod(x:=y, S)[v/x] x=y Frame(x:=y, S) • Vanilla transformer: x:=y#1X = x=y Frame(x:=y, S) • Example: x:=y#1{x=p, q=x, m=n} = {x=y, m=n}Is this the most precise result?
Developing a transformer for EQ - 3 • x:=y#1{x=p, x=q, m=n} = {x=y, m=n}  {x=y, m=n, p=q} • Where does the information p=q come from? • sp(x:=y, S) = x=y Frame(x:=y, S) v. Mod(x:=y, S)[v/x] • v. Mod(x:=y, S)[v/x] holds possible equalities between different a’s and b’s – how can we account for that?