1 / 48

SymDiff : Leveraging Program V erification for Comparing P rograms

SymDiff : Leveraging Program V erification for Comparing P rograms. Shuvendu Lahiri Research in Software Engineering ( RiSE ), Microsoft Research, Redmond Jointly with Chris Hawblitzel ( Microsoft Research, Redmond), Ming Kawaguchi (UCSD), Henrique Rebelo (UPFE). VSSE Workshop, 2012.

savea
Télécharger la présentation

SymDiff : Leveraging Program V erification for Comparing P rograms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SymDiff: Leveraging Program Verification for Comparing Programs Shuvendu Lahiri Research in Software Engineering (RiSE), Microsoft Research, Redmond Jointly with Chris Hawblitzel (Microsoft Research, Redmond), Ming Kawaguchi (UCSD), Henrique Rebelo (UPFE) VSSE Workshop, 2012

  2. Motivation

  3. Ensuring compatibility • Programmers spend a large fraction of their time ensuring (read praying) compatibility after changes How does the feature addition impact existing features? Does the refactoringchange any observable behavior? Does my bug-fix introduce a regression? Microsoft Confidential

  4. Compatibility: applications Bug fixes Refactoring New features f() { Print(foo); g(); } g() { ... Print(foo); } g() { ... Print(foo); Print(bar); } Library API changes Version Control Compilers

  5. Compatibility: Microsoft • Products • Windows APIs (Win32, ntdll) • Driver development kits • .NET frameworks, Base class library • Compilers (C#, JIT,…) • ….. • Windows updates • Security patches • Bug fixes Every developer/tester/auditor

  6. Problem • Use static analysis to • Improve the productivity of users trying to ensure compatibility across program changes • Potential benefits • Agility: fewer regressions, higher confidence in changes, smarter code review, ..

  7. Challenge • Equivalence checking is too strong a spec • Most changes modify behavior • Hard to formalize (separate expected changes from unexpected changes) • Refactoring  behaviors intact • Bug fix  non-buggy behaviors intact • Feature add  existing feature behaviors intact • API change  ?? • Data change  ?? • Config changes  ?? • …

  8. Challenge  Opportunity • Hard to formalize (separate expected changes from unexpected changes) • Refactoring  behaviors intact • Bug fix  non-buggy behaviors intact • Feature add  existing feature behaviors intact • ……. Highlight “unexpected” changes

  9. Our approach • Provide a tool for performing semantic diff (diff over behaviors) How does the feature addition impact existing features? Does the refactoringchange any observable behavior? Does my bug-fix introduce a regression? Semantic Diff Microsoft Confidential

  10. Our approach • Provide a tool for performing semantic diff (diff over behaviors) How does the feature addition impact existing features? Does the refactoringchange any observable behavior? Does my bug-fixintroduce a regression? Semantic Diff Microsoft Confidential

  11. What is SymDiff? A framework to • Leverage and extend program verification for providing relative correctness

  12. Overview • Demo • Semantic diff • Tool (in current form) • An application • Compiler compatibility • Making SymDiff extensible with contracts • Users can express “expected” changes • Mutual summaries and relative termination

  13. Demo Eval (bug1) Eval (func) StringCopy (bug fix) Recursive example

  14. SymDiff tool

  15. SymDiff • Apply and extend program verification techniques towards comparing programs • Current form: Checks input/output partial equivalence [CAV ’12 tool paper]

  16. SymDiff tool: language independent SymDiff (Boogie+Z3) = P1 C/.NET/ x86/ARM  Boogie P1 P2 S1 C/.NET/ x86/ARM  Boogie P2 S2 P1 P2 ≠ Works at Boogie intermediate language

  17. Boogie • Simple intermediate verification language • [Barnett et al. FMCO’05] • Commands • x := E //assign • havoc x //change x to an arbitrary value • assert E //if E holds, skip; otherwise, go wrong • assume E // if E holds, skip; otherwise, block • S ; T //execute S, then T • gotoL1, L2, … Ln //non-deterministic jump to labels • call x := Foo(e1,e2,..) //procedure call

  18. Boogie (contd.) • Two types of expressions • Scalars (bool, int, ref, ..) • Arrays ([int]int, [ref]ref, …) • Array expression sugar for SMT array theory • x[i] := y  x := upd(x, i, y) • y := x[i]  y := sel(x,i) • Procedure calls sugar for modular specification procedure Foo(); requires pre; ensures post; modifies V; assert pre; havoc V; assume post; call Foo();

  19. Basic equivalence checking void swap1(ref int x, ref int y){ int z = x; x = y; y = z; } void swap2(ref int x, ref inty){ x = x + y; y = x - y; x = x - y; } z0 == x0 && x1 == y0 && y1 == z0 && swap1.x == x1 && swap1.y == y1 && x1' == x0 + y0 && y1' == x1' – y0 && x2' == x1' – y1' && swap2.x == x2' && swap2.y == y1' && ~ (swap1.x == swap2.x && swap1.y == swap2.y) UNSAT (Equivalent) Z3 theorem prover SAT (Counterexample)

  20. Handling procedure calls • Modular checking • Assume “matched” callees are deterministic and have the same I/O behaviors • Modeled by uninterpreted functions [Necula ‘00, …, Godlin & Strichman ‘08, …..] • Addition of postcondition for Foo, Foo’ modifies g; free ensures g == UF_Foo_g(x, old(g)); free ensures ret == UF_Foo_ret(x, old(g)); procedure Foo’(x) returns (ret); modifies g; free ensures g == UF_Foo_g(x, old(g)); free ensures ret == UF_Foo_ret(x, old(g)); procedure Foo(x) returns (ret);

  21. Modeling C/Java/C#/x86  Boogie • Separation of concerns • Front end can be developed independently • Quite a few already exists • HAVOC/VCC for C, Spec#/BCT for .NET, ?? for Java, … • Heap usually modeled by arrays • x.f := y  Heap_f[x] := y • Challenges • Deterministic modeling of I/O, malloc, ….. • The entire heap is passed around

  22. Application: Compiler compatibility

  23. Compiler validation Source ARM+opt ARM X86+opt X86 v1 v2 v3 v4 Versions Microsoft Confidential

  24. Compatibility: x86 vs. x86 example G01: push ESI mov ESI, EDX G02: and ESI, 255 push ESI mov EDX, 0x100000 call WriteInternalFlag2(int,bool) G03: pop ESI ret G01: mov EAX, EDX G02: and EAX, 255 push EAX mov EDX, 0x100000 call WriteInternalFlag2(int,bool) __epilog: ret 254 X86+opt v2 v3

  25. Large x86 vs. ARM example

  26. Beyond equivalence

  27. Beyond equivalence

  28. Contracts over two programs • Need an extensible contract mechanism for comparing two programs • Generalization of pre/post conditions • Why • Allow users to express relative correctness specifications (e.g. conditional equivalence) • Automated methods may not always suffice (even for equivalence checking) • Challenge • Should be able to leverage SMT-based program verifiers

  29. Mutual summaries • A extensible framework for interproceduralprogram comparison • Prior work (mostly automated): • Intraprocedural • Translation validation [Pnueli et al. ‘98, Necula ‘00, Zuck et al. ’05,…] • Coarse intraprocedural (only track equalities) • Regression verification [Strichman et al. ‘08]

  30. Mutual summaries • [MSR-TR-2011-112] • Mutual summaries (MS) • Relative termination (RT) • Dealing with loops and unstructured goto

  31. Example: Feature addition int f1(int x1){ a1 = A1[x2]; a2 = A2[x2]; if (Op[x1] == 0) return Val[x1]; else if (Op[x1] == 1) return f1(a1) + f1(a2); else if (Op[x1] == 2) return f1(a1) - f1(a2); else return 0; } intf2(int x2, boolisU){ a1 = A1[x2]; a2 = A2[x2]; if (Op[x2] == 0) return Val[x2]; else if (Op[x2] == 1){ if (isU) return uAdd(f2(a1, T), f2(a2, T)); else return f2(a1, F) + f2(a2, F); } else if (Op[x2] == 2){ if (isU) return uSub(f2(a1, T), f2(a2, T)); else return f2(a1, F) – f2(a2, F); } else return 0; } The programs are equivalent when isU == False

  32. Mutual summaries void F1(int x1){ if(x1 < 100){ g1 := g1 + x1; F1(x1 + 1); } } void F2(int x2){ if(x2 < 100){ g2 := g2 + 2*x2; F2(x2 + 1); } } • What is a mutual summary MS(F1, F2)? • An formula over two copies of • parameters, globals (g), returns and next state of globals (g’) MS(F1, F2): (x1 = x2 && g1 <= g2 && x1 >= 0) ==> g1’ <= g2’

  33. Mutual summaries void F1(int x1){ if(x1 < 100){ g1 := g1 + x1; F1(x1 + 1); } } void F2(int x2){ if(x2 < 100){ g2 := g2 + 2*x2; F2(x2 + 1); } } • What does a mutual summary MS(F1, F2) mean? • For any pre/post state pairs (s1,t1) of F1, and (s2,t2) of F2, (s1,t1,s2,t2) satisfy MS(F1,F2) MS(F1, F2): (x1 = x2 && g1 <= g2 && x1 >= 0) ==> g1’ <= g2’

  34. Example int f1(int x1){ a1 = A1[x2]; a2 = A2[x2]; if (Op[x1] == 0) return Val[x1]; else if (Op[x1] == 1) return f1(a1) + f1(a2); else if (Op[x1] == 2) return f1(a1) - f1(a2); else return 0; } intf2(int x2, boolisU){ a1 = A1[x2]; a2 = A2[x2]; if (Op[x2] == 0) return Val[x2]; else if (Op[x2] == 1){ if (isU) return uAdd(f2(a1, T), f2(a2, T)); else return f2(a1, F) + f2(a2, F); } else if (Op[x2] == 2){ if (isU) return uSub(f2(a1, T), f2(a2, T)); else return f2(a1, F) – f2(a2, F); } else return 0; } MS(f1, f2) = (x1 == x2 && !isU) ==> ret1 == ret2

  35. Checking mutual summaries • Given F1, F2, MS(F1, F2), define the following procedure: void CheckMS_F1_F2(int x1, int x2){ inline F1(x1); inline F2(x2); assert MS(F1,F2); }

  36. Modular checking: Instrumentation 1. Add “summary relations” R_F1, and R_F2 void F1(int x1); ensures R_F1(x1, old(g1)/g1, g1/g1’); 2. Use the summary relations to assumemutual summaries at call sites: axiom (forall x1, g1, g1’, x2, g2, g2’:: {R_F1(x1, g1, g1’), R_F2(x2, g2, g2’)} (R_F1(x1, g1, g1’) && R_F2(x2, g2, g2’)) ==> MS_F1_F2(x1, g1, g1’, x2, g2, g2’) );

  37. Leveraging program verifiers • Mutual Summary checking • Encode using contracts (postconditions), axioms • Verification condition generation (Boogie) • Checking using SMT solver (Z3) • Next steps • Inferring the mutual summaries

  38. Relative termination • Specification relating the terminating behaviors of P2 wrt P1 • Not just for proving termination • Required for composing transformations • MS1(f,f’) && MS2(f’,f’’)  (MS1  MS2) (f,f’’) • E.g. P_Eq(f,f’) && P_Eq(f’,f’’)  P_Eq(f,f’’)

  39. Relative termination condition void F1(int x1){ if(x1 < 100){ g1 := g1 + x1; F1(x1 + 1); } } void F2(int x2){ if(x2 < 100){ g2 := g2 + 2*x2; F2(x2 + 1); } } • What is a relative termination condition RT(F1, F2)? • An formula over two copies of • parameters, globals (g) RT(F1, F2): (x1 <= x2)

  40. Relative termination condition void F1(int x1){ if(x1 < 100){ g1 := g1 + x1; F1(x1 + 1); } } void F2(int x2){ if(x2 < 100){ g2 := g2 + 2*x2; F2(x2 + 1); } } • What does relative termination condition RT(F1, F2) mean? • For pair of inputs states (s1,s2), if F1 terminates on s1, and (s1,s2) satisfies RT(F1,F2), then F2 terminates on s2 RT(F1, F2): (x1 <= x2)

  41. What about loops? int Foo2() { i = 0; if (n > 0) { t = g; v = 3; do2: a[i] := v; i := i + 1; v := v + t; While2: //FLABEL if (i < n) gotodo2; } return i; } int Foo2() { i = 0; if (n > 0) { t = g; v = 3; do2: a[i] := v; i := i + 1; v := v + t; return While2(i, t, v); } return i; } (int ,int) While2(i2, t2, v2) { i2' := i2; v2' := v2; if (i2' < n) { a2[i2'] := v2'; i2' := i2' + 1; v2' := v2' + t2; return While2(i2', t2,v2'); } return (i2‘,v2’); }

  42. Unrolling optimizations void F2(int i2) { if (i2 < n) { a2[i2] = 1; F2(i2+1); return; } return; } void F3(int i3) { if (i3 + 1 < n) { a3[i3] := 1; a3[i3+1] := 1; F3(i3+2); return; } if (i3 < n) a3[i3] := 1; return; } Extra step • Inline F2 once inside F2 to “match up” with F3 MS(F2, F3) = (i2 == i3 && a2 == a3) ==> a2’ == a3’

  43. Using mutual summaries • Flow • Specify the FLABELS to remove loops and gotos into procedures • Write mutual summaries for pairs of resulting procedures • Specify the inlining limit (if needed)

  44. Express translation validation proofs of many compiler optimizations • Copy propagation • Constant propagation • Common sub-expression elimination • Partial redundancy elimination • Loop invariant code hoisting • Conditional speculation • Speculation • Software pipelining • Loop unswitching • Loop unrolling • Loop peeling • Loop splitting • Loop alignment • Loop interchange • Loop reversal • Loop skewing • Loop fusion • Loop distribution Order of updates differ in two versions • [Kundu, Tatlock, Lerner ‘09]

  45. A nice example that uses MS, RT next: ref  ref; data: ref  int; void D(ref x){ data[x] := U(data[x]); } void A(ref x){ if(x != nil){ A(next[x]); D(x); } } void B(ref x){ if(x != nil){ D(x); B(next[x]); } } void C(ref x){ ref i := x; if(i != nil){ Do: D(i); i := next[i]; if (i != nil) goto Do; } } Recursive Tail-recursive Do-while

  46. Overview • Demo • Semantic diff • Tool (in current form) • An application • Compiler compatibility • Making SymDiff extensible with contracts • Mutual summaries and relative termination • General contracts for comparing programs

  47. In summary • Checking compatibility (statically) is a huge opportunity • Both formalizing the problem • Tools/techniques to solve it • Likely to have impact on development cycle • Existing static analysis tools has failed to do so cost-effectively, in spite of all the progress • Combining with dynamic analysis • To generate test cases when possible, or aid testing achieve higher differential coverage

  48. Resources • SymDiff website http://research.microsoft.com/symdiff/ • Binary release soon! • Contains C front end

More Related