Back to the Future: Revisiting Precise Program Verification Using SMT Solvers

Back to the Future:Revisiting Precise Program Verification Using SMT Solvers Shuvendu Lahiri Shaz Qadeer Microsoft Research, Redmond Presented earlier at POPL’08

Constraints arising in program verification are mixed! { b.f = 5 } a.f = 5 { a.f + b.f = 10 } is valid iff Select(f1,b) = 5  f2 = Store(f1,a,5)  Select(f2,a) + Select(f2,b) = 10 is valid theory of equality: f, = theory of arithmetic: 5, 10, + theory of arrays: Select, Store • [Nelson & Oppen ’79]

Satisfiability-Modulo-Theory (SMT) solvers • Boolean satisfiabilitysolving + theoryreasoning • Ground theories • Equality, arithmetic, Select/Store • NP-complete • Phenomenal progress in the past few years • Yices [Dutretre&deMoura’06], • Z3 [deMoura&Bjorner’07] • Potential to prove properties of real-world programs (ESC-JAVA, Spec#)

log_list.head log_list.tail next next next prev prev prev LinkNode data data data char * channel_name file_name logtype struct _logentry [muh: Internet Relay Chat (IRC) bouncer]

LinkNode *iter = log_list.head; while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next; } Ensure absence of double free Data structure invariant Reachability predicate For every node x in the list between log_list.head and null: x->data is a unique pointer, and x->data->channel_name is a unique pointer, and x->data->file_name is a unique pointer. Universal quantification

Challenges in program verification • Concise and precise expression of non-aliasing and disjointness of the heap • Properties of unbounded collections • Lists, Arrays, Trees • ….. • …..

Limitations of SMT solvers • No support for precise reasoning with reachability predicate • Incompleteness in Floyd-Hoare proofs for straight line code • Brittle support for quantifiers • Complexity: NP-complete (ground)  undecidable • Leads to unpredictable behavior of verifiers • Proof times, proof success rate • Requires user ingenuity to craft axioms/invariants with quantifiers

HAVOC Heap Aware Verifier fOr C programs Modular verifier for C programs Being developed at MSR Redmond Focus on low-level systems software (file systems, device drivers, ..) Linked lists, nested data structures, ..

Contribution • Expressive and efficient logic for precise program verification with the heap • A decision procedure for the logic within SMT solvers

Memory model • Memory Model • Heap consists of a set of objects (obj) • Each field “f” is a mutable map f: obj  obj

Reachability predicate: Btwnf next next next x y prev prev prev data data data Btwnnext(x,y) Btwnprev(y,x)

Inverse of a function: f-1 next next next x y prev prev prev data data data w data-1(w) = {x, y}

LinkNode *iter = log_list.head; while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next; } Data structure invariant For every node x in the list between log_list.head and null: x->data is a unique pointer, and …. x Btwnf(log_list.head, null) \ {null}. data-1(data(x)) = {x} ….

Expressive logic • Express properties of collections x Btwnf(f(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwnf(hd, null) \ {null}. yBtwnf(x, null) \ {null}. d(x)  d(y) • Type/object invariants x Type-1(“__logentry”). logtype(x) > 0  file_name(x) != null

Precise Need annotations/abstractions only at procedure/loop boundaries • Given the Floyd-Hoare triple X = {P} S {Q} • P and Q are expressed in our logic • S is a loop-free call-free program • We can construct a formula Y in our logic • Y is linear in the size of X • X is valid iff Y is valid

Efficient • Decision problem is NP-complete • Can’t expect any better with propositional logic! • Retains the complexity of current SMT logics • Provide a decision procedure for the logic on top of state-of-the-art Z3 SMT solver • Leverages powerful ground-theory reasoning (arithmetic, arrays, uninterpreted functions…)

Rest of the talk • Logic and decision procedure • Implementation and experience in HAVOC • Related work

Ground Logic Logic t  Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G  GFormula ::= t = t’| t < t’ | t  Btwnf(t1, t2) | G S  Set ::= f-1(t) | Btwnf(t1, t2) F  Formula ::= G | F1 F2 |F1 F2 | x  S. F

Ground decision procedure • Provide a set of 10 rewrite rules for Btwnf • Sound, complete and terminating • E.g. Transitivity3 t1 Btwnf(t0, t2) t  Btwnf(t0, t1) t  Btwnf(t0, t2), t1 Btwnf(t, t2)

t  Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G  GFormula ::= t = t’| t < t’ | t  Btwnf(t1, t2) | G Logic Bounded quantification over interpreted sets S  Set ::= f-1(t) | Btwnf(t1, t2) F  Formula ::= G | F1 F2 |F1 F2 | x  S. F

Sort restriction • The unsorted logic is undecidable !! • Unsorted logic  Sorted logic • Each term has a sort D, each function f has a sort D  E • There is a partial order on the sorts • Sort-restriction on x  S. F • sort(x)should be less than thesort(t[x]) for any termt[x] insideF

Sort restriction • Sort-restriction on x  S. F • sort(x)should be less than thesort(t[x]) for any termt[x] insideF • Sorts are quite natural • Come from program types • Most interesting specifications can be expressed • See paper for exceptions

Lazy quantifier instantiation • Instantiation rule t  Sx  S. F F[t/x] • Lazy instantiation • Instantiate only when a term t belongs to the set S • Substantially reduces the number of terms to instantiate a quantified fact

Implementation using triggers E.g. Transitivity3 t1 Btwnf(t0, t2) t  Btwnf(t0, t1) t  Btwnf(t0, t2), t1 Btwnf(t, t2) t0, t1 ,t2 ,t :: {Btwnf(t0, t2 ,t1), Btwnf(t0, t1 ,t2) } Btwnf(t0, t2 ,t1)  Btwnf(t0, t1 ,t2)  Btwnf(t0, t2 ,t)  Btwnf(t, t2 ,t1)

Preliminary results • Implemented a prototype in HAVOC • Uses Boogie and Z3 • Compared with an earlier [TACAS’07] implementation • Unrestricted quantifiers, incomplete axiomatization of reachability, no f-1 • Small to medium sized benchmarks

Experience • Greatly improved the predictability of the verifier • Reduced runtimes (2X – 100X) • Eliminate need for carefully crafted axioms and invariants • Can handle newer examples • Next step: infer annotations • E.g. Interpolants (builds on SMT solver)

Related work • Ternary reachability predicate • Nelson ‘83, Rakamaric et al. ’07 • Decidability with quantifiers (no reachability) • McPeak and Necula ‘05 • Separation logic [Berdine et al. ’04] • Second-order logic • MONA [Moeller et al. ’01], LRP [Yorsh et al. ’06]

Ongoing work • HAVOC available for download • www.research.microsoft.com/projects/HAVOC • Extend the logic to other interpreted set constructors • All elements in an array • All pointers of a given type for C

Questions

Back to the Future: Revisiting Precise Program Verification Using SMT Solvers