200 likes | 318 Vues
This paper presents a novel approach to automated software verification, focusing on the challenges posed by dynamically modified heaps. It explores natural proofs for structured data, separation logic, and various data structures such as trees and linked lists. The authors introduce Dryad, a new separation logic dialect that facilitates the development of automatic verification conditions. The work highlights significant advancements in specifying complex data structures and automating reasoning, thereby enhancing tools like Boogie and VCC for successful deducing program correctness.
E N D
Natural proofs for structure, data, and separation xiaokangqiuPranavgarg Andrei Stefanescu P. Madhusudan University of Illinois at Urbana-Champaign PLDI 2013, SEATTLE, WA
Motivation Don’t know • Automatic Deductive Software Verification • Success stories: Hypervisor (MSR), ExpressOS (Illinois), etc. using tools like Boogie, VCC, etc. • Challenge: dynamically modified heap • Structure (e.g., “p points to the root of a tree”) • Data (e.g., “the integers stored in a list is sorted”) • Separation (e.g., “the procedure modifies only elements reachable using the next field”) int f(int x){ assume x>0; … inv x < y; … assert y>0; } VC-Generator Verification Condition SMT Solver Annotated Program yes no
Example: Memory Isolation … … PID: 12 PID: 30 PID: 19 Start Addr Start Addr Start Addr Verify invariants in ExpressOS[@UIUC, ASPLOS’13] • “Every two assigned memory chunks are disjoint“ • “The start addresses are sorted along the doubly-linked list” • “The memory allocator modifies only this doubly-linked list” • Key Challenge: Prev Prev Prev End Addr End Addr End Addr Next Next Next … … Structure, Data,and Separationare mixed UNDECIDABLE • Expressivelogics • to specify complex • data structures • Automatedreasoning • with these • sophisticated logics
heap verification Expressive keep expressiveness Expressive Logics: Separation logics, HOL, Matching logic, etc. Natural Proofs give up decidability sound but incomplete preserve automaticity Decidable Logics: Strand, LISBQ, CSL, etc. Automatic
Natural proofs: in a nutshell • Handle a logic that is very expressive (inevitably undecidable) • Retain automaticity at the same level as decidable logics • Identify a class of naturalproofs N such that • N includes natural proof tactics used in human proofs • Many correct programs can be proved using a proof in class N • The class N is effectively searchable (checking if there is a proof in N is efficiently decidable) All Proofs N
More generalNatural proofs • Natural Proofs for Trees [POPL’12] • only for trees (can’t say “x points to a tree and y points into the tree”) • only functional recursion (no while-loops, too weak for loop invariants) • only classical logic (no frame reasoning) • In this work: Natural Proofs for Separation Logic • A wide variety of structures, and their combination (trees, doubly-linked lists, cyclic lists, etc.) • Iterative and Recursive (pre, post, and loop-inv.) • Separation Logic (frame reasoning is natively supported)
contribution • Dryad: a dialect of separation logic that is amenable to natural proofs • No explicit quantification, but supports recursivedefinitions • Admits a translation to quantifier-free classic logic with recursion • Automatic VC-generation for Dryad-annotated programs • Develop natural proofs using decision procedures (powered by SMT solvers) Two proof tactics for handling recursive defs[Suter et al., 2010] : • Unfold across the footprint • Formula abstraction These two tactics form the class of natural proofs!
Binary search tree in dryad • (loop invariant for “bst-search”: • root points to a BST and curr points into the BST; • k is stored in the BST iffk is stored under curr) the union of the keys stored in the root and the left/right sub-trees root right subtree: a BST left subtree: a BST the root Recursive predicate Recursive function Base case: empty heaplet SeparatingConjunction Base case: empty set curr k
Why dryad? • The heaplet semantics is not supported by automatic theorem provers • Solution:translated to • Capture the scope (heaplet required) of a formula using set-variables? • Empty heap emp : • Singleton heap : • Separating Conj.t t’: • Recursive definition : • Separation Logic: The semantics of is arbitrary fix point of the recursive definition, and the scope is not syntactically determined reachl,r(x) ? Dryad: the scope is exactly the reachable locations from x using l and r (only difference between Dryad and classic Separation Logic)
Translating heaplets to sets • The heaplet required for a Dryad formula can be syntactically determined • the scope can be modeled as a set of locations, and the heaplet semantics can be expressed using free set variables Example: • translated to • (still quantifier-free)
natural proofs for dryad • We consider programs with modular annotations • A simple heap-manipulating language • (supports memory allocation/deallocation, destructive pointer/data updates, and function calls) • Pre/post-conditions and loop invariants written in Dryad • Footprint:dereferenced locations (x is in footprint if y := x.next) • Verify programs in 4 steps • 1. Translate Dryad to classic logic with recursion • 2. VC-Generation (compute strongest-post) • 3. Unfold recursive defsacross the footprint (still precise; explained later) • 4. Formula Abstraction • (recursive definitions uninterpreted, becomes sound but incomplete) Natural Proofs: Two proof tactics inspired by human proofs
1st proof tactic:Unfolding across the footprint • The verification condition ψVC involves recursive definitions that can be unfolded ad infinitum. • Where to stop? • In most cases: unfold only across the footprint (locations get dereferenced in the program). Footprint: {x, z}, then assert tree(x) = … tree(z) = … … y = x.l; z = x.r; z.l = y; … Non-footprint: {y}, then assert tree(y) /\ NoMod(y) ==> tree’(y)
1st proof tactic:Unfolding across the footprint • Formally, we capture this proof tactic using a set of conjuncts: • Theorem • ψVC is valid iffψ’VC is valid. Unfold only across the footprint Apply the frame rule for non-footprint locations
2nd proof tactic:formula abstraction • Natural Proofs in two steps: • Formula Abstraction • replace each recursive definition rec with uninterpreted • when a proof for is found, we call it a natural proof for (sound but incomplete) • is expressible in the decidable theories • (theory of sets/arrays, maps, uninterpretedfunctions and integers) • and is solvable using modern SMT solvers.
User-provided axioms • Proof tactics cannot effectively find the relationship between data structures • Meta-level axioms for structures (independent of program verified) • Relating simpler data structures and more sophisticated ones, e.g., • Relating partial data structures and complete ones, e.g., • Instantiated at all non-footprint locations • Still automatic? • Axioms are program-independent, defined once and for all • Can be encoded in the verifier
Natural proofs for dryad Program + Recursive defs + Annotations in Dryad Program + Recursive defs + annptations in Classical-logic VCGen Translate Verification Conditions no Formula in a dec. logic over sets: Does a natural proof exist? Encode NTwo proof tactics for recursive defs yes Don’t know SMT Solver
experimental evaluation • A prototype verifier with Z3 as the backend solver (more details at http://web.engr.illinois.edu/~qiu2/dryad/) Inductive data structures singly-linked list, sorted list, doubly-linked list, cyclic list, max-heap, BST, Treap, AVL tree, red-black tree, binomial heap, … Data structures from open source programs • Glib library: singly/doubly-linked list • OpenBSD library: queue • Linux kernel: VMA for virtual memory management ExpressOS: a secure mobile platform • Structures for memory management
We verified … 106 Dryad-annotated programs • insert, delete, search, reverse, rotate, … (recursive & iterative) • Quicksort, Schorr-Waite algorithm for trees, etc. • VMA: virtual memory operations • ExpressOS: operations memory isolation • full-functional, partial correctness (e.g., data structure invariants, expected set-of-keys, …) All the VCs were automaticallyproved by our tool! • 69 verified within 1 sec • 96 verified within 10 sec • 104 verified within 100 sec “RBT-delete_iter” spent 225 secs, “binomial-heap-merge_rec” spent 153 secs (To the best of our knowledge) • First terminating automatic mechanism that can prove such a wide variety of data-structure manipulating programs full-functionally correct
Related work • Natural proofs • [Suter et al., POPL’10]: Proof tactics for functional programs • [Madhusudan et al., POPL’12]: Imperative programs, only trees, no while-loops, no frame reasoning • Separation Logic • Verifast, Bedrock, etc.: proof tactics provided by the user per program • Implicit Dynamic Frames: also using sets for heaplet, not syntactically determined, not quantifier-free • HIP/Sleek: search for a class of proofs, but not in any decidable theory • Classical Logic • Dafny, VCC, etc.: require significant ghost annotations • Jahob: often relies on complex provers, e.g., Mona or Isabelle
conclusion • Dryad • an expressive, tractable dialect of separation logic • A translation to QF classical logic with recursion • Established a natural proof scheme • Algorithmically search for a subclass of proofs • Automatic, sound but incomplete • Formalized two proof tactics for inductive data structures • A verifier that handles Hoare-triples automatically using SMT solvers Future work • More proof tactics for natural proofs (non-inductive structures, e.g., graphs) • Natural Proofs for C (incorporate natural proofs into existing verifiers, e.g., VCC) • Natural Proofs for concurrent programs (Rely-Guarantee + Dryad) All Proofs Natural proofs