1 / 45

Modeling Data in Formal Verification Bits, Bit Vectors, or Words

Modeling Data in Formal Verification Bits, Bit Vectors, or Words. Randal E. Bryant Carnegie Mellon University. http://www.cs.cmu.edu/~bryant. Overview. Issue How should data be modeled in logical analysis? Verification, synthesis, test generation, security analysis, … Approaches

bona
Télécharger la présentation

Modeling Data in Formal Verification Bits, Bit Vectors, or Words

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling Data in Formal VerificationBits, Bit Vectors, or Words Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant

  2. Overview • Issue • How should data be modeled in logical analysis? • Verification, synthesis, test generation, security analysis, … • Approaches • Bits: Every bit is represented individually • Words: View each word as arbitrary value • E.g., unbounded integers • Historic program verification work • Bit Vectors: Finite precision words • Captures true semantics of hardware and software

  3. Data Path Com. Log. 1 Com.Log. 2 Bit-Level Modeling Control Logic • Represent Every Bit of State Individually • Behavior expressed as Boolean next-state over current state • Historic method for most CAD, testing, and verification tools • E.g., model checkers, test generators

  4. Bit-Level Modeling in Practice • Strengths • Allows precise modeling of system • Well developed technology • BDDs & SAT for Boolean reasoning • Limitations • Every state bit introduces two Boolean variables • Current state & next state • Overly detailed modeling of system functions • Don’t want to capture full details of FPU • Making It Work • Use extensive abstraction to reduce bit count • Hard to abstract functionality

  5. x Word-Level Abstraction #1: Bits → Integers x0 • View Data as Symbolic Words • Arbitrary integers • No assumptions about size or encoding • Classic model for reasoning about software • Can store in memories & registers x1 x2 xn-1

  6. Data Path Data Path Com. Log. 1 Com. Log. 1 ? Com.Log. 2 Com. Log. 1 ? What do we do about logic functions? Abstracting Data Bits Control Logic

  7. ALU Word-Level Abstraction #2: Uninterpreted Functions • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Only assumed property is functional consistency: a = x b = y f(a, b) = f(x, y) f

  8. F1 F2 Abstracting Functions Control Logic • For Any Block that Transforms Data: • Replace by uninterpreted function • Ignore detailed functionality • Conservative approximation of actual system Data Path Com. Log. 1 Com. Log. 1

  9. Word-Level Modeling: History • Historic • Used by theorem provers • More Recently • Burch & Dill, CAV ’94 • Verify that pipelined processor has same behavior as unpipelined reference model • Use word-level abstractions of data paths and memories • Use decision procedure to determine equivalence • Bryant, Lahiri, Seshia, CAV ’02 • UCLID verifier • Tool for describing & verifying systems at word level

  10. Pipeline Verification Example Pipelined Processor Reference Model

  11. Abstracted Pipeline Verification Pipelined Processor Reference Model

  12. Experience with Word-Level Modeling • Powerful Abstraction Tool • Allows focus on control of large-scale system • Can model systems with very large memories • Hard to Generate Abstract Model • Hand-generated: how to validate? • Automatic abstraction: limited success • Andraus & Sakallah, DAC 2004 • Realistic Features Break Abstraction • E.g., Set ALU function to A+0 to pass operand to output • Desire • Should be able to mix detailed bit-level representation with abstracted word-level representation

  13. Bit Vectors: Motivating Example #1 • Do these functions produce identical results? • Strategy • Represent and reason about bit-level program behavior • Specific to machine word size, integer representations, and operations int abs(int x) { int mask = x>>31; return (x ^ mask) + ~mask + 1; } int test_abs(int x) { return (x < 0) ? -x : x; }

  14. Motivating Example #3 bit[W] popSpec(bit[W] x) { int cnt = 0; for (int i=0; i<W; i++) { if (x[i]) cnt++; } return cnt; } bit[W] popSketch(bit[W] x) { loop (??) { x = (x&??) + ((x>>??)&??); } return x; } • Is there a way to expand the program sketch to make it match the spec? • Answer • W=16: • [Solar-Lezama, et al., ASPLOS ‘06] x = (x&0x5555) + ((x>>1)&0x5555); x = (x&0x3333) + ((x>>2)&0x3333); x = (x&0x0077) + ((x>>8)&0x0077); x = (x&0x000f) + ((x>>4)&0x000f);

  15. Motivating Example #4 Sequential Reference Model Pipelined Microprocessor • Is pipelined microprocessor identical to sequential reference model? • Strategy • Represent machine instructions, data, and state as bit vectors • Compatible with hardware description language representation • Verifier finds abstractions automatically

  16. Bit Vector Formulas • Fixed width data words • Arithmetic operations • Add/subtract/multiply/divide, … • Two’s complement, unsigned, … • Bit-wise logical operations • Bitwise and/or/xor, shift/extract, concatenate • Predicates • ==, <= • Task • Is formula satisfiable? • E.g., a > 0 && a*a < 0 50000 * 50000 = -1794967296 (on 32-bit machine)

  17. Decision Procedure Satisfying solution Formula Unsatisfiable (+ proof) Decision Procedures • Core technology for formal reasoning • Boolean SAT • Pure Boolean formula • SAT Modulo Theories (SMT) • Support additional logic fragments • Example theories • Linear arithmetic over reals or integers • Functions with equality • Bit vectors • Combinations of theories

  18. Recent Progress in SAT Solving

  19. BV Decision Procedures:Some History • B.C. (Before Chaff) • String operations (concatenate, field extraction) • Linear arithmetic with bounds checking • Modular arithmetic • Limitations • Cannot handle full range of bit-vector operations

  20. BV Decision Procedures:Using SAT • SAT-Based “Bit Blasting” • Generate Boolean circuit based on bit-level behavior of operations • Convert to Conjunctive Normal Form (CNF) and check with best available SAT checker • Handles arbitrary operations • Effective in Many Applications • CBMC [Clarke, Kroening, Lerda, TACAS ’04] • Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05] • CVC-Lite [Dill, Barrett, Ganesh], Yices [deMoura, et al]

  21. Bit-Vector Challenge • Is there a better way than bit blasting? • Requirements • Provide same functionality as with bit blasting • Find abstractions based on word-level structure • Improve on performance of bit blasting • Observation • Must have bit blasting at core • Only approach that covers full functionality • Want to exploit special cases • Formula satisfied by small values • Simple algebraic properties imply unsatisfiability • Small unsatisfiable core • Solvable by modular arithmetic • …

  22. Some Recent Ideas • Iterative Approximation • UCLID: Bryant, Kroening, Ouaknine, Seshia, Strichman, Brady, TACAS ’07 • Use bit blasting as core technique • Apply to simplified versions of formula • Successive approximations until solve or show unsatisfiable • Using Modular Arithmetic • STP: Ganesh & Dill, CAV ’07 • Algebraic techniques to solve special case forms • Layered Approach • MathSat: Bruttomesso, Cimatti, Franzen, Griggio, Hanna, Nadel, Palti, Sebastiani, CAV ’07 • Use successively more detailed solvers

  23. + Overapproximation More solutions: If unsatisfiable, then so is    + Fewer solutions: Satisfying solution also satisfies  −   Underapproximation − Iterative Approach Background: Approximating Formula • Example Approximation Techniques • Underapproximating • Restrict word-level variables to smaller ranges of values • Overapproximating • Replace subformula with Boolean variable  Original Formula

  24. Starting Iterations • Initial Underapproximation • (Greatly) restrict ranges of word-level variables • Intuition: Satisfiable formula often has small-domain solution  1−

  25. 1+ UNSAT proof: generate overapproximation If SAT, then done First Half of Iteration • SAT Result for 1− • Satisfiable • Then have found solution for  • Unsatisfiable • Use UNSAT proof to generate overapproximation 1+ • (Described later)  1−

  26. If UNSAT, then done SAT: Use solution to generate refined underapproximation 2− Second Half of Iteration • SAT Result for 1+ • Unsatisfiable • Then have shown  unsatisfiable • Satisfiable • Solution indicates variable ranges that must be expanded • Generate refined underapproximation 1+  1−

  27. Iterative Behavior 2+ 1+ • Underapproximations • Successively more precise abstractions of  • Allow wider variable ranges • Overapproximations • No predictable relation • UNSAT proof not unique    k+  k−    2− 1−

  28. 2+ 1+    UNSAT k+  k−    2− 1− SAT Overall Effect • Soundness • Only terminate with solution on underapproximation • Only terminate as UNSAT on overapproximation • Completeness • Successive underapproximations approach  • Finite variable ranges guarantee termination • In worst case, get k−  

  29. 1+ UNSAT proof: generate overapproximation  1− 2− Generating Overapproximation • Given • Underapproximation 1− • Bit-blasted translation of 1− into Boolean formula • Proof that Boolean formula unsatisfiable • Generate • Overapproximation 1+ • If 1+ satisfiable, must lead to refined underapproximation • Generate 2− such that 1−  2−  

  30. x = y x + 2 z 1 a w & 0xFFFF = x x % 26 = v Bit-Vector Formula Structure • DAG representation to allow shared subformulas 

  31. x = y x + 2 z 1 w Range Constraints x y a z w & 0xFFFF = x x % 26 = v Æ Structure of Underapproximation • Linear complexity translation to CNF • Each word-level variable encoded as set of Boolean variables • Additional Boolean variables represent subformula values −

  32. w Range Constraints x y z Ç Æ : Æ Ç UNSAT Proof • Subset of clauses that is unsatisfiable • Clause variables define portion of DAG • Subgraph that cannot be satisfied with given range constraints x = y x + 2 z 1 a Æ w & 0xFFFF = x Ç x % 26 = v

  33. w Range Constraints x y z Ç Æ : Æ Ç Extracting Circuit from UNSAT Proof • Subgraph that cannot be satisfied with given range constraints • Even when replace rest of graph with unconstrained variables x = y x + 2 z 1 UNSAT a Æ b1 b2

  34. Ç Æ : Ç Generated Overapproximation • Remove range constraints on word-level variables • Creates overapproximation • Ignores correlations between values of subformulas x = y x + 2 z 1 a 1+ Æ b1 b2

  35. w Range Constraints x y z Ç Æ : Æ Ç Refinement Property • Claim • 1+ has no solutions that satisfy 1−’s range constraints • Because 1+ contains portion of 1− that was shown to be unsatisfiable under range constraints x = y x + 2 z 1 UNSAT a Æ 1+ b1 b2

  36. Ç Æ : Ç Refinement Property (Cont.) • Consequence • Solving 1+ will expand range of some variables • Leading to more exact underapproximation 2− x = y x + 2 z 1 a 1+ Æ b1 b2

  37. SAT: Use solution to generate refined underapproximation 2− Effect of Iteration • Each Complete Iteration • Expands ranges of some word-level variables • Creates refined underapproximation 1+ UNSAT proof: generate overapproximation  1−

  38. Approximation Methods • So Far • Range constraints • Underapproximate by constraining values of word-level variables • Subformula elimination • Overapproximate by assuming subformula value arbitrary • General Requirements • Systematic under- and over-approximations • Way to connect from one to another • Goal: Devise Additional Approximation Strategies

  39. x * y Function Approximation Example • Motivation • Multiplication (and division) are difficult cases for SAT • §: Prohibit Via Additional Range Constraints • Gives underapproximation • Restricts values of (possibly intermediate) terms • §: Abstract as f(x,y) • Overapproximate as uninterpreted function f • Value constrained only by functional consistency

  40. Challenges with Iterative Approximation • Formulating Overall Strategy • Which abstractions to apply, when and where • How quickly to relax constraints in iterations • Which variables to expand and by how much? • Too conservative: Each call to SAT solver incurs cost • Too lenient: Devolves to complete bit blasting. • Predicting SAT Solver Performance • Hard to predict time required by call to SAT solver • Will particular abstraction simplify or complicate SAT? • Combination Especially Difficult • Multiple iterations with unpredictable inner loop

  41. STP: Linear Equation Solving • Ganesh & Dill, CAV ’07 • Solve linear equations over integers mod 2w • Capture range of solutions with Boolean Variables • Example Problem • Variables: 3-bit unsigned integers x: [x2 x1 x0] y: [y2 y1 y0] z: [z2 z1 z0] • Linear equations: conjunction of linear constraints • General Form • A x + b = 0 mod 2w

  42. Summary: Modeling Levels • Bits • Limited ability to scale • Hard to apply functional abstractions • Words • Allows abstracting data while precisely representing control • Overlooks finite word-size effects • Bit Vectors • Realistic semantic model for hardware & software • Captures all details of actual operation • Detects errors related to overflow and other artifacts of finite representation • Can apply abstractions found at word-level

  43. Areas of Agreement • SAT-Based Framework Is Only Logical Choice • SAT solvers are good & getting better • Want to Automatically Exploit Abstractions • Function structure • Arithmetic properties • E.g., associativity, commutativity • Arithmetic reductions • E.g., LU decomposition • Base Level Should Be SAT • Only semantically complete approach

  44. Choices • Optimize for Special Formula Classes • E.g., STP optimized for conjunctions of constraints • Common in software verification & testing • Iterative Abstraction • Natural framework for attempting different abstractions • Having SAT solver in inner loop makes performance tuning difficult • Others?

  45. Observations • Bit-Vector Modeling Gaining in Popularity • Recognition of importance • Benchmarks and competitions • Just Now Improving on Bit Blasting + SAT • Lots More Work to be Done

More Related