1 / 52

Carnegie Mellon University

Decision Procedures Customized for Formal Verification. Randal E. Bryant. Carnegie Mellon University. http://www.cs.cmu.edu/~bryant. Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri. Outline. Context Infinite state models of hardware systems

vachel
Télécharger la présentation

Carnegie Mellon University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decision Procedures Customized for Formal Verification Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri

  2. Outline • Context • Infinite state models of hardware systems • Verification techniques • Needs • Requirements for decision procedures • Dealing with quantifiers • Our Solution • SAT-based procedure • “Eager” Boolean encoding

  3. Verification Example • Task • Verify that microprocessor correctly implements instruction set definition • Even though heavily pipelined Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996

  4. Existing Hardware Verification Methods • Simulators, equivalence checkers, model checkers, … • All Operate at Bit Level • View each register or memory bit as state variable • Behavior of each state variable defined by Boolean function • Strengths • Finite-state systems conceptually simple • BDDs & SAT procedures allow high degrees of automation • Limitations • State space can be very large • Only verify fixed instantiation of system • Specific memory sizes, number of processes, buffer lengths, …

  5. Verification Challenges • Sources of Complexity • Lots of internal state • Complex control logic • Opportunities • Most of the logic serves to store, select, and communicate data Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996

  6. Applying Data Abstraction to Hardware Verification • Idea • Abstract details of data encodings and operations • Keep control logic precise • Applications • Verify overall correctness of system • Assuming individual functional units correct • Advantages of Abstraction • Abstract infinite-state system easier to verify than detailed finite-state one • Parametric representation allows verification of many different system variants • Arbitrary number of processes, buffer lengths, etc.

  7. Data Path Com. Log. 1 Com.Log. 2 Word Abstraction Control Logic • Data: Abstract details of form & functions • Control: Keep at bit level • Timing: Keep at cycle level

  8. x Data Abstraction #1: Bits → Terms x0 • View Data as Symbolic Words • Arbitrary integers • No assumptions about size or encoding • Classic model for reasoning about software • Can store in memories & registers x1 x2 xn-1

  9. Data Path Data Path Com. Log. 1 Com. Log. 1 ? Com.Log. 2 Com. Log. 1 ? What do we do about logic functions? Abstracting Data Bits Control Logic

  10. ALU Abstraction #2: Uninterpreted Functions • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Only assumed property is functional consistency: a = x b = y f(a, b) = f(x, y) f

  11. F1 F2 Abstracting Functions Control Logic • For Any Block that Transforms Data: • Replace by uninterpreted function • Ignore detailed functionality • Conservative approximation of actual system Data Path Com. Log. 1 Com. Log. 1

  12. M a M m0 a Abstraction #3: Modeling Memories as Mutable Functions • Memory M Modeled as Function • M(a): Value at location a • Initially • Arbitrary state • Modeled by uninterpreted function m0

  13. Writing Transforms Memory M = Write(M, wa, wd) Reading from updated memory: Address wa will get wd Otherwise get what’s already in M Express with Lambda Notation M = a . ITE(a = wa, wd, M(a)) M wa = wd a M 1 0 Effect of Memory Write Operation

  14. Systems with Buffers Circular Queue Unbounded Buffer • Modeling Method • Mutable function to describe buffer contents • Integers to represent head & tail pointers • Parameterize buffer capacity with symbolic value Max

  15. Some History of Term-Level Modeling • Historically • Standard model used for program verification • Unbounded integer data types • Widely used with theorem-proving approaches to hardware verification • E.g, Hunt ’85 • Automated Approaches to Hardware Verification • Burch & Dill, ’95 • Tool for verifying pipelined microprocessors • Implemented by form of symbolic simulation • Continued application to pipelined processor verification

  16. UCLID • Seshia, Lahiri, Bryant, CAV ‘02 • Term-Level Verification System • Language for describing systems • Inspired by CMU SMV • Symbolic simulator • Generates integer expressions describing system state after sequence of steps • Decision procedure • Determines validity of formulas • Support for multiple verification techniques • Available by Download http://www.cs.cmu.edu/~uclid

  17. Required Logic • Scalar Data Types • Formulas (F ) Boolean Expressions • Control signals • Terms (T ) Integer Expressions • Data values • Functional Data Types • Functions (Fun) Integer  Integer • Immutable: Functional units • Mutable: Memories • Predicates (P) Integer  Boolean • Immutable: Data-dependent control • Mutable: Bit-level memories

  18. To support pointer operations CLU Logic • Counter Arithmetic, Lambda Expressions and Uinterpreted Functions • Terms (T ) Integer Expressions ITE(F, T1, T2) If-then-else Fun (T1, …, Tk) Function application succ (T) Increment pred (T) Decrement • Formulas (F ) Boolean Expressions F, F1F2, F1F2 Boolean connectives T1 = T2 Equation T1 < T2 Inequality P(T1, …, Tk) Predicate application

  19. CLU Logic (Cont.) • Functions (Fun) Integer  Integer f Uninterpreted function symbol  x1, …, xk . T Function definition • Predicates (P) Integer  Boolean p Uninterpreted predicate symbol  x1, …, xk . F Predicate definition

  20. Outline • Context • Infinite state models of hardware systems • Verification techniques • Needs • Requirements for decision procedures • Dealing with quantifiers • Our Solution • SAT-based procedure • “Eager” Boolean encoding

  21. Present State Next State  Inputs (Arbitrary) Verifying Safety Properties • State Machine Model • State encoded as Booleans, integers, and functions • Next state function expresses how updated on each step • Prove: System will never reach bad state Bad States Reachable States Reset States Reset

  22. Reachable Rn • • • Bounded Model Checking Bad States • Repeatedly Perform Image Computations • Set of all states reachable by one more state transition • Underapproximation of Reachable State Set • But, typically catch most bugs with 8–10 steps R2 R1 Reset States

  23. Reset    Bad     S X1 X2 Xn Implementing BMC Satisfiable? • Construct verification condition formula for step n by symbolically simulating system for n cycles • Check with decision procedure • Do as many cycles as tractable

  24. Reach Fixed-Point Rn = Rn+1 = Reachable Impractical for Term-Level Models Many systems never reach fixed point Can keep adding elements to buffer Convergence test undecidable (Bryant, Lahiri, Seshia, CHARME ’03)  Rn • • • True Model Checking Bad States R2 R1 Reset States

  25. I Inductive Invariant Checking Bad States • Key Properties of System that Make it Operate Correctly • Formulate as formula I • Prove Inductive • Holds initially I(s0) • Preserved by all state changes I(s)  I((i, s)) Reachable States Reset States

  26. Inductive Invariants • Formulas I1, …, In • Ij(s0) holds for any initial state s0, for 1 jn • I1(s)  I2(s)  … In(s)  Ij(s ) for any current state s and successor state s for 1 jn • Overall Correctness • Follows by induction on time • Restricted form of invariants • x1x2…xk (x1…xk) • (x1…xk) is a CLU formula without quantifiers • x1…xk are integer variables free in (x1…xk) • Express properties that hold for all buffer indices, register IDs, etc.

  27. Proving Invariants • Proving invariants inductive requires quantifiers |= [x1x2…xk (x1…xk)] [y1y2…ym (y1…ym)] • Prove unsatisfiability of formula x1x2…xk (x1…xk)  (y1…ym) • Undecidable Problem • In logic with uninterpreted functions and equality

  28. Invariant Checking:Out-of-Order Processor Designs • Generating invariants requires considerable human effort • Impractical for realistic designs

  29. Constructing Invariants from Predicates Predicates rob.head  reg.tag(r) Invariant r,t.reg.valid(r)  reg.tag(r) = t  (rob.head  reg.tag(r) < rob.tail rob.dest(t) = r ) reg.valid(r) Result: Correctness reg.tag(r) = t rob.dest(t) = r

  30. Automatic Predicate Abstraction • Graf & Saïdi, CAV ’97 • Idea • Given set of predicates P1(s), …, Pk(s) • Boolean formulas describing properties of system state • View as abstraction mapping: States {0,1}k • Defines abstract FSM over state set {0,1}k • Form of abstract interpretation • Do reachability analysis similar to symbolic model checking • Early Implementations Inefficient • Guess at possible next abstract states • Test with call to decision procedure

  31. A I Rn • • • R2 R1 Reset States Concretize  C Concrete System Reset States P.E. as Invariant Generator • Reach Fixed-Point on Abstract System • Termination guaranteed, since finite state • Equivalent to Computing Invariant for Concrete System • Strongest possible invariant that can be expressed by formula over these predicates Abstract System

  32. Symbolic Formulation of Predicate Abstraction Lahiri, Bryant, Cook, CAV ‘03 • Basic Operation • Compute set of legal abstract next states (B) given current abstract states (B) B,B: Abstract current and next-state state variables , : Boolean formulas • Create formula of form (S,B) Possible combinations of current concrete state S and next abstract state B • Formulate as Quantifier Elimination Problem • Generate formula of form (B)  S(S,B) S: Integer variables • For interpretation of B, formula  true iff (S,B) satisfiable

  33. Outline • Context • Infinite state models of hardware systems • Verification techniques • Needs • Requirements for decision procedures • Dealing with quantifiers • Our Solution • SAT-based procedure • “Eager” Boolean encoding

  34. Decision Procedure Needs • Bounded Model Checking • Satisfiability of quantifier-free CLU formula • Handled by decision procedure • Invariant Checking • Satisfiability of quantified CLU formula • Undecidable • Predicate Abstraction • Eliminate quantifiers from CLU formula • Role of Decision Procedure • Apply in sound, but incomplete way

  35. UCLID Decision Procedure Operation CLU Formula • Series of transformations leading to propositional formula • Except for lambda expansion, each has polynomial complexity Lambda Expansion -free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability

  36. Input Formula Input Formula additional clause unsatisfiable Approximate Boolean Encoder Satisfiability-preserving Boolean Encoder First-order Conjunctions SAT Checker Boolean Formula Boolean Formula satisfiable SAT Solver SAT Solver satisfying assignment satisfiable unsatisfiable satisfiable unsatisfiable LAZY ENCODING EAGER ENCODING SAT-based Decision Procedures

  37. Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver unsatisfiable satisfiable Eager Encoding Characteristics • Must encode all information about domain properties into Boolean formula • Some properties can give exponential blowup • Lets SAT solver do all of the work Good Approach for Some Domains • Modern SAT solvers have remarkable capacity • Good at extracting relevant portions out of very large formulas • Learns about formula properties as search proceeds

  38. Boolean Formula SAT Solver satisfiable/unsatisfiable Encoding Methods Difference Logic Formula Small Domain Encoding (SD) Per-Constraint Encoding (PC)

  39. x x x+1 x+1 y y z z Values increase Small Domain Encoding (SD) [Bryant, Lahiri, Seshia, CAV’02] x  y  y  z  z  x+1 0x1x00y1y00y1y00z1z00z1z00x1x0+1 Observation: To check satisfiability, need to consider all possible relative orderings of finitely-many expressions • Can use Boolean encoding of finite range of values • 4 values in this case, so 2-bit encoding

  40. e1 x  y y  z e2 e1 e2 e3 e3 z  x+1  Overall Boolean Encoding e1 e2 e4 New Difference Predicate  e4 x  z e4 e3 Transitivity Constraints Per-Constraint Encoding (PC) [Strichman, Seshia, Bryant, CAV’02] x  y  y  z  z  x+1

  41. Method • Boolean Encoding Size Example: N = 6813 • PC • > 1000000 • SD • 54465 Size of Boolean Encoding: SD better than PC • Let N be size of original difference logic formula • Size of a directed acyclic graph representation • SD encoding size is worst-case O(N2) • PC encoding size is worst-case O(2N) • Can generate O(2N) transitivity constraints

  42. Impact on SAT problem: SD vs PC • Experimentally compared zChaff performance on SD and PC encodings of several unsatisfiable formulas • Sample result: PC better than SD for zChaff

  43. How to Choose Encoding • Hybrid Strategy • Partition variables into classes • Which ones are compared to each other • For each class, choose encoding method • PC except SD when PC blows up • How to Determine Whether PC Will Work • Try to predict based on formula characteristics • Number of constraints, density, … • Selection procedure trained by machine learning

  44. Some Lessons We’ve Learned About Decision Procedures • Preserve Boolean Structure • Other approaches require collapsing to conjunctions of predicates (or extracting them dynamically) • Exploit Problem Characteristics • Sparseness • Polarity structure • Let SAT Solver Do the Work • Eager encoding: provide sufficient set of constraints to prove / disprove formula • They are good at digesting large volume of information

  45. Invariant Checking Revisited • Prove Unsatisfiability of Formula x1x2…xk (x1…xk)  (y1…ym) • General Form: X(X)  (Y) • Quantifier Instantiation • Generate expressions E1(Y), …, En(Y) • Using terms that appear in Q • Expand as (E1(Y))  … (En(Y))  (Y) • If unsatisfiable, then so is quantified formula • Sound, but incomplete • Trade-off • Be clever about instantiation, or • Instantiate many terms and rely on decision procedure capacity

  46. Predicate Abstraction Revisited • Formulate as Quantifier Elimination Problem • Generate formula of form (B)  S(S,B) S: Integer variables • Use Eager SAT Encoding of  • Get formula  AP(A,B) A: Boolean variables • Satisfying solutions for P w.r.t. B same as those for  • Core problem of symbolic model checking

  47. Quantifier Elimination for P.A. • Formula  AP(A,B) A: Boolean variables • Typically: 200+ variables for A, ~20 for B • BDD-Based • Use partitioning techniques developed for symbolic model checking • Typically too many total Boolean variables • SAT Enumeration • Find satisfying solution (A) (B) to P • Enumerate solution (B) • Reformulate P as P (B) • Performance: about 1000 solutions / second

  48. Why Verification Tasks Feasible • CLU Logic Fairly Simple • Equality, uninterpreted functions, difference constraints • Small model property • “Deep” Reasoning Not Required • Formulas large and messy, but straightforward • Verifying systems that are designed to have constrained behaviors • Only checking effect of a few cycles of system operation

  49. Decision Procedures Revisited • SAT-Based Approaches Effective • Good performance as decision procedures • Key to implementing predicate abstraction • Quantifier elimination • Eager Encoding Gives Good Performance • Avoids many iterations of theory-specific checkers • Extends to linear integer arithmetic • Seshia & Bryant, LICS ‘04 • Quantifier-free Presburger • Small domain encoding exploiting sparseness

  50. Areas of Research • Bit-Vector Decision Procedures • True model for hardware & low-level software • Bit-field extraction • Bit-wise Boolean operations • Overflow effects • Automatically apply abstractions • Abstract to symbolic terms whenever possible • Boolean Quantifier Elimination • SAT enumeration still not good enough • Limits predicate abstraction to ~25 predicates • Core problem for symbolic model checking

More Related