SAT-Based Decision Procedures for Linear Arithmetic and Uninterpreted Functions

SAT-Based Decision Procedures for Linear Arithmetic and Uninterpreted Functions Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant

OK Verification Error Decision Procedure for Decidable Fragment of First-Order Logic Decision Procedure for Decidable Fragment of First-Order Logic Decision Procedures in Formal Verification RTL/ Source Code + Specifi-cation Formal Model + Specifi-cation Abstraction Applications: Out-of-order, Pipelined Microprocessors; Cache Coherence Protocols; Device Drivers; Compiler Validation; …

Input Formula Input Formula additional clause unsatisfiable Approximate Boolean Encoder Satisfiability-preserving Boolean Encoder First-order Conjunctions SAT Checker Boolean Formula Boolean Formula satisfiable SAT Solver SAT Solver satisfying assignment satisfiable unsatisfiable satisfiable unsatisfiable LAZY ENCODING EAGER ENCODING SAT-based Decision Procedures

Uninterpreted Functions Linear Arithmetic Theory Combiner Bit Vectors • • • First-order Conjunctions SAT Checker Theory N Lazy Encoding Characteristics • Can be extended to handle wide variety of theories • Clean & modular design • Does not scale well • Number of calls to conjunction checker typically exponential in formula size • Each call independent: nothing learned in one call can be exploited by another

Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver unsatisfiable satisfiable Eager Encoding Characteristics • Must encode all information about domain properties into Boolean formula • Some properties can give exponential blowup • Lets SAT solver do all of the work Good Approach for Some Domains • Modern SAT solvers have remarkable capacity • Good at extracting relevant portions out of very large formulas • Learns about formula properties as search proceeds • Focus of this talk

Common Operations x0 x1 p  x x2 ALU x 1 0 ITE(p, x, y) xn-1 y If-then-else Bit-vectors to (unbounded) Integers x = x = y Test for equality y  f Functional units to Uninterpreted Functions a = x b = y  f(a,b) = f(x,y) Data and Function Abstraction

IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Abstract Modeling of Microprocessor • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Also view instruction memory as function F3 F2 F1

EUF: Equality with Uninterp. Functs • Decidable fragment of first order logic • Formulas (F ) Boolean Expressions F, F1F2, F1F2 Boolean connectives T1 = T2 Equation P (T1, …, Tk) Predicate application • Terms (T ) Integer Expressions ITE(F, T1, T2) If-then-else Fun (T1, …, Tk) Function application • Functions (Fun) Integer  Integer f Uninterpreted function symbol Read, Write Memory operations • Predicates (P) Integer  Boolean p Uninterpreted predicate symbol

e e 1 1 f f T T F F Ù Ù Ø e e Ø 0 0 = = x x f f Ú 0 0 T T Ú = = F F T T F F d d 0 0 EUF Decision Problem • Circuit Representation of Formula • Truth Values • Dashed Lines • Model Control • Logical connectives • Equations • Integer Values • Solid lines • Model Data • Uninterpreted functions • If-Then-Else operation • Task • Determine whether formula F is universally valid • True for all interpretations of variables and function symbols • Often expressed as (un)satisfiability problem • Prove that formula F is not satisfiable

e e 1 1 f f T T F F x0 d0 f(x0) f(d0) Ù Ù Ø e e Ø 0 0 = = x x f f Ú 0 0 T T Ú = = F F T T F F d d 0 0 Finite Model Property for EUF • Observation • Any formula has limited number of distinct expressions • Only property that matters is whether or not different terms are equal

Boolean Encoding of Integer Values • For Each Expression • Either equal to or distinct from each preceding expression • Boolean Encoding • Use Boolean values to encode integers over small range • EUF formula can be translated into propositional logic • Logic circuit with multiplexors, comparators, logic gates • Tautology iff original formula valid

Some History of EUF Decision Procedures • Ackermann, 1954 • Quantifier-free decision problem can be decided based on finite instantiations • Burch & Dill, CAV ‘94 • Automatic decision procedure • Davis-Putnam enumeration • Congruence closure to enforce functional consistency • Boolean approaches • Goel, et al, CAV ‘98 • Attempted with BDDs, but didn’t get good results • Bryant, German, Velev, CAV ‘99 • Could verify microprocessor using BDDs • Velev & Bryant, DAC 2001 • Demonstrated power of modern SAT procedures

Exploiting Positive Equality • Bryant, German, Velev CAV ‘99 • First successful use of Boolean methods for EUF • Positive Equality • Equations that appear in unnegated form • Exploiting • Can greatly reduce number of cases required to show validity • Only need to consider maximally diverse interpretations • Reduce number of Boolean variables in bit-level encoding

Diverse Interpretations: Illustration • Task • Verify someone’s obscure code for 4X4 array transpose void trans(int a[4][4]) { int t; for (t = 4; t < 15; t++) if (~t&2|| t&8 && ~t&1) { int r = t&0x3; int c = t>>2; int val = a[r][c]; a[r][c] = a[c][r]; a[c][r] = val; } } Only operations on array elements Observation • Array elements altered only by copying one to another • Just need to make sure right set of copies performed

Verifying Array Code • Test for trans4 a’ a trans4 Single Test Adequate • Unique value for each possible source element • “Maximally Diverse” • If a’[r][c]=a[c][r], then must have copied proper value

Characteristics of Array Verification • Correctness Condition • a’[0][0] = a[0][0]  a’[0][1] = a[1][0]  • a’[0][2] = a[2][0]  … • …  • a’[3][2] = a[2][3]  a’[3][3] = a[3][3] • Properties • All equations are in positive form • Worst case test is one that tends to make things unequal • Maximally diverse interpretation: use as many different values as possible • All maximally diverse interpretations isomorphic • Only need to try one to prove all handled correctly

IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Equations in Processor Verification • Data Types Equations • Register Ids Control stalling & forwarding • Instruction Address Only top-level verification condition • Program Data Only top-level verification condition

Exploiting Equation Structure • Positive Equations • In top-level verification condition • Can use maximally diverse interpretation • Negative Equations • PIpeline control logic • Between register IDs • Operation depends on whether or not two IDs are equal • Must use general encoding • Encode with Boolean variables • All possibility of IDs that match and/or don’t match

e e 1 1 5 6 7 8 f f T T F F Ù Ù Ø e e 7 Ø 0 0 5 = = x x f 0 f Ú 0 0 T T 5 5 6 6 Ú 1 6 7 7 6 8 = = F F T T 5 7 0 1 8 6 F F d d 0 0 Application of Positive Equality • Observation • All equations are positive in this formula • Can consider single, diverse interpretation for terms x0 d0 f(x0) f(d0) 1

 = F x1 vf1  = vf2 x2 f f Function Elimination: Ackermann’s Method • Replace All Function Applications by Integer Variables • Introduce new domain variable • Enforce functional consistency by global constraints • Unclear how to restrict evaluation to diverse interpretations

f vf1 x1 = f x2 T F vf2 = = x3 f T F T F vf3 Function Elimination: ITE Method • General Technique • Introduce new domain variable • Nested ITE structure maintains functional consistency

f 5 x1 = f x2 T F 6 = = x3 f T F T F 7 Generating Diverse Encoding • Replacing Application • Use fixed values rather than variables • Application results equal iff arguments equal

Benefits of Positive Equality Velev & Bryant, JSC ‘02 • Microprocessor Benchmarks • 1xDLX: Single issue, RISC processor • 2xDLX-EX-BP: Dual issue processor with exception handling & branch prediction • 9VLIW-BP: 9-way VLIW processor with branch prediction • Measurements • Using BerkMin SAT solver

Transitivity Constraints eyz ezx exy exy eyz exz exy exz eyz  Revisiting Encoding Techniques x = y  y = z  z  x Satisfiable? • Small Domain (SD) • Use bit-level encodings of bounded integers • Implicitly encode properties of equality logic • Per-Constraint Encoding (EIJ) • Introduce explicit Boolean variable for each equation • Additional transitivity constraints to express properties of equality logic x1x0=y1y0y1y0=z1z0z1z0x1x0 exy eyzexz

Per-Constraint Encoding • Introduced by Goel et al., CAV ‘98 • Exploiting sparse structure by Bryant & Velev, CAV 2000 • Procedure • Initial formula F • Want to prove valid • Prove that F is not satisfiable • Replace each equation x = y by Boolean variable exy • Gives formula Fsat • Generate formula expressing transitivity constraints • Gives formula Ftrans • Use SAT solver to show that Fsat Ftrans not satisfiable • Motivation • Provides SAT solver with more direct representation of underlying problem

= =  = = = = = Graph Interpretation of Transitivity • Transitivity Violation • Cycle in graph • Exactly one edge has ei,j= false

   Exploiting Chords • Chord • Edge connecting two non-adjacent vertices in cycle Property • Sufficient to enforce transitivity constraints for all chord-free cycles • If transitivity holds for all chord-free cycles, then holds for arbitrary cycles

Enumerating Chord-Free Cycles • Strategy • Enumerate chord-free cycles in graph • Each cycle of length k yields k transitivity constraints Problem • Potentially exponential number of chord-free cycles 1 2 k • • • 2k+k chord-free cycles • • •

2k+1 chord-free cycles Adding Chords • Strategy • Add edges to graph to reduce number of chord-free cycles 1 2 k • • • 2k+k chord-free cycles • • • Trade-Off • Reduces formula size • Increases number of relational variables

Chordal Graph • Definition • Every cycle of length > 3 has a chord • Goal • Add minimum number of edges to make graph chordal • Relation to Sparse Gaussian Elimination • Choose pivot ordering that minimizes fill-in • NP-hard • Simple heuristics effective

1xDLX-C Equation Structure • Vertices • For each vi • 13 different register identifiers • Edges • For each equation • Control stalling and forwarding logic • 27 relational variables • Out of 78 possible

Original 27 relational variables 286 cycles 858 clauses Augmented 33 relational variables 40 cycles 120 clauses Adding Chordal Edges to 1xDLX-C

2DLX-CCt Equation Structure • Equations • Between 25 different register identifiers • 143 relational variables • Out of 300 possible

Original 143 relational variables 2,136 cycles 8,364 clauses Augmented 193 relational variables 858 cycles 2,574 clauses Adding Chordal Edges to 2xDLX-CCt

Choosing Encoding Method • Comparison • Formula length n with m integer variables & function applications • Worst-case complexity • Per-Constraint Encoding Works Well in Practice • Generates slightly larger formulas than small domain • Better performance by SAT solver

Encoding Comparison Velev & Bryant, JSC ‘02 • Benchmarks • Superscalar, out-of-order datapath • 2–6 instructions issued in parallel • Measurements • Using BerkMin SAT solver

Extensions • Difference logic • Predicates of form x ≤ y + C • Original logic of UCLID • Use integer variables to represent pointers into buffers • C = 1 • Linear constraints • Predicates of from a1x1 + a2x2 + … + anxn ≤ b • Used in applying UCLID to software verification and software security problems

Difference Logic • Predicates of form x ≤ y + C • C generally a small integer • Encoding Methods • Small domain • Range bound n · max |C| • Per constraint encoding • Variables of form ex,,yC • Can have exponential blowup in number of variables • Choosing Encoding Method • Per constraint better, as long as it doesn’t blow up • Predicting blowup • Successfully used classifier trained by machine learning (Seshia, Lahiri & Bryant, DAC ’03)

Linear Constraints • Predicates of from a1x1 + a2x2 + … + anxn ≤ b • Common Case • All but k predicates are difference predicates • ai = +1, aj = –1, rest = 0 • Rest are sparse • At most w coefficients nonzero • Coefficient values small

Linear Constraints • Small Domain Encoding (Seshia & Bryant, LICS ’04) • Find value D such that only need to consider solutions with 0 ≤ xi < D, for all i • Bounds on D: • Encode as SAT problem with log(D) bits / integer variable • Practical for real applications (n+2) ¢ n ¢ (bmax+1) ¢ ( w¢amax ) k

Some Lessons We’ve Learned • Preserve Boolean Structure • Other approaches require collapsing to conjunctions of predicates • Exploit Problem Characteristics • Sparseness • Tighten bounds and/or reduce number of constraints • Polarity structure • Positive equality • Let SAT Solver Do the Work • Eager encoding: provide sufficient set of constraints to prove / disprove formula • They are good at digesting large volume of information

SAT-Based Decision Procedures for Linear Arithmetic and Uninterpreted Functions

SAT-Based Decision Procedures for Linear Arithmetic and Uninterpreted Functions

Presentation Transcript

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University