Efficient Learning Framework for Multi-Valued SAT

A General Nogood-Learning Frameworkfor Pseudo-Boolean Multi-Valued SAT*Siddhartha Jain Brown UniversityAshish Sabharwal IBM WatsonMeinolf Sellmann IBM Watson* to appear at AAAI-2011

SAT and CSP/CP Solvers [complete search] • CP Solvers: • work at a high level: multi-valued variables, linear inequalities, element constraints, global constraint (knapsack, alldiff, …) • have access to problem structure and specialized inference algorithms! • SAT: • problem crushed down to one, fixed, simple input format:CNF, Boolean variables, clauses • very simple inference at each node: unit propagation • yet, translating CSPs to SAT can be extremely powerful!E.g., SUGAR (winner of CSP Competitions 2008 & 2009) How do SAT solvers even come close to competing with CP? A key contributor: efficient and powerful “nogood learning”

SAT Solvers as Search Engines • Systematic SAT solvers have becomereally efficient at searching fast and learning from “mistakes” • E.g., on an IBM model checking instance from SAT Race 2006, with~170k variables, 725k clauses, solvers such as MiniSat and RSat roughly: • Make 2000-5000 decisions/second • Deduce 600-1000 conflicts/second • Learn600-1000 clauses/second (#clauses grows rapidly) • Restart every 1-2 seconds(aggressive restarts)

An Interesting Line of Work: SAT-X Hybrids • Goal: try to bring more structure to SAT and/or bring SAT-style techniques to CSP solvers • Examples: • Pseudo-Boolean solvers: inequalities over binary variables • SAT Module Theories (SMT): “attach” a theory T to a SAT solver • Lazy clause generation: record clausal reasons for domain changes in a CP solver • Multi-valued SAT: incorporate multi-valued semantics into SAT preserve the benefits of SAT solvers in a more general context strengthen unit propagation: unit implication variables (UIV) strengthen nogood-learning! [Jain-O’Mahony-Sellmann, CP-2010] Starting point for this work

Conflict Analysis: Example • Consider a CSP with 5 variables:X1, X2, X4 {1, 2, 3}X3  {1, …, 7}X5  {1, 2} • Pure SAT encoding: variables x11, x12, x13, x21, x22, x23, x31, …, x37, x51, x52 • What happens when we set X1 = 1, X2 = 2, X3 = 1? x11= true x22= true No more propagation, no conflict… really? C5 x31= true x51= true

Conflict Analysis: Incorporating UIVs • Unit implication variable (UIV):a multi-valued clause is “unit”as soon as all unassigned “literals”in it regard the same MV variable • SAT encoding, stronger propagation using UIVs: C1 x41 ≠ true x11= true C2 x22= true x32 ≠ true C5 C4 x31= true x51= true x43 = true conflict C3 x42 = true

What Shall We Learn? • Not a good idea to set x41 ≠ true and x31 = true learn the nogood(x41 = true || x31 = false) • Problem? When we backtrack and set, say, X3 to anything in {2, 3, 4, 5}, we end up performing exactly the same analysis again and again! • Solution: represent conflict graph nodes as variable inequations only C1 x41 ≠ true x11= true C2 x22= true x32 ≠ true C5 C4 x31= true x51= true x43 = true conflict C3 x42 = true

CMVSAT-1 • Use UIV rather than UIP • Use variable inequationsas thecorerepresentation:X1 = 1, X2 = 2, X3 = 1represented as X1 ≠ 2, X1 ≠ 3, … finer granularityreasoning!X3 doesn’t necessarily need to be 1for a conflict, it just cannot be 6 or 7 • Stronger learned multi-valued clause:(X4 = 1 || X3 = 6 || X3 = 7) • Upon backtracking, we immediatelyinfer X3 ≠ 1, 2, 3, 4, 5 !

This Work: Generalize This Framework • Core representation in implication graph • CMVSAT-1: variable inequations (X4 ≠ 1) • in general: primitive constraintsof the solver • example: linear inequalities (Y2 ≤ 5, Y3 ≥ 1) • Constraints • CMVSAT-1: multi-valued clauses (X4 = 1 || X3 = 6 || X3 = 7) • in general: secondary constraintssupported by the solver • example: (X1 = true || Y3≤ 4 ||X3 = Michigan) • Propagation of a secondary constraint Cs:entailment of a new primitive constraint from Cs and known primitives

This Work: Generalize This Framework • Learned nogoods • CMVSAT-1: multi-valued clauses (X4 = 1 || X3 = 6 || X3 = 7) • in general: disjunctions of negations of primitives(with certain desirable properties) • example: (X1 = true || Y3 ≤ 4 || X3 = Michigan) • Desirable properties of nogoods? • The “unit” part of the learned nogood that is implied upon backtrackingmust be representable as a set of primitives! • E.g., if Y3 is meant to become unit upon backtracking, then(X1 = true || Y3 ≤ 4 || Y3 ≥ 6 || X3 = Michigan) is NOT desirable • cannot represent Y3 ≤ 4 || Y3 ≥ 6 as a conjunction of primitives  •  upon backtracking, cannot propagate the learned nogood

Sufficient Conditions [details in AAAI-2011 paper] • System distinguishes between primitive and secondary constraints • Secondary constraint propagators: • Entail new primitive constraints • Efficiently provide a set of primitives constraints sufficient for the entailment • Can efficiently detect conflicting sets of primitives • and represent the disjunction of their negations (the “nogood”)as a secondary constraint • Certain sets of negated primitives (e.g., those arising from propagation upon backtracking) succinctly representable as a set of primitives • Branching executed as the addition of one or more* primitives Under these conditions, we can efficiently learn strong nogoods!

Abstract Implication Graph (under sufficient conditions) Cp: primary constraintsbranched upon or entailed at various decision levels Learned nogood: disjunction of negations of primitives in shaded nodes

General Framework: Example • X1 {true, false}X3  {NY, TX, FL, CA}X5 {r, g, b}X2, X4, X6, X7  {1, …, 100} • Branch on X1 ≠ true, X2 ≤ 50: Learn: (X3 = FL || X4 ≥ 31) Notes: The part of the nogood that is unit upon backtrackingneed not always regard the same variable! (e.g., when X ≤ Y is primitive) Neither is regarding the same variable sufficient for being a valid nogood!(e.g., X ≤ 4 || X ≥ 11 wouldn’t be representable)

Empirical Evaluation • Ideas implemented in CMVSAT-2 • Currently supported: • usual domain variables, and range variables • linear inequalities, e.g. (X1 + 5 X2 – 13 X3 ≤ 6) • disjunctions of equations and range constraintse.g. (X1 [5…60] || X2  [37…74] || X5 = b || X10 ≤ 15) • Comparison against: • SAT solver: Minisat[Een-Sorensson 2004, version 2.2.0] • CSP solver: Mistral[Hebrard 2008] • MIP solver: SCIP[Achterberg 2004, version 2.0.1] Encodings generated using Numberjack[Hebrard et al 2010, version 0.1.10-11-24]

Empirical Evaluation • Benchmark domains (100 instances of each) • QWH-C: weighted quasi-group with holes / Latin square completion • random cost cik {1, …, 10} assigned to cell (i,k) • cost constraint: sumik (cikXik) ≤ (sum of all costs) / 2 • size 25 x 25, 40% filled entries, all satisfiable • MSP-3: market split problem • notoriously hard for constraint solvers • partition 20 weighted items into 3 equal sets, 10% satisfiable • NQUEENS-W: weighted n-queens, 30x30 • average weight of occupied cells ≥ 70% of weightmax • size 30 x 30, random weights  {1,…,10}

Empirical Evaluation: Results SAT solver CSP solver MIP solver • CMVSAT-2 shows good performance across a variety of domains • Solved all 300 instances in < 4 sec on average • MiniSat not suitable for domains like QWH-C • Encoding (even “compact” ones like in Sugar) too large to generate or solve • 20x slower on MSP-3 • Mistral explores a much larger search space than needed • Lack of nogood learning becomes a bottleneck:e.g.: 3M nodes for QWH-C (36% solved), compared to 231 nodes for CMVSAT-2 • SCIP takes 100x longer on QWH-c, 10x longer on NQUEENS-W

Summary • A generalized framework for SAT-stylenogood-learning • extends CMVSAT-1 • low-overhead process, retains efficiency of SAT solvers • sufficient conditions: • primitive constraints • secondary constraints, propagation as entailment of primitives • valid cutsets in conflict graphs: facilitate propagation upon backtracking • other efficiency / representation criteria • CMVSAT-2: robust performance across a variety of problem domains • compared to a SAT solver, a CSP solver, a MIP solver • open: more extensive evaluation and comparison against, e.g., lazy clause generation, pseudo-Boolean solvers, SMT solvers • nonetheless, a promising and fruitful direction!

Efficient Learning Framework for Multi-Valued SAT

Efficient Learning Framework for Multi-Valued SAT

Presentation Transcript

Search

Design of Problem Solvers (PS) using Classical Problem Solving (CPS) techniques

Modeling with STELLA

Solving Difficult SAT Instances In The Presence of Symmetry

Experiments in Software Verification using SMT Solvers VS Experiments 2008 – Toronto, Canada

A Compressed Breadth-First Search for Satisfiability

Discovering the Internet Complete Concepts and Techniques, Second Edition

Proof translation and SMT LIB certification

Verifying Optimizations using SMT Solvers

Proofs from SAT Solvers

Integrating Trilinos Solvers to SEAM code

Boundary Point Elimination: A Path to Structure Aware SAT-solvers

MAKING MINIMAL SOLVERS FAST

Scores in a Complete Search System

CS 290H Lecture 5 Complete and incomplete factorization

Maximum Density Still Life

for more information ...

Local Restarts in SAT Solvers

SAT

“The Complete CIO” In Search of the Holy Grail John Kolm

Search with C o sts and Heuristic Search