Outline

Exploiting Structure and Randomization in Combinatorial SearchCarla P. Gomesgomes@cs.cornell.eduwww.cs.cornell.edu/gomesIntelligent Information Systems InstituteDepartment of Computer ScienceCornell University

Outline • A Structured Benchmark Domain • Randomization • Conclusions

Quasigroups or Latin Squares: An Abstraction for Real World Applications Given an N X N matrix, and given N colors, a quasigroup of order N is a a colored matrix, such that: -all cells are colored. - each color occurs exactly once in each row. - each color occurs exactly once in each column. Quasigroup or Latin Square (Order 4)

Quasigroup Completion Problem (QCP) Given a partial assignment of colors (10 colors in this case), can the partial quasigroup (latin square) be completed so we obtain a full quasigroup? Example: 32% preassignment (Gomes & Selman 97)

Quasigroup Completion Problem A Framework for Studying Search • NP-Complete. • Has a structure not found in random instances, • such as random K-SAT. • Leads to interesting search problems when structure is perturbed (more about it later). • Good abstraction for several real world problems: scheduling and timetabling, routing in fiber optics, coding, etc (Anderson 85, Colbourn 83, 84, Denes & Keedwell 94, Fujita et al. 93, Gent et al. 99, Gomes & Selman 97, Gomes et al. 98, Meseguer & Walsh 98, Stergiou and Walsh 99, Shaw et al. 98, Stickel 99, Walsh 99 )

Fiber Optic Networks Nodes connect point to point fiber optic links

Each fiber optic link supports a large number of wavelengths Nodes are capable of photonic switching --dynamic wavelength routing -- which involves the setting of the wavelengths. Fiber Optic Networks Nodes connect point to point fiber optic links

preassigned channels Routing in Fiber Optic Networks Input Ports Output Ports 1 1 2 2 3 3 4 4 Routing Node How can we achieve conflict-free routing in each node of the network? Dynamic wavelength routing is a NP-hard problem.

each channel cannot be repeated in the same input port (row constraints); • each channel cannot be repeated in the same output port (column constraints); Input Port Output Port Output ports 1 1 2 2 3 3 Input ports 4 4 CONFLICT FREE LATIN ROUTER QCP Example Use: Routers in Fiber Optic Networks Dynamic wavelength routing in Fiber Optic Networks can be directly mapped into the Quasigroup Completion Problem. (Barry and Humblet 93, Cheung et al. 90, Green 92, Kumar et al. 99)

Traditional View of Hard Problems - Worst Case View • “They’re NP-Complete—there’s no way to do anything but try heuristic approaches and hope for the best.”

New Concepts in Computation • Not all NP-Hard problems are the same! • We nowhave means for discriminating easy from hard instances • ---> Phase Transition concepts

NP-completeness is a worst-case notion – what about average complexity?Structural differences between instances of the same NP- complete problem (QCP)

Are all the Quasigroup Instances (of same size) Equally Difficult? Time performance: 1820 165 150 What is the fundamental difference between instances?

Are all the Quasigroup Instances Equally Difficult? Time performance: 150 Fraction of preassignment: 35% 1820 165 50% 40%

Critically constrained area Underconstrained area Overconstrained area 20% 42% 50% Complexity of Quasigroup Completion Median Runtime (log scale) Fraction of pre-assignment

Complexity Graph Phase transition from almost all solvable to almost all unsolvable Almost all solvable area Almost all unsolvable area Phase Transition Fraction of unsolvable cases Fraction of pre-assignment

These results for the QCP - a structured domain,nicely complement previous results on phase transition and computational complexity for random instances such as SAT, Graph Coloring, etc. • (Broder et al. 93; Clearwater and Hogg 96, Cheeseman et al. 91, Cook and Mitchell 98, Crawford and Auton 93,Crawford and Baker 94, Dubois 90, Frank et al. 98, Frost and Dechter 1994, Gent and Walsh 95, Hogg, et al. 96, Mitchell et al. 1992, Kirkpatrick and Selman 94, Monasson et 99, Motwani et al. 1994, Pemberton and Zhang 96, Prosser 96, Schrag and Crawford 96, Selman and Kirkpatrick 97, Smith and Grant 1994, Smith and Dyer 96, Zhang and Korf 96, and more)

QCPDifferent Representations / Encodings

Rows Colors Columns Cubic representation of QCP

QCP as a MIP • Variables - • Constraints - Row/color line Column/color line Row/column line

[ vs. for MIP] QCP as a CSP • Variables - • Constraints - [ vs. for MIP] row column

Exploiting Structure for Domain Reduction • A very successful strategy for domain reduction in CSP is to exploit the structureof groups of constraints and treat them as global constraints. • Example using Network Flow Algorithms: • All-different constraints (Caseau and Laburthe 94, Focacci, Lodi, & Milano 99, Nuijten & Aarts 95, Ottososon & Thorsteinsson 00, Refalo 99, Regin 94 )

Matching on a Bipartite graph Two solutions: All-different constraint we can update the domains of the column variables Analogously, we can update the domains of the other variables Exploiting Structure in QCP ALLDIFF as Global Constraint (Berge 70, Regin 94, Shaw and Walsh 98 )

Exploiting Structure Arc Consistency vs. All Diff Arc Consistency Solves up to order 20 Size search space AllDiff Solves up to order 33 Size search space

Quasigroup as Satisfiability • Two different encodings for SAT: • 2D encoding (or minimal encoding); • 3D encoding (or full encoding);

2D Encoding or Minimal Encoding • Variables: • Each variables represents a color assigned to a cell. • Clauses: • Some color must be assigned to each cell (clause of length n); • No color is repeated in the same row (sets of negative binary clauses); • No color is repeated in the same column (sets of negative binary clauses);

3D Encoding or Full Encoding • This encoding is based on the cubic representation of the quasigroup: each line of the cube contains exactly one true variable; • Variables: • Same as 2D encoding. • Clauses: • Same as the 2 D encoding plus: • Each color must appear at least once in each row; • Each color must appear at least once in each column; • No two colors are assigned to the same cell;

Capturing Structure - Performance of SAT Solvers • State of the art backtrack and local search and complete SAT solvers using 3D encoding are very competitive with specialized CSP algorithms. • In contrast SAT solvers perform very poorly on 2D encodings (SATZ or SATO); • In contrast local search solvers (Walksat) perform well on 2D encodings;

SATZ on 2D encoding (Order 20 -28) Order 28 1,000,000 Order 20 SATZ and SATO can only solve up to order 28 when using 2D encoding; When using 3D encoding problems of the same size take only 0 or 1 backtrack and much higher orders can be solved;

Walksat on 2D and 3D encoding(Order 30-33) 1,000,000 3D order 33 2D order 33 Walksat shows an unsual pattern - the 2D encodings are somewhat easier than the 3D encoding at the peak and harder in the undereconstrained region;

Quasigroup - Satisfiability • Encoding the quasigroup using only • Boolean variables in clausal form using • the 3D encoding is very competitive. • Very fast solvers - SATZ, GRASP, • SATO,WALKSAT;

Structural features of instances provide insights intotheir hardness namely: • Backbone • Inherent Structure and Balance

Backbone Backbone Total number of backbone variables: 2 Backbone is the shared structure of all the solutions to a given instance. This instance has 4 solutions:

Phase Transition in the Backbone • We have observed a transition in the backbone from a phase where the size of the backbone is around 0% to a phase with backbone of size close to 100%. • The phase transition in the backbone is sudden and it coincides with the hardest problem instances. (Achlioptas, Gomes, Kautz, Selman 00, Monasson et al. 99)

Sudden phase transition in Backbone New Phase Transition in Backbone QCP (satisfiable instances only) % Backbone % of Backbone Computational cost Fraction of preassigned cells

Inherent Structure and Balance

Rectangular Pattern Aligned Pattern Balanced Pattern Quasigroup Patterns and Problems Hardness Tractable Very hard (Kautz, Ruan, Achlioptas, Gomes, Selman 2001)

SATZ Balanced QCP Rectangular QCP QCP QWH Aligned QCP

Walksat Balanced filtered QCP Balance QWH QCP QWH aligned rectangular • We observe the same ordering in hardness when using Walksat, • SATZ, and SATO – Balacing makes instances harder

Phase Transitions, Backbone, Balance • Summary • The understanding of the structural properties of problem instances based on notions such as phase transitions, backbone, and balance provides new insights into the practical complexity of many computational tasks. • Active research area with fruitful interactions between computer science, physics (approaches • from statistical mechanics), and mathematics (combinatorics / random structures).

Outline • A Structured Benchmark Domain • Randomization • Conclusions

Randomized Backtrack Search Procedures

Background • Stochastic strategies have been very successful in the area of local search. • Simulated annealing • Genetic algorithms • Tabu Search • Gsat and variants. • Limitation: inherent incomplete nature of local search methods.

Background • We want to explore the addition of a • stochastic element to a systematic search • procedure without losing completeness.

Randomization • We introduce stochasticity in a backtrack search method, e.g., by randomly breaking ties in variable and/or value selection. • Compare with standard lexicographic tie-breaking.

Randomization • At each choice point break ties (variable selection and/or value selection) randomly or: • “Heuristic equivalence” parameter (H) - at every choice point consider as “equally” good H% top choices; randomly select a choice from equally good choices.

Randomized Strategies

Quasigroup Demo

Distributions of Randomized Backtrack Search • Key Properties: • I Erratic behavior of mean • II Distributions have “heavy tails”.

Outline

Outline

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

Outline

Outline

outline

outline

OUTLINE

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline:

Outline

Outline

OUTLINE: