A tutorial on Kernelization

A tutorial onKernelization Hans L. Bodlaender IPEC 2011 - Kernelization

On this talk • Kernelization: what, why, how? • Connection with fixed parameter tractability: problems without kernels • Problems without polynomial kernels • Conclusions • Thanks to many people for inspiration, collaboration and contribution! IPEC 2011 - Kernelization

What to do with hard problems? • Many combinatorial problems are hard, e.g., NP-complete • Arising in many contexts • Approaches to deal with them: • Approximations • Special cases • Exact algorithms • ILP techniques, branch and bound, sat-solvers, … IPEC 2011 - Kernelization

Before running a slow exact algorithm, preprocess / simplify the data: Transform to (hopefully smaller) equivalent instance Useful approach: preprocessing Input Preprocess Equivalent smaller input I’ Solve Output for I’ Undo preprocessing Output for I’ IPEC 2011 - Kernelization

On preprocessing • Relatively fast step • Attempt to obtain equivalent instance: • Answer does not change • Size may decrease • Slow algorithm (like ILP-solver) uses hopefully less time on reduced instance IPEC 2011 - Kernelization

Kernelization: central question • What can we prove on the size of a reduced instance, assuming polynomial time preprocessing? IPEC 2011 - Kernelization

What we cannot expect Proposition If P ¹ NP, then an NP-hard problem Q has no polynomial time preprocessing algorithm A that always reduces an input to a smaller equivalent input. Proof: If A exists, then P=NP: repeat the algorithm till we have an instance of size O(1) and solve. • So … instead we investigate reduced instance size as function of a parameter of the inputs. IPEC 2011 - Kernelization

Parameterized problem • Subset of S* xN: • (“real input”, “parameter”) • Decision problem • ‘Part of the input is an integer, called the parameter • We express quality of preprocessing as function of the parameter IPEC 2011 - Kernelization

Vertex cover: Input: Graph G=(V,E), integer k Question: Is there a vertex cover of size at most k in G, i.e., a set W Í V such that for all {v,w} Î E, v Î W or wÎ W? Parameter: k Classic first example: Vertex Cover IPEC 2011 - Kernelization

Simplification rules Input: Graph G and integer k • Rule 1: remove vertex with no neighbors. • Rule 2: if a vertex v has k+1 or more neighbors, then • {v must be in each vertex cover of size £k} • Remove v and all incident edges; • Set k = k – 1. • Rule 3: if k < 0, then say no. IPEC 2011 - Kernelization

Counting rule • Rule 4: if earlier rules do not apply, and we have more than k2 edges, then say no. • Each vertex has degree at most k so can cover at most k edges. So, if G has more than k2 edges, there is no vertex cover of size at most k. • Algorithm: execute rules while possible. • Output of algorithm: • Sometimes no (certainly not a solution). • Sometimes a equivalent instance with at most k2 edges (and hence O(k2) vertices). IPEC 2011 - Kernelization

“Kernel” for Vertex Cover Theorem There is a polynomial time algorithm, that given an input (G,k) of Vertex Cover, either decides on this input or gives an equivalent instance with O(k2) vertices and edges. • Instead of deciding, we can also transform to trivial no instance (e.g., graph with one edge and k=0). • We say: • Vertex Cover has kernel with O(k2) vertices and edges. IPEC 2011 - Kernelization

Kernel (definition) • A kernelization algorithm or kernel for a parameterized problem Q is an algorithm A that maps inputs for Q to inputs for Q, such that • A uses polynomial time; • For all (x,k): (x,k) Î Q if and only if A(x,k) Î Q; • There is a function f such that for all (x’, k’) = A(x,k): • k’ £ f(k); • |x’| £ f(k). Size and parameter of newinstance boundedbyfunction of old parameter IPEC 2011 - Kernelization

Research questions • For parameterized problems Q: • Does Q have a kernel? • If so, how small (function f) can this kernel be? • Linear kernels? • Polynomial kernels? • Any kernels? IPEC 2011 - Kernelization

Motivation for kernels • Analysis of preprocessing. • Kernels give new preprocessing steps. • First step for FPT algorithms. IPEC 2011 - Kernelization

Compare • Approximation algorithm =upper bound and lower bound heuristic + a proof of its quality. • Kernel =preprocessing heuristic+ a proof of its quality. IPEC 2011 - Kernelization

Overview of problem behavior • O(1) size kernels: problems in P. Ex: Eulerian • NP-completeness (variable parameter) • Polynomial kernels Shown with algorithm. Ex.: Vertex Cover • compositionality, ppt-transformations, cross-composition • Kernels, but not polynomial sized. Shown (usually) with FPT-algorithm. Ex: Long Path • W[1]-hardness • XP: No kernel, polynomial if parameter is bounded. Ex.: Independent Set • NP-completeness (fixed parameter) • Bad. Example: Graph Coloring is NP-complete for 3 colors IPEC 2011 - Kernelization

How do we make kernelization algorithms • General method: • Invent SafeRules. • Safe rules change an instance to an equivalent instance. • Rules should modify instances to equivalent instances that are • Smaller or • Give more structural insight. • Have a • Counting rule or a • Counting argument. IPEC 2011 - Kernelization

Designing the algorithm Repeat until we have a (small enough) kernel: • Invent safe rules. • Analyse instances: if no safe rule applies, is the instance size bounded? If not, why not? Can we find a rule that avoids such situations? IPEC 2011 - Kernelization

Instance: sequence of marbles and an integer k. Each marble has a positive integer cost and a color. Question: can we remove marbles of total cost at most k such that for each color, all marbles with that color are consecutive? Parameter: k. Example: Weighted marbles 3 4 1 3 2 2 6 4 1 3 2 6 Solution of cost 5=3+2 IPEC 2011 - Kernelization

Rule 1 • If we have two consecutive marbles of the same color, replace it by one with the sum of the weights. .. .. 5 7 .. .. .. .. 12 .. .. IPEC 2011 - Kernelization

What we have now: • Two successive marbles have a different color. • But, we can have many color changes, even in a solution of cost 1. 3 4 1 3 2 2 6 1 IPEC 2011 - Kernelization

Good colors • A color is good, if there is only one marble with this color. 3 4 1 3 2 2 6 1 IPEC 2011 - Kernelization

Rule 2 • Suppose two successive marbles both have a good color. Give the second the color of the first. 3 4 1 3 2 2 6 1 3 times Rule 2 3 4 1 3 2 2 6 1 IPEC 2011 - Kernelization

Rule 2 • Suppose two successive marbles both have a good color. Give the second the color of the first Rule 2 does not make the instance smaller, but it makes it simpler: fewer colors! I.e., increases our structural insight! 3 4 1 3 2 2 6 1 3 times Rule 2 3 4 1 3 2 2 6 1 IPEC 2011 - Kernelization

Algorithm • While Rule 1 or Rule 2 is possible: apply the rule. • Afterwards: • No 2 successive marbles of the same color. • No 2 successive marbles with a good color. • The number of marbles is at most twice (+1) the number of marbles with a bad color. • Can we bound the number of bad colored marbles? IPEC 2011 - Kernelization

Rule 3: counting rule • If there are at least 2k+1 bad colored marbles, say no. • Safeness: By deleting one marble, the number of bad colored marbles can decrease by at most 2 (assuming rule 1). • Applying rules 1, 2, 3 while possible gives an instance with O(k) marbles. • Is this a kernel for the problem? Or transform to O(1) size no-instance IPEC 2011 - Kernelization

Rule 4 • If a marble has weight > k+1, give it weight k+1. • Safeness: marble is never removed. • Kernelization algorithm: • While Rules 1 – 4 are possible, apply them. • Polynomial time. Gives equivalent instance with O(k log k) bits and O(k) marbles. • Theorem: Weighted marbles problem has kernel of size O(k log k). IPEC 2011 - Kernelization

Many recent results • Kernelization usually algorithms of form: • Rules.Often with nontrivial correctness proofs. • Counting argument.Often nontrivial combinatorics. • General techniques: meta-algorithms, crown reductions, protrusions, … • Sometimes, no (small) kernel (seems to) exist: can we show this? IPEC 2011 - Kernelization

Connection with Fixed Parameter Tractability • A parameterized problem P is Fixed Parameter Tractable (Î FPT) if there is an algorithm solving P that uses on inputs (x,k) in time • f(k) * |x|c • for a constant c • and some (computable) function f. IPEC 2011 - Kernelization

Three variants of FPT • Non-uniform: • For constant c: for every k, there is an algorithm that runs in O(nc) time. • Uniform: • For constant c, for a function f: there exists an algorithm that runs in f(k)nc time. • Strongly uniform: • For constant c, for a computable function f: there exists an algorithm that runs in f(k)nc time. IPEC 2011 - Kernelization

Relation between variants • Non-uniform is a proper subset of uniform. • Example 1: {(x,k) | k ÎX} for some undecidable set of integers X is in non-uniform but not in uniform FPT. • Example 2: if w is a graph parameter that does not increase by taking minors, then Robertson-Seymour theory tells that {(G,k) | w(G) £k} is in non-uniform FPT. • Uniform is proper subset of strongly uniform. • Proof by Downey and Fellows. IPEC 2011 - Kernelization

A useful theorem with a curious proof Theorem (Folklore)A decidable parameterized problem P belongs to (uniform) FPT, if and only if it has a kernel. ProofÞ: If P has a kernel, then we have an FPT-algorithm: • Given input (x,k), • Apply kernelization and obtain (x’, k’). • Now, use any algorithm to solve (x’, k’). • Answer is the same. • Running time poly(|x|) + g(f(k)). • Ü: … IPEC 2011 - Kernelization

A useful theorem with a curious proof (II) Theorem (Folklore) A decidable parameterized problem P belongs to (uniform) FPT, if and only if it has a kernel. Proof continuedÜ: If P has an algorithm A that uses f(k) nctime: • Suppose we have input (x, k) with |x| = n. • Run A for nc+1 steps. • If A halts we have the answer (transform to O(1) size yes- or no-instance). • If A does not halt, just output the original instance (x, k): we have nc+1 £f(k)* ncson £f(k). IPEC 2011 - Kernelization

Variants Theorem (Folklore) A decidable parameterized problem P belongs to strongly uniform FPT, if and only if it has a kernel of size bounded by a computable function. • Same proof. • Problems in non-uniform FPT do not need to have a kernel. • Practical consideration on variants: it does not matter if you use uniform or strongly uniform, as long as you don’t make mistakes… IPEC 2011 - Kernelization

Implications of the theorem • Positive: • Technique to obtain FPT-algorithms: • Make small kernel. • Algorithm on resulting small instance. • Negative: • If we have evidence that there exists no FPT-algorithm, we also have evidence that there exists no kernel. IPEC 2011 - Kernelization

Downey-Fellows introduce complexity classes of parameterized problems that are unlikely to have FPT algorithms, e.g. W[1]. Hardness is shown with “parameterized variant of many-one reductions”. Theorem If W[1] = FPT, then the Exponential Time Hypothesis is not valid. CorollaryA parameterized problem that is W[1]-hard has no kernel, unless the ETH does not hold. Negative results IPEC 2011 - Kernelization

Many W[1]-hard problems • Many problems are W[1]-hard, e.g.: Clique, Independent Set, Dominating Set, … • Canonical W[1]-complete problem: • Input: Boolean formula F in conjunctive normal form. • Question: Can we satisfy F by setting at most k variables to true? • Parameter: k. • No kernels for these, unless W[1] = FPT and hence the Exponential Time Hypothesis fails. IPEC 2011 - Kernelization

Problems with large kernels • For many problems in FPT, we do not know small kernels. • Consider: Long Path • Given: Graph G=(V,E), integer k. • Question: Does G have a simple path of length at least k? • Parameter: k. • Is in FPT, but all known kernels have size exponential in k… IPEC 2011 - Kernelization

Does Long Path have a kernel of polynomial size? Maybe not… • Suppose we have a polynomial kernel, say with kcbits size. Size bounded by kc k’ k IPEC 2011 - Kernelization

Long path continued • Now, suppose we have a series of inputs to long path, say all with the same parameter:(G1,k), (G2,k), …, (Gr,k). … k k k IPEC 2011 - Kernelization

Take the disjoint union • G1È G2È … È Gr has a simple path of length k, if and only if there exists a graph Githat has a path of length k. … k k k … k k IPEC 2011 - Kernelization

And now, apply the kernel to the union … k k k … k k Size bounded by kc k’ IPEC 2011 - Kernelization

What happened? • We have many (say r = k2c) instances of Long Path, and transform it to one instance of size < kc. • Intuition: this cannot be possible without solving some of the instances, as we have fewer bitsleft than we had instances to start with… • Theory (next) formalizes this idea. IPEC 2011 - Kernelization

(Or-)Compositionality • A parameterized problem Q is or-compositional, if there is an algorithm that • Receives as input a series of inputs to Q, all with the same parameter (I1,k), …, (Ir,k); • Uses polynomial time; • Outputs one input (I’,k’)to Q; • k’ bounded by polynomial in k; • (I’,k’) Î Q if and only if there exists at least onej with (Ij,k) Î Q. IPEC 2011 - Kernelization

Or-composition poly(t*n + k) time Qinstances poly(k) n x1 k x2 k x.. k xt k x* k* Q instance IPEC 2011 - Kernelization

Compositionality gives lowerbounds for kernels Theorem (B, Downey, Fellows, Hermelin + Fortnow, Santhanam, 2008)Let P be a parameterized problem that is • Or-compositional, and • “Unparameterized form” is NP-complete. Then P has no polynomial kernel unless NP Í coNP/poly. • Variant for and-compositionality is still open problem… IPEC 2011 - Kernelization

Application to Long Path • Input: t instances of Longest Path. • Take disjoint union, output as (G’, k). • G’ has a path of length k some Gi has a path of length k. • Output parameter trivially bounded in poly(k). ,k ,k ,k ,k ,k ,k Long Path does not admit a polynomial kernel unless NP⊆coNP/poly IPEC 2011 - Kernelization

Additional techniques (1) • Polynomial parameter transformations (several authors): transform an argument that problem X does not have a polynomial kernel to an argument that problem Y does not have a polynomial kernel. • Chen et al. (2009): no kernels of size kc n1-e (unless NP Í coNP/poly). • Cross-compositions (B, Jansen, Kratsch, 2010): (composition of instances of problem X into instances of problem Y). • Composition of 2n instances suffices. IPEC 2011 - Kernelization

Additional techniques (2) • Dell and van Melkebeek (2010): extend technique to precise lower bounds, e.g.: W(k2) bits for kernel for Vertex Cover (unless NP Í coNP/poly). • New results by Dell and Marx, 2011. • Weak composition: (Hermelin and Wu, 2011): polynomial lower bounds for several problems; super quasi polynomial lower bounds. IPEC 2011 - Kernelization

A tutorial on Kernelization

A tutorial on Kernelization

Presentation Transcript

A Tutorial on Crossdocking

A tutorial on MS Project

A Tutorial on Bayesian Networks

A Tutorial on Property Testing

A Tutorial on SIP

A tutorial on rule induction

A tutorial on LOTOS

A Tutorial on Linocuts

Kernelization

A Tutorial on Processor Caches

A Tutorial on Crossdocking

Tutorial on Elog A. Wehmann

A Tutorial on Game Theory

A tutorial on MS Project

A Short Tutorial on R

A Tutorial on Helical Structures

Kernelization for a Hierarchy of Structural Parameters

Parameterized Algorithms Advanced Kernelization Techniques

A tutorial on MS Project

A Tutorial on Helical Structures

A Tutorial on Bayesian Networks