Maximizing submodular functions

Maximizing submodular functions Minimizing convex functions: Polynomial time solvable! Minimizing submodular functions: Polynomial time solvable! Maximizing convex functions: NP hard! Maximizing submodular functions: NP hard! But can get approximation guarantees 

Example: Set cover Want to cover floorplan with discs Place sensorsin building Possiblelocations V For A µ V: z(A) = “area covered by sensors placed at A” Node predicts values of positions with some radius Formally: W finite set, collection of n subsets Siµ W For A µ V={1,…,n} define z(A) = |i2 A Si|

S’ S’ Set cover is submodular A={S1,S2} S1 S2 z(A[{S’})-z(A) ¸ z(B[{S’})-z(B) S1 S2 S3 S4 B = {S1,S2,S3,S4}

Y“Sick” X1“Fever” X2“Rash” X3“Male” Uncertaintyafter knowing XA Uncertaintybefore knowing XA Example: Feature selection • Given random variables Y, X1, … Xn • Want to predict Y from subset XA = (Xi1,…,Xik) Want k most informative features: A* = argmax IG(XA; Y) s.t. |A| · k where IG(XA; Y) = H(Y) - H(Y | XA) Naïve BayesModel

Y1 Y2 Y3 X1 X2 X3 X4 Example: Submodularity of info-gain Y1,…,Ym, X1, …, Xn discrete RVs z(A) = IG(Y; XA) = H(Y)-H(Y | XA) • z(A) is always monotonic • However, NOT always submodular Theorem [Krause & Guestrin UAI’ 05]If Xi are all conditionally independent given Y,then z(A) is submodular! Hence, greedy algorithm works! In fact, NO algorithm can do better than (1-1/e) approximation!

Leanforward Leanleft Slouch Building a Sensing Chair [Mutlu, Krause, Forlizzi, Guestrin, Hodgins UIST ‘07] • People sit a lot • Activity recognition inassistive technologies • Seating pressure as user interface Equipped with 1 sensor per cm2! Costs $16,000!  Can we get similar accuracy with fewer, cheaper sensors? 82% accuracy on 10 postures! [Tan et al]

How to place sensors on a chair? • Sensor readings at locations V as random variables • Predict posture Y using probabilistic model P(Y,V) • Pick sensor locations A* µ V to minimize entropy: Possible locations V Placed sensors, did a user study: Similar accuracy at <1% of the cost!

Offline (Nemhauser)bound 1.4 Data-dependentbound 1.2 1 0.8 0.6 Greedysolution 0.4 0.2 0 0 5 10 15 20 Bounds on optimal solution[Krause et al., J Wat Res Mgt ’08] Submodularity gives data-dependent bounds on the performance of any algorithm Sensing quality z(A) Higher is better Water networksdata Number of sensors placed

Summary (1) • Minimization of submodular functions • Submodularity and convexity • Submodular Polyhedron • Symmetric submodular functions

Summary (2) • Pseudo-boolean functions • Representation (polynomial, posiform, tableau, graph cut) • Reduction to quadratic polynomial • Necessary and sufficient conditions for submodularity • Minimization of quadratic and cubic submodular functions via graph cuts • Lower bound via roof duality • LP via posiform representation • LP via linear relaxation • Max flow via symmetric graph construction

Further reading • Combinatorial algorithms for submodular (and bisubmodular) function minimization • More algorithms/bounds for maximizing submodular functions • Linear and semidefinite relaxations • Matroids, greedoids, intersection of matroids, polymatroids and more • Generalized roof duality

Maximizing submodular functions