150 likes | 256 Vues
This study explores Compressed Sensing for Multi-Label Prediction, presenting efficient algorithms with robust guarantees and applications to large image and text datasets. The approach involves learning reduction, compression, and sparse label reconstruction for improved prediction accuracy. Experimental results demonstrate the effectiveness of the method across different datasets and compression functions.
E N D
Multi-Label Prediction via Compressed Sensing By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009) Presented by: Lingbo Li ECE, Duke University 01-22-2010 * Some notes are directly copied from the original paper.
Outline • Introduction • Preliminaries • Learning Reduction • Compression and Reconstruction • Empirical Results • Conclusion
Introduction • Large database of images; • Goal: predict who or what is in a given image • Samples: images with corresponding labels is the total number of entities in the whole database. • One-against-all algorithm: Learn a binary predictor for each label (class). Computation is expensive when is large. e.g. , • Assume the output vector is sparse.
Introduction 1 1 1 1 1 1 Compressed sensing: For any sparse vector , it is highly possible to compress to logarithmic in dimension with perfect reconstruction of . Main idea: “Learn to predict compressed label vectors, and then use sparse reconstruction algorithm to recover uncompressed labels from these predictions”
Preliminaries • : input space; • : output (label) space, where • Training data: • Goal: to learn the predictor with low mean-squared error Assume • is very large; • Expected value is sparse, with only a few non-zero entries.
Learning reduction • Linear compression function where • Goal: to learn a predictor Predict the label y with the Predictor F Predict the compressed label Ay with the Predictor H Samples Compressed Samples To minimize To minimize
Reduction-training and prediction Reconstruction Algorithm R: If is close to , then should be close to
Compression Functions Examples of valid compression functions:
Reconstruction Algorithms Examples of valid reconstruction algorithms: iterative and greedy algorithms • Orthogonal Matching Pursuit (OMP) • Forward-Backward Greedy (FoBa) • Compressive Sampling Matching Pursuit (CoSaMP)
General Robustness Guarantees Sparsity error is defined as where is the best k-sparse approximation of What if the reduction create a problem harder to solve than the original problem?
Linear Prediction • If there is a perfect linear predictor of , then there will be a perfect linear predictor of :
Experimental Results • Experiment 1: Image data (collected by the ESP Game) 65k images, 22k unique labels; Keep the 1k most frequent labels; the least frequent occurs 39 times while the most frequent occurs about 12k times, 4 labels on average per image; Half of the data as training and half as testing. • Experiment 2: Text data (collected from http://delicious.com/) 16k labeled web page, 983 unique labels; the least frequent occurs 21 times, the most frequent occurs about 6500 times, 19 labels on average per web page; Half of the data as training and half as testing. • Compression function A: select m random rows of the Hadamard matrix. • Test the greedy and iterative reconstruction algorithm: OMP, FoBa, CoSaMp and Lasso. • Use correlation decoding (CD) as a baseline method for comparisons.
Experimental Results Measure Measure the precision Top two: image data; Bottom: text data
Conclusion • Application of compressed sensing to multi-label prediction problem with output sparsity; • Efficient reduction algorithm with the number of predictions equal to logarithmic in original labels; • Robustness Guarantees from compressed case to the original case; and vice versa for the linear prediction setting.