1 / 33

Non Negative Matrix Factorization

Non Negative Matrix Factorization. Hamdi Jenzri. Outline. Introduction Non-Negative Matrix Factorization (NMF) Cost functions Algorithms Multiplicative update algorithm Gradient descent algorithm Alternating least squares algorithm NMF vs. SVD Initialization Issue Experiments

ervin
Télécharger la présentation

Non Negative Matrix Factorization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non Negative Matrix Factorization Hamdi Jenzri

  2. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  3. Introduction • In many data-processing tasks, negative numbers are physically meaningless • Pixel values in an image • Vector representation of words in a text document… • Classical tools cannot guarantee to maintain the non-negativity • Principal Component Analysis • Singular Value Decomposition • Vector Quantization… • Non-negative Matrix Factorization

  4. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  5. Non-Negative Matrix Factorization • Given a non-negative matrix V, find non-negative matrix factors W and H such that: V ≈ W H • V is an nxmmatrix whose columns are n-dimensional data vectors, where m is the number of vectors in the data set. • W is an nxr non-negative matrix • H is an rxm non-negative matrix • Usually, r is chosen to be smaller than n or m, so that W and H are smaller than the original matrix V

  6. Non-Negative Matrix Factorization • Significance of this approximation: • It can be rewritten column by column as v ≈ W h Where v and h are the corresponding columns of V and H • Each data vector v is approximated by a linear combination of the columns of W, weighted by the components of h • Therefore, W can be regarded as containing a basis that is optimized for the linear approximation of the data in V • Since relatively few basis vectors are used to represent many data vectors, good approximation can only be achieved if the basis vectors discover structure that is latent in the data

  7. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  8. Cost functions • To find an approximate factorization V ≈ W H, we first need to define cost functions that quantify the quality of the approximation • Such cost functions can be constructed using some measure of distance between two non-negative matrices A and B • Square of the Euclidean distance between A and B ||A – B||2 = ∑ij(Aij - Bij)2 • Divergence of A from B D (A||B) = ∑ij(Aij log(Aij/Bij) – Aij + Bij) It reduces to the Kullback-Leibler divergence, or relative entropy, when ∑ijAij = ∑ijBij = 1

  9. Cost functions • The formulation of the NMF problem as an optimization problem can be stated as: • Minimize f (W, H) = ||V – WH||2 with respect to W and H, subject to the constraints W, H ≥ 0 • Minimize f (W, H) = D (V || WH) with respect to W and H, subject to the constraints W, H ≥ 0 • These functions are convex in W only or H only, they are not convex in both variables together

  10. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  11. Multiplicative update algorithm • Lee and Seung • Convergence to a stationary point that may or may not be a local minimum

  12. Gradient descent algorithm • and are the step size parameters • A projection step is commonly used after each update rule to set negative elements to zeros • Chu et al., 2004; Lee and Seung, 2001 rand (n, r); % initialize W rand (r, m); % initialize H

  13. Alternating least squares algorithm • It aids sparsity • More flexible: able to escape a poor path • Paatero and Tapper, 1994 rand (n, r); rand (n, r); V VT

  14. Convergence • There is no insurance of convergence to local minimum • No uniqueness • If (W,H) is a minimum • Then, (WD, D-1H) is too, where D is a non-negative invertible matrix • Still, NMF is quite appealing for data mining applications since, in practice, even local minima can provide desirable properties such as data compression and feature extraction

  15. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  16. NMF vs. SVD

  17. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  18. Initialization Issue • NMF algorithms are iterative • Initialization of W and/or H • A good initialization can improve • Speed • Accuracy • Convergence • Some initializations: • Random initialization • Centroid initialization (clustering) • SVD-centroid initialization • Random Vcol • Random C initialization (densest columns)

  19. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  20. Image Dataset H ||V – WH||F = 156.7879

  21. Different initialization H ||V – WH||F = 25.6828

  22. H ||V – WH||F = 101.8359

  23. Landmine Dataset Used Data set: BAE-LMED

  24. Results: varying r for Multiplicative update algorithm, random initialization

  25. Results: Varying the initialization for the Multiplicative update algorithm, r = 9

  26. Results: Comparing algorithms for the best found r = 9, random initialization

  27. Results: Comparing best combination to Basic EHD performance

  28. Columns of H Mines FA

  29. Different Datasets

  30. Different Datasets

  31. Outline • Introduction • Non-Negative Matrix Factorization (NMF) • Cost functions • Algorithms • Multiplicative update algorithm • Gradient descent algorithm • Alternating least squares algorithm • NMF vs. SVD • Initialization Issue • Experiments • Image Dataset • Landmine Dataset • Conclusion & Potential Future Work

  32. Conclusion & Potential Future work • NMF presents a way to represent the data in a different basis • Although its convergence and initialization issues, it is quite appealing in many data mining tasks • Other formulations do exist for the NMF problem • Constrained NMF • Incremental NMF • Bayesian NMF • Future work will include • Trying other Landmine Datasets • Bayesian NMF

  33. References • Michael W. Berry et al., “Algorithms and Applications for Approximate Nonnegative Matrix Factorization”, June 2006 • Daniel D. Lee and H. Sebastian Seung, "Algorithms for Non-negative Matrix Factorization". Advances in Neural Information Processing Systems, 2001 • Chih-Jen Lin, “Projected Gradient Methods for Non-negative Matrix Factorization”, Neural Computation, june 2007 • Amy N. Langville et al., “Initializations for Nonnegative Matrix Factorization”, KDD 2006

More Related