1 / 17

Coupled Matrix Factorizations using Optimization

Coupled Matrix Factorizations using Optimization. Daniel M. Dunlavy , Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference on Computational Science and Engineering March 4, 2009.

xia
Télécharger la présentation

Coupled Matrix Factorizations using Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coupled Matrix Factorizations using Optimization Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference on Computational Science and Engineering March 4, 2009 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2009-2389C 1/17

  2. Motivating Problems • Data with multiple types of two-way relationships • Bibliometric analysis • author-document, term-document, author-venue, etc. • Can we predict potential co-authors? • Movie ratings • movie-actor, user-movie, actor-award • Can we predict useful movie ratings for other users? • Consistent dimensionality reduction • Improved interpretation through non-negativity constraints 2/17

  3. Some Related Work matrices of same size • Simultaneous factor analysis • Gramian matrices [Levin, 1966] • Test score covariance matrices over time [Millsap, et al., 1988] • Simultaneous diagonalization • Population differentiation in biology [Thorpe, 1988] • Blind source separation [Ziehe et al., 2004] • Generalized SVD • Damped or constrained least squares [Van Loan, 1976] • Microarray data analysis [Alter, et al., 2003] • Multimicrophone speech filtering [Doclo and Moonen, 2002] • Simultaneous Non-negative Matrix Factorization • Gene clustering in microarray data [Badea, 2007; 2008] • Tensor decompositions • Data mining, chemometrics, neuroscience[Kolda, Acar, Bro, Park, Zhang, Berry, Chen, Martin, CSE09] matrices of same size only 2 matrices slow at least one common dimension 3/17

  4. Coupled Non-negative Matrix Factorization (CNMF) • Given • Solve document-term document-author 4/17

  5. Method: CNMF-ALS • CNMF-ALS: Alternating Least Squares [Extends Berry, et al., 2006] linear least squares + simple projection to constraint boundary 5/17

  6. Method: CNMF-MULT • CNMF-MULT: Multiplicative Updates [Badea, 2007; Badea, 2008; extends Lee and Seung, 2001] 6/17

  7. Method: CNMF-OPT • CNMF-OPT: Projective Nonlinear CG, More-Thuente LS[Extends Acar, Kolda, and Dunlavy, 2009 and Lin, 2007] 7/17

  8. Matlab Experiments Noise: 8/17

  9. Results: No noise, r = r* 9/17

  10. Results: No noise, r = r* 10/17

  11. Results: No noise, r = r* 11/17

  12. Results: No noise, r = r*+1 12/17

  13. Results: No noise, r=r*+1 13/17

  14. Results: Noisy data, r=r*+1 14/17

  15. Future Work • Extending other promising methods to CNMF • Block principal pivoting based NMF [Park, et al. 2008] • Projected gradient NMF [Lin, 2007] • Projected Newton NMF [Kim, et al., 2008] • CNMF-OPT extensions • Sparse data, regularization [Acar, Kolda, and Dunlavy, 2009] • Sparsity constraints [Park, et al. 2008] • Numerical experiments • Scale to larger data sets • Comparisons on real data sets [Park, et al. 2008] • Alternate models / problem formulations • Coupling matrix and tensor decompositions (CNMF/CNTF) 15/17

  16. Conclusions • Coupled matrix factorizations • Method for computing factorizations consistent along common dimensions in data • Results • CNMF-OPT • Fast and accurate • Overfactors well and handles noise well • CNMF-ALS • Fast, but not accurate • Overfactoring is a big challenge • CNMF-MULT • Accurate, but may be too slow (similar to NMF results) • Future Work • Identified several promising paths forward 16/17

  17. Thank You Coupled Matrix Factorizations using Optimization Danny Dunlavy dmdunla@sandia.gov http://www.cs.sandia.gov/~dmdunla 17/17

More Related