170 likes | 287 Vues
Coupled Matrix Factorizations using Optimization. Daniel M. Dunlavy , Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference on Computational Science and Engineering March 4, 2009.
E N D
Coupled Matrix Factorizations using Optimization Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference on Computational Science and Engineering March 4, 2009 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2009-2389C 1/17
Motivating Problems • Data with multiple types of two-way relationships • Bibliometric analysis • author-document, term-document, author-venue, etc. • Can we predict potential co-authors? • Movie ratings • movie-actor, user-movie, actor-award • Can we predict useful movie ratings for other users? • Consistent dimensionality reduction • Improved interpretation through non-negativity constraints 2/17
Some Related Work matrices of same size • Simultaneous factor analysis • Gramian matrices [Levin, 1966] • Test score covariance matrices over time [Millsap, et al., 1988] • Simultaneous diagonalization • Population differentiation in biology [Thorpe, 1988] • Blind source separation [Ziehe et al., 2004] • Generalized SVD • Damped or constrained least squares [Van Loan, 1976] • Microarray data analysis [Alter, et al., 2003] • Multimicrophone speech filtering [Doclo and Moonen, 2002] • Simultaneous Non-negative Matrix Factorization • Gene clustering in microarray data [Badea, 2007; 2008] • Tensor decompositions • Data mining, chemometrics, neuroscience[Kolda, Acar, Bro, Park, Zhang, Berry, Chen, Martin, CSE09] matrices of same size only 2 matrices slow at least one common dimension 3/17
Coupled Non-negative Matrix Factorization (CNMF) • Given • Solve document-term document-author 4/17
Method: CNMF-ALS • CNMF-ALS: Alternating Least Squares [Extends Berry, et al., 2006] linear least squares + simple projection to constraint boundary 5/17
Method: CNMF-MULT • CNMF-MULT: Multiplicative Updates [Badea, 2007; Badea, 2008; extends Lee and Seung, 2001] 6/17
Method: CNMF-OPT • CNMF-OPT: Projective Nonlinear CG, More-Thuente LS[Extends Acar, Kolda, and Dunlavy, 2009 and Lin, 2007] 7/17
Matlab Experiments Noise: 8/17
Future Work • Extending other promising methods to CNMF • Block principal pivoting based NMF [Park, et al. 2008] • Projected gradient NMF [Lin, 2007] • Projected Newton NMF [Kim, et al., 2008] • CNMF-OPT extensions • Sparse data, regularization [Acar, Kolda, and Dunlavy, 2009] • Sparsity constraints [Park, et al. 2008] • Numerical experiments • Scale to larger data sets • Comparisons on real data sets [Park, et al. 2008] • Alternate models / problem formulations • Coupling matrix and tensor decompositions (CNMF/CNTF) 15/17
Conclusions • Coupled matrix factorizations • Method for computing factorizations consistent along common dimensions in data • Results • CNMF-OPT • Fast and accurate • Overfactors well and handles noise well • CNMF-ALS • Fast, but not accurate • Overfactoring is a big challenge • CNMF-MULT • Accurate, but may be too slow (similar to NMF results) • Future Work • Identified several promising paths forward 16/17
Thank You Coupled Matrix Factorizations using Optimization Danny Dunlavy dmdunla@sandia.gov http://www.cs.sandia.gov/~dmdunla 17/17