1 / 13

Paper: Indexing by Latent Semantic Analysis

Paper: Indexing by Latent Semantic Analysis. for course cs630 presented by: Haiyan Qiao. Problem Introduction. Traditional term-matching method doesn’t work well in information retrieval We want to capture the concepts instead of words. Concepts are reflected in the words. However,

garvey
Télécharger la présentation

Paper: Indexing by Latent Semantic Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paper: Indexing by Latent Semantic Analysis for course cs630 presented by: Haiyan Qiao

  2. Problem Introduction • Traditional term-matching method doesn’t work well in information retrieval • We want to capture the concepts instead of words. Concepts are reflected in the words. However, • One term may have multiple meaning • Different terms may have the same meaning.

  3. LSI (Latent Semantic Analysis) • LSI approach tries to overcome the deficiencies of term-matching retrieval by treating the unreliability of observed term-document association data as a statistical problem. • The goal is to find effective models to represent the relationship between terms and documents. Hence a set of terms, which is by itself incomplete and unreliable, will be replaced by some set of entities which are more reliable indicants.

  4. SVD (Singular Value Decomposition) • How to learn the concepts from data? • SVD is applied to derive the latent semantic structure model. • What is SVD? http://kwon3d.com/theory/jkinem/svd.html http://mathworld.wolfram.com/SingularValueDecomposition.html http://www.cs.ut.ee/~toomas_l/linalg/lin2/node13.html#SECTION00013200000000000000

  5. SVD cont’ • SVD of the term-by-document matrix X: • If the singular values of S0 are ordered by size, we only keep the first k largest values and get a reduced model: • doesn’t exactly match X and it gets closer as more and more singular values are kept • This is what we want. We don’t want perfect fit since we think some of 0’s in X should be 1 and vice versa. • It reflects the major associative patterns in the data, and ignores the smaller, less important influence and noise.

  6. Fundamental Comparison Quantities from the SVD Model • Comparing Two Terms: the dot product between two row vectors of reflects the extent to which two terms have a similar pattern of occurrence across the set of document. • Comparing Two Documents: dot product between two column vectors of • Comparing a Term and a Document

  7. Example -Technical Memo • Query: human-computer interaction • Dataset: c1 Human machine interface for Lab ABC computer application c2 A survey of user opinion of computersystemresponsetime c3 The EPS user interface management system c4 System and humansystem engineering testing of EPS c5 Relations of user-perceived responsetime to error measurement m1 The generation of random, binary, unordered trees m2 The intersection graph of paths in trees m3 Graphminors IV: Widths of trees and well-quasi-ordering m4 Graphminors: A survey

  8. Example cont’ % 12-term by 9-document matrix >> X=[ 1 0 0 1 0 0 0 0 0; 1 0 1 0 0 0 0 0 0; 1 1 0 0 0 0 0 0 0; 0 1 1 0 1 0 0 0 0; 0 1 1 2 0 0 0 0 0 0 1 0 0 1 0 0 0 0; 0 1 0 0 1 0 0 0 0; 0 0 1 1 0 0 0 0 0; 0 1 0 0 0 0 0 0 1; 0 0 0 0 0 1 1 1 0; 0 0 0 0 0 0 1 1 1; 0 0 0 0 0 0 0 1 1;];

  9. Example cont’ % X=T0*S0*D0', T0 and D0 have orthonormal columns and So is diagonal % T0 is the matrix of eigenvectors of the square symmetric matrix XX' % D0 is the matrix of eigenvectors of X’X % S0 is the matrix of eigenvalues in both cases >> [T0, S0] = eig(X*X'); >> T0 T0 =   0.1561 -0.2700 0.1250 -0.4067 -0.0605 -0.5227 -0.3410 -0.1063 -0.4148 0.2890 -0.1132 0.2214 0.1516 0.4921 -0.1586 -0.1089 -0.0099 0.0704 0.4959 0.2818 -0.5522 0.1350 -0.0721 0.1976 -0.3077 -0.2221 0.0336 0.4924 0.0623 0.3022 -0.2550 -0.1068 -0.5950 -0.1644 0.0432 0.2405 0.3123 -0.5400 0.2500 0.0123 -0.0004 -0.0029 0.3848 0.3317 0.0991 -0.3378 0.0571 0.4036 0.3077 0.2221 -0.0336 0.2707 0.0343 0.1658 -0.2065 -0.1590 0.3335 0.3611 -0.1673 0.6445 -0.2602 0.5134 0.5307 -0.0539 -0.0161 -0.2829 -0.1697 0.0803 0.0738 -0.4260 0.1072 0.2650 -0.0521 0.0266 -0.7807 -0.0539 -0.0161 -0.2829 -0.1697 0.0803 0.0738 -0.4260 0.1072 0.2650 -0.7716 -0.1742 -0.0578 -0.1653 -0.0190 -0.0330 0.2722 0.1148 0.1881 0.3303 -0.1413 0.3008 0.0000 0.0000 0.0000 -0.5794 -0.0363 0.4669 0.0809 -0.5372 -0.0324 -0.1776 0.2736 0.2059 0.0000 0.0000 0.0000 -0.2254 0.2546 0.2883 -0.3921 0.5942 0.0248 0.2311 0.4902 0.0127 -0.0000 -0.0000 -0.0000 0.2320 -0.6811 -0.1596 0.1149 -0.0683 0.0007 0.2231 0.6228 0.0361 0.0000 -0.0000 0.0000 0.1825 0.6784 -0.3395 0.2773 -0.3005 -0.0087 0.1411 0.4505 0.0318

  10. Example cont’ >> [D0, S0] = eig(X'*X); >> D0 D0 =    0.0637 0.0144 -0.1773 0.0766 -0.0457 -0.9498 0.1103 -0.0559 0.1974 -0.2428 -0.0493 0.4330 0.2565 0.2063 -0.0286 -0.4973 0.1656 0.6060 -0.0241 -0.0088 0.2369 -0.7244 -0.3783 0.0416 0.2076 -0.1273 0.4629 0.0842 0.0195 -0.2648 0.3689 0.2056 0.2677 0.5699 -0.2318 0.5421 0.2624 0.0583 -0.6723 -0.0348 -0.3272 0.1500 -0.5054 0.1068 0.2795 0.6198 -0.4545 0.3408 0.3002 -0.3948 0.0151 0.0982 0.1928 0.0038 -0.0180 0.7615 0.1522 0.2122 -0.3495 0.0155 0.1930 0.4379 0.0146 -0.5199 -0.4496 -0.2491 -0.0001 -0.1498 0.0102 0.2529 0.6151 0.0241 0.4535 0.0696 -0.0380 -0.3622 0.6020 -0.0246 0.0793 0.5299 0.0820

  11. Example cont’ >> S0=eig(X'*X) >> S0=S0.^0.5 S0 =   0.3637 0.5601 0.8459 1.3064 1.5048 1.6445 2.3539 2.5417 3.3409 % We only keep the largest two singular values % and the corresponding columns from the T and D

  12. Example cont’ >> T=[0.2214 -0.1132; 0.1976 -0.0721; 0.2405 0.0432; 0.4036 0.0571; 0.6445 -0.1673; 0.2650 0.1072; 0.2650 0.1072; 0.3008 -0.1413; 0.2059 0.2736; 0.0127 0.4902; 0.0361 0.6228; 0.0318 0.4505;]; >> S = [ 3.3409 0; 0 2.5417 ]; >> D’ =[0.1974 0.6060 0.4629 0.5421 0.2795 0.0038 0.0146 0.0241 0.0820; -0.0559 0.1656 -0.1273 -0.2318 0.1068 0.1928 0.4379 0.6151 0.5299;] >> T*S*D’ 0.1621 0.4006 0.3790 0.4677 0.1760 -0.0527 0.1406 0.3697 0.3289 0.4004 0.1649 -0.0328 0.1525 0.5051 0.3580 0.4101 0.2363 0.0242 0.2581 0.8412 0.6057 0.6973 0.3924 0.0331 0.4488 1.2344 1.0509 1.2658 0.5564 -0.0738 0.1595 0.5816 0.3751 0.4168 0.2766 0.0559 0.1595 0.5816 0.3751 0.4168 0.2766 0.0559 0.2185 0.5495 0.5109 0.6280 0.2425 -0.0654 0.0969 0.5320 0.2299 0.2117 0.2665 0.1367 -0.0613 0.2320 -0.1390 -0.2658 0.1449 0.2404 -0.0647 0.3352 -0.1457 -0.3016 0.2028 0.3057 -0.0430 0.2540 -0.0966 -0.2078 0.1520 0.2212

  13. Summary • What is the common and difference between PCA and SVD? • Both are related to standard eigenvalue-eigenvector, to remove noise or correlation and get the most important info. • PCA is on covariance matrix and SVD works on original matrix.

More Related