1 / 20

A Vector Space Model for Automatic Indexing

A Vector Space Model for Automatic Indexing. Enhanced Vector Space Models for Content-based Recommender Systems. G. Salton, A. Wong and C. S. Yang. Cataldo Musto. Presenter Sawood Alam <salam@cs.odu.edu>. A Vector Space Model for Automatic Indexing. G. Salton, A. Wong and C. S. Yang

skah
Télécharger la présentation

A Vector Space Model for Automatic Indexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Vector Space Model for Automatic Indexing • Enhanced Vector Space Models for Content-based Recommender Systems G. Salton, A. Wong and C. S. Yang • CataldoMusto Presenter SawoodAlam<salam@cs.odu.edu>

  2. A Vector Space Model for Automatic Indexing G. Salton, A. Wong and C. S. Yang Cornell University

  3. Introduction • In document retrieval, best indexing space is where each entity lies far away from others • Density of the object space becomes a measure of indexing system • Retrieval performance correlate inversely with space density

  4. Document Space • Di = (di1, di2, di3, …, dij)

  5. Document Space (cont.)

  6. Document Space (cont.)

  7. Indexing Performance vs. Space Density

  8. Cluster Density vs. Indexing Performance

  9. Discrimination Value Model

  10. Discrimination Value Model (cont.)

  11. Discrimination Value Model Summary

  12. Average Recall vs. Precision

  13. Summary Recall vs. Precision

  14. Enhanced Vector Space Models for Content-based Recommender Systems CataldoMusto Dept. of Computer Science University of Bari, Italy cataldomusto@di.uniba.it

  15. Introduction • Vector Space Models (VSM) in Information Retrieval is an established practice • Investigate the impact of vector space models in Information Filtering • Recommender system

  16. Problems of VSM • High dimensionality • Becoming more serious due to emerging social apps and micro-blogging, generating lots of web content and new vocabulary • Inability to manage document semantics • Order of the term occurrence in the document

  17. Components • Context vector for each term • Values in {-1, 0, 1} • Vector Space representation of a term (t) • Vector Space representation of a document (d) • Vector Space representation of a user profile (pu)

  18. Indexing Technique • Random Indexing-based model • Weighted Random Indexing-based model • Semantic Vector-based model • Weighted Semantic Vector-based model

  19. Experimental Evaluation

  20. Conclusions • First prototype with naive weighting scheme is comparable to other content based filtering techniques like Bayesian classifier • Other complex weighting schemes should perform better • User profiles may be studied based on Linked Data rather than keyword based user profiles

More Related