1 / 5

Project 2 Latent Dirichlet Allocation

Project 2 Latent Dirichlet Allocation . 2014/4/29 Beom-Jin Lee. Data Selection. Enron Email Dataset http://www.cs.cmu.edu/~enron/ NIPS 1-17 data http://ai.stanford.edu/~gal/data.html http://www.cs.nyu.edu/~roweis/data.html Datahub (Wikipedia Data, Wikinews , etc )

indiya
Télécharger la présentation

Project 2 Latent Dirichlet Allocation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Project 2Latent Dirichlet Allocation 2014/4/29 Beom-Jin Lee

  2. Data Selection • Enron Email Dataset • http://www.cs.cmu.edu/~enron/ • NIPS 1-17 data • http://ai.stanford.edu/~gal/data.html • http://www.cs.nyu.edu/~roweis/data.html • Datahub (Wikipedia Data, Wikinews, etc) • http://datahub.io/en/dataset • Reuters Corpora (RCV1, RCV2, TRC2) • http://trec.nist.gov/data/reuters/reuters.html • News group data • http://www.infochimps.com/datasets/20-newsgroups-dataset-de-duped-version • Company Datasets • http://endb-consolidated.aihit.com/datasets.htm • Twitter Data • http://snap.stanford.edu/data/twitter7.html

  3. Methodology • Original Paper • Latent Dirichlet Allocation, David M. Blei, Andrew Y. Ng, Michael I. Jordan, Journal of Machine Learning Research 3, 993 – 1022, 2003 • Toolbox • http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm • Help • http://www.4four.us/article/2010/11/latent-dirichlet-allocation-simply

  4. 평가방법 • Base line • Data Selection, Data inspection, Methodology report, Result from using LDA • Plus points • Big data processing method(Wikipedia, Wallstreet Journal, etc) • Different kind of model comparison • Improvement in LDA

More Related