1 / 12

Galago for Information Retrieval

Galago for Information Retrieval. Java based information retrieval toolkit. Galago. http://www.galagosearch.org/index.html. Download both Binary and Source at: . http://code.google.com/p/galagosearch/downloads/list. Document. Query. Tasks . Index. Parse. Parse. Evaluation. <Html>

arnaud
Télécharger la présentation

Galago for Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Galago for Information Retrieval

  2. Java based information retrieval toolkit Galago http://www.galagosearch.org/index.html Download both Binary and Source at: http://code.google.com/p/galagosearch/downloads/list

  3. Document Query Tasks Index Parse Parse Evaluation

  4. <Html> <Head> Hello </Head> <Body> I love information retrieval so much! </Body> </Html> Tasks - Index Doc-001: hello; i; love; information…… Index with words Doc-001: hello; i; love; informate…… Index with stemmed words Doc-001: html; head; body…… Index with extents galago.batbuild D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus

  5. Index Folder How galago index work documentLengths documentNames postings Parts (folder) stemmedPostings extents

  6. postings How galago index work galago.batdump-keys galago.batdump-index stemmedPostings extents

  7. Text Folder: D:\test_collection\Wiki\wiki-small.corpus Index Folder: D:\test_collection\Wiki\index Tasks - Search galago.batsearch D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus http://localhost:XXXX

  8. Information retrieval tasks + Evaluation (TREC style) Query: <parameters> <query> <number>1</number> <text> test query </text> </query> </parameters> Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef 1 -14.29629326 galago 1 Q0 KULA-LP_fb81 2 -15.90760040 galago 1 Q0 KKJK_7f20 3 -15.92886543 galago 1 Q0 WNUZ_d533 4 -15.93414688 galago 1 Q0 KZEN_8c0c 5 -15.94256783 galago IR experiment Eval Result: num_ret 1 100 num_rel 1 5 num_rel_ret 1 3 map 1 0.2667 ndcg 1 0.4622 ndcg15 1 0.4622 R-prec 1 0.4000 Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1

  9. Batch search task, send multiple queries to Galago: IR experiment galago.batbatch-search --index=D:\test_collection\Wiki\index --count=100 D:\test_collection\Wiki\test.query Query: <parameters> <query> <number>1</number> <text> test query </text> </query> </parameters>

  10. Save your result in a file, e.g. wiki.query.eval Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef 1 -14.29629326 galago 1 Q0 KULA-LP_fb81 2 -15.90760040 galago 1 Q0 KKJK_7f20 3 -15.92886543 galago 1 Q0 WNUZ_d533 4 -15.93414688 galago 1 Q0 KZEN_8c0c 5 -15.94256783 galago IR experiment

  11. You need a judgment file e.g. wiki.query.judgments Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1 IR experiment You can evaluate your retrieve and ranking performance: galago.batevalD:\test_collection\Wiki\wiki.query.eval D:\test_collection\Wiki\wiki.query.judgments

  12. You can make it better by updating org.galagosearch.core.tools.App.java And …… Advanced, read and update the source code!

More Related