1 / 8

Information Retrieval Project

Information Retrieval Project. Team 9 資研一 90522035 黃國瑜 資研一 90522045 何聰鑫 資研一 90522077 丁智凱. System architecture. CPU Speed PIII 1G RAM 256 Mb OS Win 2000 Programming php Database MySQL. Indexing method(1/3). Indexing Using lower case of letter Elimination of stopwords

duer
Télécharger la présentation

Information Retrieval Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Retrieval Project Team 9 資研一 90522035 黃國瑜 資研一 90522045 何聰鑫 資研一 90522077 丁智凱

  2. System architecture • CPU Speed • PIII 1G • RAM • 256 Mb • OS • Win 2000 • Programming • php • Database • MySQL

  3. Indexing method(1/3) • Indexing • Using lower case of letter • Elimination of stopwords • Using hash table • 317 word • Removing punctuation mark • Removing letters with length less than 3 • Removing <tag>

  4. Indexing method(2/3) • Database Table • IndexMap • (Index, TermID, DocID, Line, Pattern) • DocMap • (DocID, FileName, DocTitle) • TermMap • (TermID, Term)

  5. Indexing method(3/3) • Indexing Speed • 130 sec/Mb • Total : 125sec * 490Mb=17 hr • E.q • File Name : FB496255 • File Size : 997438 • Total Term : 8523 • Start : 1004540338.9145 sec • End : 1004540464.1279 sec • Total : 125.2134180069 sec

  6. Query(1/3) • Interface • Query • Insert New Data • Existed Data View • Help • Mail

  7. Query(2/3) • Query • Feature • Multiple keyword query • Title Query • Speed • Match String : 6448 • Search Time :2.3293360471725 sec • Match String : 239 • Search Time :0.72075593471527 sec ( Base on speed of netwrok and result number)

  8. Query(3/3) • Output • Performance • Match String • Search Time • Query Result • File Name • Document Title • Line ( show 5 line ) • # of Pattern ( Highlight Mark )

More Related