1 / 7

Focused Crawler

Focused Crawler. Ben Markines Mira Stoilova Fulya Erdinc. Introduction. Based from the paper presented the first week of class Accelerated Focused Crawling through Online Relevance Feedback by Chakrabarti presented by Mark Meiss

Télécharger la présentation

Focused Crawler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Focused Crawler Ben Markines Mira Stoilova Fulya Erdinc

  2. Introduction • Based from the paper presented the first week of class • Accelerated Focused Crawling through Online Relevance Feedback by Chakrabarti presented by Mark Meiss • Implemented a focused crawler and a focused crawler with an apprentice • Apprentice analyzes words around a link

  3. Crawler Implementation • Feature extraction • Using document frequency and mutual information • Baseline crawl using a classifier • Naïve Bayesian • Cosine Similarity • Support Vector Machine • Crawl with trained apprentice • Again using the same types of classifiers

  4. Baseline Precision/Recall Target Pages

  5. Baseline Precision/Recall DMOZ Description

  6. Apprentice Precision/Recall Target Pages

  7. Apprentice Precision/Recall DMOZ Description

More Related