Modeling Web Crawling Process with Zipf’s Law for Web Graph Understanding

Problem Addressed • Attempts to prove that Web Crawl is random & biased image of Web Graph and does not assert properties of Web Graph • Understanding the Hyperlink structure has allowed the design of Algorithms like PageRank, HITS etc

Approach • Generally authors use sociological hypothesis to model the web graphs & compare the graphs with existing crawls • This paper attempts to model the web crawling process using only Zipf’s law but arrive at results comparable to existing web graph models.

Criticism • Uses Zipf’s law to model the crawling process, but also uses Zipf’s law in evaluation • Most of the evaluation criteria used are commonly observed in graphs following Zipf’s law • Concludes that Web Crawls are different from Web Graphs without proving it

Relation to Course • Closely related to Social Networks as it discusses Zipf’s law, Small World Phenomenon, A/H, Page Ranks etc. • Related to the study of Crawlers

Modeling Web Crawling Process with Zipf’s Law for Web Graph Understanding

Modeling Web Crawling Process with Zipf’s Law for Web Graph Understanding

Presentation Transcript

Opportunity Addressed

Competencies Addressed:

Issues addressed

Questions addressed

Standard Addressed: 10.4.3

Issue Addressed

1. Problem addressed

Topics Addressed

Problem Addressed

Major questions addressed

Questions Addressed

Standards Addressed

Strategic objectives addressed:

Content Addressed Storage

Content Addressed Storage

HVAC Issues Addressed

Opportunity Addressed

1. Problem addressed