1 / 15

Extrapolation Methods for Accelerating PageRank Computations

Extrapolation Methods for Accelerating PageRank Computations. Doğu Gül Boğaziçi University 1/12/2003. Introduction. Fast computation method for PageRank which is a hyperlink-based estimate of the “importance” of Web pages, is proposed. Web link graph is represented by a “Markov matrix”.

anisa
Télécharger la présentation

Extrapolation Methods for Accelerating PageRank Computations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extrapolation Methods for Accelerating PageRank Computations Doğu Gül Boğaziçi University 1/12/2003

  2. Introduction • Fast computation method for PageRank which is a hyperlink-based estimate of the “importance” of Web pages, is proposed. • Web link graph is represented by a “Markov matrix”. • The PageRank algorithm uses the “Power Method” to compute the Markov matrix. • Empirically, it is shown that extrapolation methods speed up PageRank computation by 25-300%.

  3. Definitions • A link from a page u to a page v can be viewed as an evidence that v is an “important” page. • The amount of importance of page v which has a link from a page u, is proportional to the importance of u and inversely proportional to the number of pages u points to. • The PageRank of a page i is defined as the probability that at some particular time step, the surfer is at page i.

  4. Adopting Markov Matrix • The problem can be defined as a random walk on a directed Web graph. • Assume there exists an edge from u to v. • Deg(u) is the outdegree of page u in a Web graph G. • Consider a random surfer visiting page u at time k, in the next time step, the surfer chooses a node vi from among u’s out-neighbors uniformly at random. • The transition matrix describing the transition from i to j is given by P with Pij = 1 / deg(i).

  5. Conversion to a Valid Transition Matrix • For P to be a valid transition matrix, P should have no rows with consisting of all zeros. • A new transition matrix P’ is introduced which has no rows existing with all zeros. • Let d be the n-dimensional column vector identifying the nodes with outdegree 0:

  6. Conversion to a Valid Transition Matrix (cont.) • Then P’ is constructed as follows: • P’’ is constructed as follows:

  7. Power Method • The A that is equal to (P’’)T, is used in the formulations of “Power Method”. x(k) = A(k).x(k-1) • x(0) can be written as follows: x(0) = u1 + α2u2 + ..... + αmum

  8. Power Method Algorithm • The power method algorithm: PowerMethod(){ x(0) = v k = 1 repeat x(k) = Ax(k-1) a = |x(k) – x(k-1)| k = k + 1 until a < ε }

  9. Aitken Extrapolation • x(k-2) can be expressed as a linear combination of the first two eigenvectors. • x(k-2) = u1 + α2u2 • x(k-1) = A x(k-2) • x(k) = A x(k-1)

  10. Aitken Extrapolation Results • Comparison of convergence rate of unaccelerated Power Method and Aitken Extrapolation for c = 0.99. • Extrapolation was applied at the 10th iteration.

  11. Quadratic Extrapolation • It is assumed that Markov matrix A has only three eigenvectors and x(k-3) can be expressed as a linear combination of these three eigenvectors. • x(k-2) = u1 + α2u2 + α3u3 • x(k-2) = A x(k-3) • x(k-1) = A x(k-2) • x(k) = A x(k-1)

  12. Quadratic Extrapolation Results • Comparison of convergence rates for Power Method and Quadratic Extrapolation on LARGEWEB for c = 0.90.

  13. Quadratic Extrapolation Results • Comparison of times taken by Power Method and Quadratic Extrapolation on LARGEWEB for c = {0.90, 0.95, 0.99} • The residual tolerance is set to 0.001 for c = {0.90, 0.95} and 0.01 for c = 0.99.

  14. Comparison of Convergence Rates for Three Methods • Comparison of convergence rates for Power Method, Aitken Extrapolation and Quadratic Extrapolation for c = 0.99.

  15. Conclusion • Although PageRank is an offline computation, it has become increasingly desirable to speed up this computation. • The extrapolation step need only be applied periodically not at all steps. • Quadratic and Aitken extrapolation is a simple technique that requires little additional infrastructure to integrate into the standard Power Method.

More Related