Download
link building n.
Skip this Video
Loading SlideShow in 5 Seconds..
Link Building PowerPoint Presentation
Download Presentation
Link Building

Link Building

149 Vues Download Presentation
Télécharger la présentation

Link Building

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Link Building Martin Olsen Department of Computer Science Aarhus University

  2. Outline • Motivation and Introduction • Contribution • Link Building • Communities in Networks • Hedonic Games • Simple Games

  3. What is Search Engine Optimization (SEO) ? • ... in 2012, companies will spend almost $9 billion on search engine optimization … The New York Times, January 2009 Objective of SEO: A link to your page appears here on page 1

  4. www as a Graph = =

  5. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 100 1 3 100 2 100 4 5 6 7 8 9 10 100 100 100 100 100 100 100 1000 random surfers

  6. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 143 = 85 + 85/2 +15 1 3 355 = 4  85 + 15 2 270 4 5 6 7 8 9 10 15 58 15 15 15 15 100 1000 random surfers Distribution after one tick

  7. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 281 1 3 66 2 280 4 5 6 7 8 9 10 254 15 43 15 15 15 15 1000 random surfers Stationary distribution after 50 ticks

  8. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 0.281 1 3 0.066 2 0.280 4 5 6 7 8 9 10 0.254 0.015 0.043 0.015 0.015 0.015 0.015

  9. Random Surfer Zaps with probability 0.15 PageRank. Random Surfer Perspective 0.281 1 3 0.066 2 0.280 4 5 6 7 8 9 10 0.254 0.015 0.043 0.015 0.015 0.015 0.015 PageRank Ranking: 1, 2, 4, 3, 6 PageRank is an important ingredient of the ranking mechanism Relevance counts as well!

  10. Link Building is an Important Aspect of SEO

  11. Contribution/Link Building The Computational Complexity of Link Building (Cocoon ´08) Olsen Maximizing PageRank with new Backlinks (submitted) Olsen MILP for Link Building (In preparation) Olsen, Viglas

  12. The Link Building Problem. Formal Definition • LINK BUILDING • Instance : G(V, E), t V, k  Z+ • Solution : S V  {t} with  S   k • maximizing t after adding • S  {t} to E

  13. Link Building is not Trivial 0.096 2 0.091 0.060 7 3 0.272 1 8 0.250 6 4 0.085 0.069 0.054 2 0.039 2 5 0.078 0.042 0.042 0.049 0.035 7 3 0.375 7 3 1 0.367 1 8 0.337 6 4 8 0.331 6 4 0.054 0.054 0.070 0.049 5 0.042 5 0.060

  14. PageRank Topology Theorem*) : The expected number of visits to p for a random surfer starting at u prior to the first zapping event i 1  increase in PageRank 1 j

  15. k-REGULAR INDEPENDENT SET ≤FPT LINK BUILDING • Does the graph contain an independent set of size k? • Can we turn this question into a Link Building problem? j i

  16. k-REGULAR INDEPENDENT SET ≤FPT LINK BUILDING j y x 1 i OPT! Basic idea: Make zij relatively big

  17. k-REGULAR INDEPENDENT SET ≤FPT LINK BUILDING j LINK BUILDING is W[1]-hard *): LINK BUILDING solvable in time f(k)  nc  k-REGULAR INDEPENDENT SET solvable in time f(k)  nc  W[1] = FPT Another result: FPTAS for LINK BUILDING  NP = P y x 1 i OPT! Basic idea: Make zij relatively big

  18. Upper Bound: k = 1 fixed 0.070 0.096 2 2 0.060 0.091 0.048 0.060 7 3 7 3 0.338 0.272 1 1 8 0.306 8 0.250 6 4 6 4 0.048 0.085 0.060 0.069 5 5 0.070 0.078 The dashed link can be found in time corresponding to O(1) PageRank computations with a randomized scheme *).

  19. Upper Bound: Mixed Integer Linear Programming Approach *) Price for link from i Compute the cheapest set of new incoming links that would make node 5 rank highest 0.061 2 0.099 0.036 7 3 0.187 1 8 0.178 6 4 0.189 0.049 5 0.200

  20. A Quiz: Which of the two situations would be optimal for Martin?

  21. Contribution/Communities in Networks Communities in Large Networks: Identification and Ranking (WAW ´06) Olsen

  22. Communities in Networks Dolphins in Doubtful Sound [Newman, Girvan ´04]:

  23. What is a Community? Informally: A community C is a set of nodes with relatively many links between them Assumption/Observation: A CS site has relatively many CS links! Formal definition based on assumption *) : v C,u  C: wvC ≤ wuC C

  24. A Greedy Approach for Detecting Members of a Community *) Repeat until C is a Community: • Find v Cwith maximum attention to C • CC {v} • Update attentions Use two priority queues holding elements in C and V C 1) Old C 2) New C

  25. An Experiment. A Danish CS Community • Crawl of the dk-domain with 180.468 sites in total • Representatives = 4 CS sites • CS-Community with 556 sites • Minimum attention, : 15.8% • Maximum attention, : 15.4% Ranking: • www.daimi.au.dk (CS U Aarhus) • www.diku.dk (CS U Copenhagen) • www.itu.dk (ITU Copenhagen) • www.cs.auc.dk (CS U Aalborg) • www.brics.dk (CS PhD School) • www.imm.dtu.dk (Informatics/Mathematical modeling DTU Copenhagen) … • www.imada.sdu.dk (CS/Mathematics U Southern Denmark)

  26. Other Results Computing non trivial communities by the definition given is NP-hard A simple model for the evolution of communities is presented. These communities are probably obeying the definition for large n if the out degree of the nodes is (log n).

  27. Contribution/Hedonic Games Nash Stability in Additively Separable Hedonic Games Is NP-Hard (CiE ´07) Olsen Extended version: Nash Stability in Additively Separable Hedonic Games and Community Structures (Theory of Computing Systems ´09) Olsen

  28. An Additively Separable Hedonic Game Two buffaloes b1 and b2 that hate each other. They are only thirsty if they have a parasite on their back in which case they have to drink 9 l/h. Two gigantic parasites p1 and p2. They only want to sit on b1 and b2 respectively. Five waterholes w1, …,w5 with capacities 1, 2, 3, 4 and 8 l/h respectively.

  29. An Additively Separable Hedonic Game One Nash Equilibrium for the game: PARTITION ≤ NE in ASHG NPC *)

  30. Community Structures in Networks Put a 1 on each connection between two dolphins. The community structure is a NE! NE  community structure? NE’s are NP-hard to compute even with symmetric and positive payoffs*)

  31. Contribution/Simple Games On the Complexity of Problems on Simple Games (submitted) Freixas, Molinero, Olsen, Serna

  32. Open Problems/Future Work • In the thesis we show LINK BUILDING  APX. Is there a PTAS for LINK BUILDING? • Surgical Link Building: • Isolate the Community C • Model all pages in V  C as one page • Use MILP • Use information on distribution of PageRank • Does the stuff presented really work? • Thank You!

  33. Link Building. A Real World Example Dear X We are trying to get more links to our website to help improve its rating on the search engines. We were wondering if you could put a link to our site … on your webpage or blog. If you have a website or a Blog and put a link to our page on it then to say thank you for each month it is up, I will give you … Source: An e-mail to a colleague X

  34. Link Building is not Trivial. 2nd Example 1 Assumption: Obtaining a link from one green node is slightly better for node 1 compared to obtaining a link from one blue node. Now node 1 can pick three incoming links for free. What should node 1 choose?

  35. No FPTAS for LINK BUILDING if NP ≠ P *) j y x 1 i OPT!

  36. Power Law

  37. Fixed Parameter Tractability: FPT and W[1] W[1] k-INDEPENDENT SET k-REGULAR INDEPENDENT SET Solvable in time f(k)  nc FPT k-VERTEX COVER Complete for W[1] LINK BUILDING is W[1]-hard *)

  38. Other Results Computing non trivial communities by the definition given is NP-hard A simple model for the evolution of communities is presented. These communities are probably obeying the definition for large n if the out degree of the nodes is (log n). C

  39. Upper Bound: Mixed Integer Linear Programming Approach *) price for 0.061 0.096 2 2 0.099 0.036 0.091 0.060 7 3 7 3 0.187 0.272 1 1 8 0.178 8 0.250 6 4 6 4 0.189 0.085 0.049 0.069 5 5 0.200 0.078 The dashed links show the cheapest modification that will bring node 5 to the top of the ranking. Computed using a MILP approach. Alternatively we could go for the maximum improvement in the ranking for a given budget.