260 likes | 350 Vues
Dive into the field of Theoretical Computer Science and explore the mathematical tools used to understand complexities of the Internet. Discover strategies, game theory, and mechanisms vital in solving algorithmic problems online.
E N D
Algorithmic Problems in the Internet Christos H. Papadimitriou www.cs.berkeley.edu/~christos
Goals of TCS (1950-2000): Develop a productive mathematical understanding of the capabilities and limitations of the von Neumann computer and its software (the dominant and most novel computational artifacts of that time); Mathematical tools: combinatorics, logic What should the goals of TCS be today? (and what math tools will be handy?) Iowa State, April 2003
The Internet • huge, growing, open, emergent, mysterious • built, operated and used by a multitude of diverse economic interests • as information repository: open, huge, available, unstructured, critical • foundational understanding urgently needed Iowa State, April 2003
Today… • Games and mechanism design • Getting lost in the web • The Internet’s heavy tail Iowa State, April 2003
Games, games… strategies strategies 3,-2 payoffs (NB: also, many players) Iowa State, April 2003
matching pennies prisoner’s dilemma e.g. chicken Iowa State, April 2003
Nash equilibrium • Definition: double best response (problem: may not exist) • randomized Nash equilibrium Theorem [Nash 1952]: Always exists. • Problem: there are usually many . . . Iowa State, April 2003
The price of anarchy cost of worst Nash equilibrium [Koutsoupias and P, 1998] “socially optimum” cost in network routing = 2 [Roughgarden and Tardos, 2000, Roughgargen 2002] Iowa State, April 2003
mechanism design(or inverse game theory) • agents have utilities – but these utilities are known only to them • game designer prefers certain outcomes depending on players’ utilities • designed game (mechanism) has designer’s goals as dominating strategies Iowa State, April 2003
e.g., Vickrey auction • sealed-highest-bid auction encourages gaming and speculation • Vickrey auction: Highest bidder wins, pays second-highest bid Theorem: Vickrey auction is a truthful mechanism. (Theorem: It maximizes social benefit and auctioneer expected revenue.) Iowa State, April 2003
Vickrey shortest paths 3 6 5 s 4 t 6 10 3 11 pay e Vc(e) = its declared cost c(e), plus a bonus equal to dist(s,t)|c(e) = - dist(s,t) Iowa State, April 2003
Problem: 1 1 1 1 1 s 10 t Iowa State, April 2003
But… • …in the Internet Vickrey overcharge would be only about 30% on the average [FPSS 2002] • Could this be the manifestation of rational behavior at network creation? • [FPSS 2002]: Vickrey charges • Depend on origin and destination • Can be computed on top of BGP Iowa State, April 2003
But… (cont) • [FPSS 2002]: Vickrey charges • Depend on origin and destination • Can be computed on top of BGP • [with Mihail and Saberi, 2003] • They are small in expectation in random graphs. • (Also: Why traffic grows moderately as the Internet grows…) Iowa State, April 2003
The web as a graphcf: [Google 98], [Kleinberg 98] • how do you sample the web? [Bar-Yossef, Berg, Chien, Fakcharoenphol, Weitz, VLDB 2000] • e.g.: 42% of web documents are in html. How do you find that? • What is a “random” web document? Iowa State, April 2003
documents Idea: random walk Problems: hyperlinks 1. asymmetric 2. uneven degree 3. 2nd eigenvalue? = 0.99999 Iowa State, April 2003
The web walker: results • mixing time is ~log N/(1-) • WW mixing time: 3,000,000 • actual WW mixing time: 100 • .com 49%, .jp 9%, .edu 7%, .cn 0.8% Iowa State, April 2003
Q: Is the web a random graph? • Many K3,3’s (“communities”) • Indegrees/outdegrees obey “power laws” • Model [Kumar et al, FOCS 2000]: copying Iowa State, April 2003
Also the Internet • [Faloutsos3 1999] the degrees of the Internet are power law distributed • Both autonomous systems graph and router graph • Eigenvalues: ditto!??! • Model? Iowa State, April 2003
The world according to Zipf • Power laws, Zipf’s law, heavy tails,… • i-th largest is ~ i-a (cities, words: a = 1, “Zipf’s Law”) • Equivalently: prob[greater than x] ~ x -b • (compare with law of large numbers) • “the signature of human activity” Iowa State, April 2003
Models • Size-independent growth (“the rich get richer,” or random walk in log paper) • Growing number of growing cities • In the web: copying links [Kumar et al, 2000] • Carlson and Doyle 1999: Highly optimized tolerance (HOT) Iowa State, April 2003
Our model [with Fabrikant and Koutsoupias, 2002]: minj < i [ dij + hopj] Iowa State, April 2003
Theorem: • if < const, then graph is a star degree = n -1 • if > n, then there is exponential concentration of degrees prob(degree > x) < exp(-ax) • otherwise, if const < < n, heavy tail: prob(degree > x) > x -b Iowa State, April 2003
Heuristically optimized tradeoffs • Also: file sizes (trade-off between communication costs and file overhead) • Power law distributions seem to come from tradeoffs between conflicting objectives (asignature of human activity?) • cf HOT, [Mandelbrot 1954] • Other examples? • General theorem? Iowa State, April 2003
PS: eigenvalues Model: Edge [i,j] has prob. ~ di dj Theorem [with Mihail, 2002]: If the di’s obey a power law, then the nb largest eigenvalues are almost surely very close to d1, d2, d3, … (NB: The eigenvalue exponent observed in Faloutsos3 is about ½ of the degree exponent) Corollary: Spectral methods are of dubious value in the presence of large features Iowa State, April 2003