26
Algorithmic Problems in the Internet Christos H. Papadimitriou www.cs.berkeley.edu/ ~christos

Algorithmic Problems in the Internet

  • Upload
    qiana

  • View
    23

  • Download
    2

Embed Size (px)

DESCRIPTION

Algorithmic Problems in the Internet. Christos H. Papadimitriou www.cs.berkeley.edu/~christos. Goals of TCS (1950-2000): - PowerPoint PPT Presentation

Citation preview

Page 1: Algorithmic Problems            in the Internet

Algorithmic Problems in the Internet

Christos H. Papadimitriouwww.cs.berkeley.edu/~christos

Page 2: Algorithmic Problems            in the Internet

Iowa State, April 2003 2

Goals of TCS (1950-2000): Develop a productive mathematical understanding

of the capabilities and limitations of the von Neumann computer and its software (the dominant and most novel computational artifacts of that time);Mathematical tools: combinatorics, logic

What should the goals of TCS be today?(and what math tools will be handy?)

Page 3: Algorithmic Problems            in the Internet

Iowa State, April 2003 3

Page 4: Algorithmic Problems            in the Internet

Iowa State, April 2003 4

The Internet

• huge, growing, open, emergent, mysterious• built, operated and used by a multitude

of diverse economic interests• as information repository: open, huge,

available, unstructured, critical• foundational understanding urgently

needed

Page 5: Algorithmic Problems            in the Internet

5Iowa State, April 2003

Today…

• Games and mechanism design• Getting lost in the web• The Internet’s heavy tail

Page 6: Algorithmic Problems            in the Internet

Iowa State, April 2003 6

Games, games…strategies

strategies3,-2

payoffs

(NB: also, many players)

Page 7: Algorithmic Problems            in the Internet

Iowa State, April 2003 7

1,-1 -1,1

-1,1 1,-1

0,0 0,1

1,0 -1,-1

3,3 0,4

4,0 1,1

matching pennies prisoner’s dilemma

chicken

e.g.

Page 8: Algorithmic Problems            in the Internet

Iowa State, April 2003 8

Nash equilibrium

• Definition: double best response (problem: may not exist)

• randomized Nash equilibriumTheorem [Nash 1952]: Always exists.

• Problem: there are usually many...

Page 9: Algorithmic Problems            in the Internet

Iowa State, April 2003 9

The price of anarchycost of worst Nash equilibrium

“socially optimum” cost [Koutsoupias and P, 1998]

in networkrouting

= 2 [Roughgarden and Tardos, 2000,Roughgargen 2002]

Page 10: Algorithmic Problems            in the Internet

Iowa State, April 2003 10

mechanism design(or inverse game theory)

• agents have utilities – but these utilities are known only to them

• game designer prefers certain outcomes depending on players’ utilities

• designed game (mechanism) has designer’s goals as dominating strategies

Page 11: Algorithmic Problems            in the Internet

Iowa State, April 2003 11

e.g., Vickrey auction

• sealed-highest-bid auction encourages gaming and speculation

• Vickrey auction: Highest bidder wins, pays second-highest bid

Theorem: Vickrey auction is a truthful mechanism.

(Theorem: It maximizes social benefit and auctioneer expected revenue.)

Page 12: Algorithmic Problems            in the Internet

Iowa State, April 2003 12

Vickrey shortest paths

6

63

45

1110

3

ts

pay e Vc(e) = its declared cost c(e),plus a bonus equal to dist(s,t)|c(e) = - dist(s,t)

Page 13: Algorithmic Problems            in the Internet

Iowa State, April 2003 13

Problem:

ts

11

1

1

1

10

Page 14: Algorithmic Problems            in the Internet

Iowa State, April 2003 14

But…• …in the Internet Vickrey overcharge would

be only about 30% on the average [FPSS 2002]

• Could this be the manifestation of rational behavior at network creation?

• [FPSS 2002]: Vickrey charges– Depend on origin and destination– Can be computed on top of BGP

Page 15: Algorithmic Problems            in the Internet

Iowa State, April 2003 15

But… (cont)

• [FPSS 2002]: Vickrey charges– Depend on origin and destination– Can be computed on top of BGP

• [with Mihail and Saberi, 2003]– They are small in expectation in random

graphs.– (Also: Why traffic grows moderately as the

Internet grows…)

Page 16: Algorithmic Problems            in the Internet

Iowa State, April 2003 16

The web as a graphcf: [Google 98], [Kleinberg 98]

• how do you sample the web? [Bar-Yossef, Berg, Chien, Fakcharoenphol,

Weitz, VLDB 2000]

• e.g.: 42% of web documents are in html. How do you find that?

• What is a “random” web document?

Page 17: Algorithmic Problems            in the Internet

17Iowa State, April 2003

documents

hyperlinks

Idea: random walk

Problems:

1. asymmetric 2. uneven degree3. 2nd eigenvalue?

= 0.99999

Page 18: Algorithmic Problems            in the Internet

Iowa State, April 2003 18

The web walker: results

• mixing time is ~log N/(1-)• WW mixing time: 3,000,000• actual WW mixing time: 100

• .com 49%, .jp 9%, .edu 7%, .cn 0.8%

Page 19: Algorithmic Problems            in the Internet

Iowa State, April 2003 19

Q: Is the web a random graph?• Many K3,3’s (“communities”)• Indegrees/outdegrees obey “power laws”

• Model [Kumar et al, FOCS 2000]: copying

Page 20: Algorithmic Problems            in the Internet

Iowa State, April 2003 20

Also the Internet

• [Faloutsos3 1999] the degrees of the Internet are power law distributed

• Both autonomous systems graph and router graph

• Eigenvalues: ditto!??!• Model?

Page 21: Algorithmic Problems            in the Internet

Iowa State, April 2003 21

The world according to Zipf

• Power laws, Zipf’s law, heavy tails,…• i-th largest is ~ i-a (cities, words: a = 1,

“Zipf’s Law”)• Equivalently: prob[greater than x] ~ x -b

• (compare with law of large numbers)• “the signature of human activity”

Page 22: Algorithmic Problems            in the Internet

Iowa State, April 2003 22

Models

• Size-independent growth (“the rich get richer,” or random walk in log paper)

• Growing number of growing cities• In the web: copying links [Kumar et al,

2000]• Carlson and Doyle 1999: Highly optimized

tolerance (HOT)

Page 23: Algorithmic Problems            in the Internet

Iowa State, April 2003 23

Our model [with Fabrikant and Koutsoupias, 2002]:

minj < i [ dij + hopj]

Page 24: Algorithmic Problems            in the Internet

Iowa State, April 2003 24

Theorem:

• if < const, then graph is a star degree = n -1• if > n, then there is exponential

concentration of degrees prob(degree > x) < exp(-ax)• otherwise, if const < < n, heavy tail: prob(degree > x) > x -b

Page 25: Algorithmic Problems            in the Internet

Iowa State, April 2003 25

Heuristically optimized tradeoffs

• Also: file sizes (trade-off between communication costs and file overhead)

• Power law distributions seem to come from tradeoffs between conflicting objectives (a signature of human activity?)

• cf HOT, [Mandelbrot 1954]• Other examples? • General theorem?

Page 26: Algorithmic Problems            in the Internet

Iowa State, April 2003 26

PS: eigenvalues

Model: Edge [i,j] has prob. ~ di dj

Theorem [with Mihail, 2002]: If the di’s obey a power law, then the nb largest eigenvalues are almost surely very close to d1, d2, d3, …

(NB: The eigenvalue exponent observed in Faloutsos3 is about ½ of the degree exponent)

Corollary: Spectral methods are of dubious value in the presence of large features