Price of Total Anarchy

Price of Total Anarchy

June 2008

Slides by Israel Shalom

Based on “Regret Minimization and the Price of Total Anarchy”By Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett and Aaron Roth

1

Agenda Preliminaries

Game Theory Basics Regret Minimization

Hotelling games Valid games Atomic congestion games Algorithmic efficiency

2

Games in Strategic Form The game has players Each player has his available pure strategies marks the strategy profiles Individual utility (payoff) functions

iAk

kAAAA 21

Ai :

3

Games in Strategic Form – cont’d Examples:

Rock, Paper, Scissors

Prisoner’s Dilemma

Rock PaperScissor

s

Rock 0, 0 -1, 1 1, -1

Paper 1,-1 0,0 -1, 1

Scissors

-1,1 1, -1 0, 0

Deny Confess

Deny 1, 1 5, 0

Confess

0, 5 3, 3

4

Mixed Strategies Users can play “mixed strategies” as well – a

probability distribution over , we mark this as

marks the mixed strategy profiles

The payoffs are now defined as the expected value of over the randomness of the players

Sometimes marked by

iA iS

kSSSS 21

i

i

5

Best Response and Nash Equilibria Lowercase letters will usually denote elements:

, , … We denote by the selected strategies of the

players other than i ( )

A strategy is best response to if for all :

A strategy profile is a Nash Equilibrium if for all i, is a best response to .

Pure equilibria might exist, but in every game there is at least one mixed Nash Equilbrium.

ii Aa Aa ii Ss

ia

ia ii Aa 'ia

),'(),( iiiiii aaaa Aa iaia

),...,,,...,( 111 kiii aaaaa ),...,,',,...,(),'( 111 kiiiii aaaaaaa

6

Nash Equilibria Examples:

Rock, Paper, Scissors Mixed equilibrium:

([1/3,1/3,1/3], [1/3, 1/3, 1/3])

Prisoner’s Dilemma Pure equilibrium

(Confess, Confess)

Rock PaperScissor

s

Rock 0, 0 -1, 1 1, -1

Paper 1,-1 0,0 -1, 1

Scissors

-1,1 1, -1 0, 0

Deny Confess

Deny 1, 1 5, 0

Confess

0, 5 3, 3

7

Social Optimum Sometimes, we’ll define a social utility

(welfare) function, similar to payoffs: Choices that would make sense:

For mixed strategies, we’ll look for the expected value (analogous to payoff in mixed strategies)

Socially optimum strategy profile (and OPT) are:

A:

))((min)( aa ii

k

ii aa

1

)()(

))((max sOPTSs

We are assuming a maximizing game throughout, the minimization is analogous8

Price of Anarchy Let mark all the Nash Equilibria in the

game The price of anarchy is defined as the ratio of

the worst NE to optimum:

SN

)(max

s

OPTPOA

Ns

9

Price of Anarchy Prisoner’s Dilemma

Notice that the fraction is flipped (minimization game)

Deny Confess

Deny 1, 1 5, 0

Confess

0, 5 3, 3

10

OPT = 2

N = 6

3)(

max OPT

sPOA

Ns

Regret Minimization Let mark the strategy profiles in T

steps We define the regret of player i in a

maximization game:

Intuitively, this is “how much i could gain more in average had he played a single strategy throughout the game”

Taaa ,,, 21

T

t

ti

T

ti

tii

Aaa

Taa

Tii 11

)(1

),(1

max

11

Regret Minimization When a player i uses a regret-minimizing

algorithm, for any sequence, we have the property

Where: vanishes as marks the number of steps before The expectancy is over the algorithm’s randomness

In other words, the expected value of regret vanishes

Notice that this is for maximizing games

T

t

T

t

tii

tii

AaTRa

TEaa

Tii 1 1

)()(1

),(1

max

Taaa ,,, 21

T )(TRT)(TR

12

Regret Minimization This implies that for any

sequence , if player i is regret-minimizing, then:

The price of total anarchy is defined as:

Where max is taken over , that are play profiles with regret-minimization property

Tsss ,,, 21

)()(1

),(1

max1 1

TRsT

EasT

T

t

T

t

tii

tii

Aa ii

T

t

tsT

OPT

1

)(1

max

Tsss ,,, 21

13

Regret and NE Notice that when playing a

Nash Equilibrium, all players will have zero-regret If there’s a better

“constant” response, we can improve by moving to it

Therefore, the price of total anarchy in any game consists an upper bound for the price of anarchy

Regret-minimizing strategies

NE

14

Advantages of Regret Minimization Computational

Nash Equilibria are hard (PPAD-hard) to calculate – even for small action spaces

There are efficient regret minimization algorithms for polynomial number of actions

Motivational No particular reason for players to converge down to

NE There might be multiple equilibria, and agents may

individually prefer different ones Byzantine players’ actions are not taken into account in

NE Regret-minimization considers only local information,

much more practical15

Agenda Preliminaries Hotelling games

Definition POA/POTA Generalization

Valid games Atomic congestion games Algorithmic efficiency

16

Hotelling - Game Definition Souvenir stand owners in Paris:

There are tourists every day, they buy from whichever stand they find first

Each stand owner wishes to maximize his own sales We want “fairness”, the social welfare function is the

minimum of the total sales made. Formally:

We have an n-vertex graph . Each seller locates himself at a vertex Each day, a tourist in each vertex, goes to the closest

seller If there is a “tie” between the sellers, they split the

gains Minimum utility:

n

),( EVG VAi

))((min)( ti

i

t ss 17

Hotelling - Optimum Solution Notice that the sum of payoffs is always

exactly n Therefore, the social optimum is achieved

when all players have equal payoffs This can happen if all players play on the

same vertex Therefore

k

nOPT

18

Hotelling – POA Theorem 3.1

The price of anarchy in the Hotelling game is (2k – 2)/k Proof

We are to show that all players gain at least n/(2k – 2) Assume the contrary, that player i gains less than that in S Consider player i “leaving” the game. The total payoff is

still n, so the average payoff for players is now n/(k-1) There must be at least one player h gaining at least the

average, playing the vertex vh Player i can assure n/(2k – 2) by moving to vh Contradiction to Nash equilibrium

19

Theorem 3.1 – cont’d We are left with

showing tightness Consider a game with k-1 stars k-1 players play at

centers of their own stars, and player k plays uniformly over all the star centers

This is NE The randomizing player

earns n/(2k - 2)

1 2

3

k-1

20

Hotelling – POTA Let be the strategy of playing an arbitrary

strategy from strategies in . Define Notice that , since when player i is

removed, the rest have average payoff of

Lemma 3.4 For all i, for all , .(Trivial for t = u)

)22/(),( knos t

iuii

uti

tio

0 tti

)1/( kn

Tut ,1 0 tui

uti

21

tis

Lemma 3.4 - Proof Consider a -player game

Each player other than i replicated twice: once as time-t player and once as time-u player, with strategies and .

Average payoff is If player i replaces a time-t player, that’s his

expected payoff If we further remove time-t players, we only

improve

tjs

ujs

)22/( kn

)22( k

22

The Imaginary Game, n=10, k=4

23

time-t players

time-u players

replacing time-tplayer in imaginary≤E

replacing time-t & removing other

time-t players= E

utikn )22/(=),( t

iuii os

Lemma 3.4 – cont’d

24

Same argument holds for replacing u-player:

tui

uti

tui

uti

k

n

k

n

k

n

EE

Ek

n

2

1

22

222

1

222

1

imaginaryin

player-u a replacing

2

1

imaginaryin

player- ta replacing

2

1

imaginaryin

player a replacing

22

Hotelling – POTA Theorem 3.2

Each regret minimizing player has at least n/(2k-2) payoff Proof

Provided a sequence of T plays, select a random time u

The average expected payoff if we played throughout is:

Averaging over different u, we reach:

uis

T

u

T

t

tui

T

u

T

t

tui

T

uu Tk

Tn

k

n

TG

T 1 11 11

1

2222

11

T

t

tui

T

t

ui

tiiu k

nssG

11 22),(

25

Hotelling – POTA We reached

The second term is non-negative due to Lemma 3.4

There is value for u that achieves the average For that u, if player i mixes between , he’ll

achieve A regret minimizing player achieves this expected

payoff

T

u

T

t

tui

T

u

T

t

tui

T

uu Tk

Tn

k

n

TG

T 1 11 11

1

2222

11

uis

26

22

1

1

k

TnG

T

T

uu

)22/( kTnGu)22/( kn

Hotelling – POTA Corollary: The price of total anarchy in the

Hotelling game is (2k-2)/k, matching the price of anarchy

Notice that in the we haven’t made any assumptions about how other players behave, so the proof holds even in the presence of Byzantine players making arbitrary (or adversarial) decisions!

27

Generalized Hotelling Game Notice that in the proof we have used only three

features of the hotelling game: Constant sum – the sum of utilities is constant Symmetric – the “names” of the stand owners

don’t matter Monotone – any player can “leave” the game and

the sum does not change We call such games with the “fairness” social

utility generalized Hotelling games. Theorem 3.6:

In any k-player generalized Hotelling game, the price of total anarchy among regret minimizing players is (2k-2)/k even in the presence of arbitrarily many Byzantine players.28

Non-Convergence Consider the game with:

Players {0, …, k-1} k-1 n-vertex stars, with centers at v0, …, vk-2 and isolated vertex vk-1

Consider Each player’s payoff No single vertex has expected

payoff more than No regrets However, this is not Nash!

Players at the isolated vertex will deviate!

1 2

3

k-1

k

kitti va mod

kkn /)1)1((

kkn 2/)1(

29

Break?

30

Agenda Preliminaries Hotelling games Valid games

Definition Market sharing game POA/POTA Byzantine players

Atomic congestion games Algorithmic efficiency

31

Valid Games – Definitions Consider a k-player maximization game

For each player, there is a groundset of actions Vi Player i plays from some feasible set

Definitions Let The discrete derivative of at in the

direction is The function is said to be submodular if for

This should remind us “concavity” – decreasing marginal utility

iViA 2

)( 1 kVVV

f VX XVD )()()( XfDXfXfD

Vf 2:

)()( , BfAfBVi ii

BA

32

Submodularity

33

Adding something to a smaller set makes a bigger difference

A

B

V

car

house

villahigh-def

jacuzzi

Valid Games – Definitions We will notate as the strategies of players

with index smaller than i. We will also use both this and as complete strategies (as in apply over them), meaning that the remaining players play the empty set

Definition 4.2: A game with private utility functions and social utility function is valid if: is submodular For all i, s: - private fairness For all s: - social fairness

isis

,i

Vi 2:,

)()( isi ssi

)()(1

ssk

i i

34

Valid Games – Example Market sharing game

(Goemans et. al., 2005) Players are ISP’s Markets are towns

Each market has price and value

Each player can “enter” the market he has an edge towards, with budget constraint

Player’s payoff per market is the value divided by entrances

Sum social utility Or – sum of values at entered

markets

5

3

9

2

playersmarkets

35

Valid Games – Price of Anarchy Vetta, 2002:

In a valid game, if is a NE strategy, and is the optimal strategy then:

Corollary: if is non-decreasing, then we have POA 2(The derivatives are always positive)

Theorem 4.3, Corollary 4.2 (no proofs)POTA matches POA in valid games (up to )

s },...,,{ 21 k))(( OPT

ii

i

ii

isi

issi

is sssOPT

::

)()()(2

k

36

Valid Games – Byzantine Players Theorem 4.5

In a valid game with nondecreasing social welfare, if k players minimize regret with while the Byzantine players play strategies the average social welfare is:

Proof.Assume the contrary,

2)(

1

1

OPTBs

Tt

T

t

t

tss ,...,1

tBB ,...,1

2)(

1

OPTTBs t

T

t

t

37

Theorem 4.5 – cont’d

)()( tt sBOPT

k

ii

tttt BsBsi

1

)()(

(non-decreasing)

(gradually inserting)

k

i

tti

tt BsBsi

1

)()( (submodularity)

k

ii

ttii

tt BsBs1

),()( (private fairness)

38

Gradual Insertation

39

A

B

car

house

villa

jacuzzi

})villa{()()()( jacuzzivilla AfAfAfBf

Theorem 4.5 – cont’d

T

t

k

ii

ttii

T

t

tt BsBsOPTT1 11

),()( (summarizing)

(assumption – the first term is less than half)

T

t

ttT

t

k

ii

ttii BsBs

11 1

)(),(

k

i

T

t

tti Bs

1 1

)( (social fairness)

k

i

T

t

tti

k

i

T

ti

ttii BsBs

1 11 1

)(),( (rearranging sum)

40

Theorem 4.5 – cont’d At least one player must match that, so for

him we have

Contradictory to regret minimization!

Note that it’s compared to the old OPT (without the Byzantine players) But it’s fair – Byzantine players may be acting

even against their own interest – we can’t say anything about them

T

t

ttii

T

ti

ttii BsBs

11

)(),(

41

Agenda Preliminaries Hotelling games Valid games Atomic congestion games

Definition Sum social utility – POTA Makespan utility – Lower bounds

Algorithmic efficiency

42

Congestion Games A congestion game is a minimization game,

with k players For each player, there is a set of facilities Vi Player i plays from some feasible set In weighted games, player i has a weight wi

For unweighted games, we assume wi = 1 The load on facility e is defined as Each facility e has an associated latency

function fe Player i playing ai experiences cost

iViA 2

iaeiie wl

:

iae

eei lf )(

43

Atomic Congestion Games We’ll consider a specific kind of congestion

game Unweighted Linear latencies –

We will use sum social utility: Previously known results:

POA for pure strategies is 2.5 (Awerbuch et. al., 2005)

POA for mixed strategies is also 2.5 (Chirstodoulou and Koutsoupias, 2007)

Theorem 5.1: POTA in this setting is 2.5 This asserts the previously known results!

eeeee blclf )(

k

ii aa

1

)()(

44

Theorem 5.1 – Proof Let be the optimal play Since we have no regret, for all i Summarizing for each player, and rearranging

sum:

Or more simply:

},...,,{ 21 k

T

t ee

teei

ti

T

ti

tT

ti

T

t see

tee

iti

blcssblc1111

)1(),()(

T

t Ee eie

tee

T

t Ee seie

tee

iti

blcblc1 :1 :

)1(

T

t Eeeeeee

tee

T

t Eeee

tee

T

t Ee

tee

tee lblcllclblclblc

1

***

1

*

1

))1(()(

45

Theorem 5.1 – cont’d Geometric mean is smaller than arithmetic

mean, so:

Recall our equation

2/)( 2222 jijiij

T

t Eeeeeee

tee

T

t Ee

tee

tee lblcllclblc

1

***

1

)(

T

t Eeeeeeee

tee

T

t Ee

tee

tee lblclclclblc

1

**2*2

1

2 )(2

1)(

2

1)(

T

t Eeeeeeee

T

t Ee

tee

tee lblclclblc

1

**2*

1

2 )(2

1)(

2

1

(1)

(2)

(3)

46

Theorem 5.1 – cont’d Multiplying both sides by two:

Further relaxing the inequality:

We’re done!

T

t Eeeeeeee

T

t Ee

tee

tee lblclclblc

1

**2*

1

2 22)()(

T

t Eeeeee

T

t Ee

tee

tee lblclblc

1

*2*

1

2 )(3)(

OPTTsT

t

t

3)(1

47

Parallel Link Congestion Game Consider n identical links and k weighted

players Each player selects which link to use (single link) Each player pays the sum of the weights on the

link

48

Parallel Link Congestion Games – cont’d Claim: In Parallel link congestion game with

social cost function as the maximum expected job latency, POTA is 2

Proof. Rescale the weights, so that OPT = 1 Total weight is less than n, weights are less than 1 Total latency in T plays is Tn, at least one link e*

with latency less than T in total, average latency - l(e*) ≤ 1 Regret minimizing player will be competitive to

moving to e* We expect at most l(e*) +wi ≤ 2

49

Parallel Links Congestion Game – cont’d The unweighted case with the sum social

utility, is called “load balancing game” It’s a specific case of the discussed before,

thus we will have POA and POTA of 2.5 If k >> n (a likely case) and the server speeds

are relatively bounded, we can say even more Theorem 5.6 (no proof):

In this formation, POTA is 1 + o(1) Corollary 5.7:

In this formation, POA is 1 + o(1), even for mixed strategies

50

Parallel Link Congestion Games – cont’d Usually, we consider the makespan social utility

function – the load on the most loaded link Why doesn’t our argument from before hold?

Because E[max{X}] > max{E[X]} The POA for 2-link games is 3/2 (Koutsopias and

Papadimitriou, 1999) The POA for n-link games is

(Koutsopias, Marvronikolas, Spirakis, 1999) Theorem 5.4 (no proof)

POTA for this game with two links is 3/2. Theorem 5.5: POTA for this game with n links is

.

)loglog/(log nn

)( n

51

Theorem 5.5 – Proof Sketch Consider n links, n players, unit weights. OPT = 1 Resembles what we did in Hotelling (for non-

convergence): Split the players into groups At time t, group t mod plays at link 1, while the rest

play in different nodes – get average latency of close to 1 This minimizes regret – for any fixed link, the player

will need to share the link most of the times (latency ~2)

Still, at each time, link 1 has a whole group – maximum latency of

Notice that this holds even for unweighted players!

n2

n2

)( n

52

Theorem 5.5 – Proof Sketch

53

spots

single spotload = 2/n

2/nn

load = 012/ n spots

4

5

2

2/

2

12)(

n

n

n

nsi

22

1

2

2)12()(

nn

nvii

load=1

Agenda Preliminaries Hotelling games Valid games Atomic congestion games Algorithmic efficiency

54

Algorithmic Efficiency Weighted Majority Algorithm (Littlestone,

Warmuth) Initialize for all i Update at time t, where is a small

tradeoff parameter (0.01) and is the loss at time t-1 Expects regret over time T Explained in “Algorithmic Game Theory”, chapter

4 Polynomial in the number of strategies (Hotelling,

Congestion games) Not as good in Valid games (the strategies are

exponential to the size of groundset)

11 iw

We’re assuming a minimizing game with [0,1] loss and n strategies.

1

)1(1

tilt

iti ww

1til

)/1( TO

55

Questions?

56

Documents

Price of Total Anarchy