Upload
ryu
View
60
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Price of Total Anarchy. June 2008. Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett and Aaron Roth. Agenda. Preliminaries Game Theory Basics Regret Minimization Hotelling games Valid games - PowerPoint PPT Presentation
Citation preview
Price of Total Anarchy
June 2008
Slides by Israel Shalom
Based on “Regret Minimization and the Price of Total Anarchy”By Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett and Aaron Roth
1
Agenda Preliminaries
Game Theory Basics Regret Minimization
Hotelling games Valid games Atomic congestion games Algorithmic efficiency
2
Games in Strategic Form The game has players Each player has his available pure strategies marks the strategy profiles Individual utility (payoff) functions
iAk
kAAAA 21
Ai :
3
Games in Strategic Form – cont’d Examples:
Rock, Paper, Scissors
Prisoner’s Dilemma
Rock PaperScissor
s
Rock 0, 0 -1, 1 1, -1
Paper 1,-1 0,0 -1, 1
Scissors
-1,1 1, -1 0, 0
Deny Confess
Deny 1, 1 5, 0
Confess
0, 5 3, 3
4
Mixed Strategies Users can play “mixed strategies” as well – a
probability distribution over , we mark this as
marks the mixed strategy profiles
The payoffs are now defined as the expected value of over the randomness of the players
Sometimes marked by
iA iS
kSSSS 21
i
i
5
Best Response and Nash Equilibria Lowercase letters will usually denote elements:
, , … We denote by the selected strategies of the
players other than i ( )
A strategy is best response to if for all :
A strategy profile is a Nash Equilibrium if for all i, is a best response to .
Pure equilibria might exist, but in every game there is at least one mixed Nash Equilbrium.
ii Aa Aa ii Ss
ia
ia ii Aa 'ia
),'(),( iiiiii aaaa Aa iaia
),...,,,...,( 111 kiii aaaaa ),...,,',,...,(),'( 111 kiiiii aaaaaaa
6
Nash Equilibria Examples:
Rock, Paper, Scissors Mixed equilibrium:
([1/3,1/3,1/3], [1/3, 1/3, 1/3])
Prisoner’s Dilemma Pure equilibrium
(Confess, Confess)
Rock PaperScissor
s
Rock 0, 0 -1, 1 1, -1
Paper 1,-1 0,0 -1, 1
Scissors
-1,1 1, -1 0, 0
Deny Confess
Deny 1, 1 5, 0
Confess
0, 5 3, 3
7
Social Optimum Sometimes, we’ll define a social utility
(welfare) function, similar to payoffs: Choices that would make sense:
For mixed strategies, we’ll look for the expected value (analogous to payoff in mixed strategies)
Socially optimum strategy profile (and OPT) are:
A:
))((min)( aa ii
k
ii aa
1
)()(
))((max sOPTSs
We are assuming a maximizing game throughout, the minimization is analogous8
Price of Anarchy Let mark all the Nash Equilibria in the
game The price of anarchy is defined as the ratio of
the worst NE to optimum:
SN
)(max
s
OPTPOA
Ns
9
Price of Anarchy Prisoner’s Dilemma
Notice that the fraction is flipped (minimization game)
Deny Confess
Deny 1, 1 5, 0
Confess
0, 5 3, 3
10
OPT = 2
N = 6
3)(
max OPT
sPOA
Ns
Regret Minimization Let mark the strategy profiles in T
steps We define the regret of player i in a
maximization game:
Intuitively, this is “how much i could gain more in average had he played a single strategy throughout the game”
Taaa ,,, 21
T
t
ti
T
ti
tii
Aaa
Taa
Tii 11
)(1
),(1
max
11
Regret Minimization When a player i uses a regret-minimizing
algorithm, for any sequence, we have the property
Where: vanishes as marks the number of steps before The expectancy is over the algorithm’s randomness
In other words, the expected value of regret vanishes
Notice that this is for maximizing games
T
t
T
t
tii
tii
AaTRa
TEaa
Tii 1 1
)()(1
),(1
max
Taaa ,,, 21
T )(TRT)(TR
12
Regret Minimization This implies that for any
sequence , if player i is regret-minimizing, then:
The price of total anarchy is defined as:
Where max is taken over , that are play profiles with regret-minimization property
Tsss ,,, 21
)()(1
),(1
max1 1
TRsT
EasT
T
t
T
t
tii
tii
Aa ii
T
t
tsT
OPT
1
)(1
max
Tsss ,,, 21
13
Regret and NE Notice that when playing a
Nash Equilibrium, all players will have zero-regret If there’s a better
“constant” response, we can improve by moving to it
Therefore, the price of total anarchy in any game consists an upper bound for the price of anarchy
Regret-minimizing strategies
NE
14
Advantages of Regret Minimization Computational
Nash Equilibria are hard (PPAD-hard) to calculate – even for small action spaces
There are efficient regret minimization algorithms for polynomial number of actions
Motivational No particular reason for players to converge down to
NE There might be multiple equilibria, and agents may
individually prefer different ones Byzantine players’ actions are not taken into account in
NE Regret-minimization considers only local information,
much more practical15
Agenda Preliminaries Hotelling games
Definition POA/POTA Generalization
Valid games Atomic congestion games Algorithmic efficiency
16
Hotelling - Game Definition Souvenir stand owners in Paris:
There are tourists every day, they buy from whichever stand they find first
Each stand owner wishes to maximize his own sales We want “fairness”, the social welfare function is the
minimum of the total sales made. Formally:
We have an n-vertex graph . Each seller locates himself at a vertex Each day, a tourist in each vertex, goes to the closest
seller If there is a “tie” between the sellers, they split the
gains Minimum utility:
n
),( EVG VAi
))((min)( ti
i
t ss 17
Hotelling - Optimum Solution Notice that the sum of payoffs is always
exactly n Therefore, the social optimum is achieved
when all players have equal payoffs This can happen if all players play on the
same vertex Therefore
k
nOPT
18
Hotelling – POA Theorem 3.1
The price of anarchy in the Hotelling game is (2k – 2)/k Proof
We are to show that all players gain at least n/(2k – 2) Assume the contrary, that player i gains less than that in S Consider player i “leaving” the game. The total payoff is
still n, so the average payoff for players is now n/(k-1) There must be at least one player h gaining at least the
average, playing the vertex vh Player i can assure n/(2k – 2) by moving to vh Contradiction to Nash equilibrium
19
Theorem 3.1 – cont’d We are left with
showing tightness Consider a game with k-1 stars k-1 players play at
centers of their own stars, and player k plays uniformly over all the star centers
This is NE The randomizing player
earns n/(2k - 2)
1 2
3
k-1
20
Hotelling – POTA Let be the strategy of playing an arbitrary
strategy from strategies in . Define Notice that , since when player i is
removed, the rest have average payoff of
Lemma 3.4 For all i, for all , .(Trivial for t = u)
)22/(),( knos t
iuii
uti
tio
0 tti
)1/( kn
Tut ,1 0 tui
uti
21
tis
Lemma 3.4 - Proof Consider a -player game
Each player other than i replicated twice: once as time-t player and once as time-u player, with strategies and .
Average payoff is If player i replaces a time-t player, that’s his
expected payoff If we further remove time-t players, we only
improve
tjs
ujs
)22/( kn
)22( k
22
The Imaginary Game, n=10, k=4
23
time-t players
time-u players
replacing time-tplayer in imaginary≤E
replacing time-t & removing other
time-t players= E
utikn )22/(=),( t
iuii os
Lemma 3.4 – cont’d
24
Same argument holds for replacing u-player:
tui
uti
tui
uti
k
n
k
n
k
n
EE
Ek
n
2
1
22
222
1
222
1
imaginaryin
player-u a replacing
2
1
imaginaryin
player- ta replacing
2
1
imaginaryin
player a replacing
22
Hotelling – POTA Theorem 3.2
Each regret minimizing player has at least n/(2k-2) payoff Proof
Provided a sequence of T plays, select a random time u
The average expected payoff if we played throughout is:
Averaging over different u, we reach:
uis
T
u
T
t
tui
T
u
T
t
tui
T
uu Tk
Tn
k
n
TG
T 1 11 11
1
2222
11
T
t
tui
T
t
ui
tiiu k
nssG
11 22),(
25
Hotelling – POTA We reached
The second term is non-negative due to Lemma 3.4
There is value for u that achieves the average For that u, if player i mixes between , he’ll
achieve A regret minimizing player achieves this expected
payoff
T
u
T
t
tui
T
u
T
t
tui
T
uu Tk
Tn
k
n
TG
T 1 11 11
1
2222
11
uis
26
22
1
1
k
TnG
T
T
uu
)22/( kTnGu)22/( kn
Hotelling – POTA Corollary: The price of total anarchy in the
Hotelling game is (2k-2)/k, matching the price of anarchy
Notice that in the we haven’t made any assumptions about how other players behave, so the proof holds even in the presence of Byzantine players making arbitrary (or adversarial) decisions!
27
Generalized Hotelling Game Notice that in the proof we have used only three
features of the hotelling game: Constant sum – the sum of utilities is constant Symmetric – the “names” of the stand owners
don’t matter Monotone – any player can “leave” the game and
the sum does not change We call such games with the “fairness” social
utility generalized Hotelling games. Theorem 3.6:
In any k-player generalized Hotelling game, the price of total anarchy among regret minimizing players is (2k-2)/k even in the presence of arbitrarily many Byzantine players.28
Non-Convergence Consider the game with:
Players {0, …, k-1} k-1 n-vertex stars, with centers at v0, …, vk-2 and isolated vertex vk-1
Consider Each player’s payoff No single vertex has expected
payoff more than No regrets However, this is not Nash!
Players at the isolated vertex will deviate!
1 2
3
k-1
k
kitti va mod
kkn /)1)1((
kkn 2/)1(
29
Break?
30
Agenda Preliminaries Hotelling games Valid games
Definition Market sharing game POA/POTA Byzantine players
Atomic congestion games Algorithmic efficiency
31
Valid Games – Definitions Consider a k-player maximization game
For each player, there is a groundset of actions Vi Player i plays from some feasible set
Definitions Let The discrete derivative of at in the
direction is The function is said to be submodular if for
This should remind us “concavity” – decreasing marginal utility
iViA 2
)( 1 kVVV
f VX XVD )()()( XfDXfXfD
Vf 2:
)()( , BfAfBVi ii
BA
32
Submodularity
33
Adding something to a smaller set makes a bigger difference
A
B
V
car
house
villahigh-def
jacuzzi
Valid Games – Definitions We will notate as the strategies of players
with index smaller than i. We will also use both this and as complete strategies (as in apply over them), meaning that the remaining players play the empty set
Definition 4.2: A game with private utility functions and social utility function is valid if: is submodular For all i, s: - private fairness For all s: - social fairness
isis
,i
Vi 2:,
)()( isi ssi
)()(1
ssk
i i
34
Valid Games – Example Market sharing game
(Goemans et. al., 2005) Players are ISP’s Markets are towns
Each market has price and value
Each player can “enter” the market he has an edge towards, with budget constraint
Player’s payoff per market is the value divided by entrances
Sum social utility Or – sum of values at entered
markets
5
3
9
2
playersmarkets
35
Valid Games – Price of Anarchy Vetta, 2002:
In a valid game, if is a NE strategy, and is the optimal strategy then:
Corollary: if is non-decreasing, then we have POA 2(The derivatives are always positive)
Theorem 4.3, Corollary 4.2 (no proofs)POTA matches POA in valid games (up to )
s },...,,{ 21 k))(( OPT
ii
i
ii
isi
issi
is sssOPT
::
)()()(2
k
36
Valid Games – Byzantine Players Theorem 4.5
In a valid game with nondecreasing social welfare, if k players minimize regret with while the Byzantine players play strategies the average social welfare is:
Proof.Assume the contrary,
2)(
1
1
OPTBs
Tt
T
t
t
tss ,...,1
tBB ,...,1
2)(
1
OPTTBs t
T
t
t
37
Theorem 4.5 – cont’d
)()( tt sBOPT
k
ii
tttt BsBsi
1
)()(
(non-decreasing)
(gradually inserting)
k
i
tti
tt BsBsi
1
)()( (submodularity)
k
ii
ttii
tt BsBs1
),()( (private fairness)
38
Gradual Insertation
39
A
B
car
house
villa
jacuzzi
})villa{()()()( jacuzzivilla AfAfAfBf
Theorem 4.5 – cont’d
T
t
k
ii
ttii
T
t
tt BsBsOPTT1 11
),()( (summarizing)
(assumption – the first term is less than half)
T
t
ttT
t
k
ii
ttii BsBs
11 1
)(),(
k
i
T
t
tti Bs
1 1
)( (social fairness)
k
i
T
t
tti
k
i
T
ti
ttii BsBs
1 11 1
)(),( (rearranging sum)
40
Theorem 4.5 – cont’d At least one player must match that, so for
him we have
Contradictory to regret minimization!
Note that it’s compared to the old OPT (without the Byzantine players) But it’s fair – Byzantine players may be acting
even against their own interest – we can’t say anything about them
T
t
ttii
T
ti
ttii BsBs
11
)(),(
41
Agenda Preliminaries Hotelling games Valid games Atomic congestion games
Definition Sum social utility – POTA Makespan utility – Lower bounds
Algorithmic efficiency
42
Congestion Games A congestion game is a minimization game,
with k players For each player, there is a set of facilities Vi Player i plays from some feasible set In weighted games, player i has a weight wi
For unweighted games, we assume wi = 1 The load on facility e is defined as Each facility e has an associated latency
function fe Player i playing ai experiences cost
iViA 2
iaeiie wl
:
iae
eei lf )(
43
Atomic Congestion Games We’ll consider a specific kind of congestion
game Unweighted Linear latencies –
We will use sum social utility: Previously known results:
POA for pure strategies is 2.5 (Awerbuch et. al., 2005)
POA for mixed strategies is also 2.5 (Chirstodoulou and Koutsoupias, 2007)
Theorem 5.1: POTA in this setting is 2.5 This asserts the previously known results!
eeeee blclf )(
k
ii aa
1
)()(
44
Theorem 5.1 – Proof Let be the optimal play Since we have no regret, for all i Summarizing for each player, and rearranging
sum:
Or more simply:
},...,,{ 21 k
T
t ee
teei
ti
T
ti
tT
ti
T
t see
tee
iti
blcssblc1111
)1(),()(
T
t Ee eie
tee
T
t Ee seie
tee
iti
blcblc1 :1 :
)1(
T
t Eeeeeee
tee
T
t Eeee
tee
T
t Ee
tee
tee lblcllclblclblc
1
***
1
*
1
))1(()(
45
Theorem 5.1 – cont’d Geometric mean is smaller than arithmetic
mean, so:
Recall our equation
2/)( 2222 jijiij
T
t Eeeeeee
tee
T
t Ee
tee
tee lblcllclblc
1
***
1
)(
T
t Eeeeeeee
tee
T
t Ee
tee
tee lblclclclblc
1
**2*2
1
2 )(2
1)(
2
1)(
T
t Eeeeeeee
T
t Ee
tee
tee lblclclblc
1
**2*
1
2 )(2
1)(
2
1
(1)
(2)
(3)
46
Theorem 5.1 – cont’d Multiplying both sides by two:
Further relaxing the inequality:
We’re done!
T
t Eeeeeeee
T
t Ee
tee
tee lblclclblc
1
**2*
1
2 22)()(
T
t Eeeeee
T
t Ee
tee
tee lblclblc
1
*2*
1
2 )(3)(
OPTTsT
t
t
3)(1
47
Parallel Link Congestion Game Consider n identical links and k weighted
players Each player selects which link to use (single link) Each player pays the sum of the weights on the
link
48
Parallel Link Congestion Games – cont’d Claim: In Parallel link congestion game with
social cost function as the maximum expected job latency, POTA is 2
Proof. Rescale the weights, so that OPT = 1 Total weight is less than n, weights are less than 1 Total latency in T plays is Tn, at least one link e*
with latency less than T in total, average latency - l(e*) ≤ 1 Regret minimizing player will be competitive to
moving to e* We expect at most l(e*) +wi ≤ 2
49
Parallel Links Congestion Game – cont’d The unweighted case with the sum social
utility, is called “load balancing game” It’s a specific case of the discussed before,
thus we will have POA and POTA of 2.5 If k >> n (a likely case) and the server speeds
are relatively bounded, we can say even more Theorem 5.6 (no proof):
In this formation, POTA is 1 + o(1) Corollary 5.7:
In this formation, POA is 1 + o(1), even for mixed strategies
50
Parallel Link Congestion Games – cont’d Usually, we consider the makespan social utility
function – the load on the most loaded link Why doesn’t our argument from before hold?
Because E[max{X}] > max{E[X]} The POA for 2-link games is 3/2 (Koutsopias and
Papadimitriou, 1999) The POA for n-link games is
(Koutsopias, Marvronikolas, Spirakis, 1999) Theorem 5.4 (no proof)
POTA for this game with two links is 3/2. Theorem 5.5: POTA for this game with n links is
.
)loglog/(log nn
)( n
51
Theorem 5.5 – Proof Sketch Consider n links, n players, unit weights. OPT = 1 Resembles what we did in Hotelling (for non-
convergence): Split the players into groups At time t, group t mod plays at link 1, while the rest
play in different nodes – get average latency of close to 1 This minimizes regret – for any fixed link, the player
will need to share the link most of the times (latency ~2)
Still, at each time, link 1 has a whole group – maximum latency of
Notice that this holds even for unweighted players!
n2
n2
)( n
52
Theorem 5.5 – Proof Sketch
53
spots
single spotload = 2/n
2/nn
load = 012/ n spots
4
5
2
2/
2
12)(
n
n
n
nsi
22
1
2
2)12()(
nn
nvii
load=1
Agenda Preliminaries Hotelling games Valid games Atomic congestion games Algorithmic efficiency
54
Algorithmic Efficiency Weighted Majority Algorithm (Littlestone,
Warmuth) Initialize for all i Update at time t, where is a small
tradeoff parameter (0.01) and is the loss at time t-1 Expects regret over time T Explained in “Algorithmic Game Theory”, chapter
4 Polynomial in the number of strategies (Hotelling,
Congestion games) Not as good in Valid games (the strategies are
exponential to the size of groundset)
11 iw
We’re assuming a minimizing game with [0,1] loss and n strategies.
1
)1(1
tilt
iti ww
1til
)/1( TO
55
Questions?
56