Upload
brendan-burns
View
216
Download
2
Embed Size (px)
Citation preview
Daniel Ariosa
Ecole Polytechnique Fédérale de Lausanne (EPFL)Institut de Physique de la Matière Complexe
CH-1015 Lausanne, Switzerland
and
Hugo Fort
Instituto de Física, Facultad de CienciasUniversidad de la República
Montevideo, Uruguay
Statistical MechanicsStatistical Mechanics Applied to Social Sciences Applied to Social Sciences
________________________________________________________________________
Introduction
Estimating Utilities: Magnetic Systems and Games People Play
... and Adaptive Self-Interested Agents
Extended Estimator Approach for 2x2 Games and its Extended Estimator Approach for 2x2 Games and its Mapping to the Ising HamiltonianMapping to the Ising Hamiltonian
Outline
Self-organization into cooperative equilibrium states
The extended estimator formulation
Mapping the iterated game into an Ising model
Classifying Markovian Strategies
Stability of cooperation-Asynchronous random dynamics-Synchronous fully connected system
Thermodynamics of Ising mappable strategies
Discussion
Self-organization into cooperative equilibrium states
How populations of self interested agents cooperate, or manage in order to
satisfy this goal globally or collectively ?
A few examplesA few examples::
- electrons in a superconductor
- local magnetic moments in a ferromagnet
- molecules that cooperate to form cells, cells that cooperate to form
living creatures that in turn cooperate to form societies ...
Different approachesDifferent approaches (paradigms and extremal principles)::
- Biology Darwin’s evolution fitness maximization
- Economics « Homo economicus » profit maximization
- Physics - Physics statistical thermodynamics statistical thermodynamics free energy minimization free energy minimization
Game theory and the Prisoner’s dilemma
- The Prisoner’s dilemma models the social behavior of "selfish" individuals
- 2x2 game in normal form:
i) 2 players, each confronting 2 choices : to cooperate (C) or to defect (D)
ii) each player makes his choice without knowing what the other will do
iii) there is a 22 matrix specifying the payoffs of each player for the 4 possible outcomes:
M R S
T P
T > R > P > S
2R > S + T
With the condition
and
Elementary Markovian strategies and estimates (aspiration levels) for iterated games_____________________________________________
S
P
R
T
C ; D
D ; D
C ; C
D ; C-The player updates its behavior (C or D) according to the outcome of the preceding run.
- A simple updating rule (= strategy) consists in comparing the obtained utilities () with a
given estimate ()for the expected income.
- Example: The “PAVLOV” strategy
Behavior: “win-stay, lose-shift”
Estimate: P < < R
Character: ( pR
( 1
pT
0
pS
0
pP)
1 )
-more examples:“Retaliator” strategy:
-Behavior: retaliates when the other player defects
- Estimate: S < < P
- Character
“Tit-for-Tat” strategy:
-Behavior: cooperate on the first move an then reproduce what the other player did
In the preceding move.
- Estimate (conditional):
S < S* ; P > P* ; R > R* ; T < T*- Character: ( pR
( 1
pT
1
pS
0
pP)
0 )
( pR
( 1
pT
0
pS
0
pP)
0 )
S
P
R
T
C ; D
D ; D
C ; C
D ; C
S*
T*
R*
P*Retaliator
The extended estimator formulation
- Behavior (state): “SPIN”:
-Estimate Estimator payoff matrix(EPM):
- Updating rule: player(i) FLIPS
M* R * S *
T * P *
c
1 for C
0 for D
Sc
1 c
i SiTMSj
i Si
TM S j
i i
Mapping the iterated game into an Ising model
______________________________• Story line: two-valued variable Ising spinIsing spin
updating rule for c Metropolis Metropolis algorithm algorithm (T=0)
i
12
if agent i is in the C state
1
2if agent i is in the D state
; with i ci 1
2
Flipping condition:
i i i i ; j E i E i
H J i ji , j h i
i
The Ising Hamiltonian:
E i E i Ei 2 Jz nn h i
Energy density associated with the flip:
i i ; nn Ei
The link:
Explicit form of the utilities:
i SiT MSj R S T P cic j S P ci T P c j P
- when playing against a single player (j):
- when playing against z nearest neighboring (nn) players:
i SiT MSnn Si
TM z Snnnn
- In terms of the Ising variables:
i R - S - T + P i j 1
2R + S - T - P i +
1
2R - S + T - P j +
1
4R + S + T + P
i z R - S - T + P i nn z
2R + S - T - P i +
z
2R - S + T - P nn +
z
4R + S + T + P
and
The mapping:Ei 2 Jz i nn 2h i
i z R - S - T + P i nn z
2R + S - T - P i
+ z
2R - S + T - P nn +
z
4R + S + T + P
1 S * S = P P *
2 R * RT T *
i i ; nn Ei R *+T*R + T and S *+P* S + P
J = 2 1 and h z
21 2
S
P
R
T
C ; D
D ; D
C ; C
D ; C
1
2
Classifying Markovian strategies
Stability of cooperation__________________________________
• Average cooperation:
Fraction c of agents in the C-state
a) Steady state cooperation (asymptotic): c*
b) Ground state cooperation (equilibrium at T=0): ceq
Synchronous Dynamics:
All agents simultaneously update their states in one round.
Asynchronous Dynamics:The update is carried out by the subset of agents who just played.
ceq 1
2 i0
c c c 1 pR 1 c 1 pS 1 c cpT 1 c pP 0
A pair of players is randomly chosen for each round.
• Stability in Asynchronous Random Dynamics (ARD):
Example for the PAVLOV (1 0 0 1)strategy:
c 1 c 1 c 1 c c 1
2 or c 1
Only c* = 1/2 is a stable solution (for all but one cooperating agents, the system is rapidly driven away from c = 1.)
Asynchronous Random Dynamics (ARD):
• Stability in the Synchronous Fully Connected System (SFC):
A) The FES case
The equilibrium sate strongly depends upon the initial configuration.
Obtained utilities as a function of c:
i SiT M Snn Si
T M z Snn N 1 c R 1 c S ; for Si Cc T 1 c P ; for Si D
nn
• A stable configuration is reached when all players get a payoff
greater than the estimate or, in other words, when is lower than the cooperator’s utilities.
• Marginal stability is reached also when all players defect and is
lower than the defector’s utilities.
Phase diagram for the FES case
FROZEN
S
P
R
T
0 1c0
0 <-> 1OSCILLATING
OSCILLATING
c* = 1
RET
PAV
AMB
ALT
FR
c* = 0
c* = 1- c0
co <-> 1- co
c* = c0
0 <-> 1OSCILLATING
CON AD
AC TFT
c* = 1
c* = 0c0
1
2
Phase diagram for the steady statesteady state cooperation in IMS
Thermodynamics of Ising-mappable strategies
H J i ji , j h i
i
J
2N2 2 N i
2 hN
NJ
2N 2 h
J
8
U
F NJ
2N 2 h
J
8
F N 2
22 1 2 1 2
Fully connected system(z=N-1):
Free energy functional in terms of mapping parameters:
Mean field approximation:
HMF J iz jij h i
i zJ h
| |
i i
˜ h
ZMF Tr exp HMF 2 cosh ˜ h
2
N
1
N
dFMF
dh
1
Nd ln ZMF
dh
1
2tanh
zJ h
2
Partition function:
Average magnetization:
ceq = 0
ceq = 1
ceq =
1
1
2
2
1CON AD
AC TFT
Phase diagram for the ground stateground state cooperation in IMS
Summary• All the relevant elementary Markovian strategies for the Prisoner’s
dilemma have been formulated in terms of an extended (conditional) estimate.
• Another subset (AD ; TFT ; CON ; AC) has been mapped on an Ising Hamiltonian with 2 parameters.
• The remaining (7) strategies can also be formulated in terms of estimators involving more than two parameters (3, 4).
• A straightforward application of the thermodynamic approach of IMS consists of finding the ground state cooperation of complex systems of interacting agents, which is clearly different from the steady state of the iterated game. Finite temperature leads to “generous” strategies allowing more efficient equilibrium states in its “fitness” landscape.
• A subset of these strategies (FR ; RET ; PAV ; AMB ; ALT) admits a fixed (single) estimate.
Future work:The mapping on Hamiltonian systems can be exploited
in many ways:
- Replacing the two valued state [C or D] by a continuous variable (more rich systems as the XY model)
- Evolution (Darwinian) and learning models:
Instead of cooperation, consider the strategy (eg. the FES *) as the site variable (order parameter?). Within this context, thermal fluctuations represent spontaneous “mutations” and the Boltzman energy factor will operate the selection.
- Heterogeneous estimates varying from site to site (eg. spin glass model)