Daniel Ariosa Ecole Polytechnique Fédérale de Lausanne (EPFL) Institut de Physique de la Matière Complexe CH-1015 Lausanne, Switzerland and Hugo Fort Instituto

Daniel Ariosa

Ecole Polytechnique Fédérale de Lausanne (EPFL)Institut de Physique de la Matière Complexe

CH-1015 Lausanne, Switzerland

and

Hugo Fort

Instituto de Física, Facultad de CienciasUniversidad de la República

Montevideo, Uruguay

Statistical MechanicsStatistical Mechanics Applied to Social Sciences Applied to Social Sciences

________________________________________________________________________

Introduction

Estimating Utilities: Magnetic Systems and Games People Play

... and Adaptive Self-Interested Agents

Extended Estimator Approach for 2x2 Games and its Extended Estimator Approach for 2x2 Games and its Mapping to the Ising HamiltonianMapping to the Ising Hamiltonian

Outline

Self-organization into cooperative equilibrium states

The extended estimator formulation

Mapping the iterated game into an Ising model

Classifying Markovian Strategies

Stability of cooperation-Asynchronous random dynamics-Synchronous fully connected system

Thermodynamics of Ising mappable strategies

Discussion

Self-organization into cooperative equilibrium states

How populations of self interested agents cooperate, or manage in order to

satisfy this goal globally or collectively ?

A few examplesA few examples::

- electrons in a superconductor

- local magnetic moments in a ferromagnet

- molecules that cooperate to form cells, cells that cooperate to form

living creatures that in turn cooperate to form societies ...

Different approachesDifferent approaches (paradigms and extremal principles)::

- Biology Darwin’s evolution fitness maximization

- Economics « Homo economicus » profit maximization

- Physics - Physics statistical thermodynamics statistical thermodynamics free energy minimization free energy minimization

Game theory and the Prisoner’s dilemma

- The Prisoner’s dilemma models the social behavior of "selfish" individuals

- 2x2 game in normal form:

i) 2 players, each confronting 2 choices : to cooperate (C) or to defect (D)

ii) each player makes his choice without knowing what the other will do

iii) there is a 22 matrix specifying the payoffs of each player for the 4 possible outcomes:

M R S

T P

T > R > P > S

2R > S + T

With the condition

and

Elementary Markovian strategies and estimates (aspiration levels) for iterated games_____________________________________________

S

P

R

T

C ; D

D ; D

C ; C

D ; C-The player updates its behavior (C or D) according to the outcome of the preceding run.

- A simple updating rule (= strategy) consists in comparing the obtained utilities () with a

given estimate ()for the expected income.

- Example: The “PAVLOV” strategy

Behavior: “win-stay, lose-shift”

Estimate: P < < R

Character: ( pR

( 1

pT

0

pS

0

pP)

1 )

-more examples:“Retaliator” strategy:

-Behavior: retaliates when the other player defects

- Estimate: S < < P

- Character

“Tit-for-Tat” strategy:

-Behavior: cooperate on the first move an then reproduce what the other player did

In the preceding move.

- Estimate (conditional):

S < S* ; P > P* ; R > R* ; T < T*- Character: ( pR

( 1

pT

1

pS

0

pP)

0 )

( pR

( 1

pT

0

pS

0

pP)

0 )

S

P

R

T

C ; D

D ; D

C ; C

D ; C

S*

T*

R*

P*Retaliator

The extended estimator formulation

- Behavior (state): “SPIN”:

-Estimate Estimator payoff matrix(EPM):

- Updating rule: player(i) FLIPS

M* R * S *

T * P *

c

1 for C

0 for D

Sc

1 c

i SiTMSj

i Si

TM S j

i i

Mapping the iterated game into an Ising model

______________________________• Story line: two-valued variable Ising spinIsing spin

updating rule for c Metropolis Metropolis algorithm algorithm (T=0)

i

12

if agent i is in the C state

1

2if agent i is in the D state

; with i ci 1

2

Flipping condition:

i i i i ; j E i E i

H J i ji , j h i

i

The Ising Hamiltonian:

E i E i Ei 2 Jz nn h i

Energy density associated with the flip:

i i ; nn Ei

The link:

Explicit form of the utilities:

i SiT MSj R S T P cic j S P ci T P c j P

- when playing against a single player (j):

- when playing against z nearest neighboring (nn) players:

i SiT MSnn Si

TM z Snnnn

- In terms of the Ising variables:

i R - S - T + P i j 1

2R + S - T - P i +

1

2R - S + T - P j +

1

4R + S + T + P

i z R - S - T + P i nn z

2R + S - T - P i +

z

2R - S + T - P nn +

z

4R + S + T + P

and

The mapping:Ei 2 Jz i nn 2h i

i z R - S - T + P i nn z

2R + S - T - P i

+ z

2R - S + T - P nn +

z

4R + S + T + P

1 S * S = P P *

2 R * RT T *

i i ; nn Ei R *+T*R + T and S *+P* S + P

J = 2 1 and h z

21 2

S

P

R

T

C ; D

D ; D

C ; C

D ; C

1

2

Classifying Markovian strategies

Stability of cooperation__________________________________

• Average cooperation:

Fraction c of agents in the C-state

a) Steady state cooperation (asymptotic): c*

b) Ground state cooperation (equilibrium at T=0): ceq

Synchronous Dynamics:

All agents simultaneously update their states in one round.

Asynchronous Dynamics:The update is carried out by the subset of agents who just played.

ceq 1

2 i0

c c c 1 pR 1 c 1 pS 1 c cpT 1 c pP 0

A pair of players is randomly chosen for each round.

• Stability in Asynchronous Random Dynamics (ARD):

Example for the PAVLOV (1 0 0 1)strategy:

c 1 c 1 c 1 c c 1

2 or c 1

Only c* = 1/2 is a stable solution (for all but one cooperating agents, the system is rapidly driven away from c = 1.)

Asynchronous Random Dynamics (ARD):

• Stability in the Synchronous Fully Connected System (SFC):

A) The FES case

The equilibrium sate strongly depends upon the initial configuration.

Obtained utilities as a function of c:

i SiT M Snn Si

T M z Snn N 1 c R 1 c S ; for Si Cc T 1 c P ; for Si D

nn

• A stable configuration is reached when all players get a payoff

greater than the estimate or, in other words, when is lower than the cooperator’s utilities.

• Marginal stability is reached also when all players defect and is

lower than the defector’s utilities.

Phase diagram for the FES case

FROZEN

S

P

R

T

0 1c0

0 <-> 1OSCILLATING

OSCILLATING

c* = 1

RET

PAV

AMB

ALT

FR

c* = 0

c* = 1- c0

co <-> 1- co

c* = c0

0 <-> 1OSCILLATING

CON AD

AC TFT

c* = 1

c* = 0c0

1

2

Phase diagram for the steady statesteady state cooperation in IMS

Thermodynamics of Ising-mappable strategies

H J i ji , j h i

i

J

2N2 2 N i

2 hN

NJ

2N 2 h

J

8

U

F NJ

2N 2 h

J

8

F N 2

22 1 2 1 2

Fully connected system(z=N-1):

Free energy functional in terms of mapping parameters:

Mean field approximation:

HMF J iz jij h i

i zJ h

| |

i i

˜ h

ZMF Tr exp HMF 2 cosh ˜ h

2

N

1

N

dFMF

dh

1

Nd ln ZMF

dh

1

2tanh

zJ h

2

Partition function:

Average magnetization:

ceq = 0

ceq = 1

ceq =

1

1

2

2

1CON AD

AC TFT

Phase diagram for the ground stateground state cooperation in IMS

Summary• All the relevant elementary Markovian strategies for the Prisoner’s

dilemma have been formulated in terms of an extended (conditional) estimate.

• Another subset (AD ; TFT ; CON ; AC) has been mapped on an Ising Hamiltonian with 2 parameters.

• The remaining (7) strategies can also be formulated in terms of estimators involving more than two parameters (3, 4).

• A straightforward application of the thermodynamic approach of IMS consists of finding the ground state cooperation of complex systems of interacting agents, which is clearly different from the steady state of the iterated game. Finite temperature leads to “generous” strategies allowing more efficient equilibrium states in its “fitness” landscape.

• A subset of these strategies (FR ; RET ; PAV ; AMB ; ALT) admits a fixed (single) estimate.

Future work:The mapping on Hamiltonian systems can be exploited

in many ways:

- Replacing the two valued state [C or D] by a continuous variable (more rich systems as the XY model)

- Evolution (Darwinian) and learning models:

Instead of cooperation, consider the strategy (eg. the FES *) as the site variable (order parameter?). Within this context, thermal fluctuations represent spontaneous “mutations” and the Boltzman energy factor will operate the selection.

- Heterogeneous estimates varying from site to site (eg. spin glass model)

Documents

Daniel Ariosa Ecole Polytechnique Fédérale de Lausanne (EPFL) Institut de Physique de la Matière Complexe CH-1015 Lausanne, Switzerland and Hugo Fort Instituto