11

Click here to load reader

Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

Embed Size (px)

Citation preview

Page 1: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

Journal of Theoretical Biology 258 (2009) 208–218

Contents lists available at ScienceDirect

Journal of Theoretical Biology

0022-51

doi:10.1

� Corr

E-m

(G.A. Ke

journal homepage: www.elsevier.com/locate/yjtbi

Statistical fluctuations in population bargaining in the ultimatum game:Static and evolutionary aspects

Roberto da Silva �, Gustavo A. Kellermann, Luis C. Lamb

Institute of Informatics, Federal University of Rio Grande do Sul, 91501-970 Porto Alegre RS, Brazil

a r t i c l e i n f o

Article history:

Received 17 June 2008

Received in revised form

14 December 2008

Accepted 16 January 2009Available online 31 January 2009

Keywords:

Mathematical modelling of biological and

economic systems

Statistical fluctuations in dynamical

systems

Iterated game theory

93/$ - see front matter & 2009 Elsevier Ltd. A

016/j.jtbi.2009.01.017

esponding author. Tel.: +55 5132113341.

ail addresses: [email protected] (R. da Silva

llermann), [email protected] (L.C. Lamb).

a b s t r a c t

We explore the emergent behavior in heterogeneous populations where players negotiate via an

ultimatum game: two players are offered a gift, one of them (the proposer) suggests how to divide the

offer while the other player (the responder) can either accept or reject the deal. Rejection is detrimental

to both players as it results in no earnings. In this context, our contribution is twofold: (i) we consider a

population where the distribution of used strategies is constant over time and properties of the random

payoff received by the players (average and higher moments) are reported from simple exact methods

and corroborated by computer simulations; (ii) the evolution of a population is analyzed via Monte

Carlo simulations where agents may change independently the proposing and accepting parameters of

their strategy depending on received payoffs. Our results show that evolution leads to a stationary state

in which wealth (accumulated payoff) is fairly distributed. As time evolves, an increase in average payoff

and a simultaneous variance decrease is observed when we use a dynamics based on a probabilistic

version of the saying: ‘‘One should not comply with small earnings, but one’s greed must be limited.’’

& 2009 Elsevier Ltd. All rights reserved.

1. Introduction

Von Neumann and Morgenstern’s (1953) game theory plays animportant role in explaining social features, economic aspects ofstock markets and the interaction between living creatures inbiological sciences. Their broad definition of game considersinteractions among players under a rules’ set representing theiractions and the set of possible rewards obtained by each player.Thus, a large number of applications can be modeled by thiswidely studied theory. It is important to notice that classical gametheory is a mathematical framework, and from this (classical)point of view all features are studied considering the hypothesis ofrational agency. However, this is not always true in real situations.Several works have criticized this assumption (see e.g. HargreavesHeap and Varoufakis, 2004) indicating that one has to considerother aspects that would make the theory more adaptive orflexible.

This theory, widely accepted by theoretical economists,including contributions by Nash (1950), has also been widelystudied by biologists following the pioneering work of MaynardSmith and Price (1973) and Maynard Smith (1982). To put simply,different strategies can be tested and the results extrapolated to

ll rights reserved.

), [email protected]

real situations in societies (if one has in mind existing relationsamong species and among individuals of the same species).

In this evolutionary branch of game theory (Szabo and Fath,2007) one can ask questions related to equilibrium and survival ofstrategies, assuming not only one game play, but several plays.Such a theory explains the average dynamic behavior of gamesunder different social contexts: in economics, public good games(da Silva et al., 2006), minority games (Araujo and Lamb, 2004,2007; Challet and Zhang, 1997); in biology the prisoner and otherdilemmas (Szabo and Fath, 2007). It can also model differentnetworks and their influence on emerging collective behaviors.

In our application scenario, the ultimatum game, two playersmust divide a quantity (a sum of money). One of the playersproposes a division (the proposer) and the other can either acceptor reject it. If the second player (the responder) accepts it, thevalues are distributed according to the division established bythe proposer. In the case of rejection no earning is distributed tothe players.

In human economic experiments, one usually considers adivision as close as possible to a fair 50–50 division and rejectsdivision values smaller than 30% (see e.g. Guth et al., 1982;Henrich et al., 2006). However, this would contradict results ofclassical game theory which states that participants play underNash equilibrium (Nash, 1950). Here, Nash equilibrium supposes arational player receiving any amount—small or large—instead ofreceiving nothing at all. This way, the rational action of theproposer is to gain the largest possible value (leaving the

Page 2: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218 209

responder with the smallest value). Other experimental resultswith human subjects (Alvard, 2000) have shown that people donot universally play fairly, but have a preference for fairnessin adaptive ways. In a way, our ancestors would be the key tounderstand the ultimatum game dynamics in living populations.

Several alternatives have been studied aiming at explaining thedifferences between actual human behavior and the behavioralmodels of classical game theory. These alternatives include thepresence of factors such as altruism, punishment or reciprocity(Fehr and Fishbacher, 2003), learning process (Abbink et al., 2001;Gale et al., 1995) and strategic evolution (adapting) (Page andNowak, 2000; Page et al., 2000; Sanchez and Cuesta, 2005).

The main objective of this paper is to offer improvements aswell as new information (supported by means of quantitativenumerical methods) about the collective emerged behavior ofheterogeneous players negotiating values via an ultimatum game.Our contribution is twofold:

1.

We consider a population where the distribution of usedstrategies is constant over time and properties of the randompayoff received by the players (average and higher moments)are reported from simple numerical exact methods andcorroborated by extensive computer simulations.

2.

We analyze the evolution of a population via Monte Carlosimulations where agents may independently change theproposing and accepting parameters of their strategiesdepending on received payoffs.

As regards our fist contribution, we consider a strategy definedfor both accepting and proposing probability distributions. Inorder to do so, we have analyzed the average payoff accumulatedby players under different strategies, performing computationalexperiments and simulations. An analytical method derived in daSilva and Kellermann (2007) has been used to estimate this value(payoff), and the obtained results have been compared to the onesproduced in computational experiments. The variance has alsobeen considered and general formulas have then been derived asfunction of the number of players and fractions of each strategy.

The emerging behavior arising from this heterogeneouspopulation can bring a novel understanding about dominantstrategies, evolution of properties in human societies and even inother biological populations including models for disputes(bargaining) for water, food, resources or territory. It may be thecase that if an individual disagrees about a given division inextreme situations serious damages can be caused to a popula-tion. In the worst case, the life of individuals could be at risk.

With respect to our second contribution, we explore a simpleevolutionary and probabilistic dynamics based on the obtainedpayoff in each simulation step. In this case we consider that aplayer has strategy represented by a pair ðwa;wpÞ, denoting,respectively, the accepting and proposing parameters (cutoffs). Itis important to mention that in this model these parameters areindependently updated and only wa of the responder and wp ofthe proposer are updated. We have divided our evolutionaryapproach in two distinct probabilistic dynamics (policies) asfollows:

1.

Policy I: ‘‘One should not comply with small earnings, but oneshould limit one’s own greed.’’The increment on the accepting or proposing cutoffs occurswith probability: 1� ðpayoff=maximum payoffÞ.

2.

Policy II: ‘‘One’s earnings will increase one’s own greed.’’The increment on the accepting or proposing cutoffs occurswith probability: payoff=maximum payoff.

Our simulations show that policy I leads to a stationary statewith a higher average payoff and a smaller variance of payoff. The

inverse occurs in policy II, which leads to a smaller average payoffand a larger variance. We consider populations where ðwa;wpÞ areuniformly distributed around (maximum payoff)/2 with standarddeviation s. The influence of s on the temporal evolution ofstatistical parameters was also analyzed.

Works related to ours include Binmore et al. (1986) where theauthors investigate how to use Nash bargaining in two personeconomic scenarios. In particular, Binmore et al. (1986) investi-gates incentive mechanisms for reaching agreements (suchincentives include players impatience and ‘‘the fear of negotiationbreakdown’’).

The paper is organized as follows. In Section 2 we describe thestatic version of the model, i.e., how to reason in a game withmany participants playing in accordance to an ultimatum gamewhen the strategies are static (they do not change along time). Wedefine the strategies and values to be simulated in our hetero-geneous ultimatum game population. In Section 2.3 we describedetails of the simulations and experiments that were carried out.Here it is important to mention that algorithm purposed in thissection will also be used in the evolutionary simulations. InSection 2.4 we describe numerical exact results (which arepossible just in the static version of the model) following theresearch initiated in da Silva and Kellermann (2007). These resultsexplore the temporal evolution of population payoff as function ofa parameter that determines the stubbornness level of a fractionof a kind of players. These stubborn players aim at always winningthe same payoff wc as proposers, while as responders they onlyaccept values greater than wc . In addition, we analyze the effectsin payoff of players endowed with higher probabilities of beingproposer/responder. In Section 2.4 we present 3D plots illustratingthe payoff distribution as function of different coupling of properparameters relative to sub-population in mixed binary popula-tions. From that we can visually check situations when maximalpayoff is obtained. We also describe several evolutionary simula-tions. Time evolving plots of the average cutoffs waðtÞ and wpðtÞ

are shown as well as the temporal evolution of average payoff andits variance. We describe the wealth (accumulated payoff)distribution in two different specific time steps, and we analyzethe Gini coefficient g of wealth as function of time. It alwaysdecays along time and for some values of s as a power law-likegðtÞ�t�y what corroborates a very fair wealth distribution forsufficient long time scales in both proposed and studied policies(I and II). Policy I leads to better alternatives for a populationpracticing ultimatum bargaining because players have a betteryield, the population presents a higher average payoff combinedwith a smaller variance of this payoff. Section 4 concludes andpoints out directions for further research.

2. Static version of the bargaining ultimatum game for n players

2.1. General definitions

In the ultimatum game, two players must divide a quantity.One of them proposes a division (the proposer) and the other caneither accept or reject the proposal. If the second player(the responder) accepts it, the values are distributed accordingto the division established by the proposer. In the case of rejectionno earning is distributed to the players.

Our model considers a heterogeneous (multi-strategic) popu-lation of size N, in which individuals play a given number ofiterations (or infinite plays for analytical calculations) dividing aquantity w. A player has a strategy sk of proportion Fk in thepopulation, defined by two probability distributions: one of them

Page 3: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

Table 1Players strategies.

Strategy pðiÞ aðiÞ Player parameters

Uniform 1

ðwþ 1Þ

1

2

None

Greedy 2ðiþ 1Þ

ðwþ 1Þðwþ 2Þ

ðiþ 1Þ

ðwþ 1Þ

None

Altruist2ðw� iþ 1Þ

ðwþ 1Þðwþ 2Þ

ðw� iþ 1Þ

ðwþ 1Þ

None

Rational di;w�1 1� di;0 None

Stubborn (stubborn) di;wc 1; wcpipw

0; 0piowc

(Cutoff: wc

Fair (fair) di;wc 1; w�wcpipw

0; 0piow�wc

(Cutoff: wc

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218210

to propose (a division) and the other to accept a division withk ¼ 1; . . . ; p. In a natural analogy with biological populations, theconcept of species could be seen as a group of players with thesame strategy (with or without the same parameters). Distinctgroups would then represent different species in an ecosystemand thus a payoff analysis for the different species under differentsituations could be quantitatively obtained under appropriaterules.

In our model simulations, for each iteration a couple of playersdenoted by ðsk1

; sk2Þ is randomly chosen and the proposer is

selected according to rsk1;sk2

: the conditional probability of aplayer with strategy sk1

, when her/his opponent has strategy sk2.

Similarly, the probability of a proposer player sk2when sk1

isthe responder is given by rsk2

;sk1¼ 1� rsk1

;sk2. This characteristic

mimics a dominance relation of a strategy (analogously, a species)in relation to others existing in the population.

2.2. On strategies

Human behavior is expected to be affected by environmentaland internal processes that are not easily reproduced. Suchprocesses are usually regarded as noise in simulation of physicalprocesses. Behavioral experiments (as reported in e.g. Henrichet al., 2006) have shown that players must not always stick to thesame division strategy. In order to model these behaviors, one canmake use of a probability distribution to represent the strategy ofa player; this may be the first step towards modelling such acomplex behavior. In addition this constitutes an alternativeapproach with respect to other important contributions (see e.g.Page and Nowak, 2000; Page et al., 2000; Sanchez and Cuesta,2005; Napel, 2003).

Let sk 2 S ¼ fs1; s2; . . . ; spg be the strategy of a given player,represented for discrete probability distributions psk

ðiÞ, askðiÞ,

i ¼ 0;1; . . . ;w and k ¼ 1; . . . ;N. Let us denote by ws1the required

quantity for the proposer and ws2¼ w�ws1

be the valuecorresponding to the responder in a division. We have ps1

ðiÞ asthe probability of the division ðws1

;ws2Þ ¼ ði;w� iÞ to be proposed

provided that the proposer has strategy s1 while as2ðw� iÞ is the

probability of the responder accept it when her strategy is s2.In this work, the following strategies are considered:

Uniform strategy: The proposed value is uniformly distributedand the responder accepts any amount, playing with a fair coin. � Greedy strategy: Higher values are more probable to be

proposed and accepted.

� Altruist strategy: Lower values are more probable to be

proposed and accepted.

� Rational strategy: The Nash equilibrium player: as a proposer

(s)he wants the highest possible value ðw� 1Þ; as a responderany positive value is accepted.

� Stubborn or fixed strategy: The same value wc (cutoff) is always

proposed. As a responder only quantities equal to or greaterthan wc are accepted.

� Fair strategy (Page et al., 2000) or fair (Sanchez and Cuesta,

2005): Player proposes a division ðwc ;w�wcÞ, and accepts anyvalue equal to or greater than w�wc .

Table 1 lists the proposition and acceptance functions for eachtype of strategy, defined for the discrete interval ½0;w�.

2.3. Computational simulations

In order to explore several different combinations of para-meters we have run a number of computational simulations. It isimportant to notice that when the number of confrontations is

very large all players are chosen approximately the same numberof times, and properties such as the average received payoff leadsexactly to the expected value. However, we are interested invalues measured over few iterations, as done in Sanchez andCuesta (2005).

Two different algorithms were used to run our simulations.First, we repeated the game considering a given number ofencounters (or meetings). An encounter corresponds to arandomly chosen pair of players. In short periods there is noguarantee that the players will have the same number ofencounters, and fluctuations are really important. The otherapproach considers a concept of turn to overcome this problem(see Algorithm 1). A turn is composed of iterations where, for eachiteration, a pair of players is randomly chosen. However, for eachturn all players necessarily participate only once (in the sense ofperforming a matching in a graph).

Algorithm 1. Simulation with turns.

1:

for each parameter combination do 2: create all players in population

3:

for each turn do 4: create a list of players not chosen yet

5:

repeat 6: randomly choose two players in the list

7:

choose which of them is proposer or responder

8:

let them play according to their strategies

9:

remove players from the list

10:

until the list is empty

11:

end for 12: calculate mean payoff and variance

13:

end for

Numerically, we have T ¼ 2E=N, where T is the number of turnsand E denotes encounters. We emphasize that both algorithms areequivalent when properties are computed after a large number ofencounters or turns, and in this situation the algorithms havereproduced the analytical results (Section 2.4). However, for asmall number of encounters they compute different results.To check the difference we perform initial computer simulationsto show the difference between simulation with encounters andturns in a fair population for different cutoffs.

Fig. 1 depicts the accumulated average payoff distribution in10 turns for a population with 200 fair players with the samecutoff value. Numerically, it is equivalent to 1000 encounters,according to equation T ¼ 2E=N. More precisely, for each set of 10turns we compute the received payoff mean for every player inthis period, and the frequency of these values is accumulatedin the histogram. The dispersion reaches a maximum when thecutoff is 0, when a wider histogram is observed. By increasing thecutoff, received values get more concentrated around the meanvalue and, with a cutoff of 50, all players always receive the same

Page 4: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

0

0.05

0.1

0.15

0.2

0.25

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 0

0

0.05

0.1

0.15

0.2

0.25

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 30

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 50

0

0.05

0.1

0.15

0.2

0.25

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 60

Fig. 1. Histogram of the mean payoff obtained by players over 10 turns each in a population composed only by fair players with the same cutoff.

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218 211

mean of 50 in each turn. However, we can observe a pattern(symmetry) for the obtained histogram: a bell curve distributionbetween two perceptible picks in wc and w�wc , except in thedegenerate case wc ¼ 50.

These results illustrate a different behavior from the one inFig. 2, where the same number of encounters (1000) is used, butthe players have a different number of participations.

2.4. Exact results

Firstly, in the static version of the model, we will show exactresults for the payoff average as well as its variance. Let usconsider Ysc 2 f0;1;2; . . . ;wg a random variable, denoting thepayoff obtained for a player with strategy sc 2 S ¼ fs1; s2; . . . ; spg

in a encounter. The m-th moment of this payoff in a mixedpopulation is given by (see da Silva and Kellermann, 2007)

E½Ymsc� ¼

Xw

i¼0

Xp

k¼1

im � pscðiÞaskðw� iÞ

FskN � dk;c

N � 1rsc ;sk

þXw

i¼0

Xp

k¼1

im � asc ðiÞpskðw� iÞ

FskN � dk;c

N � 1ð1� rsc ;sk

Þ. (1)

After considering the first ðm ¼ 1Þ and second moment ðm ¼ 2Þ,we can calculate the dispersion of the ‘‘money’’ obtained by aplayer with a particular strategy sc, i.e., var½Ysc � ¼ E½Y2

sc� � E½Ysc �

2.Our first test considers a particular case, fixing p ¼ 3. I.e., we

have an ensemble with three types of players: k ¼ 1 (stubbornpayoff), k ¼ 2 (uniform player), and k ¼ 3 (greedy probabilistic),see Table 1.

Let us now calculate the expected value of a stubborn player ina ensemble of N players. For the sake of simplicity, here we willconsider r1;2 ¼ r2;3 ¼ r1;3 ¼ 1=2.

The left plot in Fig. 3 depicts the received average payoff of astubborn player, computed by the simulations. For a large numberof turns a convergence to an exact result given by Eq. (1) isverified. In the right plot, a similar analysis for standard deviationis depicted. For these results a population of 200 agents composed

of 50% stubborn, 25% uniform and 25% greedy players has beenused.

The jump at wc ¼ 50 corresponds to a transition between twodistinct ‘‘phases’’. Below wc ¼ 50, stubborn players are lessconservative, providing a higher volume of negotiation. Abovewc ¼ 50, the players are more conservative and although moremoney can be obtained when one plays as proposer, in eachnegotiation the volume is lower and the average payoff computedover turns decreases.

We have also performed exact numerical computations forrsc ;sk

a1=2. In this case, we have considered a mixed population of50% stubborn and 50% fair players. Fixing the cutoff of stubbornplayers, we have then analyzed the payoff average of fair playersas a function of fair cutoff considering different r-values (0, 0.2,0.4, 0.6, 0.8 and 1.0). In this case, r1;2 ¼ r denotes the probabilityof fair to be the proposer when her opponent (responder) is astubborn player; and similarly r2;1 ¼ 1� r is the probability ofstubborn to be the proposer when her opponent (responder) is afair player. For ‘‘intra-species’’ confrontations (interactions),naturally r1;1 ¼ r2;2 ¼ 1=2. So, after numerical computations, wecan conclude:

E½YF � ¼

w

4þrw

wðsÞc if wðf Þc owðsÞc ^wðf Þc pw�wðsÞc ;

3

4�r2

� �wþðr� 1Þ

2wðsÞc þ

r2

wðf Þc if wðsÞc Xwðf Þc Xw�wðsÞc ;

w

4if w�wðsÞc owðf Þc owðsÞc ;

3

4�r2

� �wþðr� 1Þ

2wðsÞc if wðf Þc 4wðsÞc ^wðf Þc Xw�wðsÞc ;

8>>>>>>>>>>><>>>>>>>>>>>:

where wðf Þc is cutoff of fair players and wðsÞc of stubborn players.Transitions between low and high payoff can be observed in

two specific fair cutoff values: wc and w�wc. These transitionsare distinct for different values of r.

2.5. Comparing fair and stubborn players

In this subsection, we report experiments with mixed popula-tions composed by only fair and/or stubborn players in different

Page 5: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

0

0.05

0.1

0.15

0.2

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 0

0

0.05

0.1

0.15

0.2

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 30

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

frequ

ency

mean payoff

cutoff = 50

0

0.05

0.1

0.15

0.2

0 20 40 60 80 100fre

quen

cymean payoff

cutoff = 60

Fig. 2. Histogram of the mean payoff obtained by players over each 1000 encounters, in a population composed only by fair players with the same cutoff.

Fig. 3. Payoff average of a stubborn player E½Y1� (left plot) and its standard deviationffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffivar½Y1�

p(right plot) as a function of the cutoff attributed to stubborn players. Here,

the population is composed of 100 fixed, 50 uniform and 50 greedy players.

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218212

situations. We apply the algorithms introduced in the previoussection. We then compute the payoff for a large number ofencounters obtained by a group of players with determinedstrategy, in a population composed by two distinct groups. It isimportant to note that when two players belong to same group,they have the same cutoff in any confrontation considered in ouranalysis. We analyze three distinct cases: (i) a population withtwo groups of stubborn players, (ii) a population with two groupsof fair players and, finally, (iii) a mixed population, i.e., stubbornversus fair players.

Fig. 4 depicts the average payoff in these analyzed situations.We start our analysis simulating a population composed only bystubborn players divided into two groups (here denoted bystubborn-1 and stubborn-2). The first plot in Fig. 4 (plot a) showsthat stubborn-1 group payoff is greater than stubborn-2, whenstubborn-2 players participate with cutoff lesser than 50. However,

the average over all population (see plot b in Fig. 4) corresponds tothe same the payoff, equal to 50, when both groups cutoffs arelesser than 50. This region corresponds to the best opportunity toobtain higher payoffs in the game. However, a player will onlyhave her payoff enlarged if living at another’s expenses becausethey are disputing the same value of an average equal to 50.

For each stubborn-1 cutoff choice between 1 and 50, the bestcutoff response for a stubborn-2 player is 50. If the stubborn-1player chooses 0, the stubborn-2 player could fix her cutoff to 100to gain 50 on average; but it means no earnings for stubborn-1player. Therefore, the best strategy for both groups is to choose acutoff of exactly 50.

In the sequel, we report experiments where we haveconsidered a population composed only by fair players, separatedin two groups. Plot c of Fig. 4 depicts the received average payofffor one of these fair groups. Along the diagonal (the line in which

Page 6: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

0 10 20 30 40 50 60 70

0 20

40 60

80 100

0 20

40 60

80 100

0 10 20 30 40 50 60 70

mea

n pa

yoff

of fi

xed-

1 po

pula

tion

fixed-2 cutofffixed-1 cutoff

0

10

20

30

40

50

0 20

40 60

80 100

0 20

40 60

80 100

0 10 20 30 40 50

mea

n pa

yoff

of a

ll po

pula

tion

fixed-2 cutoff

fixed-1 cutoff

20 25 30 35 40 45 50

0 20

40 60

80 100

0 20

40 60

80 100

20 25 30 35 40 45 50

mea

n pa

yoff

of re

cipr

ocal

-1 p

opul

atio

n

reciprocal-2 cutoff reciprocal-1 cutoff

0

10

20

30

40

50

0 20

40 60

80 100

0

20

40

60

80

100

0 10 20 30 40 50

mea

n pa

yoff

of fi

xed

play

ers

reciprocal cutofffixe

d cutoff

20 30 40 50 60 70 80

0 20

40 60

80 100

0

20

40

60

80

100

20 30 40 50 60 70 80

mea

n pa

yoff

of re

cipr

ocal

pla

yers

reciprocal cutofffixe

d cutoff

10 15 20 25 30 35 40 45 50

0 20 40

60 80

100

0 20

40 60

80 100

10 15 20 25 30 35 40 45 50

mea

n pa

yoff

of a

ll pl

ayer

s

reciprocal cutoff fixed cutoff

Fig. 4. Average payoff in many different mixed binary populations composed only by stubborn and fair players. Experiments have considered stubborn versus stubborn,

stubborn versus fair and fair versus stubborn players.

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218 213

the cutoff of the groups are identical) we can observe a maximumpayoff situation for both groups since they are receiving thesame payoff. In this situation, any consensual value would beestablished by the population. However, these points do not havethe same variance and, again, 50 is the best choice value for allplayers since it corresponds to the lowest variability condition.

So far, we have shown a strong evidence that a 50–50 divisionprevails in populations with only stubborn or fair strategies. The

next step is then to simulate situations with mixed heterogeneouspopulation of stubborn and fair players in groups of the samecutoff value.

Even though it is advantageous for stubborn players to play50–50 against other stubborn players, when playing against fairplayers it is interesting to try something different to win theconfrontation. However, this is a blind game and a player does notknow if her opponent is stubborn, and the same strategy must be

Page 7: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218214

used against all opponents. Plot d in Fig. 4 depicts the mean payoffreceived by stubborn players. There is a line in which the stubbornplayers receive the maximum payoff, i.e., 50, when the groupshave the same cutoff below 50. On the other hand, in plot e, wecan observe that higher payoffs can be achieved for fair playerswhen the stubborn players have a cutoff lesser than 50.

From plot d, we can observe deep losses for stubborn playerswith high cutoff when their confrontation is against fair players.In the region where the stubborn player has a payoff greater than50, there is an influence on the stubborn player to have the samecutoff of a fair player. In the same region, the best response fromthe fair player is a cutoff equal to 50. The 50–50 division is againan equilibrium point. When the entire population is used tocompute the payoff average (Fig. 4, plot d) a region where playersreceive 50 as the maximum average payoff can be observed, andone player can get a higher value only in the case of payoff lossesof other players. Thus, an equilibrium point in which all playersreceive the same value is again the 50–50 division.

3. Dynamic version of the ultimatum game: evolutionarysimulations

In this subsection we report experiments in which the cutoff isa variable which depends on the earning player. Here, each player,at time t, will be represented differently from the static approachused until now. In a general form, a pair of parameters (cutoffs)sðtÞ ¼ ðwpðtÞ;waðtÞÞ where wpðtÞ and waðtÞ represent, respectively,the values to be proposed when the player is proposer, and theminimum to be accepted when this player is responder in anegotiation with another player.

We begin with a population with uniformly distributed cutoffs:

wpðtÞ ¼ wmin þ xðpÞðtÞðwmax �wminÞ,

waðtÞ ¼ wmin þ xðaÞðtÞðwmax �wminÞ, (2)

where xðpÞi and xðaÞi are uniformly distributed random variables in[0,1]. So we simulated temporal evolution as described byAlgorithm 1, where we considered simple evolutionary rules toupdating wa and wp independently for each player. In apredetermined confrontation between two players at turn t, theproposer spðtÞ ¼ ðw

ðpÞp ðtÞ;w

ðpÞa ðtÞÞ and the responder saðtÞ ¼

ðwðaÞp ðtÞ;wðaÞa ðtÞÞ are established with probability r ¼ 1=2. The deal

happens if w�wðpÞp 4wðaÞa and the payoffs of the proposer and ofthe responder are, respectively, gðpÞðtÞ ¼ wðpÞp and gðaÞðtÞ ¼ w�wðpÞp ;otherwise gðpÞðtÞ ¼ gðaÞðtÞ ¼ 0. We considered constraintsmaxfwaðtÞg ¼maxfwcðtÞg ¼ w and minfwaðtÞg ¼ minfwcðtÞg ¼ 0and for our simulations we let w ¼ 100.

In our approach, we consider that any individual who has notobtained a positive payoff in a confrontation must change herstrategy by decreasing her parameter (of accepting or proposing)considering whether or not she has been successful. Suchdecrease must be proportionally performed according to theamount that the player has not earned. Note that if this player is aresponder just her accepting parameter is updated, and if she is aproposer just her proposing parameter is updated, i.e.,

wðpÞp ðtÞ ¼ wðpÞp ðt � 1Þ �wðpÞp ðtÞ=w,

wðpÞa ðtÞ ¼ wðpÞa ðt � 1Þ,

wðaÞa ðtÞ ¼ wðaÞa ðt � 1Þ � ½w�wðpÞp ðt � 1Þ�=w,

wðaÞp ðtÞ ¼ wðaÞp ðt � 1Þ.

Here we normalized by the maximum amount to be gained ðwÞ. Inbiology, this works as defense mechanism of the species. Afterlosing considerable quantities of attractive or tasteful food one

has to lower one’s criteria and to accept smaller amounts of foodor worse tasting food to guarantee one’s survival.

On the other hand, when the deal is made two situations arepossible: the obtained earning can reduce or increase the greed ofthe players. So we consider two distinct dynamics: The first one(policy I) suggests that the probability of accepting or proposingincreases in proportion to the payoff earned, but it gets smaller asthe earned payoff gets much larger (‘‘One should not comply with

small earnings, but one should limit one’s own greed’’). The secondpolicy suggests an increase in the accepting or proposingparameters with a probability that is larger as the payoff getslarger (‘‘One’s earnings will increase one’s own greed’’). Here, onlythe accepting parameter of the responder and proposing para-meter of the proposer should be updated, as follows:

1.

Policy I

PrðwðpÞp ðtÞ ! wðpÞp ðtÞ þwðpÞp ðtÞ=wÞ ¼ 1�wðpÞp ðtÞ=w,

PrðwðaÞa ðtÞ ! wðaÞa ðtÞ þ ðw�wðpÞp ðtÞÞ=wÞ ¼ 1� ðw�wðpÞp ðtÞÞ=w

¼ wðpÞp ðtÞ=w.

2.

Policy II

PrðwðpÞp ðtÞ ! wðpÞp ðtÞ þwðpÞp ðtÞ=wÞ ¼ wðpÞp ðtÞ=w,

PrðwðaÞa ðtÞ ! wðaÞa ðtÞ þ ðw�wðpÞp ðtÞÞ=wÞ ¼ ðw�wðpÞp ðtÞÞ=w.

For both policies we have considered PrðwðpÞp ðtÞ ! wðpÞp ðtÞÞ ¼

1�PrðwðpÞp ðtÞ!wðpÞp ðtÞþwðpÞp ðtÞ=wÞ, PrðwðaÞa ðtÞ!wðaÞa ðtÞÞ ¼ 1�PrðwðaÞa

ðtÞ ! wðaÞa ðtÞ þ ðw�wðpÞp ðtÞÞ=wÞ and, PrðwðaÞp ðtÞ ! wðaÞp ðtÞÞ ¼

PrðwðpÞa ðtÞ ! wðpÞa ðtÞÞ ¼ 1.

3.1. Results

Here, we analyze the evolution of averages:

hwaðtÞi ¼1

n

Xn

i¼1

wa;iðtÞ,

hwpðtÞi ¼1

n

Xn

i¼1

wp;iðtÞ,

where wa;i and wp;i are, respectively, the accepting and proposingparameters of player i ¼ 1; . . . ;n in the population. Here we sethwað0Þi ¼ hwpð0Þi ¼ w=2 and hwað0Þ

2i � hwað0Þi

2 ¼ hwpð0Þ2i�

hwpð0Þi2 ¼ s2. In order to do so, it is only necessary to calculate

wmax ¼ ðwþffiffiffiffiffiffi12p

sÞ=2 and wmin ¼ ðw�ffiffiffiffiffiffi12p

sÞ=2 in Eq. (2).In Fig. 5, we describe the behavior of averages for two distinctvalues of s (20 and 40). The two policies are described asdynamics I—plot (a) and dynamics II—plot (b).

The players are drawn from a uniformly distributed population.For our experiments we set w ¼ 100 so hwai ¼ hwpi ¼ 50. We canobserve a deep difference between the two policies. In dynamics(policy) I the accepting and proposing cutoffs difference dependson the initial variance when t!1, but always hwai ! 50while hwpi gets smaller as s gets larger (other plots s ¼ 10 and30 were constructed leading to this conclusion). Dynamics (policy)II shows proposing and responder cutoff converge to 0 and 100,respectively, as t!1. But what dynamics gets the highest averagepayoff considering a fair distribution of wealth among players?

We can observe a noticeable advantage of dynamics I inrelation to dynamics II. Using dynamics I, a higher payoff and asmaller variance is simultaneously obtained. Let us consider nowonly the interesting policy I (Fig. 6). A payoff histogram in the 1stand 2000th turn is shown in Fig. 7.

At the beginning we find a situation where many players haveno earnings (peak at zero) and a flat distribution where the

Page 8: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

Fig. 5. Temporal evolution of the accepting and proposing average cutoffs of a uniformly distributed population. We analyze two different values of variance for the initial

population s ¼ 20 and 40. The plot (a) represents dynamics (policy) I and plot (b) dynamics (policy) II.

Fig. 6. Average payoff and its variance.

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218 215

players payoff is distributed in the interval [15,80]. As we reachthe 2000th Monte Carlo (MC) step, the payoff histogram also doesnot follow a fair distribution; a peak in zero appears but is lessaccentuated than at time zero, and there is a distributionconcentrated at a payoff of 50. However, it is more important ina society if this payoff is fairly distributed along time among thepopulation. Thus, we define a player’s i wealth as the accumulated

payoff:

WealthiðtÞ ¼Xt

t0¼1

payoff iðt0Þ.

The average payoff hWealthiðtÞ ¼ ð1=nÞPn

i¼1WealthiðtÞ seems togrow up linearly as a function of time for both policies (inside

Page 9: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

Fig. 7. Payoff distribution at times t ¼ 1 and 2000 turns.

Fig. 8. Temporal evolution of the Gini coefficient for each dynamics.

Fig. 9. Yield evolution in time for different dynamics.

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218216

plots in Fig. 8). But would this growth of average wealth mean anegalitarian wealth distribution among the population? An inter-esting measure, the Gini coefficient (Gini, 1921), measures theequality level of a society. In order to measure the homogeneity inwealth distribution we have measured the Gini coefficient at eachtime step. Supposing that WealthiðtÞ represents the wealth valuesof players at time t, such that WealthiðtÞoWealthiþ1ðtÞ, with

i ¼ 1; . . . ;n� 1, an estimate of the Gini coefficient is given by

gðtÞ ¼ 1�1

n

� �þ

2

nPn

j¼1WealthjðtÞ� �Xn�1

i¼1

Xi

j¼1

WealthjðtÞ

with j0ðtÞ ¼ 0 and jnðtÞ ¼ 1 and 0pgp1. The larger g gets, themore disparate is the wealth distribution of population. We can

Page 10: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218 217

observe in the left and right plots of Fig. 8 a decrease in the Ginicoefficient as a function of time and in some cases a robust powerlaw behavior, showing a clear equality. However, the first policyleads to a higher average payoff, but to a smaller variance. This canbe understood analyzing a value related to the yield of thedynamics (policy):

yieldðtÞ ¼2

ntw

Xn

i¼1

WealthiðtÞ (3)

This concept is very simple: consider a game where two playersare offered a quantity w to divide between them. The optimalyield is the case when the player earns half of this quantity (whatmeans the other player earns the other half); in t iterations of thegame the optimal yield of a player is tw=2 and so that Formula (3)is obtained. In Fig. 9, the left plot shows the yield for the policy Iwhile the right plot depicts the same result for the policy II. Wecan observe a best result for policy I in which the yield grows upalong time, stabilizing at high values ð�0:7Þ. Policy II shows a badsituation where the yield increases, but after some time itdecreases, eventually converging to the initial value ð�0:5Þ.

The main question considered here is how different dynamicscan lead the population to optimal situations. In these situations,individuals are endowed with a large number of strategies(uniformly distributed), but they use the same dynamics (hereconsidered as a characteristic of the species). We investigatedwhat levels of greed a population needs in order to obtain bestresults, and if this greed must be limited to allow species co-existence or survivability. Our results lead to the conclusion that itis not interesting to change to a control factor in higher payoffregimes; but they also suggest that for individuals in smallerpayoff regimes changes must be more probable for them to get tooptimal situations.

4. Conclusions and further work

In this paper we have studied the global emerging behavior ofa heterogeneous population of ultimatum game players. Ourcontribution is twofold:

1.

We have considered a population where the distribution ofused strategies is constant over time, and properties of therandom payoff received by the players (average and highermoments) are reported from simple numerical exact methodsand corroborated by several computer simulations.

2.

The evolution of a population has been simulated via MonteCarlo simulations; in such simulation agents may changeindependently the proposing and accepting parameters of theirstrategy depending on received payoffs.

For the first results (static approach) we have designed andperformed several computational simulations considering theconcept of turn in the game (in a turn each participant playsnecessarily once, what is equivalent to perform a graph matching).We have shown that the simulation results correspond to theanalytical results we have derived. Then, we have analyzed thepayoff in different confrontations where players were endowedwith strategies based in cutoffs (stubborn and fair players) using3D plots. Our static simulation results (not considering evolutionin the strategies) have shown that there is strong evidence thatthe 50–50 division for cutoff is predominant in populations withonly stubborn or fair strategies. This result corroborates experi-mental evidence carried out with human subjects, which haveshown that real players have a bias, offering a division as close aspossible to 50–50 and rejecting values smaller than 30% (see e.g.Guth et al., 1982; Henrich et al., 2006). These results differ from

the expected Nash equilibrium. We have also investigated theinfluence on the average payoff of different probabilities of actingas proposer or responder ðra1=2Þ.

In the evolutionary approach we have shown that forevolutionary dynamics (policy I) the players enlarge theiraccepting or proposing cutoff with higher probability when smallpayoffs are obtained, and small probabilities for large payoffs isbetter than the policy suggested by policy II, i.e., the parametersare enlarged with probability proportional to the obtained payoff.However, when t!1 both dynamics show good wealthdistribution; the first policy leads to states with higher averagepayoff, smaller variance and yield characteristics that are moreinteresting for the overall population.

Finally, we believe that these results can bring importantconsiderations to the design of simulations in the context ofevolutionary game theory, in particular in the simulation ofrelevant social features and interactions when modelling largepopulation dynamics.

Acknowledgments

We would like to thank the anonymous referees for theirhelpful suggestions. da Silva and Lamb are partly supportedby the Brazilian Research Council CNPq under Grants 308371/2006-2, 311343/2006-6, 490440/2007, 480258/2008-2 and 577473/2008-5.

References

Abbink, K., Bolton, G.E., Sadrieh, A., Tang, F.-F., 2001. Adaptive learning versuspunishment in ultimatum bargaining. Games and Economic Behavior 37 (1),1–25.

Alvard, M., 2000. The ultimatum game, fairness, and cooperation among big gamehunters. In: Henrich, J. (Ed.), Foundations of Human Sociality, 2000.

Araujo, R.M., Lamb, L.C., 2004. Towards understanding the role of learning modelsin the dynamics of the minority game. In: Proceedings of 16th IEEEInternational Conference on Tools with Artificial Intelligence ICTAI 2004.IEEE Computer Society Press, Los Alamitos, CA, pp. 727–731.

Araujo, R.M., Lamb, L.C., 2007. An information-theoretic analysis of memorybounds in a distributed resource allocation mechanism. In: Proceedings of the20th International Joint Conference on Artificial Intelligence (IJCAI-07). AAAIPress, Menlo Park, CA, pp. 212–217.

Binmore, K., Rubinstein, A., Wolinsky, A., 1986. The Nash bargaining solution ineconomic modelling. The RAND Journal of Economics 17 (2), 176–188.

Challet, D., Zhang, Y.C., 1997. Emergence of cooperation and organization in anevolutionary game. Physica A 246, 407–418.

da Silva, R., Bazzan, A.L.C., Baraviera, A.T., Dahmen, S.R., 2006. Emerging collectivebehavior and local properties of financial dynamics in a public investmentgame. Physica A 371, 610–626.

da Silva, R., Kellermann, G.A., 2007. Analysing the payoff of heterogenouspopulation in the ultimatum game. Brazilian Journal of Physics 37 (4),1206–1211.

Fehr, E., Fishbacher, U., 2003. The nature of human altruism. Nature 425 (6960),785–791.

Gale, J., Binmore, K.B., Samuelson, L., 1995. Learning to be imperfect: theultimatum game. Games and Economic Behavior 8, 56–90.

Gini, C., 1921. Measurement of inequality and incomes. The Economic Journal 34,124–126.

Guth, W., Schmittberger, R., Schwarze, B., 1982. An experimental analysis ofultimatum bargaining. Journal of Economic Behavior & Organization 3 (4),367–388.

Hargreaves Heap, S.P., Varoufakis, Y., 2004. Game Theory: A Critical Text, seconded. Routledge, New York.

Henrich, J., McElreath, R., Barr, A., Ensminger, J., Barrett, C., Bolyanatz, A., CamiloCardenas, J., Gurven, M., Gwako, E., Henrich, N., Lesorogol, C., Marlowe, F.,Tracer, D., Ziker, J., June 2006. Costly punishment across human societies.Science 312 (5781), 1767–1770.

Napel, S., 2003. Aspiration adaptation in the ultimatum game. Games andEconomic Behavior 43, 86–106.

Nash, J., 1950. Equilibrium points in n-person games. Proceeding of the NationalAcademy of Science of the United States of America 36, 48–49.

Von Neumann, J., Morgenstern, O., 1953. Theory of Games and Economic Behavior,second ed. Princeton University Press, Princeton.

Page, K.M., Nowak, M.A., March 2000. A generalized adaptive dynamics frameworkcan describe the evolutionary ultimatum game. Journal of Theoretical Biology209 (2), 173–179.

Page 11: Statistical fluctuations in population bargaining in the ultimatum game: Static and evolutionary aspects

ARTICLE IN PRESS

R. da Silva et al. / Journal of Theoretical Biology 258 (2009) 208–218218

Page, K.M., Nowak, M.A., Sigmund, K., 2000. The spatial ultimatumgame. Proceedings of Royal Society B: Biological Sciences 267 (1458),2177–2182.

Sanchez, A., Cuesta, J.A., July 2005. Altruism may arise from individual selection.Journal of Theoretical Biology 235 (2), 233–240.

Maynard Smith, J., 1982. Evolution and the Theory of Games. Cambridge UniversityPress, Cambridge.

Maynard Smith, J., Price, G.R., 1973. The logic of animal conflict. Nature 246, 15–18.Szabo, G., Fath, G., 2007. Evolutionary games on graphs. Physics Report 446,

97–216.