12
Is the Repeated Prisoner’s Dilemma a Good Model of Reciprocal Altruism? Robert Boyd Department of Anthropology, University of California, Los Angeles Axelrod and Hamilton (1981) used the repeated prisoner’s dilemma game as a basis for their widely cited analysis of the evolution of reciprocal altruism. Recently, it has been argued that the repeated prisoner’s dilemma is not a good model for this task. Some critics have argued that the single period prisoner’s dilemma represents mutu- alistic rather than altruistic social interactions. Others have argued that reciprocal altruism requires that the opportunities for altruism occur sequentially, first one in- dividual and then after some delay the other. Here I begin by arguing that the single period prisoner’s dilemma game is consistent with the definition of altruism that is widely accepted in evolutionary biology. Then I present two modified versions of the repeated prisoner’s dilemma, one in which behavior is sequential, and a second in which behavior occurs in continuous time. Each of these models shares the essential qualitative properties with the version used by Axelrod and Hamilton. KEY WORDS: Reciprocal altruism; Reciprocity; Cooperation; Prisoner’s dilemma. R ecently , Axelrod (1984) and others (Axelrod and Hamilton 1981; Brown et al. 1982; Peck and Feldman 1985) have based an ex- tensive mathematical analysis of the evolution of reciprocal al- .truism upon the repeated prisoner’s dilemma game. Axelrod and Hamilton (1981) argue that regardless of the simplicity of this model, the conclusions drawn from it can be used to understand the evolution of re- ciprocity in a wide range of biological contexts. In October 1986 a conference was held at UCLA. Its objective was to summarize the work done on re- ciprocal altruism in the 15 years since the publication of Trivers’ (1971) seminal paper. During the discussions at this conference it became clear that many of the participants disagreed with Axelrod and Hamilton’s contention that the repeated prisoner’s dilemma game was a good model for recip- rocal altruism. Their dissatisfaction arose primarily from two issues. First, there was concern that cooperative behavior in the single period prisoner’s Received February 17, 1987; revised June 29, 1987 Address reprint requests to: R. Boyd, Ph.D., Department of Anthropology, University of Cal- ifornia, Los Angeles, 405 Hilgard Avenue, Los Angeles, CA 90024-1553. Ethology and Sociobiology 9: 211-222 (1988) 0 Elsevier Science Publishing Co., Inc., 1988 52 Vanderbilt Ave., New York, New York 10017 0162-3095/88/$03.50

Is the repeated prisoner's dilemma a good model of reciprocal altruism?

Embed Size (px)

Citation preview

Is the Repeated Prisoner’s Dilemma a Good Model of Reciprocal Altruism? Robert Boyd Department of Anthropology, University of California, Los Angeles

Axelrod and Hamilton (1981) used the repeated prisoner’s dilemma game as a basis for their widely cited analysis of the evolution of reciprocal altruism. Recently, it has been argued that the repeated prisoner’s dilemma is not a good model for this task. Some critics have argued that the single period prisoner’s dilemma represents mutu- alistic rather than altruistic social interactions. Others have argued that reciprocal altruism requires that the opportunities for altruism occur sequentially, first one in- dividual and then after some delay the other. Here I begin by arguing that the single period prisoner’s dilemma game is consistent with the definition of altruism that is widely accepted in evolutionary biology. Then I present two modified versions of the repeated prisoner’s dilemma, one in which behavior is sequential, and a second in which behavior occurs in continuous time. Each of these models shares the essential qualitative properties with the version used by Axelrod and Hamilton.

KEY WORDS: Reciprocal altruism; Reciprocity; Cooperation; Prisoner’s dilemma.

R ecently , Axelrod (1984) and others (Axelrod and Hamilton 1981; Brown et al. 1982; Peck and Feldman 1985) have based an ex- tensive mathematical analysis of the evolution of reciprocal al-

.truism upon the repeated prisoner’s dilemma game. Axelrod and Hamilton (1981) argue that regardless of the simplicity of this model, the conclusions drawn from it can be used to understand the evolution of re- ciprocity in a wide range of biological contexts. In October 1986 a conference was held at UCLA. Its objective was to summarize the work done on re- ciprocal altruism in the 15 years since the publication of Trivers’ (1971) seminal paper. During the discussions at this conference it became clear that many of the participants disagreed with Axelrod and Hamilton’s contention that the repeated prisoner’s dilemma game was a good model for recip- rocal altruism. Their dissatisfaction arose primarily from two issues. First, there was concern that cooperative behavior in the single period prisoner’s

Received February 17, 1987; revised June 29, 1987

Address reprint requests to: R. Boyd, Ph.D., Department of Anthropology, University of Cal- ifornia, Los Angeles, 405 Hilgard Avenue, Los Angeles, CA 90024-1553.

Ethology and Sociobiology 9: 211-222 (1988) 0 Elsevier Science Publishing Co., Inc., 1988 52 Vanderbilt Ave., New York, New York 10017

0162-3095/88/$03.50

212 R. Boyd

dilemma game is better labeled mutualism than altruism. Second, several participants argued that reciprocal altruism requires the action of one in- dividual, followed at some delay by the potentially reciprocal behavior of the second individual, and that the simultaneous behavior assumed in the repeated prisoner’s dilemma is not faithful to this concept. [For a more extensive account of the conference discussion see Packer (1986).]

In this article, I argue that the repeated prisoner’s dilemma is a useful model. I begin by arguing that the one period prisoner’s dilemma is a rela- tively general model of potentially altruistic social interactions involving pairs of individuals, and that interpreting the “cooperate” move as an al- truistic act is consistent with the definition of altruistic behavior widely accepted in evolutionary biology. Then I stress that the particular temporal sequence of behaviors assumed in the usual version of the repeated pris- oner’s dilemma is not an essential feature of the model. To do so, I present two versions of the repeated prisoner’s dilemma in which the temporal se- quence of behavior is quite different. Nonetheless, both models have the same general evolutionary properties as the model analyzed by Axelrod and Hamilton (1981) because that model captures two of the essential evolu- tionary features of ongoing social interactions: the possibility of contingent behavior and the existence of a time delay during between the time an in- dividual first defects and the time that he or she is punished for defecting. None of the arguments in this paper is new; they occur in various forms in the papers of Axelrod and Hamilton (1981), Brown and colleagues (1982), Maynard Smith (1982a, 1982b), Hirshleifer (1982), and Nunney (1985). How- ever, I hope that by bringing these arguments together in a single place I can convince the reader that the repeated prisoner’s dilemma is a useful model for analyzing the evolution of reciprocal altruism.

THE PRISONER’S DILEMMA: ALTRUISM OR MUTUALISM?

Altruistic behaviors are commonly distinguished from mutualistic behaviors on the basis of the fitness effects of the behavior on the actor who performs the behavior and the recipient (or recipients) affected by the behavior. Be- haviors that have a positive effect on the fitness of both are said to be mutualistic, whereas behaviors that have a negative effect on the fitness of the actor and a positive effect on the recipient are said to be altruistic. This distinction is often summarized (e.g., Wilson 197.5, Wrangham 1982) as in Table 1. Here, fitness effects on the donor refer to the difference between the fitness of individuals who perform the behavior in question and the fitness of individuals who perform some alternative behavior that serves as a base- line. Similarly, the fitness effects on the recipient refer to the difference in fitness resulting from the donor performing a given behavior using the same baseline.

Reciprocal Altruism Equals Prisoner’s Dilemma? 213

Table 1. Distinction Between Altruistic and Mutualistic Behavior

Effect on Fitness of

Behavior Actor Recipient

Altruistic Mutualistic

_ + + +

When individuals interact socially, the effect of a particular individual’s behavior on itself and on other individuals will often depend on the other individual’s behavior. This fact means that the magnitude of the difference in fitness between a behavior and its alternative behavior used as a baseline will depend on the social context in which both behaviors occur. When determining whether a particular behavior is altruistic, it is important to calculate the difference in fitness between the behavior and its alternative, holding the behavior of other individuals constant. The failure to make the comparison in this way can lead to much confusion about what kinds of behaviors are properly called altruistic.

To see this, consider the following hypothetical example. Assume that there is a species in which pairs of unrelated individuals form one-time co- alitions in order to defend a resource of some kind. Each individual can behave in one of two ways. Cagey individuals hang back slightly hoping to avoid the most serious risks, whereas avid individuals fight so as to maximize the chance that the defense will be successful. Thus, there are four types of interacting pairs: two avid individuals, one avid and one cagey, and so on. Assume that the fitness payoffs for each type of interaction are as in- dicated in Table 2. The left entry in each cell of the table gives the effect of the interaction on the fitness of individual A, whereas the right entry gives the effect on individual B.

Assume that avid fighters are observed to be overwhelmingly common in a particular population. Is this an example of altruism, or can it be ex- plained as mutual&m? It is tempting to compare the fitness of an avid lighter paired with another avid fighter and the fitness of a cagey fighter paired with another cagey fighter. Because two avid fighters are both better off than two cagey ones, avid fighting seems mutualistic. However , this view is incorrect. To see why, consider the fate of an avid individual who sustains a mutation that causes it to switch to cagey behavior. Because avid lighters are very

Table 2. Payoff Matrix for Hypothetical Example of Coalition Behavior

Individual B

Avid Cagey

Avid 8, 8 1, 10 Individual A

Cagey 10, 1 4, 4

214 R. Boyd

common, the mutant individual is very likely to be paired with an avid lighter and have a fitness increase of 10 units. The effect of the mutation is to increase the fitness of the mutant by two units, and thus cagey behavior should increase in the population. If the notion of individual costs and ben- efits are to have evolutionary meaning, they must be reflected in evolutionary outcomes. We compare the fitness of two alternative social behaviors hold- ing the behavior of other individuals constant, because it is this fitness dif- ference that will govern whether a trait can persist when individuals interact at random. [The idea that evolutionary costs and benefits should be calcu- lated by considering the effect of a mutation on fitness is drawn from Nunney (1985) and is defended at greater length in that article.]

The matrix of fitness effects shown in Table 2 is an example of the one- period prisoner’s dilemma. More generally, this game is defined by the ma- trix of payoffs shown in Table 3. The pattern of payoffs that defines a one- period prisoner’s dilemma can be derived from two principles. (1) Cooper- ative behavior must be altruistic in the sense defined above. This condition implies that T > R and P > S. (2) The positive effect of cooperation on the fitness of the recipient must exceed its negative effects on the fitness of the donor. This condition implies that R > (T + S)/2 > P. Thus, in the simplest case in which interacting individuals are identical and there are only two alternative behaviors, the prisoner’s dilemma is a quite general represen- tation of potentially altruistic social interactions. Moreover, such simple ESS calculations often provide useful insight about the equilibrium behavior of more complex models that assume diploid or polygenic inheritance (e.g., Thomas 1985; Boyd and Richerson 1980).

In order for avid fighting to be mutualistic, the matrix of fitness effects would have to be like that shown in Table 4. This matrix is not a prisoner’s dilemma; it is another game, sometimes called “the stag hunt” because it is thought to formalize the Rousseau’s parable of the stag hunt (Oye, 1986). In this case, switching from avid to cagey fighting actually lowers the fitness of an individual ‘when it is paired with an avid individual. For example, it might be that by switching to cagey behavior, an individual significantly increases the chance that the pair will lose control of the resource, and at the same time only reduces its risk of injury a small amount. [This definition of mutualism is taken from Maynard Smith (1982b); Wilson (1980) would label avid behavior in this setting as weak altruism.] Notice also that avid

Table 3. Payoff Matrix for Prisoner’s Dilemma”

Individual B

Cooperate Defect

Cooperate R, R Individual A

Defect T. S

S, T

P. P

n T > R > P > S, R > (T + 912.

Reciprocal Altruism Equals Prisoner’s Dilemma? 215

Table 4. Payoff Matrix in Which Avid Behavior is Mutualistic

Individual B

Avid Cagey

Avid Individual A

Cagey

8, 8 -1, 6

6, -1 4, 4

fighting is individually advantageous only when common. When avid be- havior is rare, it is altruistic by the definition given above.

It is important to distinguish between mutualism and altruism because quite different selective processes must be invoked to account for the two classes of behaviors. Once common, mutualistic behavior will remain com- mon under the influence of natural selection, even if individuals interact at random. In contrast, altruistic behavior can remain common only if some process causes altruistic morphs to interact assortatively. Unless it is likely that altruists will disproportionately receive the benefits resulting from the behavior of other altruists, the nonaltruistic morphs will increase in fre- quency. Kin selection allows the evolution of altruistic behaviors because association with relatives is one process that leads to assortative social in- teraction. It will be argued below that reciprocal altruism, which requires that there be ongoing interactions and contingent behavioral strategies, is another process that allows altruists to interact nonrandomly with other altruists (also see Brown et al. 1982).

THE REPEATED PRISONER’S DILEMMA: IS SIMULTANEOUS BEHAVIOR CRUCIAL?

A variety of authors have modeled ongoing altruistic interactions using the repeated version of the prisoner’s dilemma game. This game is usually pre- sented as follows. Pairs of individuals are drawn from a population and interact some number of times. During each interaction, individuals choose between the two behaviors, C or D, simultaneously; that is, without knowl- edge of the other individual’s choice. The effects of a single interaction on the fitness of each individual are given by the prisoner’s dilemma matrix. Theorists then try to determine whether reciprocal behavior strategies can evolve in this context, and if so, what kinds of reciprocal strategies are best.

Several participants in the UCLA conference argued that the repeated prisoner’s dilemma was not a good model for the evolution of reciprocal altruism because it is based on the assumption that individuals choose be- havior simultaneously during each interaction. Instead, they argued, recip- rocal altruism applies only to situations in which behavior is potentially reciprocal: first one individual acts, either altruistically or not, and then after

216 R. Boyd

some delay, the second individual acts, thus having the opportunity to re- ciprocate altruism and punish nonaltruism. Without such a delay, cheaters have no opportunity to benefit from their cheating, and thus the behavior labeled “cooperate” is better described as mutualistic.

In this section, 1 argue that the exact temporal patterning does not de- termine whether cooperation is altruistic, or whether it can evolve. Rather, cooperation is altruistic as long as there is a time lag that makes cheating potentially beneficial. Ongoing social interactions allow the evolution of al- truistic cooperation because it allows contingent behavior. The repeated prisoner’s dilemma is a useful model for investigating the evolution of re- ciprocal altruism because it captures these two properties in an analytically tractable framework.

If individuals have the opportunity to interact over a period of time, contingent behavior becomes possible. By contingent behavior, I mean that one individual’s behavior at some time depends on the previous behavior of another individual. For example, individual A might behave altruistically toward individual B at the present time, only if B was altruistic at earlier times. I will call strategies of this kind contingent altruism. Contingent al- truism is important because it can cause individuals who behave altruistically to be disproportionately likely to benefit from the altruistic behavior of oth- ers, and thus permits altruistic behavior to evolve. Assume that some in- dividuals in a population are contingently altruistic, whereas others are al- ways nonaltruistic. Then after some time, the only individuals who will receive the benefits of altruistic behavior will be contingent altruists who are paired with other contingent altruists. Under the right conditions, this nonrandom pattern of interaction can allow altruistic behavior to persist.

The second essential feature of ongoing altruistic interactions is the existence of a delay between the time that one individual ceases to be al- truistic and the time that the other individual can retaliate. Without such a time delay, contingent altruists could retaliate instantly to nonaltruistic be- havior by behaving nonaltruistically themselves, and cheaters could not ben- efit from their nonaltruistic behavior. With a time delay, individuals can increase their fitness in the short run by behaving nonaltruistically. This then raises the problem of whether the long-run benefit realized by pairs of con- tingent altruists can compensate them for foregoing the short-run benefits of nonaltruistic behavior. It is not sequential behavior that is essential. Rather, it is the existence of a period of time during which individuals can gather the benefits of nonaltruistic behavior. One of the reasons that the repeated prisoner’s dilemma analyzed by Axelrod and Hamilton is a useful model is that it incorporates such a lag.

To support my assertion that the possibility of contingent behavior and the existence of a time delay are essential features of the evolution of re- ciprocal altruism, and that other aspects of the temporal patterning of social interaction are not, I present two simple extensions of the repeated prisoner’s dilemma model analyzed by Axelrod and Hamilton (1981) and others. In the

Reciprocal Altruism Equals Prisoner’s Dilemma? 217

first, behavior is not simultaneous. In the second, behavior is simultaneous and performed in continuous time. I argue that each of these models and the original repeated prisoner’s dilemma all have similar evolutionary prop- erties, and that this similarity results from the combination of contingent behavior in a potentially altruistic setting and the existence of a time delay.

A Model of Reciprocal Altruism With Nonsimultaneous Behavior

Let us modify the assumptions of our example of coalition behavior to allow for ongoing interactions and nonsimultaneous behavior. Consider a popu- lation in which pairs of unrelated individuals form lasting relationships. The probability that the relationships lasts for t or more time periods is w’. In any given time period, one or the other of the two individuals will have the opportunity to behave altruistically by aiding his or her partner in defending a resource. Call this individual the actor and the other individual the recip- ient. Behaving altruistically reduces the fitness of the actor an amount c and increases the fitness of the recipient an amount b. During the first period, each member of a given pair of interacting individuals is equally likely to be the recipient. During subsequent periods, there is a probability g that each individual has the same role (i.e., actor or recipient) as during the previous period and a probability 1 - g that the two individuals will switch roles. Thus, if g = 0, individuals alternate roles, actor, recipient, actor, and so on. If g = 1, an individual retains his or her initial role throughout the interaction. Intermediate values of g yield intermediate length “runs” of each role. (A version of this model was analyzed by Hirshleifer 1982).

Next, suppose that there are two behavioral strategies present in the population. In defining these strategies, it will be convenient to define a turn as an unbroken sequence of periods in which each player has the same role. The first strategy is the following generalization of tit-for-tat: If one is the actor during the initial turn, behave altruistically; during subsequent turns behave altruistically as long as the other individual behaved altruistically throughout the previous turn. We label this strategy “tit-for-tat” (TFT). In addition, assume that the second strategy is unconditional defection (ALLD), which never behaves altruistically. In a population in which TFT is common, most interactions will be altruistic, whereas there will be little altruism in a population in which unconditional defection is common.

1 now demonstrate that the conditions under which each of the two behavioral strategies are ESSs closely parallel the conditions derived by Axelrod and others using the usual version of the repeated prisoner’s di- lemma in which individuals are assumed to act simultaneously. Table 5 gives the expected incremental effect on the fitness of individual A given the strat- egy used by individual B. Notice that unconditional defection is always an ESS, since V(ALLDIALLD) = 0 > V(TFT[ALLD) = -c/(2(1 - gw)) for any values of g and w. Next, notice that TFT is an ESS as long as ~1, which

218 R. Boyd

Table 5. Expected Fitness Matrix for Repeated Prisoner’s Dilemma Game with NonshnuUaneous Behavior

Individual B’s Strategy

TFT ALLD

TFT b-c

2(1 - w) Individual A’s strategy

ALLD b

2(1 - w)

--r

20 - gw)

0

is a measure of the average number of interactions between individuals, is greater than a threshold value, w*, given below:

w* = C

b(l - g) + cg * (1)

First, let us consider these results in the special case in which g = 0, so that individuals alternate being actor and recipient. With this assumption, the matrix of expected payoffs and, therefore, all the evolutionary properties of the model is exactly the same as in the ordinary repeated prisoner’s di- lemma in which T = b/2, R = (b - c)/2, P = 0, and S = -c/2. There is no difference between alternating behavior and simultaneous behavior if each cooperative act produces a benefit to the recipient, 6, and a cost to the donor, c, and costs and benefits interact additively. Now, consider what happens as g becomes larger. Unconditional defection is still always an ESS, but (from Eq. 1) the threshold value of w necessary for TFT to be an ESS increases as g increases. When g = 1, so that an individual retains his or her initial role throughout an interaction, no value of w will allow TFT to increase in frequency.

To understand this result, first recall that in the usual repeated prisoner’s dilemma, the threshold value of w necessary for TFT to increase, ti’, is given by (e.g., Axelrod 1981):

T-R iC=T_.

Because R > P, increasing T, the fitness of an nonaltruist when paired with an altruist increases ti. In order that TFT be an ESS, the long-run benefit achieved by two altruists must exceed the short-run benefits of cheating. Thus, ~5 increases as T is increased because in the ordinary repeated pris- oner’s dilemma, T is the benefit to the cheating. Next, notice that the fitness effects given in Table 5 are exactly the same as in the ordinary repeated prisoner’s dilemma in which T = ii b/(1 - gw), R = (b - c)/2, P = 0, and S = 3 c/(1 - gw). Thus, increasing g in the current model has all the same effects as increasing T and decreasing S in the usual model, including increasing the threshold value of w necessary for TFT to be an ESS.

This correspondence make sense. Increasing g increases the benefits

Reciprocal Altruism Equals Prisoner’s Dilemma? 219

to cheating. In this model, the average length of a turn increases as g in- creases. When g is near zero, the chance that an individual will be the donor for more than a few periods in a row is very small. In contrast, when g is near one, individuals may be recipients for many periods before the situation arises in which they are donors. Thus, g is a measure of the amount of benefit that a defecting individual can achieve before detection and the other individual ends his or her altruism. As g becomes larger, the average fitness increase associated with nonaltruistic behavior becomes larger.

This model is interesting because it makes the role of a time delay during which cheaters benefit in the repeated prisoner’s dilemma more explicit than it is in the usual formulation. By increasing g, one increases the time during which cheaters benefit, which in turn reduces the opportunities for the ev- olution of altruism. Remember, however, that a delay of one interaction is present in the ordinary repeated prisoner’s dilemma, and that one can achieve the same effect in the ordinary repeated prisoner’s dilemma by in- creasing T.

A Model of Reciprocal Altruism in Continuous Time

Let us now analyze a model that is far removed from the intuitive notion that reciprocal altruism involves behavior by one individual followed by behavior by the second individual. As in our first example of coalition be- havior, assume that pairs of unrelated individuals join together in one-time coalitions to defend a resource and that they represent a random sample of the population. However, now assume that individuals interact continuously for a length of time t, where t is an exponentially distributed random variable with the density:

f(t) = (1 - w) e-‘l-““.

This is the continuous time analog of the usual assumption that there is a constant probability w that the interaction continues from one period to the next; on the average, interactions will last l/(1 - w) time units. Because we are now considering the pattern of behavior within a particular bout of resource defense, the expected length of interaction might be very short, perhaps only minutes. At any instant in time during a bout, an individual can be cagey or avid. The rate at which the fitness of individual A is affected by the social interaction given the behavior of individual B is given by the prisoner’s dilemma matrix shown in Table 3, where T > R, P > S, and R > (T + S)/2 > P. This assumption means, for example, that if both indi- viduals defend the resource avidly for a period of time, X, each individual experiences a change in fitness xR. Similarly, if one individual fights avidly and the other does not for a period of time x, the avid fighter’s fitness is changed by an amount xS, whereas the cagey individual’s fitness changes by an amount XT.

We assume that there are two strategies possible, a generalization of

220 R. Boyd

TFT, and unconditional defection. A TFT individual defends avidly as long as the other individual is behaving the same way, otherwise he or she be- comes cagey. Unconditional defectors are always cagey. We further assume that there is a time lag 7 between the time an individual first changes its behavior (i.e., from avid to cagey or visa versa) and the time that the other individual detects this change where T is an exponentially distributed random variable with the density

f(7) = de-dT.

This means that on average there is a delay of l/d time units between the time at which an individual first begins to fight cagily and the time at which a TFT individual can respond by behaving in the same way. For example, this lag might be due to the fact that each individual must attend to its own behavior during a fight, and can only occasionally monitor the behavior of its coalition partner.

Once again, the conditions under which each of the two behavioral strategies are ESSs closely parallel the conditions derived by Axelrod and others using discrete time repeated prisoner’s dilemma. Table 6 gives the expected incremental effect on the fitness of individual A given the strategy used by individual B for the continuous time interaction.

Notice that unconditional defection is always an ESS, since V(ALLD(ALLD) - V(TFT(ALLD) = (S - P)l[d + (1 - w)] < 0, since P > S. It is also true in this model that the expected length of the interaction, I/(1 - w), must exceed a threshold in order that TFT be an ESS. This threshold is

First, consider the special case in which the expected delay is one time unit. Then the term in braces on the right side of this expression is the expected number of interactions necessary for TFT to be an ESS in the discrete time model analyzed by Axelrod less one. The difference is a consequence of the fact that, in the continuous time case, the game sometimes ends before defectors are detected; thus, although the average time until defection is l/d, defectors do not reap the benefits of defection for l/d time units on the average. Note that this difference becomes small as the expected length of

Table 6. Expected Fitness Matrix for Continuous Time Prisoner’s Dilemma

Individual B’s Strategy

TFT ALLD

TFT R S-P P

(1 - WI [d + (1 - w)l +(1 Individual A’s Strategy

ALLD T-P P P

[d + (1 - w)] + (1 (1 - w)

Reciprocal Altruism Equals Prisoner’s Dilemma? 221

the interaction becomes large when compared to the expected time delay. Second, notice that, as in the previous case, increasing the average length of the delay increases the threshold value of the expected number of inter- actions (and therefore the threshold value of w). This means that as the average interval between the time at which the defector first defects and the time at which the defection is detected shrinks, the expected length of time of the interaction necessary for contingent altruism to be favored also shrinks. In the limit in which there is no delay (l/d = O), interactions between pairs of individuals need not persist at all.

This model is interesting for two reasons. First, it suggests that recip- rocal altruism may be involved in the evolution of interactions that occur on very short time scales. Traditionally, people have thought about recip- rocal altruism in terms of interactions that last over a significant portion of an individual’s lifespan. However, the continuous time model shows that it is not the absolute length of the interaction that is crucial; rather it is the length of the interaction relative to the time delay between the time that an individual defects and the time at which that defection is detected. Thus, reciprocal altruism may explain the evolution of many kinds of ephemeral coalitions, such as those involved in predator mobbing. Second, it puts the role of the time delay in a different perspective. When there is no delay, cheaters can never exploit a contingent altruistic strategy such as TFT be- cause they cannot benefit from cheating. As soon as they stop behaving altruistically, so does the individual with whom they are interacting. Hence, simultaneous behavior alone is not sufficient to cause on ongoing interaction to be mutualistic; rather, both simultaneous behavior and the absence of any time delay during which cheaters can prosper are required.

CONCLUSION

In this article, I have argued that the repeated prisoner’s dilemma model is a useful model because it captures the features of ongoing social interactions that allow altruism to evolve among unrelated individuals. There is, how- ever, a great deal more to understand about ongoing social interactions than how such interactions allow altruism among unrelated individuals. The pris- oner’s dilemma is only one of a great number of “mixed motive” games that describe different kinds of social interactions in which the interests of the participants are neither perfectly coincident nor purely opposed. For example, the hawk-dove game represents many situations in which indi- viduals engage in contests over resources; the “battle of the sexes” game can represent situations in which different individuals have different inter- ests, but also a strong interest in behaving similarly (see Hirshleifer 1982 for many interesting examples). Unfortunately, there has been very little effort expended in evolutionary analyses of repeated versions of mixed- motive games other than the prisoner’s dilemma. Ongoing social life in an-

222 R. Boyd

imal groups involves not merely potentially altruistic interactions, but on- going interactions that are described by other mixed-motive games. Because the interactions are ongoing, it is likely that contingent behavioral strategies will arise, and some may be favored by natural selection. To understand the behavior that results, we need to investigate the repeated versions of a wider variety of mixed-motive games.

I am grateful to Eric Fischer, Robert Seyfarth, Joan Silk, John Wiley, and Gerald Wilkinson for useful comments on earlier drafts of this paper, and to Peter Richerson for many long and useful discussions of these ideas.

REFERENCES

Axelrod, R. The Evolution of Cooperation. New York: Basic Books, 1984. -, and Hamilton, W.D. The evolution of cooperation. Science 211: 1390-1396, 1981. Boyd, R., and Richerson, P.J. 1980. Effect of phenotypic variation on kin selection. Proceedings

of the National Academy of Sciences USA 77: 7506-7509, 1980. Brown, J.S., Sanderson, M.J., and Michod, R.E. Evolution social behavior by reciprocation.

Journal of Theoretical Biology 99: 319-339, 1982.

Hirshleifer, J. Evolutionary models in economics and law: Cooperation vs. conflict strategies. Research in Law and Economics 4: l-60, 1982.

Maynard Smith, J. Evolution and the Theory of Games. London: Cambridge University Press, 1982a.

-. The evolution of social behavior: A classification of models. In Current Problems in

Sociobiology, Kings College Sociobiology Group (Eds.). Cambridge: Cambridge Uni- versity Press, 1982b,

Nunney, L. Group selection, altruism, and structured deme models. American Naturalist 126:212-230, 1985.

Oye, K.A. Explaining cooperation under anarchy: Hypotheses and strategies. In Cooperation

Under Anarchy, K.A. Oye (Ed.). Princeton: Princeton University Press, 1986. Packer, C. What ever happed to reciprocal ‘altruism? Trends in Ecology and Evolution 1: 142-

143, 1986. Peck, J. and Feldman, M.W. 1985. The evolution of helping behavior in large, randomly mixed

populations. American Naturalist 127: 209-221, 1985.

Thomas, B. Genetical ESS models. I. Concepts and basic model. Theoretical Population Biology 28: 18-32, 1985.

Trivers, R. The evolution of reciprocal altruism. Quarterly Review of Biology 46: 35-57, 1971. Wilson, D.S. The Natural Selection of Population and Communities. Menlo Park, CA: Ben-

jamin/Cummings, 1980. Wilson, E.O. Sociobiology: The New Synthesis. Cambridge: Belknap/Harvard University Press,

1975. Wrangham, R. Mutualism, kinship, and social evolution. In Current Problems in Sociobiology,

Kings College Sociobiology Group (Eds.). Cambridge: Cambridge University Press, 1982.