43
Journal of Economic Literature Vol. XLIII (March 2005), pp. 65–107 Economic Theory and Experimental Economics LARRY SAMUELSON 65 1. Introduction G ame theory had its beginnings in eco- nomics as a separate topic of analysis, practiced by a cadre of specialists. It has since become commonplace. Every econo- mist is acquainted with the basic ideas, often without notice, and there is free movement between the use of game theory and other techniques. This incorporation as a standard economic tool has helped shape the nature of game theory itself—the mix of questions has changed and more attention has been devoted to how game theoretic models are to be interpreted as capturing economic interactions. Mathematical economics and economet- rics have each similarly progressed from being a topic pursued by a band of specialists to becoming a sufficiently familiar tool as to be used without comment. In the process, each has been shaped by issues arising in economic applications. Samuelson: University of Wisconsin. I thank Jim Andreoni, Jakob K. Goeree, Roger Gordon, John McMillan, Georg Nöldeke, and three referees for helpful comments. I thank the Economic Science Association for the invitation to give a talk at the 2003 Pittsburgh meetings that developed into this paper. I thank the National Science Foundation (SES-0241506) and Russell Sage Foundation (82-02-04) for financial support. Experimental economics is currently making its transition from topic to tool. 1 Once viewed skeptically by many econo- mists, experiments have become common- place. Once again, this transition has involved changes both in the way econo- mists view experimental methods and in the experimental methods themselves. This paper explores one aspect of this integration of experimental economics into economics. How can we usefully combine work in economic theory and experimental economics? What do economic theory and experimental economics have to contribute to one another, and how can we shape their interaction to enhance these contributions? There is already plenty of work that insightfully integrates theory and 1 For example, the Journal of Economic Literature’s “Mathematical and Quantitative Methods” classification section includes a “Design of Experiments” subsection, and a Nobel prize has been given for experimental work. At the same time, training in experimental methods has not yet joined basic econometrics or game theory as a stan- dard part of the first-year graduate curriculum. Alvin E. Roth (1993) provides a history of early work in experimen- tal economics. Roth (1995) continues this history and pro- vides a more detailed discussion of recent experimental work. Roth (1991) proceeds further with some thoughts on the future of experiments in economics.

Economic Theory and Experimental Economicswhiteheadjc/eco4810/stuff/mr05_samuel2.pdfMathematical economics and economet- ... Foundation (82-02-04) ... Samuelson: Economic Theory and

Embed Size (px)

Citation preview

Journal of Economic LiteratureVol. XLIII (March 2005), pp. 65–107

∗ Samuelson: UAndreoni, JakobMcMillan, Georg comments. I thanthe invitation to githat developed iScience FoundatFoundation (82-0

Economic Theory and Experimental Economics

LARRY SAMUELSON∗

1 For example, the Journal of Economic Literature’s

1. Introduction

Game theory had its beginnings in eco-nomics as a separate topic of analysis,

practiced by a cadre of specialists. It hassince become commonplace. Every econo-mist is acquainted with the basic ideas, oftenwithout notice, and there is free movementbetween the use of game theory and othertechniques. This incorporation as a standardeconomic tool has helped shape the natureof game theory itself—the mix of questionshas changed and more attention has beendevoted to how game theoretic models areto be interpreted as capturing economicinteractions.

Mathematical economics and economet-rics have each similarly progressed frombeing a topic pursued by a band of specialiststo becoming a sufficiently familiar tool as tobe used without comment. In the process,each has been shaped by issues arising ineconomic applications.

65

niversity of Wisconsin. I thank Jim K. Goeree, Roger Gordon, JohnNöldeke, and three referees for helpfulk the Economic Science Association forve a talk at the 2003 Pittsburgh meetingsnto this paper. I thank the Nationalion (SES-0241506) and Russell Sage2-04) for financial support.

Experimental economics is currentlymaking its transition from topic to tool.1

Once viewed skeptically by many econo-mists, experiments have become common-place. Once again, this transition hasinvolved changes both in the way econo-mists view experimental methods and in theexperimental methods themselves.

This paper explores one aspect of thisintegration of experimental economics intoeconomics. How can we usefully combinework in economic theory and experimentaleconomics? What do economic theory andexperimental economics have to contributeto one another, and how can we shape theirinteraction to enhance these contributions?

There is already plenty of work that insightfully integrates theory and

“Mathematical and Quantitative Methods” classificationsection includes a “Design of Experiments” subsection,and a Nobel prize has been given for experimental work.At the same time, training in experimental methods hasnot yet joined basic econometrics or game theory as a stan-dard part of the first-year graduate curriculum. Alvin E.Roth (1993) provides a history of early work in experimen-tal economics. Roth (1995) continues this history and pro-vides a more detailed discussion of recent experimentalwork. Roth (1991) proceeds further with some thoughts onthe future of experiments in economics.

66 Journal of Economic Literature, Vol. XLIII (March 2005)

3 For example, Lisa A. Cameron (1999), ElizabethHoffman, Kevin A. McCabe and Vernon L. Smith (1996),and Robert L. Slonim and Roth (1998) (larger payoffs);Joseph Henrich (2000), Henrich, Robert Boyd, SamuelBowles, Colin F. Camerer, Ernst Fehr, Herbert Gintis, andRichard McElreath (2001), and Roth, Vesna Prasnikar,Masahiro Okuno-Fujiwara, and Shmuel Zamir (1991) (dif-ferent countries and cultures); Gary E. Bolton and RamiZwick (1995), Hoffman, McCabe, and Smith (1996), and

mr05_Article 2 3/28/05 3:25 PM Page 66

experiments.2 However, the methods for put-ting the two together are still developing. Thegoal here is to examine the issues involved inthis development. Much can be gained bycombining economic theory and experiments,but doing so calls for thinking carefully aboutthe way we do theory as well as experiments.

2. An Example

It is helpful to begin with an example inwhich experimental results and economic the-ory have constructively mingled. This exam-ple illustrates the ideas that will be developedmore generally in section 3, illustrated in section 4, and then extended in section 5.

In 1965, Reinhard Selten (1965) intro-duced the concept of a subgame-perfectequilibrium. Subgame perfection is nowtaken for granted, in the sense that a paperwhose conclusion hinged upon an equilibri-um that was not subgame perfect wouldhave a great deal of explaining to do.

Some years later, Werner Güth, RolfSchmittberger, and Bernd Schwarze (1982)performed a simple experiment, examiningwhat has come to be known as the ultimatumgame. Player 1 makes a proposal for how asum of money is to be split between players 1and 2. Player 2 then either accepts, imple-menting the proposal, or rejects, in which casethe interaction ends with zero payoffs for each.This is the type of game—perfect information,two players, only one move per player—inwhich subgame perfection is often viewed asbeing obviously compelling. In any subgame-perfect equilibrium of the ultimatum game,player 1 makes and player 2 accepts a propos-al that gives player 2 at most one penny (or oneof whatever is the smallest monetary unitavailable). In contrast, Güth, Schmittberger,and Schwarze obtained results that have beenechoed by an ever-growing list of subsequent

2 Vincent P. Crawford (1997) and Roth (1988) explorethe interaction between economic theory and experi-ments, each arguing (as does this paper) that there aregood reasons for thinking about experiments when doingeconomic theory as well as thinking about theory whendoing experiments.

studies. The modal proposal is typically to splitthe sum of money evenly. If player 1 asks fortwo-thirds or more of the surplus, he stands agood chance of being rejected.

We thus have a marked contrast betweentheory and experiment. A common initialreaction was to dismiss the laboratory envi-ronment as uninteresting. Why should we beinterested in how experimental subjects playan artificial game for token amounts ofmoney? Borrowing a term from experimen-tal psychology, this is a question of externalvalidity: is the experimental environmentsufficiently close to the situation of interestto be informative? In this case, for example,is the laboratory environment close enoughto the situations envisaged by contract theo-rists when they assume that subgame-per-fect equilibria appear in the ultimatumgames embedded in their models?

One way of gaining some perspective onsuch questions is to turn them around. Howspecial is the laboratory environment generat-ing the experimental results? Can we link theresults to aspects of the experimental envi-ronment that appear to be especially artificial,or do they appear to be robust? In the case ofthe ultimatum game, a long string of experi-ments has investigated the effects of playingfor larger amounts of money, playing in dif-ferent countries and cultures, playing withdiffering degrees of anonymity, playing withdifferent amounts of experience, playinggames of different length, and playing withdifferent types of opponents.3 Some of these

Hoffman, McCabe, Shachet, and Smith (1994) (anonymi-ty); David Cooper, Nick Feltovich, Roth, and Zwick (2002)(experience); Robert Forsythe, Joel L. Horowitz, N. E.Savin, and Martin Sefton (1995) and Glenn W. Harrisonand McCabe (1992) (length); and Sally Blount (1995),Harrison and McCabe (1996), and Eyal Winter and Zamir(1997) (opponents).

Samuelson: Economic Theory and Experimental Economics 67

mr05_Article 2 3/28/05 3:25 PM Page 67

variations matter, and there is much to belearned about which matter more than oth-ers. However, no combination of conditionshas been found that produces the subgame-perfect equilibrium outcome sufficientlyreliably as to allow us to dismiss the remain-ing experimental results. The mounting evi-dence suggests that the ultimatum game hassomething to tell us about behavior. One canoften find reasons to dismiss any singleexperiment, but cannot ignore such a largeand varied body of work.

Attention then turns to the theory. Whatimplications for economic theory do theexperimental results have? Perhaps none. Weknow that any theory is a deliberate approxi-mation, and hence that there must be somecircumstances under which it fails. Could itbe that the theory is meant for settings notcaptured by the experiments, and that thetheory is still useful in the applications forwhich it is intended? In this spirit, KenBinmore, Avner Shaked, and John Sutton(1985) argue that the ultimatum game fea-tures an atypically asymmetric division of bar-gaining power, making subgame perfectionunrealistically demanding, and that modelsbuilt around subgame perfection might be abetter match for two-stage bargaining gamesthat feature a less extreme (though still asym-metric) distribution of bargaining power.Their experiments produced outcomes muchcloser to the subgame-perfect equilibrium intwo-stage bargaining games. Are we then toassume that the subgame-perfect equilibriumis a useful model of behavior in bargainingmodels, as long as we stay away from modelswith equilibria that are too asymmetric? Andif so, what does “too asymmetric” mean?

Once again, we can seek insight in theensuing body of experimental work.Subsequent experiments have examined bar-gaining games with varying degrees of asym-metry in bargaining power (see Douglas D.Davis and Charles A. Holt [1993] for a sum-mary). Departures from equilibrium areoften much less pronounced than in the ulti-matum game, but the data still does not

invariably reflect equilibrium play. However,we do not expect theories to make exact pre-dictions. How close is close enough? Whenare experimental results within the margin ofapproximation that is inevitably built into atheory, and when do they indicate that thetheory is on the wrong track? There is typi-cally no obvious standard for answering thesequestions. One can then imagine Davis andHolt’s summary figure (Figure 5.6) beingregarded as evidence for both the successand the failure of conventional bargainingmodels, depending upon one’s point of view.

A return to the theory is again helpful, thistime with an eye toward finding within thetheory some guide for evaluating the experi-mental results. Glen W. Harrison (1989,1992) (see also Robert Drago and John S.Heywood [1989]) suggests one approach. Acornerstone of the relevant theory is that peo-ple maximize their expected payoffs. In lightof this, a natural measure for evaluating thetheory is the payoff losses subjects incur as aresult of not behaving as predicted. The larg-er are these losses, the stronger is the evi-dence that the theory has missed something.Harrison argues that in the case of auctions,seemingly large departures from equilibriumbehavior often translate into very small payofflosses, suggesting that the contrast betweentheory and behavior is not nearly as large as itfirst appears. Drew Fudenberg and David K.Levine (1997) turn a similar eye toward avariety of other games, including the ultima-tum game. They find that behavior in manyof these games is consistent with subjects’holding beliefs about others’ behavior that isconsistent with their experience and againstwhich they suffer relatively small payoff loss-es. This again suggests that the theory maycapture important elements of behavior,despite seemingly unencouraging experi-mental results. At the same time, however,payoff losses in the ultimatum game are rela-tively large compared to many other experi-ments. Rejecting an offer often involves asignificant sacrifice, regardless of what onebelieves about how others are playing. It is

68 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 68

then harder to argue here that one canrationalize nonequilibrium behavior simplyby arguing that the players are nonethelessachieving approximately equilibrium payoffs.

Binmore, John Gale, and LarrySamuelson (1995) and Alvin E. Roth andIdo Erev (1995) suggest an alternativeapproach to assessing how close observedbehavior is to the predictions of the theory.Why should we expect equilibrium behaviorin the first place? The traditional answer ineconomics is not that equilibria spring tolife as a result of sheer calculation or exter-nal organization, but rather that behavior ispushed toward equilibrium by an adjust-ment or learning process that continuallyputs pressure on players to alter nonequilib-rium behavior. Adopting this view, howstrong are the incentives for players toadjust nonequilibrium behavior in simplebargaining games?4 The stronger are theseincentives, the stronger is the experimentalevidence that the theory has missed some-thing. In the case of bargaining games, itturns out that these incentives can be quiteweak. Even small amounts of noise orimperfection can cause the learning processto get stuck, for long or even indefinite peri-ods of time, far away from a neighborhoodof the subgame-perfect equilibrium. Wethus again have a suggestion that theobserved behavior may not be too far fromequilibrium, by at least one measure of “toofar.” Motivated by similar considerations, aliterature on learning and its relationship to experimental behavior has developed.5

However, two difficulties now arise. First,

4 In spirit, this is close to asking how far realized pay-offs fall short of the payoffs that could be obtained by play-ing a best response. The difference is that the incentivesfor adjusting one’s behavior are now taken to be not thepayoffs promised by perfect optimization, but rather theincentives to pursue the potentially imperfect learningprocess that shapes behavior.

5 See Camerer (2003) and Drew Fudenberg and DavidK. Levine (1998) for examples. Raymond Battalio,Samuelson, and John Van Huyck (2001) report an experi-ment linking the speed of learning and the incentives foradjusting one’s strategy, providing some hint that learningcan be important.

what can the subjects, especially respon-ders, possibly have to learn in a game sosimple as the ultimatum game? Without aclear answer to this question, learning mod-els are difficult to interpret. We return tothis question in section 5. Second, learningtheories have proven to be cumbersometools with which to examine strategic inter-actions. A successful theory trades off itsexplanatory power with its ease of use. It hasnot been easy to formulate learning modelsthat rival equilibrium theories in terms ofreadily yielding sharp predictions.6

Perhaps one should view the connectionbetween theory and experiment differently.Instead of asking whether the theory getsthe behavior right, and then wrangling overhow the distance between experimentaland theoretical outcomes is to be measuredand interpreted, let us ask whether the the-ory captures the important considerationsshaping the behavior. This directs attentionaway from the point predictions of the the-ory and toward its comparative statics. Forexample, experimental behavior that con-sistently responds to changes in discountrates as predicted by the theory of bargain-ing might lead us to believe that the theoryhas identified an important role for impa-tience in shaping behavior, even if the the-ory is not complete enough to captureevery aspect of behavior. This emphasis oncomparative statics pushes experimentalanalysis closer to methods familiar in otherareas of economics.

The results in this respect for bargainingtheory are mixed. For example, Binmore,Peter Morgan, Shaked, and Sutton (1991)and Binmore, Shaked, and Sutton (1989)report experiments in which behaviorresponds to the difference between a volun-tarily exercised and involuntarily exercisedoutside option in a direction consistent withtheoretical predictions. However, Jack Ochs

6 Ed Hopkins (2002) provides an indication of why itcan be difficult to identify the learning process behindexperimental behavior.

Samuelson: Economic Theory and Experimental Economics 69

mr05_Article 2 3/28/05 3:25 PM Page 69

and Roth (1989) report an experiment inwhich behavior does not respond to the dis-count factor and the length of the game con-sistently with the predictions of subgameperfection. This latter finding is all the moredisconcerting because the role of impatienceis viewed as one of the key insights of non-cooperative bargaining models.7 Theseresults suggest that rates of impatience maybe less central, and the prospect of a break-down in negotiations more important, thancaptured by the original models.

Taken together, the body of experimentalevidence suggests that our simplest theoriesof bargaining leave some aspects of behav-ior unexplained. This is interesting, but ismost useful if the experiments also suggesthow we might construct a more encompass-ing account of behavior. This brings us tothe question, again borrowed from experi-mental psychology, of internal validity. Howdo we assess whether our interpretation ofan experimental result captures the rele-vant aspects of the experimental situationand the resulting behavior, and hencepoints the way to a better understanding ofthe behavior and to better theoretical mod-els of that behavior? For example, Güth,Schmittberger, and Schwarze (1982) (seealso Güth and Reinhard Tietz [1982]) inter-pret their results as indicating that subjects’behavior is shaped primarily by considera-tions of fairness. If this is the case, then wemay be on the road to a new “fairness theo-ry” of behavior. We might work with famil-iar bargaining models, but with quitedifferent views of how people behave inthese models. Notice that this assessmentdiffers markedly from the hints with whichthe previous paragraph concluded, underwhich we would retain the basic view of

7 The contrast between these two results becomessharper in the context of section 4, which suggests that oneshould be especially disappointed when a theory fails toexhibit behavior integral to its original structure, such asthe appropriate sensitivity to discount rates, but especiallypleased when the theory successfully extends to originallynovel questions, such as the effect of outside options.

self-interested behavior but revise thestructure of the model.

An appeal to fairness has an intuitive ringto it. It is hard to believe that fairness doesnot play a role in our lives, or that extreme-ly asymmetric allocations would not strikeone as unfair. It also seems quite naturalthat these considerations would carry overinto behavior in bargaining experiments.Here, however, we return to a theme thatappeared in connection with learning mod-els and that runs throughout this essay. Therelevant question for evaluating a theory isnot so much whether it is “correct,” butwhether it can be readily and usefullyapplied to a sufficiently broad range of set-tings. The difficulty with appeals to fairnessis that they too often have an “I know itwhen I see it” quality that makes them par-ticularly cumbersome to use. VesnaPrasnikar and Roth (1992) develop thisidea, reporting experimental results show-ing that, under some circumstances, experi-mental subjects do settle on extremelyasymmetric allocations (see also JamesAndreoni, Paul M. Brown, and LiseVesterlund [2002] and Harrison and JackHirshleifer [1989]). This appears to suggestthat we have been too hasty in concludingthat concerns for fairness routinely pushpeople away from asymmetric allocations.However, the extreme allocations inPrasnikar and Roth’s best-shot game Paretodominate the less asymmetric allocations. Inresponse, it is tempting to refine the notionof fairness, viewing it as inducing an antipa-thy to asymmetric allocations, but an antipa-thy that is tempered when asymmetricallocations have efficiency properties thatsymmetric allocations lack. Adding the best-shot experimental results to our portfoliomay thus suggest that fairness is importantafter all, but is a more subtle notion thansimply a concern for equality or symmetry.8

8 Prasnikar and Roth (1992) investigate these possibili-ties by examining a market game in which asymmetricequilibrium outcomes appear that do not Pareto dominatethe symmetric outcomes.

70 Journal of Economic Literature, Vol. XLIII (March 2005)

9 Among others, Ken Binmore, John McCarthy,Giovanni Ponti, Samuelson, and Avner Shaked (2002) andArmin Falk, Fehr, and Urs Fischbacher (2003) reportexperimental results indicating that preferences mustdepend upon more than simply payoffs, even the payoffsof all players.

mr05_Article 2 3/28/05 3:25 PM Page 70

There is much that is appealing about thisargument, but it illustrates the difficulties ofworking with such an elusive concept asfairness. The more subjective or context-dependent is the idea of fairness, the lessuseful it becomes as a component of a the-ory, regardless of how important it is inshaping behavior.

Making progress in interpreting seeming-ly anomalous experimental results thusrequires making the idea of fairness, orwhatever else it is that one imagines affect-ing players’ behavior, sufficiently precise. Afirst question is theoretical: can we do sowith conventional theoretical techniques, orare we dealing with something quite differ-ent? Are we dealing with a world to whichthe underlying structure of economic mod-els applies—people maximize, they bal-ance competing objectives, they respondto variations in the constraints on howthese objectives can be traded against oneanother—even if they are concerned withsomething other than simply how muchmoney they make? Or are such models ofbehavior simply on the wrong track?Andreoni and John Miller (2002) providesome insight into this question throughexperiments in which a dictator faces a vari-ety of exchange rates between the payoffsthat the dictator can keep or allocate to arecipient (while Andreoni, Marco Castillo,and Ragan Petrie [2003] do much the samefor the ultimatum game). As is often the casein such games, their results are not consis-tent with the proposition that all subjectscare only about how much money theyreceive. However, their results are consis-tent with the claim that most subjects havestable preferences satisfying revealed-pref-erence axioms. Whatever motivates the sub-jects, whether money or fairness orsomething else, it is something that we canmodel with the familiar optimization tools ofeconomics, without abandoning rationalbehavior as a unifying principle.

The next task is again theoretical: fittingsome more encompassing model of individual

behavior into standard models of bargain-ing. Gary E. Bolton and Axel Ockenfels(2000) and Ernst Fehr and Klaus M.Schmidt (1999) (see also Bolton [1991])offer models of preferences that capture aconcern for fairness. Each is centeredaround a utility function that involves one’sown payoff and the payoff of one’s oppo-nent, and that exhibits some aversion topayoff inequality.

One tempting reaction to these models isthat nothing so simple could possibly cap-ture the complexity of human behavior.Pursuing this view, it is not hard to find evi-dence that some factors are missing fromthese models.9 However, such criticismsmiss the point. It is again important to recallthat one purpose of any theory is to judi-ciously choose considerations to neglect.The ability to find some circumstances inwhich the theory does not work perfectly isthen not by itself cause to reject the theory.While they may still be incomplete, themodels offered by Bolton and Ockenfels(2000) and Fehr and Schmidt (1999) havethe key virtue that their predictions are clearand they can easily be extended to encom-pass novel situations. This allows us to con-firm that these models predict behaviormatching that of standard models in a widevariety of circumstances in which the latterappear to be applicable, to confirm that theycapture the apparent fairness considerationsthat operate in the bargaining models thatmotivated their construction, and to investi-gate the extent to which they apply to newapplications. This is just what we need tomake progress, and is what economic theorymust do if we are to effectively combine theory and experiments.

A variety of alternative and more elaboratemodels have appeared, many enriching the

Samuelson: Economic Theory and Experimental Economics 71

mr05_Article 2 3/28/05 3:25 PM Page 71

theory by incorporating elements in prefer-ences beyond simply the final allocation ofpayoffs.10 There is considerable work to bedone in assessing and synthesizing thesemodels, work that will require a continualinterplay between economic theory andexperiments. How is this interplay to pro-ceed? It will be helpful to develop some ofthe ideas raised in the course of this examplemore generally.

3. A Theoretical Perspective

This section opens a more general discus-sion with a theoretical perspective, in theform of a model of economic theory andeconomic experiments. The idea is to pro-vide a precise way of talking about what atheory is, what an experiment is, and howthe two might be related.

3.1 A Model

The environment. The model beginswith the assumption that there is an objectiveenvironment or “real world” to be studied,represented by a function

where X and S are finite sets and X∞ and

F : ,X S� �→

10 Bolton (1991) offers an early model in which prefer-ences depend upon others’ payoffs, while Matthew Rabin(1993), building on the theory of psychological games(John Geanakoplos, David Pearce, and Ennio Stacchetti[1989]), is an early example of how one might make theidea of fairness theoretically operational. Other examplesinclude Gary Charness and Rabin (2002), MartinDufwenberg and Georg Kirchsteiger (2004), Fehr andSimon Gächter (2000), Levine (1998), and Uzi Segal andJoel Sobel (2003). Andreoni and Samuelson (2003) reportexperimental results that explore, in a somewhat differentsetting, some of the key features of these models.

11 This discussion thus avoids questions concerning theexistence of an objective reality have been raised fromwidely differing perspectives. Physicists argue that quan-tum phenomena are not determined independently ofattempts to measure them, while some social scientistsargue that nothing objective exists independently of theobserver, who constructs reality as she observes it. Suchconcerns can be relevant for economics. For example,could experimental procedures designed to elicit valua-tions affect those valuations? We return to this set of issuesin section 5.1.

S∞ are their infinite-dimensional prod-ucts.11 We think of the function F as takingin information, given by an element of theset X∞, that defines a situation of interest.This information might define an exten-sive-form game, or a set of lotteries fromwhich one is to choose, or a market or aneconomy. With each such situation, thefunction F associates an output from theset S∞.12 Depending on the situation, thisoutput might be an equilibrium of thegame, or a selected lottery, or a marketprice or a competitive equilibrium.

We can view each of the dimensions ofX∞ and S∞ as corresponding to a propertyor characteristic that a situation or an outputmight have, with the sets X and S providingthe language in which one describes suchproperties. The details of the sets X and Sare not particularly important. What doesmatter is that there is a potentially endlesslist of relevant properties, sufficiently manythat neither theoretical nor experimentalwork could ever hope to describe everyaspect of reality. We ensure this in themodel by assuming that there are infinitelymany such properties, so that the sets ofinputs and outputs are the infinite productsX∞ and S∞.13

We think of the environment as generat-ing situations which are then transformedinto outcomes by the function F. We let �denote a probability distribution describingthe process that generates situations. Wethink of a theory or an experiment as beinga tool for understanding the function F. In

12 Again, the model skirts a philosophical issue, con-cerning whether the world is deterministic or random. Weadopt the technical convenience that outcomes are deter-ministic (conditional on being able to identify the situationcompletely), though in practice we can identify only finiteapproximations of situations, with outcomes that thenappear to be random (conditional on this information). Arandom-world view requires only additional notation inorder to accommodate an extra layer of probability distri-butions in the model.

13 The sets X∞ and S∞ can be viewed as a short-handfor sets that are finite but prohibitively large. Daniel C.Dennett’s Library of Mendel (1995) provides the settingfor an intriguing discussion of large finite sets.

72 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 72

the absence of any constraints, of course,one would simply work with F itself.Unfortunately, the function F is too com-plicated to work with directly. The idea isthen to combine theory and experimentalwork to produce tools that are simpleenough to be used, while capturing enoughof F to be useful.

Theory. Like the function F thatdescribes the environment, a theory takes ininformation concerning a situation and pro-vides information concerning the correspon-ding outcome. However, instead of taking inall of the information contained in an ele-ment of the set X∞, we model the theory asmaking use only of the dimensions 1,…,N,for some finite N. Similarly, instead of speci-fying every detail of the output, the theoryprovides information only about the dimen-sions 1,…, M, for some finite M. Let XN bethe set of N-tuples corresponding to the firstN dimensions of the set X∞, and let SM simi-larly be M-tuples whose elements corre-spond to the first M dimensions of S∞. Atheory is then a function

for some N and M, where ∆SM is the set ofprobability measures over SM and ∆∆SM isthe set of probability measures over ∆SM.

The restriction to finite N and M capturesthe fact that a theory does not make use ofall of the information defining a situation,nor does it specify every detail of the out-put.14 Instead, one of the challenges incrafting a theory is to choose its inputs andoutputs, i.e., to choose N and M, so as toinclude relevant information and neglectrelatively unimportant details. For example,a theory about labor force participation

f X SN M: ,→ ��

14 While it is intuitive that a theory cannot make use ofall the information in the environment, there is in princi-ple no reason why it should be restricted to the first Ndimensions of X∞. Why not a theory that makes use of theinformation in dimensions 1, 3, and 14 and ignores therest? There is no loss in assuming that, whatever theorywe have, the dimensions of X∞ are arranged so that thetheory makes use of an initial string of them.

rates may use information on wage rates,marital status, age and educational attain-ment, but may neglect information concern-ing foreign exchange rates. The theory’soutput may provide information about howmuch time an individual devotes to leisure,but may say nothing about which activitiesconsume this time.

As a first approximation, we might think ofthe theory as choosing an output from SM.However, given that the theory’s input leavessome details of the situation unspecified, it ismore natural to view the theory as produc-ing a probability distribution over the out-puts in SM (i.e., an element of ∆SM). We theninterpret the random output as reflectingthe uncontrolled realization of those aspectsof the situation that are not captured by XN.For example, a theory of labor force partici-pation may provide an expected participa-tion rate, as a function of an individual’s ageand education, but would view actual partic-ipation as being randomly distributedaround this expected value, reflecting other,unobserved characteristics.15

It is useful to go one step further andallow the theory to produce an element of∆∆SM, the space of distributions over distri-butions. We may have more informationconcerning the likely values and implica-tions of some of the unmodeled features ofa situation than of others. We may thenhave a distribution over the realizations ofthe features about which we have relativelygood information, each in turn inducing adistribution over outcomes. For example,an analyst asked to predict the outcome ofthe next presidential election might beginwith the question of whether the economywill then be healthy or in recession. Theanalyst’s theory may involve a distributionover which of these is likely to be the caseand, conditional on either, a distribution

15 The idea that the theory produces a distribution overoutcomes is perhaps most familiarly exploited by weatherforecasters, who regularly announce probabilities of rain,but also appears routinely in economics. We return to thisidea in section 4.1.2.

Samuelson: Economic Theory and Experimental Economics 73

17 An experiment may yield many observations, but wecan arrange the notation to represent the entire experi-mental outcome as a single observation. This modelignores an issue raised by Roth (1994): the tendency toconcentrate on “successes” when reporting experimentalresults can cause useful information to be neglected. Areport of a successful experiment, whether it involves aseeming confirmation or contradiction of a theory, may beless informative than a report that also details the processleading to that experiment. The latter may include investi-gations of alternative games, alternative experimental pro-cedures, alternative presentations of the experiment to thesubjects, and so on. As Roth notes, the line between hav-ing also run alternative (possibly unsuccessful) experi-ments and having run pilot or diagnostic trials is oftenambiguous, so that even the best of intentions do notensure the optimal provision of information. The discus-sion here assumes that this problem has been solved, sothat we have precisely the information we would wantfrom an experiment, and then asks how we combine thatinformation with economic theory.

18

mr05_Article 2 3/28/05 3:25 PM Page 73

over likely outcomes of the election.Similarly, an economist asked to analyzethe market for skilled labor might beginwith a distribution over likely macroeco-nomic conditions, each of which in turninduces a distribution over conditions inthe relevant labor market. In a model oflabor force participation, one’s educationand age may induce a distribution over par-ticipation decisions that itself depends ran-domly on labor market conditions. Who isto say whether the probability that theweather forecaster attaches to rain is notitself chosen randomly?

Experiments. An experiment similarlyassociates an output with an input. Theexperiment again begins with an elementof XN (for some finite N), which we denoteby xN and refer to as the experimentaldesign. This design fixes those features ofthe environment captured in the N dimen-sions of XN. For example, the design xN mayspecify how much money the subjects earnin various circumstances.

The actual input to the experiment is a sit-uation, i.e. an element of X∞ (denoted byx∞), that matches xN on the first N dimen-sions.16 The idea here is that the experimen-tal design fixes those details of theenvironment described by xN, while leavingothers uncontrolled. For example, thedesign may leave uncontrolled the wealthlevels of the subjects.

Let FM denote the function comprisingthe first M dimensions of F. Given anexperimental design xN and a correspondinginput x∞, the experiment consists of anobservation of the form:

FM(x∞)

for some M. Hence, an experiment consistsof a partial description (FM(x∞)) of the out-put of the function F that describes theenvironment, evaluated at one of a collec-tion of possible inputs (x∞ ∈ X∞) that share

16 Hence, the input is drawn from the set {x∞ ∈ X∞ : x∞(n) � xN(n), n � 1,…,N}.

the features (xN) determined by the experimental design.17

It may seem counterintuitive to character-ize the experiment as yielding realizations ofF, since experiments are often viewed as(and criticized for) being artificial ratherthan “real.” However, a more precise formu-lation of this criticism is that the experimen-tal situation involves a value of xN that is notprecisely the one in which we are mostinterested.18 But given this value, the outputis given by FM(x∞) for some x∞ that matchesxN on the relevant dimensions.

There are many situations x∞ consistentwith an experimental design xN, as must bethe case when we are unable to specifyevery detail of the experimental situation.One hopes the experimental design deter-mines most of the important aspects of thesituation, but cannot control all of thedimensions of the experimental situationx∞. In effect, the experiment is a model of asituation, just as is a theory. The output of anexperiment is similarly a model, given byFM(x∞) rather than F(x∞). We can hope toidentify the salient points of the experimen-tal outcome, but again cannot identifyeverything.

For example, the experiment may involve universityundergraduates choosing between small-stakes lotterieswhile we may be interested in risk attitudes among largetraders in financial markets.

74 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 74

The Goal. The goal of both theoreticaland experimental work is to understand theworld, or in the context of our model, tounderstand the function F. How can eco-nomic theory, often seemingly quiteremoved from the world, be combined withexperiments in pursuit of this goal?

It is not easy to make this goal more pre-cise. How do we know when we haveachieved some understanding of F? Wemight judge our understanding by the abili-ty to make predictions that match the out-puts generated by F. For example, Erev,Roth, Robert L. Slonim, and Greg Barron(2002) pose the following question. Supposewe have both a theory and some experimen-tal evidence bearing on a question of eco-nomic behavior, perhaps making differentpredictions and each potentially subject toerror. How do we combine the two to reacha more precise, joint prediction?

The perspective of this paper is different.Instead of asking how we use existing theo-retical and experimental results to make pre-dictions, our focus will be on how we canexploit experimental results in the develop-ment of more useful theory, and vice versa.We thus shift the emphasis from using exist-ing theory and experiments in making pre-dictions to using them in making new theoryand experiments.

To be meaningful, of course, this processmust be organized by the ultimate goal ofunderstanding the world. At this point, weconfront new difficulties. While we mighthope that a theory’s predictions will be close,we again cannot expect them to be exact.Then how are we to judge whether a newtheory is an improvement? This would bestraightforward if there were only benefitsand no costs to enhancing the predictivepower of a theory, but this is not the case.We return to this issue in section 3.2.

More importantly, the ability to make pre-dictions is only part of what is involved inusing economic theory to understand theworld. Robert J. Aumann (1985) and ArielRubinstein (1998), for example, argue that

while the ability to predict behavior may bea good test of our understanding of theworld, the ultimate goal is the understandingitself. Economic theory can then be helpfulin making precise our intuition and estab-lishing relationships between our ideas, evenwithout adding to our predictive abilities.This is a popular view, but one that makes itall the harder to identify the criteria bywhich theory is to be evaluated.

3.2 Why Experiments?

How do experiments help us assess anddesign economic theory? It is useful to startby considering the limitations of economictheory, organized around four ideas:• Economic theory may be inaccurate:

given an input xN ∈ XN, the theory f(xN)may produce distributions over outputsthat do not match the distributioninduced by the environment. Hence,given the information on which it condi-tions and the results it predicts, the the-ory provides a result that we wouldchange if we knew the true model.

• Economic theory may be imprecise: thetheory may produce a random output ofsufficient noisiness to be unhelpful. Ifpossible, we might then seek more preci-sion by increasing N to encompass moreinformation than that captured by XN,bringing more of the relevant variation inthe situation within the purview of themodel.

• Economic theory may be uninformative:important information may be missingfrom the output of the theory. We maythen need to expand the range of thetheory (increase M).

• Economic theory can be too complicated:if the vectors N and M are large, then theinformational demands of the theorymay be so burdensome as to make thetheory useless.

In practice, we must expect these cate-gories to blur together, with any particulartheory exhibiting some degree of eachshortcoming.

Samuelson: Economic Theory and Experimental Economics 75

22 See, for example, Dan Ariely, George Loewenstein,and Drazen Prelec (2003), Daniel Kahneman and AmosTversky (2000), Loewenstein and Prelec (1992), RichardH. Thaler (1992, 1994), and Tversky and Kahneman(1982).

23 Neglecting this last point makes it all too easy to fallinto a state of tension in which the primary value of exper-iments is seen as debunking theory, and theory is viewed ashaving to defend itself from the challenge of experiments.Against this backdrop, it is noteworthy that economicexperiments owe much of their prominence to theirdemonstration that economic theory can be surprisinglyrobust. For example, Vernon Smith’s work on actions (see

mr05_Article 2 3/28/05 3:25 PM Page 75

How can economic experiments helpaddress the shortcomings of economic the-ory? First, experiments can fill the gapwhen the theory is either too uninformativeor too complicated to be useful.19 Forexample, the role of economists in design-ing and running Federal CommunicationsCommission spectrum auctions in theUnited States, and subsequently throughoutthe world, has been offered as evidence forthe usefulness of economic theory. Beforerunning the auctions, however, the FCCcommissioned experiments (spearheadedby Charles R. Plott) to explore their proper-ties (cf. Paul Milgrom 2004). These experi-ments played an important role in verifyingthe internal consistency of the auction pro-cedure and in making the case that the auc-tion could work. Experiments weresimilarly important in designing the Britishspectrum auctions (Binmore and PaulKlemperer 2002). There are many otherexamples, from designing multiunit auc-tions (Jeffrey S. Banks, John O. Ledyard,and David P. Porter 1989) to designing pro-cedures to allocate access to railroad tracks(Paul J. Brewer and Plott 1996), payloadpriority on the space shuttle (Ledyard,Porter and Randii Wessen 2002), and air-port take-off and landing slots (Stephen J.Rassenti, Vernon L. Smith, and Robert L.Bulfin 1982).

Second, and of more relevance for our dis-cussion, much of the work in experimentaleconomics has centered around identifyinginaccuracies and imprecisions in economictheory.20 For example, the standard eco-nomic model of individual behavior is thatpeople maximize expected utility. However,ample experimental evidence suggests thatpeople do not always maximize expectedutility, and do not count upon others to doso.21 More generally, there are long-standing

19 This falls into Roth’s (1987) category of “Whisperingin the Ears of Princes.”

20 This falls into Roth’s (1987) category of “Speaking toTheorists.”

21 See Camerer (1995) and Roth (1995) for surveys.

research programs in economics and psy-chology that serve as a conscience for eco-nomic theory, arguing that much of ourtheory does not provide a good match forbehavior.22

The difficulty here is that theories areintended to be inaccurate and imprecise. Aswe have noted, a theory is a deliberateapproximation of a world too complicated tobe analyzed in complete detail. It is then nosurprise to find that the theory does notalways match behavior. Experimental con-firmation of this fact is potentially helpful,but only if it also points the way toward animproved theory.23 The constructive role forexperiments that challenge economic theo-ries is thus not to simply argue that existingtheories do not work, but to point the way toimprovements.24 Perhaps paradoxically, it iswhen playing this role that experimentspose the greatest challenge to economictheorists. It is relatively easy to dismiss anexperimental contention that a theory issometimes off the mark, but much harder toignore an indication of how it might beimproved.

A new difficulty now appears. Whenassessing potentially improved theories, wemust trade off competing features thatleave us with only a partial order over alter-natives. Theories are better if they aremore accurate and precise, but also if they

Theodore Bergstrom [2003] for a survey and Smith [1991,2000] for collections of papers) showed that elementarysupply-and-demand models, the bread-and-butter tool ofmuch of economics, were surprisingly descriptive.

24 Binmore (1999) advocates such a “consolidating”view of the interaction between theory and experiments.

76 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 76

are more parsimonious (i.e., have smallerN, for fixed M, or in some cases that havesmaller N and M). A theory that makes bet-ter predictions at the cost of more com-plexity is not necessarily more desirable.Nor is the goal necessarily a single “cor-rect” theory. Instead, we can expect to workwith a portfolio of theories that address dif-ferent issues and that lie at different pointsalong a frontier that trades off power andcomplication.

The idea that a more complicated theorymay not be better is obscured by economictheory itself, which implicitly assumes thatreasoning and inference is costless and auto-matic.25 In practice, however, it is a familiaridea that theories are costly to use, andhence that a more accurate or more precisetheory is not always superior.26 This point isoften illustrated in introductory economicsclasses by asking students to think of a roadmap as a metaphor for an economic theory,and then to note that a map on a scale of 1:1would be more precise than is commonlyfound, but its very detail would render ituseless.

One of the obstacles to the integration ofeconomic theory and experiments is thusthat we have no clear idea of what makes atheory good. For example, we have ampleevidence of shortcomings of expected utili-ty theory, as well as an ample collection ofalternative models. However, while it iseasy to find papers in theory journals work-ing on the tools that might serve as alterna-tives to expected utility theory, it is much

25 Standard models of reasoning and knowledge beginwith a set of states and a partition over these states repre-senting the structure of the available information (e.g.,Ronald Fagin, Joseph Y. Halpern, Yoram Moses, andMoshe Y. Vardi 1995). These models have the implicationthat one automatically knows every implication of anyinformation received. Hence, knowledge of the rules ofchess ensures that one knows an optimal strategy, whileknowledge of the basic axioms of mathematics makes all ofthe theorems of mathematics instantly available. It is nosurprise that such models do not encourage one to thinkabout the costs of complicated reasoning.

26 Barton L. Lipman (1999) examines a formal model ofthe cost of using a theory.

harder to find papers that use these tools.Why? The informal explanation typically isthat for most applications, expected utilitytheory’s lack of realism is a reasonable priceto pay for its simplicity. This assessmentconvinces some, while striking others as tooeasy an excuse.

David W. Harless and Colin F. Camerer(1994) provide a foundation for examiningthis issue, introducing the notion of an effi-ciency frontier for generalizations ofexpected utility theory, balancing predictivepower and simplicity. They find that ordi-nary expected utility theory lies on this fron-tier, as do several more sophisticatedtheories. This at least provides some reas-surance that expected utility theory is notdominated on every dimension. Dependingupon our requirements, we might reason-ably choose to work at various points on thisfrontier, including work with expected utili-ty. But what is the criterion by which pointsalong this frontier are evaluated, other thanconventional wisdom and accepted prac-tice? How much of conventional wisdomand accepted practice reflects inertia, his-torical accident, a lack of familiarity withnew theories, fashion, and similar factors? Ifwe are to insist that the goal of a theory isnot to be right but to be useful, then one ofthe great difficulties with economic theoryis that we have little consensus on whatmakes a theory useful, other than that it iscustomarily used.

This presents a challenge in two respects.Theorists need to be more explicit, both intheir theory and in their reactions to experi-ments, as to how they assess the trade-offsbetween various limitations. Experimentalists,when interpreting results as supporting anelaboration of existing theory, must addressnot only the potentially increased precisionand accuracy of the theory but also theincreased complication.

Is there anything special about experi-ments in this discussion? In one sense, no.The ideas apply to the use of data in gener-al, regardless of whether an experiment lies

Samuelson: Economic Theory and Experimental Economics 77

29 In one view of economic theory, there would be noproblem of external validity. Elon Kohlberg and JeanFrançois Mertens (1986, p. 1005) state: “We adhere tothe classical point of view that the game under consider-ation fully describes the real situation—that any(pre)commitment possibilities, any repetitive aspect, anyprobabilities of error, or any possibility of jointly observ-ing some random event, have already been modeled inthe game tree.” Pushing this view as far as it will go, thetheory then identifies the situation exactly. If the theoryis simple enough that all of its aspects can be captured inthe lab, then we literally have the situation of interest andnot simply an approximation, leaving no room for ques-

mr05_Article 2 3/28/05 3:25 PM Page 77

at their source.27 However, the great attrac-tion and relevance of experiments is theability they provide to control inputs. If weare interested in assessing the output f(xN)produced by the theory f in response toinput xN, it may be easier to create (orapproximate) input xN in the laboratory than“in the field.”28 The value of this controlbecomes all the more apparent upon realiz-ing that we typically can neither ensure thatour inputs include all of the factors wewould like to have, nor that they exclude allof the ones we would like to not have. At thesame time, this advantage brings with it anew challenge. How do we know when theexperimental setting has done its job, givingus observations from situations consistentwith the desired input xN and not somethingelse? We return to this issue in section 5.1.

3.3 Why Theory

What does economic theory have to con-tribute? Paralleling the preceding discus-sion, it is helpful to begin with thelimitations of experimental work:• Experiments may be inaccurate: the

experimental procedure is itself a situa-tion. This procedure has presumablybeen designed to control the key fea-tures of the situation, but we cannotexpect to have controlled everything.How do we know that the design bringsthe experimental situation sufficientlyclose to the real-world situations in

27 The line between experimental and field data isbecoming increasingly blurred, as economists turn to“field experiments” designed to capture the best of bothsettings. See Harrison and John A. List (2004) for an intro-duction to field experiments and the methodological issuesthey raise.

28 If we cannot observe situations consistent with inputxN in the field, why do we care about xN? The answer is thatsome values of xN may provide especially revealing condi-tions under which to evaluate the theory. For example,theories about bargaining may be more readily evaluatedwhen complicating interpersonal factors are stripped awayby examining anonymous bargaining. This in turn my bepossible only in the laboratory. For similar reasons, scien-tists may endeavor to free their experimental environ-ments of impurities, even though such an environment isnot observed in nature.

which we are interested to be informa-tive about the latter? Experimental psy-chology refers to this as the question ofexternal validity.29

• Experiments may be imprecise: our inter-pretation of an experiment may incor-rectly identify the links between thesituation and the results. The unrecog-nized links may make the resulting infer-ences too noisy to be useful. This is aquestion of internal validity.30

• Experiments may be uninformative. Itmay not be possible to bring the experi-mental design xN close enough to thesituation to provide useful information.

• Experiments may be informative only atprohibitive cost. Though one of the obvi-ous advantages of experiments is theability to address otherwise intractableproblems in a manageable way, theremay be cases where this is not feasible.

Again, these are neither sharply definednor mutually exclusive categories, and we

tions of external validity. If not, then the laboratory inves-tigation is irrelevant to the theory. Under this view, anexperimental result at odds with the theory tells us onlythat the experimental design has not captured the condi-tions under which the theory applies. This classicalapproach is best viewed as a philosophical exploration ofthe idea of rationality. It contrasts with a positiveapproach, under which economic theory is viewed as atool for modeling and understanding behavior, a tool thatis more useful the broader is its applicability. In this case,experimental results at odds with the theory help identifycircumstances under which the theory is not applicable.

30 For example, do the choices of experimental subjectsreveal the values they place on the consequences of thosechoices or some other aspect of the process by whichchoices are made or values identified? See Harrison,Ronald M. Harstad, and Elisabet Rutström (2002) for adiscussion of value elicitation.

78 Journal of Economic Literature, Vol. XLIII (March 2005)

32 Difficulties in distinguishing between theories thatare consistent with observations and theories that “explain”these observations are not special to economics. Similarconsiderations arise in the view that one can falsify, butcannot “prove,” a scientific theory.

33 One response to concerns over internal and externalvalidity is to subject the relevant experimental protocol toscrutiny. For example, Thaler (1988) wonders whetherBinmore, Shaked, and John Sutton (1985) might haveinfluenced the behavior of their experimental subjects bystressing in their experimental instructions that subjectsshould maximize their monetary payoffs. As in otherexperimental sciences, however, a useful response topotential inaccuracy or imprecision in economic experi-ments is to rely on replication. The more readily an exper-imental result can be replicated, the less likely is it tohinge upon uncontrolled or unrecognized features of a sit-uation. The evaluation of a new experimental situationthen lies in the ability of its “control” treatment to repli-

mr05_Article 2 3/28/05 3:25 PM Page 78

can expect experiments to exhibit elementsof each.

How can economic theory help? First,economic theory can fill the gap when exper-iments are not sufficiently informative (at areasonable cost) to be useful. Oil companiesmaintain teams of geologists who supple-ment sampling data with theoretical modelsdesigned to predict the likelihood of findingoil beneath a tract of land or ocean bed. Whybother, when a single experimental observa-tion would suffice to provide the result? Thedifficulty is that the experiment in questionconsists of drilling a well, which can be suffi-ciently expensive as to be undertaken onlyafter a favorable theoretical assessment.Similarly, our primary means of assessingnuclear weapons is theoretical, in the formof computer simulations.31 The relevantexperiments are too costly.

Second, and again of more relevance forthe current discussion, economic theory canbe useful in assessing the external and inter-nal validity of experiments. Insight into linksbetween experimental outcomes and uncon-trolled aspects of the experimental situation(and hence external validity), or insight intothe link between the experimental environ-ment and the observed behavior (internalvalidity), can be provided by theoreticalmodels of the behavior. For example, exper-imental outcomes in continuous double auc-tion markets (e.g., Plott and Smith 1978;Smith 1962, 1964, 1965, 1976, 1982) havebeen surprisingly efficient, given the appar-ent thinness of the markets. How do thetraders overcome the frictions of a thin mar-ket to achieve nearly efficient outcomes?Under what circumstances can we expectsimilar behavior in actual markets and whenshould we be less sanguine about efficiency?Addressing the latter question has becomeeasier as theoretical models have tackled the

31 Developing such a theory is itself quite costly, somuch so that its provision to other countries is treason-ous. But in this case, the relevant experiments involvenuclear detonations whose direct and political costs areeven larger.

former, showing that the continuous flow ofoffers, coupled with traders’ budget con-straints, generates a mechanical but power-ful push in the direction of efficientoutcomes (Brewer, Maria Huang, BradNelson, and Plott 2002; Dhananjay K. Godeand Shyam Sunder 1993, 1993, 1997;Sunder 2004). Alternatively, inconsistentbehavior in laboratory decision problems isoften interpreted as reflecting preferencesthat violate the expected-utility axioms. Howdo we know when we have uncovered some-thing about preferences and when we shouldseek some other explanation in the experi-mental design? We have more confidence inthe links between behavior and preferenceswhen we have models of the latter.

Once again, a difficulty arises. A modelconsistent with the observed behavior doesnot always identify the principles behind thebehavior. Instead, experience has shown thateconomists can build a variety of modelsconsistent with virtually any behavior.32 Howdo we know when we have hit upon a cleverbut irrelevant model and when our modelcaptures something important?33

Revisiting a theme, one of the obstacles tothe integration of economic theory and

cate previous results. For two examples among many,Binmore, Shaked, and Sutton begin their experiment byreplicating the results of Werner Güth, RolfSchmittberger, and Bernd Schwarze (1982), and CharlesR. Plott and Zeiler (2003) begin their investigation of theendowment effect by replicating previous findings.

Samuelson: Economic Theory and Experimental Economics 79

35 Formally, π*(sM) is proportional (being rescaled toensure a total probability of one) to ρ({x∞ ∈ X∞ : x∞(n) �xN(n), n � 1,…, N and FM(x∞) � sM}).

36 Recall that a theory is an element of ��SM, being adistribution from which a distribution over sM is randomlydrawn. Notice that the outcomes of the experiment andthe theory are drawn from different spaces. This is famil-

mr05_Article 2 3/28/05 3:25 PM Page 79

experiments is thus that we have no clearidea of when we have a good match betweentheory and behavior. This difficulty againposes a challenge in two respects. For theo-rists, there is much to be done in terms ofidentifying behavior that would enhanceone’s confidence that the theory in questionhas captured the relevant principles, or thatwould force one to question such a conclu-sion. A good start would be to consistentlyexplain what behavior a theory cannotexplain.34 For experimentalists, it can beimportant to argue not only that a modelcaptures the outcomes of the experiment,but that it captures the appropriate linksbetween the experimental situation and theoutcome. Again, a good start would be toconsistently explain what outcomes wouldlead to the opposite conclusion.

4. Combining Theory and Experiments

4.1 Using Experiments to Learn AboutTheory

4.1.1 Testing Theory: Accuracy

How can we use experiments to evaluateeconomic theory? Suppose we fix an experi-mental design xN and a set of possible out-puts SM, identifying the features of the inputand output that are considered salient in theexperiment. The resulting experiment pro-duces an output sM. Does this indicate thatwe should be more confident of economictheories that place relatively large probabili-ty on the outcome sM, or on similar out-comes, when faced with the input xN? Someuseful insight into this question is given bythe following argument, adapted fromAlvaro Sandroni (2002), that is typical of thecalibration literature.

Given the design xN, the experiment’s out-put sM is randomly determined by the envi-ronment. In particular, a situation x∞ is

34 Among many such examples, Timothy N. Cason andDaniel Friedman (1996) and John H. Kagel, Harstad, andDan Levin (1987) begin their analysis with theoreticalmodels, focusing on aspects of behavior the models cannotaccommodate.

randomly drawn from the set of situationswhose first N dimensions match xN, i.e., fromthe set of situations that match the experi-mental design in those features controlled bythe design. This situation is then convertedinto an output according to the function Fdescribing the environment, and we observethe first M dimensions of this output, givingthe output sM. We let π∗ denote the resultingprobability distribution over the set of possi-ble experimental outputs SM, and refer to π∗

as the true distribution.35

Similarly, given the input xN, a theory f canbe viewed as producing an output π ∈�SM,i.e., a probability distribution over the set ofpossible outcomes . This output is itself ran-domly chosen according to a probability dis-tribution over �SM that is determined by thetheory.36 We let f ∗ denote this distribution.

The task now is to describe the implica-tions of the experiment for the theory. Wethink of running the experiment, producinga randomly-drawn output sM (from the dis-tribution π∗ induced by the experiment), andchoosing a randomly-drawn distribution π(from the distribution f ∗ induced by the the-ory). We insert these realizations into anevaluation rule T(sM,π). The evaluation ruleproduces the output T(sM,π)�1 if we acceptthe theory given realizations π and sM andT(sM,π)�0 if we reject the theory given real-izations π and sM. Clearly, of course, a singleexperiment does not suffice to evaluate atheory. The labels “accept” and “reject”might accordingly be more precisely (butalso more cumbersomely) phrased as“regard this experiment as evidence in favor

iar. For example, the experiment produces the outcomerain or no rain, according to a distribution that dependsupon such factors as the location and the season. The the-ory is allowed to announce a probability of rain, whichmay itself be drawn from a distribution that depends uponsimilar factors.

80 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 80

of the theory” and “regard this experiment asevidence questioning the theory.”

How do we design a useful evaluation ruleT? One desirable criterion is that if one wereto offer the true distribution π∗ as the real-ized output of one’s theory, then our evalua-tion of the experimental evidence should beunlikely to reject it. Because the experimen-tal outcome is random, we cannot expect thedistribution π∗ to always prompt an accept-ance. For example, the evidence will some-times reject the theory that a fair coin yieldsheads on half of its flips, simply because weencounter an unusual and unlikely sequenceof outcomes. However, we can reasonablyask that such rejections be rare. We makethis idea precise by saying that an evaluationrule accepts the truth with probability atleast 1�� if, for any true distribution π∗,

.

Hence, with probability at least 1��, thetrue distribution π∗ generates an experimen-tal outcome sM that would not prompt us toreject the truth, if we were asked to evaluatethe truth as a possible theory. Notice that wewill typically not know the true distributionπ∗ when designing an evaluation rule, andhence our requirement is that the evaluationrule be unlikely to reject the truth (given thedistribution of experimental outcomes gen-erated by the truth), regardless of what thetruth happens to be.37

At the other end of the spectrum, an eval-uation rule is not particularly helpful inassessing a theory if there are no experimen-tal outcomes that would cause the theory tobe rejected (even though this would be oneway to accept the truth with high probabili-ty). For example, an experimental test of thetheory that a coin is fair is not helpful if it

� � �∗ ∗( ) ={ }( ) ≥ −s T sM M: , 1 1

37 For example, we can design an evaluation rule thatcan observe one hundred flips of a coin and simultaneous-ly be quite likely to conclude that the coin is biasedtowards heads when it is, and quite likely to conclude thatthe coin is biased toward tails when it is, because these twobiases (if true) generate quite different distributions overexperimental outcomes.

always accepts the theory, but could be use-ful if it instead rejects the theory if theobserved proportion of heads (or tails) is toolarge. To capture this distinction, we say thatan evaluation rule is blindly passed by theoryf with probability 1�� if, for every sM ∈ SM,

.

Hence, no matter what observation sM theexperiment produces, with probability atleast 1�� the theory f (via its induced distri-bution f∗) produces a distribution π over pos-sible experimental outcomes that causes thetheory to be accepted (given the observationsM). The phrase “blindly passed” here refersto the fact that the theory f is accepted bythe evaluation rule with probability at least1�� regardless of the experimental outcomeor, equivalently, to the fact that f embodiesno understanding of the true process gener-ating experimental outcomes. As a result, atheory may blindly pass an evaluation rulewith high probability, but without providingany insight into the principles governing theoutcome in this situation.

The main result (proven in section 7) isnow:38

Proposition 1 Any evaluation rule thataccepts the truth with probability 1�� canbe blindly passed with probability 1��.

At first glance, it seems obvious that anevaluation rule that accepts the truth can bepassed—one need only propose the truth asone’s theory. However, Proposition 1 makesa quite different assertion. If an evaluationrule accepts the truth sufficiently often(i.e., with probability 1��), then one canfind a theory that requires no knowledge ofthe truth and has the property that, no mat-ter what the outcome of the experimentand no matter what the actual process gen-erating the experimental outcomes, the the-ory is accepted with probability 1��. Thefollowing illustrates:

f T sM∗ ( ) ={ }( ) ≥ −� � �: , 1 1

38 This is a special case of Proposition 1 in AlvaroSandroni (2002).

Samuelson: Economic Theory and Experimental Economics 81

mr05_Article 2 3/28/05 3:25 PM Page 81

Example. Suppose that there are only twopossible outcomes of an experiment, headand tail. The environment induces a trueprobability distribution over these two out-comes, which we denote as π∗∈[0,1], whereπ∗ is the probability of the experimental out-come head. As the notation suggests, we canthink of the experiment as a single flip of a(possibly biased) coin, with π∗ being the trueprobability of a head. The theory generates a(possibly randomly determined) candidateprobability π, which we must then combinewith the experimental outcome to evaluatethe theory. A possible evaluation rule is:

.

Hence, the theory is accepted if the experi-mental realization is tail and the realizationπ of the theory attaches probability less than

to head (the first line), and is accepted ifthe experimental realization is head and therealization of the theory attaches probabili-ty at least to head (the second line). Thisparticular evaluation rule accepts the truthwith probability at least .39 Such a mini-mum acceptance probability does not soundvery impressive. By altering the evaluationrule, we could manage to boost this proba-bility to , but could not go further in thiscase.40 Now suppose the theory f draws πuniformly from the set [0,1]. Hence, consis-tent with the model of Section 3.1, theprobability π with which the theory predicts

12

13

13

13

T s

if s tail

if sM

M

M,�

( ) == <=

11

13and

hhead andotherwise

� ≥

13

0

39 If the true distribution is π∗� , the evaluation ruleaccepts π∗ if the experiment generates outcome tail,which happens with probability 1�π∗(� ). If the true dis-tribution is π∗� , the evaluation rule accepts π∗ if theexperiment generates outcome head, which happens withprobability π∗(� ).

40 It is to be expected that an experiment with only twooutcomes provides rather crude information—how muchinformation can one expect to extract about the probabili-ty of heads, from a coin of unknown bias, from a singleflip? Higher minimum acceptance probabilities requirericher outcome spaces. Whether we are better off in thiscase with a rule that accepts the truth with probability depends upon the relative costs of mistakenly accepting orrejecting the various values of π.

12

13

13

13

13

the outcome head is itself drawn randomlyaccording to a distribution f ∗ over ∆SM.Given the uniform distribution we clearlywork without any information as to what thetruth might be. Then

.

,

and hence the evaluation rule is blindlypassed with probability .

To see the intuition behind Proposition 1,think of playing a zero-sum game against amalevolent and possibly omniscient oppo-nent, “Nature,” where Nature chooses thetrue theory π∗ generating the experimentaloutcomes and you choose a theory f , withNature attempting to maximize the probabil-ity of an outcome that rejects your theory(here we see Nature’s malevolence) and youtrying to minimize this probability. Suppose(counterfactually) that you had the luxury ofobserving Nature’s choice before makingyour own. Then you could always simplyname Nature’s choice as your theory, and therequirement that the test accept the truthwith probability 1�� ensures that your suc-cess probability would be at least 1��.Alternatively, the worst that could happen isthat Nature gets to observe your proposedtheory before choosing the truth (here wesee Nature’s potential omniscience) and thenchooses the truth to minimize your successrate.41 The minmax theorem then gives us aresult that is familiar in the context of zero-sum games, namely that you can do as well inthe second circumstance as in the first, andhence can succeed with probability at least1�� in the second circumstance. But youroptimal performance in the actual game, inwhich neither side gets to observe the other’smove, must be somewhere between thesebest and worst cases, ensuring that the testcan be blindly passed with probability 1��.

13

f T head f∗ ∗( ) ={ }( ) = ≥{ }( ) =� � �: , 1 13

23

f T tail f∗ ∗( ) ={ }( ) = <{ }( ) =� � �: , 1 13

13

41 Here, it is clear that one is not simply predicting wellby offering the truth as a prediction, since the prediction ischosen first and then a worst-case specification of the truthis chosen.

82 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 82

The implication of this result is that theability of an economic theory to match exper-imental data does not necessarily provideevidence in support of the theory. Instead,given any specification of questions that atheory could be asked, and any specificationof how the answers to these questions are tobe compared to the experimental evidence,one can devise a theory based on no under-standing of the situation or the underlyingprinciples that allows one to be as successfulas knowing those principles precisely.

This result is not simply a restatement ofthe common view that it is somehow moreinstructive if one first commits to a theoryand then compares it to data (rather thanfirst observing the data and then constructinga theoretical rationalization). More impor-tantly, this result is not simply a restatementof the observation that it is important for the-ories constructed in response to experimen-tal observations to make “out of sample”predictions, i.e., predictions that could beassessed only with the collection of new data.

Instead, the ability to blindly construct atheory f that fares as well as the truthdepends upon knowing the evaluation rule Tby which the theory is to be assessed. Aslong as we identify a fixed set of potentialtests to which a theory is to be subjected,whether in or out of sample, we can blindlyconstruct a theory that fares as well as thetruth in these tests, regardless of whether wehave seen the outcomes of the tests andregardless of what these outcomes might be.Interpreting experimental evidence as sup-porting a theory, or offering a theory as aninterpretation of experimental evidence,thus acquires some bite only if the theory isclear and complete enough that it can beextended to answer new questions and con-front new tests that did not play a role in theconstruction of the theory.42 Is the theory

42 Eddie Dekel and Yossi Feinberg (2004) propose atest for whether one’s theory matches the environmentalfunction F that hinges upon asking one to design (ratherthan react to) an evaluation rule T.

clear enough that others could design newtests, and is one willing to risk the theory insuch tests? If not, then it is not clear thatprogress has been made.

For example, Bolton and Ockenfels(2000) and Fehr and Schmidt (1999) offermodels motivated by behavior in bargain-ing experiments, with each model consist-ing of an explicit specification of how utilitydepends upon (one’s own and one’s rival’s)payoffs (cf. section 2). In doing so, theauthors are offering models that (like allothers) cannot hope to capture every detailof human motivation, and hence are boundto fail some tests. However, these modelsexhibit the essential characteristic of beingsufficiently precise and powerful that newtests can be devised. The authors are takingsome risk in presenting their theories soexplicitly, but in return they ensure thattheir models can be meaningfully investi-gated experimentally. If their models donot provide useful alternatives to thehypothesis that players maximize theirexpected monetary payoffs, they will bestepping stones to such alternatives. Eitherway, their models allow progress that wouldbe impossible without the ability to venturebeyond the experimental designs thatprompted them.

4.1.2 The Margin of Error: Precision

We have modeled a theory as producinga probability distribution over probabilitydistributions over outcomes.43 In mostcases, an economic theory provides nothingof the sort, with deterministic outcomesbeing the rule. How do we put these twotogether?

Think first about how economists typi-cally do empirical work. The underlyingintuition and theoretical structure comefrom a model free of anything random. Butbefore confronting this model with the

43 This ability to mix is important, as without it one can-not be assured of blindly passing evaluation rules thataccept the truth.

Samuelson: Economic Theory and Experimental Economics 83

46 At the same time, the experimental results still pres-ent a challenge for the theory. We no longer have evidencethat the model is inaccurate, but we have evidence that itis not sufficiently precise to be a useful description ofbehavior. In response, we could restrict our attention topayoffs (effectively, shortening the list M of outputs of thetheory) or refine the theory in hopes of more precisely cap-turing behavior.

47 Binmore and Samuelson (1999) study learning mod-

mr05_Article 2 3/28/05 3:25 PM Page 83

data generated by a noisy world, an errorterm is added. The characteristics of thiserror can be important, providing the foun-dations for the inferences to be drawn fromthe results.

Assumptions about errors play a similarlyimportant role in interpreting experimentalresults. One argues not that the data and thetheory are a perfect match, but rather thatthe errors required to reconcile the datawith the model are not too large.

What does “not too large” mean? Auctionshave received significant attention fromexperimentalists, with results that oftenappear to be at odds with theoretical predic-tions.44 One interpretation of the observedbehavior is to assume that subjects invari-ably intend to identify and take their optimalactions, but that some sort of “tremble”translates this optimal action into a randomchoice.45 The evidence convinces mostobservers that by this standard, there isoften a large gap between theoretical resultsand experimental behavior: the tremblesrequired to reconcile the two are too large,and hence much of auction theory appearsinsufficiently accurate to be a usefuldescription of behavior.

Alternatively, one might interpret theobserved behavior by assuming that subjectsare only �-optimizers, being content withidentifying and playing an action that iswithin some � of a best response. Section 2touched on Harrison’s (1989, 1992) argu-ment that, by this standard, very little erroris required to reconcile the theory with thedata. It turns out that one’s actions have rel-atively little effect on expected payoffs inmany auctions (as long as actions are not toofar from equilibrium), and hence that one

44 For surveys, see Douglas D. Davis and Charles A.Holt (1993), Kagel (1995), and Kagel and Levin (2002).

45 Such trembles may initially appear difficult to moti-vate, but similar ideas have played an important role in theequilibrium refinements literature. More importantly, thepossibility that typing or other errors might lead to mistak-en bids was a serious concern in the design of the FCCspectrum auctions (Paul Milgrom 2004).

can come close to maximizing one’s expect-ed payoff with actions that seem far awayfrom the equilibrium. If we are to viewerrors in this way, then we must be carefulin concluding either that optimization is apoor description of individual behavior orthat the outcome is not (approximately) inequilibrium.46

Yet another interpretation of the observedbehavior assumes that subjects choose theiractions not by optimizing but through aprocess of trial-and-error learning.47 Here,errors are measured in terms of the strengthof the incentives embedded in the learningprocess.

The implication in each case is that theinterpretation of experimental resultsrequires not only a theory, but also someidea of what types of errors are most likelyinvolved when the theory does not work per-fectly. Richard D. McKelvey and Thomas R.Palfrey’s (1995) quantal response equilibri-um is perhaps the best developed and mostgeneral such model, built around agents whomaximize utility functions perturbed by ran-dom terms. Notice that the errors here arebuilt into the model of individual behaviorfrom the beginning rather than being addedat the end.48 These errors can be interpret-ed as capturing unmodeled but (one hopes)small effects on preferences. Quantalresponse equilibria have been used to good

els whose results depend importantly on the nature of(possibly very small) errors.

48 It is a familiar result that incorporating uncertaintyinto the construction of a model can yield results that dif-fer from simply appending error terms to a deterministicmodel. For example, incorporating an error term into play-ers’ choices in a game and then solving for a (perfect) equi-librium (Reinhard Selten 1975) can give results quitedifferent than first solving for a (Nash) equilibrium andthen adding an error term.

84 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 84

effect in analyzing a variety of experimentalresults.49

Once again, however, new challengesappear. Philip Haile, Ali Hortacsu, andGrigory Kosenok (2004) show that quantalresponse equilibrium is a sufficiently flexiblenotion that, by appropriately specifying theerror terms, one can obtain equilibria con-sistent with any behavior that one mightpossibly observe. The unmodeled errors arethus important. Without further assump-tions concerning their distribution, too muchis left out of the model for its predictions tobe usefully precise.50

There are then two possibilities for har-nessing the potential power of quantalresponse equilibria. First, quantal responsemodels can provide comparative staticimplications even without distributionalassumptions.51 Alternatively, Haile,Hortacsu, and Kosenok’s (2004) resultdepends upon having sufficient freedom inspecifying the errors in the individual utili-ties underlying the quantal response model.We may often have either intuition or exper-imental evidence about what forces are cap-tured by the errors. We may then augmentthe underlying model with hypotheses aboutthe distribution of errors sufficiently power-ful to produce precise results. In effect, weare enhancing precision by expanding theset of inputs XN to capture more informa-tion. Jacob K. Goeree, Holt, and Palfrey(2004) note that applications of quantalresponse equilibria typically work with mod-els that are monotonic, in the sense that

49 See, for example, Simon P. Anderson, Jacob K.Goeree, and Charles A. Holt (1998, 1998, 2001), Georeeand Holt (2001), Goeree, Holt and Thomas R. Palfrey(2002), and Richard D. McKelvey and Palfrey (1992,1995, 1998).

50 Though the technical details are different, this resultis similar in spirit to John O. Ledyard’s (1986) observationthat any behavior is consistent with the notion of Bayesianequilibrium.

51 A concentration on comparative statics requiresthat the distributions of the error distributions do notvary (or vary sufficiently regularly) as the parameters ofthe problem vary, a requirement lying behind many aneconometric inquiry.

increasing the expected payoff of an alterna-tive increases the probability that it is cho-sen.52 Goeree, Holt, and Palfrey providesufficient conditions for quantal responseequilibria to be monotonic and show thatmonotonic quantal response equilibria canhave substantive empirical content.53 Theinformativeness of experiments based onquantal response models is thus enhancedby a better theoretical understanding ofsuch models.

Focusing attention on the specification oferrors has the advantage of leading naturallyto a provision for heterogeneity in players’behavior. Perhaps one of the most robustfindings to emerge from experimental eco-nomics is that such heterogeneity is wide-spread and substantial. Despite this,heterogeneity has often not played a promi-nent role in many theoretical models.Instead, theoretical explanations often havethe flavor of seeking “the” model of individ-ual behavior that will account for the experi-mental behavior. This appears to be aholdover from the original presumption thatmonetary payoffs, common to all subjects,suitably captured preferences, an assump-tion that encourages a view of players ashomogeneous.54 Error terms provide a natu-ral vehicle for capturing heterogeneity.

The implication is that there is much to begained by making our treatment of errors inindividual decision-making more explicit,and hence much to be gained in the inter-pretation of experimental results by beingmore careful with our theory. However, thisis a task made all the more daunting by the

52 For example, a logit choice model with independent,identically distributed extreme-value errors satisfiesmonotonicity.

53 Other examples of work focussing on the structure oferrors include Mahmoud A. El-Gamal and David M.Grether (1995), David W. Harless and Camerer (1995),Harrison (1990), and Daniel Houser, Michael Keane, andMcCabe (1995). Similarly, Ledyard’s (1986) analysis ofBayesian equilibrium suggests that we augment the model,perhaps with assumptions about players’ beliefs.

54 For work on subject heterogeneity, see Andreoni,Marco Castillo, and Ragan Petrie (2003), Andreoni andJohn Miller (2002), and the examples cited in note 53.

Samuelson: Economic Theory and Experimental Economics 85

mr05_Article 2 3/28/05 3:25 PM Page 85

observation that the considerations relegat-ed to error terms are often there because weknow little about them. Once again, theoristsare sent back to the drawing board in searchof theories precise enough to be useful.

4.2 Using Theory to Learn aboutExperiments

4.2.1 External Validity

Having found an experimental regularity,how do we assess whether the experimentaldesign from which it emerges is a goodmatch for the intended application (thequestion of eternal validity) and whether wehave linked the resulting behavior to theappropriate characteristics of the design?The obvious observation is that more exper-iments are always helpful, and one of thegreat advantages of the experimentalmethod is the ability to collect more data.But economic theory has a role to play inconjunction with these experiments.

For example, the standard assumptionwhen modeling intertemporal choice is thatpeople maximize the sum of exponentially-discounted expected utilities. Expected util-ity theory derives much of its appeal fromthe fact that it rests upon a collection ofaxioms that can be interpreted as prescribingconsistent behavior (Leonard J. Savage1972). Extending this argument to intertem-poral behavior, consistency is similarlyensured by exponential discounting.

The difficulty is that the experimental evi-dence has not been particularly supportiveof exponential discounting.55 The consensusleans toward a model in which discountingdeparts from exponential in the direction ofbeing biased toward the present, so that dis-count rates decline as one evaluates moredistant payoffs. Hyperbolic discounting isthe most prominent example.

The case for hyperbolic discounting (orother forms involving a bias toward the

55 See Shane Frederick, Loewenstein, and TedO’Donoghue (2002) for a survey and Maribeth Coller,Harrison, and Rutström (2003) for an alternative view.

present) is often bolstered with results from(nonhuman) animal as well as human exper-iments. The use of hyperbolic discountingin interpreting results from animal experi-ments is routine (e.g., James E. Mazur 1984,1986, 1987). The question to be consideredhere is one of external validity: how relevantis the animal evidence for human behavior?

Our approach to this question is not todebate how similar are animals and humans,but rather how similar are the typical dis-counting problems faced by animals andhumans. In turn, the approach to this latterquestion is to examine theoretical models ofthese discounting problems.

Discounting in animals is commonlyexamined in the context of foraging behavior(e.g., Alasdair I. Houston and John M.McNamara (1999), Alex Kacelnik (1997),Michael Bulmer (1997)). It is helpful tobegin with a highly simplified, deterministicmodel. Suppose an animal faces the prob-lem of maximizing total food consumptionover an interval of length T. A functionc:IR+→IR+ identifies the quantity of con-sumption c(t) that can be secured upon theinvestment of foraging time t. The animal isto make a succession of foraging-time/con-sumption pairs of the form (t,c(t)), whereeach choice (t,c(t)) allows consumption c(t)but precludes another choice until time thas passed.

The animal’s task is to choose an optimalpair (tr,c(tr)) for any length � of time remain-ing in the foraging interval. Let V(�) be thevalue of the optimal continuation consump-tion plan, given the length � of time remain-ing.56 If � is sufficiently large, then theoptimal consumption plan will be nearly sta-tionary, featuring a choice of some fixed,optimal t∗ at each opportunity. This allowsthe approximation

.Vc t

t� �( ) = ( )∗

56 The function V is implicitly defined by t� ∈ arg maxt{c(t)V(��t)}.

86 Journal of Economic Literature, Vol. XLIII (March 2005)

57 Peter D. Sozou (1998) and Partha Dasgupta and EricMaskin (2003) explore evolutionary motivations for hyper-bolic discounting that do not depend upon foraging as thestandard decision problem.

58 This possibility provides one illustration of how elu-sive external validity can be. There is often no single orobvious external situation to which the model is to beapplied. The question may then not be whether there aresituations outside the laboratory that correspond to theexperiment, but rather whether the corresponding situa-tions are the “right” ones. We return to this point at theend of this section.

mr05_Article 2 3/28/05 3:25 PM Page 86

But then the optimal consumption plan t∗

maximizes

(1)

Hence, optimal foraging behavior induces apreference for consumption c(t) at time tover c(t�) at t� if

.

Consumption at time t is thus optimally dis-counted by 1/t, i.e., is discounted by thehyperbolic function 1/t. It then seems unsur-prising that experiments with animals aresuggestive of hyperbolic discounting.

How relevant is this evidence for humans?Hyperbolic discounting arises out of a modelin which delayed consumption imposes anopportunity cost, in the sense that other con-sumption opportunities are precluded whilewaiting for the current realization. The vari-able t measures the time spent foraging, dur-ing which consumption is precluded. Thereis nothing like this in the intertemporal deci-sion problems typically associated withhyperbolic discounting in humans, where tmeasures a delay during which other optionsare not closed. For example, when facing thecanonical hyperbolic-discounting story ofchoosing between one sum of money nowand another in a week, and then betweenthe same sums in fifty-two and fifty-threeweeks, there is no presumption that inter-vening consumption possibilities are sacri-ficed. We thus have reason to doubt thathyperbolic discounting in animals has suffi-cient external validity to be of relevance forhuman behavior.

This observation is only the first step ofthe story. There may still be good reasons forhumans to engage in hyperbolic discounting.One possibility is that human intertemporalpreferences were formed during a time inwhich people typically faced decision prob-lems similar to the foraging choices thoughtto be typical of animal decisions, and thatpeople now simply apply the resulting(hyperbolically discounted) preferences to

c tt

c tt

( ) > ( )''

c tt( )

current decisions without noting the differ-ent context.57 In effect, the opportunitycosts of the time sacrificed while waiting forconsumption may have been important inthe ordinary lives of our ancestors, even ifwe do not commonly encounter it in ourlives, potentially restoring the relevance ofthe animal experiments.58

A second difficulty now arises. Suppose weexpand our simple foraging model to accom-modate uncertainty. Let {X(1),…, X(n)} be acollection of independent, positive-valuedrandom variables. We interpret each of theseas representing a foraging strategy, with eachforaging strategy characterized by a randomlength of time until it yields a consumptionopportunity. To keep the example transpar-ent, we simplify our previous model byassuming that each consumption opportunityfeatures one unit of food. The animal choosesa foraging strategy, waits until its payoff is real-ized, chooses another strategy (perhaps thesame one), and so on, until a fixed foragingperiod of length T has been exhausted.

This model gives what is commonly knownas a renewal process. The intuition is thatonce a unit of food has been received, theprocess has been “renewed,” in the sense thatthe set of possible choices and outcomes hasreverted (literally in the case of an infinitehorizon and approximately in the case of asufficiently long finite horizon) to its originalconfiguration. For sufficiently long horizons,the optimal strategy will again be approxi-mately stationary. Consider a stationary strat-egy, in which the same random variable X(i)is chosen at each opportunity. Let µi be themean time before food is realized under X(i).

Samuelson: Economic Theory and Experimental Economics 87

59 One of the puzzles facing biologists is that observedbehavior appears to match the objective given by (3) moreclosely than the simple theoretical prediction that (2) bemaximized (Melissa Bateson and Alex Kacelnik (1996),Kacelnik (1997), Kacelnik and Fausto Brito e Abreu(1998)).

60 Again, see Frederick, Loewenstein, andO’Donoghue (2002). Here, as always, there are still ques-tions of internal validity. Is the observed behavior a prod-uct of hyperbolic discounting, or something else? Halevy(2004) and Ariel Rubinstein (2003) explore alternatives.

mr05_Article 2 3/28/05 3:25 PM Page 87

Let N(T) be the number of renewals (i.e.,number of units of food) secured by time T.Then the elementary renewal theorem(Sheldon Ross (1996, Proposition 3.3.1))indicates that, as T gets large,

.

As a result, the stationary strategy thatchooses the random variable with the small-est mean time to renewal (µi) will be approx-imately optimal (among the set of allstrategies, not just stationary ones), in thesense that it maximizes the number ofrenewals N(T) and hence consumption, forlarge values of T. This strategy chooses therandom variable X(i) that maximizes

(2) ,

where t is the renewal time and E{t}�µi is itsexpected value. In contrast, applications ofhyperbolic discounting in economics typi-cally assume that people maximize theexpected value of hyperbolically-discountedutilities. In our simplified case, recalling thateach random delay is terminated by theappearance of one unit of food, this calls formaximizing(3) ,The objectives given by (2) and (3) can espe-cially differ if the menu of foraging strategiesincludes alternatives with high mean renew-al times but that attach some probability tovery short waiting times. Such strategies mayfare very well under (3), while being lessattractive under (2).

We thus find that an appeal to our evolu-tionary background may or may not allow usto interpret animal evidence as bracing abelief in human hyperbolic discounting, butthat in the process we also provide evidenceagainst commonly-used models of (hyper-bolically discounted) expected utility maxi-mization. There appears to be no obviousway to interpret animal experiments as sup-porting both hyperbolic discounting andexpected utility maximization.

E t1 /{ }

1E t{ }

N TT i

( ) → 1

Two qualifications are relevant. First,there are things about animal behavior thatwe do not understand.59 More importantly,the point here is not to defend exponentialdiscounting. Instead, it would be quite a sur-prise if discounting were precisely exponen-tial. There is also evidence of hyperbolicdiscounting from human experiments, whichthe current discussion does not call intoquestion.60 The point is that extendingresults from animal experiments to conclu-sions about human behavior raises questionsof external validity that can be examinedthrough the lens of economic theory. In con-nection with hyperbolic discounting, theaccompanying theory is not immediatelysupportive of a link.

Assessments of external validity can befurther complicated by the fact that theappropriate external environment for com-parison is often not obvious. Consider one ofthe simplest experimental settings, the dicta-tor game. Experiments find that dictatorstypically do not seize all of the money,despite the lack of any obvious reason for notdoing so (Davis and Holt 1993; RobertForsythe, Joel L. Horowitz, N. E. Savin, andMartin Sefton 1995). What should we makeof this result? Each of us is constantlyinvolved in a version of the dictator game, inthat we constantly have opportunities to giveaway the money in our wallets, or anythingelse that we own. Typically, however, wehold on to what is ours. One might then viewthe experimental evidence as beingswamped by a mass of practical experiencewith the dictator game, in which people forthe most part tend to keep what they have.

88 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 88

Then what do the experiments have totell us? One message is clear: people do notalways keep everything. This is a usefulpoint of departure. Outside the laboratory,people also sometimes relinquish whatthey own, giving gifts and making contribu-tions to charity. A variety of explanationshave been offered for why this seeminglyaltruistic behavior is consistent with ration-al, selfish behavior.61 While often persua-sive, and consistent with some aspects ofbehavior in dictator experiments,62 itseems a stretch to suggest that such expla-nations can cover every bit of generosity.One can then view dictator experiments asan attempt to strip away the confoundingfactors and isolate a situation in whichrational, selfish behavior has a clear predic-tion, allowing us to conclude that peopleare not always relentlessly selfish.

This is instructive, but only the mostextreme would claim that selfish preferencesare a complete description in every circum-stance. Do the dictator experiments haveanything to contribute beyond challengingsuch extremists? Here we return squarely tothe question of what is the appropriate con-text in which to evaluate the external validi-ty of dictator experiments. Does theexperimental allocation represent the con-tinual decisions we implicitly make aboutwhether to keep our wealth or give it away?If so, then the findings provide a seriouschallenge to the preferences commonly usedin economic models. Does the experimentcapture those rarer circumstances underwhich people make anonymous contribu-tions to charity? If so, then the findings arecommonplace. A useful point of departure in

61 For example, people are said to give gifts in anticipa-tion of reciprocation, to contribute to charity in order togain esteem, to tip in order to advertise their generosity tofellow diners, and so on.

62 For example, the sensitivity of amounts retained bydictators to the degree of anonymity in the experiment(Hoffman, McCabe, Keith Shachat, and Smith 1994;Hoffman, McCabe, and Smith 1996) could be interpretedas indicating that one purpose of a seemingly altruistic actis to demonstrate one’s behavior to others.

addressing this issue is again theoretical,aimed at identifying and modeling the fea-tures that distinguish the first set of circum-stances from the second, and theninterpreting these circumstances in terms ofexperimental designs and findings. Onceagain, the general point is that examining therelevant theory can help assess the interpre-tation and external validity of experimentalresults.

4.2.2 Internal Validity

Experiments in economics typically fea-ture monetary payoffs. Can we assume thatthese monetary payoffs represent utilities?Section 2 touched on one reason why theanswer might be no, namely that subjectsmight care about more than simply theamount of money they make. However, sup-pose that this is not the case. If subjects arerisk averse, then monetary payoffs still donot provide a good representation of utility.

One of the early insights of experimentaleconomics was that we can effectively elimi-nate risk aversion, as long as subjects areexpected-utility maximizers. Suppose onehas in mind an experiment that would makemonetary payments ranging from 0 to 100.Then replace each payoff x∈[0,100] with alottery that offers a prize of 100 with proba-bility x/100 and a prize of zero otherwise.Expected payoffs are unchanged. However,for any expected utility maximizer, regard-less of risk attitudes, the expected utility of alottery that pays 100 with probability p (and0 otherwise) is

This expression is linear in p, meaning thatthe agent is risk neutral in the currency ofprobabilities. On the strength of this con-venience, lottery payoffs have often beenused in experimental economics.63

pU p U U U U p100 1 0 0 100 0( ) ( ) ( ) ( ) ( ) ( )[ ]+ − = + − .

63 See Cedric A. B. Smith (1961) for an early theoreti-cal discussion of lottery payoffs, and Roth and Michael W.K. Malouf (1979), Roth and J. Keith Murnighan (1982),and Roth and Françoise Schoumaker (1983) for earlyexperimental applications.

Samuelson: Economic Theory and Experimental Economics 89

mr05_Article 2 3/28/05 3:25 PM Page 89

Against this background, Matthew Rabin(2000) (see also Rabin and Richard H. Thaler2001) presents an argument that we illustratewith the following example. Suppose Alicewould rather take $95.00 with certainty thanface a lottery that pays nothing with proba-bility and $200 with probability . Supposefurther that Alice would make this choice nomatter what her wealth. Then either thestandard model of utility maximization doesnot apply, or Alice is absurdly risk averse.

To see the reasoning behind this argu-ment, assume that Alice has a differentiableutility function U(w) over her level of wealthw, with (at least weakly) decreasing margin-al utility. Alice’s choice implies that the utili-ty of an extra 95 dollars is more than half theutility of an extra 200 dollars. This impliesthat 200U�(w200)�95U�(w), where w isAlice’s current wealth and U�(w) is thelargest marginal utility found in the interval[w, w200] and U�(w200) is the smallestmarginal utility in that interval.64

Simplifying, we have, for any wealth w

(4) .

Now letting w0 be Alice’s initial wealth leveland stringing such inequalities together, itfollows that, for any w, no matter how large,Alice’s utility U(w) satisfies

.= ( ) + ′( )U w U w0 04000

= ( ) + ′( ) −

U w U w0 0200 1

1920

/

′( ) +…

U w

2

01920

≤ ( ) + ′( ) + ′( ) +

U w U w U w0 0 0200 19

20

200 6000+ ′ +( ) +U w

w U w+( ) + ′ +0 0200 200 4400( )U w U w U w U( ) ≤ ( ) + ′( ) + ′0 0200 200

′ +( ) ≤ ′( )U w U w2001920

12

12

12

64 Hence, 95U′(w) is an upper bound on the utility ofan extra 95 dollars, and 200U′(w200) a lower bound onthe utility of an extra 200 dollars. Rabin (2000) containsadditional examples and shows that the argument extendsbeyond the particular formulation presented here.

where the first inequality breaks [0,∞] intointervals of length 200 and assumes thatthe maximum possible marginal utilityholds throughout each interval, the secondrepeatedly uses (4), and the remainder is astraightforward calculation.

Now consider a loss of 3000. A similarargument shows that the utility U(w0�3000)must satisfy

.

Comparing these two results, we have thatfor any X �0,

.

Hence, there is no positive amount of moneyX, no matter how large, that would induceAlice, no matter how wealthy, to accept afifty/fifty lottery of losing 3000 and winningX. Risk aversion over relatively small stakesthus implies absurd risk aversion over largerstakes.

Risk aversion over small stakes seemsquite reasonable and is consistent with labo-ratory evidence (e.g., Holt and Susan K.Laury 2002). How do we reconcile this withthe seeming absurdity of the implied behav-ior over larger stakes? Taking it for grantedthat people are not so risk averse over largestakes, Rabin (2000) and Rabin and Thaler(2001) suggest that the expected-utilitymodel should be abandoned.

This conclusion poses a puzzle for experi-mental practice. The use of lottery payoffsappears to be either unnecessary (because

4000 0 0U w U w U w( ) − ′( ) < ( ))

12

3000120 0U w U w X−( ) + +( ) ≤

≤ ( ) − ′( )U w U w0 04400

+

2019

114

0′( )U w′( ) +U w0

≤ ( ) − ′( ) +U w U w0 02002019

U w0200 2800− ′ −( )U w0200 200′ −( ) −

U w U w U w0 0 03000 200−( ) ≤ ( ) − ′( ) −

90 Journal of Economic Literature, Vol. XLIII (March 2005)

65 There is then no inconsistency in believing thatexperimental subjects are expected utility maximizerswhile using lotteries to control for risk aversion over smallstakes. The evidence on whether lottery payoffs success-fully control for risk aversion is not entirely encouraging(e.g., Joyce E. Berg, John W. Dickhaut, and Thomas A.Rietz 2003; James C. Cox and Ronald L. Oaxaca 1995;Selten, Abdolkarim Sadrieh, and Klaus Abbink 1999; andJames M. Walker, Smith, and Cox 1990). These findingspresent yet another challenge to the presumption thatexperimental subjects maximize expected utility.

mr05_Article 2 3/28/05 3:25 PM Page 90

subjects are risk neutral over the relativelymodest sums paid in experiments) or neces-sarily ineffective (because subjects are riskaverse over small sums, and hence cannot beexpected-utility maximizers). The argumentis even more challenging for economic theo-ry, where expected-utility maximization isfirmly entrenched.

In response, our attention turns to ques-tions of internal validity. Is the observedbehavior appropriately interpreted asreflecting departures from expected-utilitymaximization? Addressing this questionrequires a more careful look at the theory.Let X be a set of consequences, � a set ofstates, and L a set of acts, where an act is afunction associating a consequence witheach state. Savage (1972) shows that if anagent has preferences over the set of acts Lsatisfying certain axioms, then the agentchooses as if she has a probability distribu-tion p over � and a utility function U over X,and maximizes expected utility.

This theory makes no comment as towhat is contained in the set X over whichutilities are defined (cf. James C. Cox andVjollca Sadiraj 2002). The argument thatAlice’s risk aversion over small stakesimplies implausible behavior over largestakes implicitly assumes that utility is afunction of (only) Alice’s final wealth—theamount of money she has after the outcomeof the lottery has been realized. Hence,Alice must view winning a million-dollarlottery when initially penniless as equiva-lent to losing $9,000,000 of an initial$10,000,000. This is the most common waythat expected utility appears in theoreticalmodels, but nothing in expected utility the-ory precludes defining utility over pairs ofthe form (w, y), where w is an initial wealthlevel and y is a gain or loss by which thiswealth level is adjusted. In this case, Alicemay view the two final $1,000,000 out-comes described above quite differently.And once this is the case, there need nolonger be any conflict between being anexpected utility maximizer, being risk

averse over small stakes, and still behavingplausibly over larger states.65

This argument can be taken a step further.Savage (1972, pp. 15–16, 82–91) views expect-ed utility theory as applicable only to “small-worlds” problems, in which the sets of states,consequences and acts are simple enoughthat one can identify and explore every impli-cation of each act. Savage notes that it is“utterly ridiculous” to encompass all of ourdecision-making within a single small-worldsmodel (1972, p. 16). Instead, his view (1972,pp. 82–91) is that decision makers break theworld they face into small chunks that aresimple enough to be approximated with asmall-worlds view. We can expect behavior inthese subproblems to be described by expect-ed utility theory, but the theory tells us noth-ing about relationships between behavioracross problems.

Duncan Luce and Howard Raiffa (1957,pp. 299–300) continue this argument, notingthat, “one’s choices for a series of problems—no matter how simple—usually are not con-sistent.” They suggest that if one discovers aninconsistency, one should modify one’s deci-sions, with “this jockeying—making snapjudgments, checking on their consistency,modifying them, again checking on consisten-cy, etc.”, ultimately leading to consistentexpected-utility maximizing behavior. We canthus expect consistent behavior only acrosssets of choices (or worlds) that are sufficient-ly small that we can expect the requiredadjustment to have been made.

Returning to our original setting, the set ofall lotteries may be too large a world toencompass within a single expected-utility

Samuelson: Economic Theory and Experimental Economics 91

mr05_Article 2 3/28/05 3:25 PM Page 91

formulation. If we define utility in terms offinal wealth levels, Alice’s expected-utilitymaximization over small stakes may then notbe consistent with her behavior over largestakes. But she may nonetheless be maximiz-ing expected utility, though with a utilityfunction in which wealth or some other vari-able indexes different small worlds problems,each of which is treated via a utility functionover (some subset of) final wealth levels.

This discussion is not to be read as adefense of expected utility theory. There isevery reason to believe that so stark a theorycannot always be a good approximation. Thisdiscussion is instead meant to provide a wordof caution in assessing the internal validity ofexperimental results. Risk aversion oversmall gambles, one of the seemingly mostpowerful challenges to the theory, may infact be consistent with expected utility.

More importantly, this argument does notdiminish the strength of the small-stakes-risk-aversion challenge to economic theory.The evidence remains that we can saveexpected utility maximization as a useful the-ory only if something other than wealthenters utility functions. As Rubinstein (2001)notes, this opens the door to all manner ofinconsistencies in decision making.Expected utility can be defended only byrecognizing that economic theorists have agreat deal of work to do.

Other illustrations of the importance oftheory in assessing internal validity are easi-ly found. Game-theoretic models featuringmixed Nash equilibria have been questionedon the grounds that individual play does notexhibit the identical, independent random-ization required by the theory (e.g., JamesN. Brown and Robert W. Rosenthal 1990).But if the mixed equilibrium reflects either apopulation polymorphism (as suggested byJohn F. Nash 2002) or the result of an adap-tive process, we would expect such inde-pendence to fail (e.g., Binmore, JoeSwierzbinski, and Chris Proulx 2001).

Alternatively, section 2 sketched the dis-cussion of behavior in bargaining games up

to the appearance of models in which pref-erences depend upon the vector of all pay-offs, one’s own as well as the payoffs ofothers. Subsequent experiments have sug-gested that more is involved. Attitudestowards payoffs appear to depend not onlyon the payoffs themselves, but also the con-text in which these payoffs were generated.A player is more likely to prefer a largeropponent payoff if the opponent’s play hasbeen appropriate (kind, or fair, or generous,or expected) and more likely to prefer asmaller opponent payoff if the opponent’splay has been inappropriate. The experi-mental evidence has provided evidence forpositive reciprocity (the desire to rewardthose who have behaved appropriately)(Fehr and Simon Gächter 2000; Fehr,Gächter, and Georg Kirchsteiger 1997;Kevin A. McCabe, Rassenti, and Smith1998) and negative reciprocity (the desireto punish those who have behaved inappro-priately) (Fehr and Gächter 2002).However, we can expect subjects’ choices toreflect a mixture of concerns for one’s ownpayoff, inequality aversion, altruism, trust,and positive and negative reciprocity (cf.Cox 2004). How do we separate theseforces, i.e., how do we assess the internalvalidity of the experiments? Once again, auseful point of departure is a model of pref-erences encompassing these forces andpointing to experiments that will distin-guish them. Interpreting the experiments isagain likely to rest upon careful theoreticalmodeling.

5. The Search for Theory

Where do we look for theoretical devel-opments that will help integrate economictheory and experimental economics?

To approach this question, think of anexperiment as being composed of threepieces. The game form (recognizing that the“game” may include only a single-player)specifies the rules of play, including thenumber and characteristics of the players,

92 Journal of Economic Literature, Vol. XLIII (March 2005)

67 See, for example, Dieter Balkenborg (1994); JordiBrandts and Holt (1992, 1993, 1995); and Cooper, SusanGarvin, and Kagel (1997, 1997).

68 Weibull (2004) stresses the possibility that an experi-ment’s monetary payoffs may not capture subjects’ prefer-

mr05_Article 2 3/28/05 3:25 PM Page 92

the choices available to the players, theirtiming and sequence, the information avail-able to the players, the resulting conse-quences, and so on. To this, one adds aspecification of how the outcomes are trans-lated into utilities. It would typically be con-venient if the monetary payoffs given by thegame form could also be taken to representplayers’ utilities, but this need not be thecase. Let us refer to a game form and itsassociated utilities as a game protocol or sim-ply protocol, and let us think of the “default”protocol as equating monetary payoffs andutilities.66 The third piece of the triad is atheory describing the behavior one wouldexpect, given the game protocol.

In some cases, the game protocol leaveslittle to the discretion of the theory. If theprotocol combines the dictator game withthe assumption that monetary payoffs areequivalent to utilities, then a theory based onrational behavior leaves no room for maneu-ver: dictators must keep all of the money.Similarly, if the protocol pairs the ultimatumgame with the assumption that monetarypayoffs are utilities, then sequential rational-ity uniquely determines the implications ofthe theory. In other cases, the protocolleaves much to the discretion of the theory.Work on equilibrium refinements grew outof the fact that even if one restricts attentionto relatively simple games and assumes thatthe payoffs are indeed utilities, sequentialrationality in general puts relatively fewrestrictions on behavior.

Now consider how one might react if aneconomic theory and experimental resultsare consistently at odds. One possibility isthat the theory should be refined, orextended, or altered, or abandoned. Forexample, the equilibrium refinements liter-ature culminated in models of equilibriumselection centered around notions of for-ward induction. The experimental evidence

66 Jörgen W. Weibull (2004) introduces the concept of agame protocol, though drawing a somewhat different linebetween the game form and game protocol.

has not been particularly supportive of for-ward induction,67 suggesting that theoriesbased on forward induction could well bereconsidered.

In other cases, there is little to be gainedby looking for alternative theories whilemaintaining the game protocol. In the bar-gaining games in section 2, for example,there appears to be no way to account for theobserved behavior while clinging to a modelbased on rationality and the default protocol.The result, as we have seen, has been a flur-ry of work developing alternative models ofpreferences.68

We return to the modeling of prefer-ences in section 5.2. First, however, section5.1 considers another possible responsewhen examining a protocol. There may begood reasons to question whether the gameform perceived by the subjects matchesthat embedded in the experimental gameprotocol.

5.1 Perceived Protocols

How could subjects help but perceive theproper game form? The potential behaviorin an experiment is typically tightly con-trolled, including quite precise rules for whogets to make what choices at what times. Asnoted in section 3.2, a great advantage ofexperiments is the ability to control thesedetails. In assessing the effects of these con-trols, however, we return to the idea thatpeople, including experimental subjects, usemodels to make decisions.

Just as economists are forced to rely onmodels in their analysis, so can we expectpeople to rely on models when making theirdecisions. Given the many choices people

ences and discusses the resulting difficulties involved indrawing inferences from experiments. Roth (1991) notesthat, given the difficulty in controlling every aspect of subjects’ preferences and expectations, it is hard to knowprecisely what game is involved in an experimental study.

Samuelson: Economic Theory and Experimental Economics 93

71 Psychologists frequently run experiments based on thepremise (often with the help of some deception) that the

mr05_Article 2 3/28/05 3:25 PM Page 93

have to make in their everyday lives, mostwithout the time and resources that econo-mists devote to a problem, we cannot expectpeople to make use of all of the informationin their environment.69 Instead, mostaspects of most decisions are ignoredbecause they are not important enough tobother with. In essence, people use models,stripping away unimportant considerationsto focus on more important ones.

Similarly, we should expect experimentalsubjects to respond to the novelty of anexperimental setting by modeling its key fea-tures. This need to rely on models when ana-lyzing the real world ensures thatresearchers and experimental subjects bothintroduce a subjective element into theirperceptions.70 Using the notation developedabove, an experiment is designed to fix anexperimental design xN. The experimentitself, however, is a situation x∞ with theproperty that x∞(n)�xN(n) for n�1,…, N.The choice of the aspects of the situation tobring within the experimental design, cap-tured by N, represents the experimenter’smodel of the situation. Suppose that anexperimental subject, confronted with thesituation x∞, similarly constructs a model.This model is itself a choice of finitely manydimensions of the infinitely-dimensioned x∞

to take into consideration. Is there any rea-son to expect the subject’s model to coincidewith the experimenter’s, i.e., to expect thesubject to hit upon the same choice of salientinformation as did the experimenter?

We may often be able to expect the subjectto come close. The experiment is typicallydesigned so as to focus attention on xN.However, it would be surprising if the twomodels matched exactly. We thus run the riskthat subjects may ignore aspects of the situa-tion that the experimenter deems critical or

69 How long would it take to get through the grocerystore if every detail of every purchase were analyzed?

70 Section 3.1 touched on the question of whether thereis an objective reality (cf. note 11). The point here is that,regardless of whether there is, models of this reality aresubjective.

that the subjects may introduce aspects thatthe experimenter deems irrelevant.71

An illustration is provided by DouglasDyer and John H. Kagel (1996). Theirresearch is motivated by the observation thatexperimental subjects frequently bid tooaggressively in common-value auctions.Even subjects who are experienced, profes-sional bidders in auctions for constructioncontracts fall prey to the winner’s curse inlaboratory experiments.

Dyer and Kagel note that the auctions inwhich the professionals routinely bid containsome potentially important features that didnot appear in the laboratory experiments.For example, the real-world auctions typical-ly allowed bidders to withdraw winning bids,without cost, when these bids contain mis-takes that are formally characterized as“arithmetic errors” but in practice areallowed to cover virtually any request towithdraw a bid (on the principal that onedoes not want a contractor who does notwant the job). As a result, the winner canwithdraw a bid that is revealed (by compari-son with other bids) to be too optimistic,providing some protection against the win-ner’s curse. It appears as if the bidders havedeveloped rules of behavior that are effec-tive in the context with which they are famil-iar, though perhaps without completelyidentifying the key features of the environ-ment that make these rules work well. Inbringing the resulting rules into the experi-ment, the subjects are reacting to a per-ceived protocol that appears to be familiar,but with results that appear to be anomalouswhen held to the standard of the protocolchosen by the experimenter.72

experimenter and subject will perceive different protocols.72 This raises the question of when we can expect les-

sons learned in one context to transfer to other contexts.Such transfer will presumably be more effective thegreater is the extent to which people learn not only whichbehavior works well, but also the reasons why the behaviorworks well. Cooper and Kagel (2003) provide an introduc-tion to work on generalizing learning across contexts.

94 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 94

The general principle is that, just as sub-jects in an experiment may face effectivepayoffs that differ from those of the gameprotocol, so might they effectively play a dif-ferent game. Our interpretation of experi-mental results can then depend importantlyon how we imagine subjects perceive thegame.73

A hypothetical illustration will be helpful.Suppose that biologists were interested in atheory that female birds preferred maleswith long tails, and that they did so rational-ly because long tails were a signal of othercharacteristics that make a mate particularlydesirable.74 To test this theory, an experi-ment is designed in which some males haveplastic feathers glued to their tails.75

Suppose that females indeed flock to themales with now strikingly long tails. How dowe interpret the results? A biologist is likelyto claim that the experimental results pro-vide support for the theory. However, onecan well imagine an economist claiming justthe opposite, that the theory has beendemonstrated to be nonsense. After all, thetheory is founded on the presumption ofrational behavior. This seems obviouslyinconsistent with exhibiting a preference formales with plastic tails, since the latter can-not be associated with the characteristicsthat make males desirable mates.

73 Uri Gneezy and Aldo Rustichini (2000, 2000) reportexperiments showing that behavior may be counterintu-itively nonmonotonic in the scale of monetary payments,with (for example) the incidence of late pickups at a day-care center increasing as the cost increases from zero to asmall amount, but then being deterred by higher costs.Their interpretation is that the increase from zero to a pos-itive cost causes agents to think about the interaction dif-ferently. In our terms, attention has been focused ondifferent aspects of the situation, triggering the use of adifferent analytical model.

74 There is a rich body of work in biology on plant andanimal signaling. See Alan Grafen (1990a, 1990b) andRufus A. Johnstone and Grafen (1992) for theoreticalmodels; H. C. J. Godfray and Johnstone (2000) andJohnstone (1998) for surveys; and Johnstone (1995) for anexamination of the evidence.

75 See Malte Andersson (1982), J. Hoglund, M.Eriksson, and L. E. Lindell (1990), and Anders Møller(1988) for examples of similar experiments.

These differing conclusions are groundedin different assumptions about how the sub-jects perceive the experiment. The biologistassumes that the subject will not perceivethe difference between a real tail and a plas-tic one or, in our terms, that the subject’smodel of the experiment does not accommo-date plastic tails. The typical assumption ineconomic contexts is that, provided theexperiment is sufficiently transparent andeffectively presented, the subjects’ model ofthe experiment matches the experimentaldesign.

The example of plastic tails may seem a bitremoved from human experiments. Supposeinstead that the hypothesis in question is thathuman males are attracted to females with“hourglass” figures.76 An experimenter teststhis by showing males a variety of porno-graphic pictures, checking which femalesprompt the most enthusiastic reaction. Manymales are responsive to pornography, andmany will be especially responsive tofemales with the appropriate figure. A biolo-gist or psychologist is again likely to interpretthe experimental results as support for thetheory. Once more, however, one can imag-ine economists interpreting the results asanother blow to the contention that people(or at least males) behave rationally. Howcan they be rational if they react the sameway to fictitious females as to real ones?

An alternative interpretation is that peo-ple behave rationally, but that they use mod-els that do not incorporate a distinction thatthe experimenter takes for granted. Just asbirds may have a model of the world thatmakes no provision for plastic tails, so maypeople have models that do not distinguishperfectly between real and fictitious females(though obviously also not treating the twoidentically). Why might people persist inusing such models? We turn to this questionin section 5.2. Before doing so, one more

76 Again, there may be a biological basis for such tastes.See, for example, Bobbi S. Low, R. D. Alexander, and K.M. Noonan (1987) and Matt Ridley (1993).

Samuelson: Economic Theory and Experimental Economics 95

mr05_Article 2 3/28/05 3:25 PM Page 95

example is useful, pushing the setting yetcloser to traditional economic experiments.

Return to the point of departure for sec-tion 2, the ultimatum game. A key compo-nent of the game form is that the proposerand responder will have no subsequent inter-actions. Experimenters have gone to greatlengths to ensure that subjects understandthis, including most notably ensuring that thesubjects interact anonymously. But can wepreclude the possibility that subjects modelthe situation as if there is some possibility offuture interaction? If not, then the observedbehavior might be consistent with rationality,without requiring any modification in howwe model preferences.77

Fehr and Joseph Henrich (2003) shedsome light on this “phantom future” expla-nation, pointing to experimental studiescomparing behavior with and without theprospect of future interaction. For example,experimental behavior in one-shot andrepeated games is markedly different (e.g.,Fehr and Gächter 2000 and Gächter andArmin Falk 2002). As Fehr and Henrichnote, this provides evidence that the inabili-ty to correctly account for the future is not aplausible explanation for the observedbehavior. These results help fill in one pieceof the puzzle, but leave some more to beexplored. The experiments strongly suggestthat people do not treat every situation as ifit has the same prospects for subsequentinteraction. It then remains to ask whethersubjects may still model each situation as ifthere is some prospect of future interaction,perhaps on the strength of some reasoningto the effect that one can never absolutelypreclude any possibility, while still recogniz-ing that games with an explicit future are dif-ferent than games without. The idea behindthis middle ground is that people may notperfectly model one-shot interactions, while

77 Returning to the question raised in section 2, what dosubjects have to learn in the ultimatum game? Perhapsthat its futureless nature distinguishes it from other, morefamiliar situations, and accordingly calls for differentbehavior (and that enough others have also learned this).

still recognizing and acting on the fact thatthe likelihood of future interaction is quitedifferent in different situations (just as malesmay recognize that they are not dealing withreal females when consuming pornography,and yet have a reaction to the latter shapedby their reaction to real females).

At this point, this possibility is a hypothe-sis awaiting further exploration and experi-mentation. Before expecting too much suchexperimentation, however, we must ask forsome theoretical guidance on how peoplemodel the situations they face. What behav-ior could we observe that would bolster ourbelief in such an explanation, or that wouldcall it into question? In essence, we need atheory of how people use theories in shapingtheir behavior. This is a relatively new butimportant direction for economic theory.78

The implication is that models of subjects’perceptions of experimental game formsshould take their place alongside models ofpreferences in explaining behavior. We seeevidence for an important role in the waysubjects perceive experimental protocols inthe importance of framing effects.79 Why doseemingly innocuous differences in thedescription of a protocol make such a differ-ence? Presumably because they prompt sub-jects to use different models in analyzing theexperiment.

This perspective suggests that some cau-tion is called for when working with espe-cially complicated experiments, not simplybecause it may tax the abilities of the sub-jects (as stressed by Binmore 1999), but alsobecause it may expand the range of modelsthat subjects apply to the experiment. At thesame time, Harrison and John A. List (2004)caution that the context-free framing ofmany experiments, designed to eliminatepotentially confounding factors, may insteadsimply invite subjects to impose their own

78 Samuelson (2001) provides one example, in whichthe use of models makes an explicit appearance. Also seePhilippe Jehiel (2004).

79 See Robyn M. Dawes (1988) for an early discussionof framing effects.

96 Journal of Economic Literature, Vol. XLIII (March 2005)

82 This distinguishes this exercise from the bulk of whathas come to be known as “evolutionary game theory” or“evolutionary economics.” These latter bodies of (quitediverse) work share the guiding principle that instead ofoptimizing, people reach decisions and markets reach out-

mr05_Article 2 3/28/05 3:25 PM Page 96

context.80 Despite an experimenter’s bestefforts to ensure that subjects understandwhat they are dealing with, including carefulpresentations, questions, and preliminaryquizzes, it is not clear when we can be confi-dent that the subjects’ models match theexperimenter’s.81

In many cases, it will be difficult to distin-guish whether unexpected behavior is root-ed in subjects’ payoffs or their perception ofthe game form. A tendency to analyze theultimatum game with a model that implicitlybuilds in a future may give results that lookas if an agent has a preference for fairness oran antipathy for asymmetric solutions. Thismay be more than a coincidence. As arguedin Samuelson (2004), a likely response byNature to limitations on our reasoning abili-ty, the same sort of limitations that promptour use of models, may be to compensate bybuilding arguments into our preferencesthat we would not expect to find whenagents are perfectly rational. Hence, the twoarguments are likely to be complementaryrather than contradictory. Then how do wechoose between them, or what use is therein considering both? These questions againsuggest a quest for richer theoretical models.

5.2 Evolutionary Foundations

One difficulty in modeling preferences isthat once we move beyond a narrow concep-tion of self-interest, there appear to be fewrestrictions on the features we can attributeto preferences, and hence the behavior wecan explain (cf. Andrew Postlewaite 1998).

80 Camerer and Keith Weigelt (1988) conduct an earlyexperimental analysis of reputation models. Their resultsexhibit many features of reputation equilibria, but theirsubjects also appear to have a “homemade prior” about theinformation structure that is at odds with a strict interpre-tation of the experimental environment. It appears as if thesubjects have provided a context for the experiment.

81 In the context of the hourglass-figure experiment dis-cussed above, one can imagine an experimenter adding avariety of additional controls to ensure that the subjectsunderstand that pornographic females are not real, per-haps stressing this in the instructions and quizzing the sub-jects on the difference. But this is news to no one, and isunlikely to eliminate the effect.

Where do we find the discipline to ensurethat our models are meaningful? This dan-ger seems all the more real once we openthe door to the possibility that subjects mayform their own models of the experimentalsituation. Where do we look for a theory ofpeoples’ models of the world?

This section suggests an evolutionaryapproach to both questions. The idea is toview evolution as the biological process bywhich humans came to their modern form.82

This modern form includes a host of physicalcharacteristics—our size, our relative lack ofhair, our ability to walk upright— and behav-ioral characteristics—our diet—that wereadily attribute to the forces of evolution.We can also expect our preferences and ourdecision-making to have been the productsof evolution.83 The result is a “reverse engi-neering” approach to studying decision mak-ing. Can we plausibly make a case that agiven specification of preferences, or rulesfor how situations are modeled and translat-ed into decisions, might have evolved as partof a solution to an evolutionary design prob-lem? The more easily one finds such evolu-tionary foundations, the more seriouslyshould we be inclined to take the model inquestion.84

Evolutionary research abounds in mal-adaption stories.85 One first identifies a

comes through an adaptive process involving varyingdegrees of learning, experimentation, and trial-and-error.“Evolution” is a metaphor for this adaptive process.

83 This view is familiar in evolutionary psychology, (e.g.,Leda Cosmides and John Tooby 1992), and has ampleprecedent in economics (e.g., Arthur J. Robson 1992,1996, 1996, 2001a, 2001b).

84 Just as economists are adept at building models, evo-lutionary psychologists have been criticized for seeminglybeing able to rationalize any behavior with an evolutionarymodel. Steven Jay Gould and Richard C. Lewontin (1979)find these models sufficiently unpersuasive as to bedeemed “just-so stories.” If the evolutionary approach is tobe successful, it must do more than provide such stories.

85 Terry Burnham and Jay Phelan (2000) contains awealth of examples.

Samuelson: Economic Theory and Experimental Economics 97

88 In connection with the first feature, it is intriguingthat the study of bargaining behavior in fifteen small-scalesocieties by Henrich, Boyd, Bowles, Camerer, Fehr,

mr05_Article 2 3/28/05 3:25 PM Page 97

behavior that is likely to have been an opti-mal response to the environment in whichwe evolved. One then notes that our mod-ern environment is quite different, causingthe behavior to now be quite surprising, ifnot counterproductive.86 In contrast, ourconcern here is with behavior that evolu-tion has designed as an optimal response toour environment, recognizing that this is anenvironment in which we must rely onpreferences and on models in making ourdecisions.

There is a growing body of work on howour evolutionary background may haveshaped our preferences. Perhaps best devel-oped is the link between reproduction andrisk taking, and hence the implications forattitudes toward risk (Arthur J. Robson1992, 1996). Much of this material is nicelycovered in Robson (2001a).

More recently, experimental evidence hasmounted that people will incur costs notonly to bestow benefits on others, but also topenalize others, with the preference forreward or punishment hinging upon percep-tions of whether the recipient has actedappropriately or inimically. What might bethe evolutionary origins of such “prosocial”behavior? Henrich (2004) offers an explana-tion based on cultural group selection.87 Anadvantage of this model is that it providesthe type of discipline required for furtherinvestigation. For example, the model sug-gests that we should expect multiple cultur-al equilibria, and hence considerablevariation across cultures in the tendency to

86 For a simple example, it is likely that during most ofour evolutionary history, food was both in chronically shortsupply and could be stored only in the form of body fat. Asa result, it appears likely that an evolutionarily successfulstrategy was to eat as much as possible whenever possible.It is then no surprise that members of modern, wealthysocieties find it difficult to avoid health-threateningovereating.

87 The model of cultural group selection avoids many ofthe difficulties that have made biologists skeptical of groupselection arguments. Elliott Sober and David Sloan Wilson(1998) argue that group selection lies behind preferencesfor altruism.

bestow benefits and costs on others. Second,a propensity to imitate or conform to thebehavior of others plays an important role inthe model, suggesting we look for a linkbetween such behavior and prosocial behav-ior. Third, evolution is viewed as facinginformation constraints, so that we must inturn view preferences as tools for maximiz-ing fitness while economizing on informa-tion. These features may be consistent witha variety of other models, and so they cannotbe the end of the quest, but they provide auseful point of departure.88

Work on how evolution has shaped theway people model their environment is in aneven earlier stage. Three illustrations will beuseful.

First, the Wason selection test (1966) isnow a standard example in evolutionarypsychology. Experimental subjects are sur-prisingly prone to errors in evaluatingabstract conditional statements.89 But ifasked to evaluate conditional statementsposed in terms of monitoring compliancewith a standard of behavior, success ismuch higher.90 The suggested interpreta-tion is that our reasoning about conditionalstatements evolved in a setting in whichmonitoring behavior was particularlyimportant.91 This interpretation bolsters anargument of Fehr and Henrich (2003), that

Gintis, and McElreath (2001) finds significant behavioralvariation.

89 For example, if given a collection of cards with num-bers on one side and letters on the other, along with thehypothesis that any card with a 3 on one side has a B on theother, and then shown four cards bearing 3, 7, A and B,only a minority correctly identify which cards must beturned over to check the hypothesis.

90 For example, told that a coke-drinker, beer-drinker,17-year-old and 25-year-old are seated at a table, virtuallyevery subject knows which ones to check for compliancewith a 21-year-old drinking law.

91 See Cosmides and Tooby (1992). The ability to rec-ognize faces (Steven Pinker 1997) is similarly interpreted asan evolutionary response to the importance of monitoringothers’ behavior.

98 Journal of Economic Literature, Vol. XLIII (March 2005)

94 For example, they find that, if subjects are first asked

mr05_Article 2 3/28/05 3:25 PM Page 98

laboratory behavior exhibiting reciprocityshould not be interpreted as an evolution-ary maladaption.92 At the same time, how-ever, it raises the possibility that subjects’perceived protocols may not correctly capture incentives if their presentation issufficiently unfamiliar.

Second, a variety of evidence suggests thatpeople are not very good in dealing withprobabilities. However, there is also evi-dence that people fare much better whenprobabilities are presented in terms of fre-quencies.93 This may indicate that we spentmuch of our history with a frequentist viewof the world. This is consistent with the pos-sibility that people are approximatelyexpected utility maximizers, while perform-ing quite poorly in laboratory experiments, ifthe latter present probabilistic informationunfamiliarly.

Third, Plott (1996) presents the “discov-ered preference hypothesis,” suggestingthat rather than coming to a decision prob-lem with fixed and well defined prefer-ences, people respond by combiningcontextual information and experience withan internal search process to discover theirpreferences. Similar ideas appear in DanAriely, George Loewenstein, and DrazenPrelec’s (2003) notion of “constructed” pref-erences, which they illustrate with a number

92 The maladaption account of such behavior would bethat we evolved in an environment in which repeated orkinship interactions, and hence the optimality of reciproc-ity, were sufficiently pervasive that there was no point inchecking whether such behavior is warranted. However, itis not clear that our evolutionary past would haveequipped us with a basic propensity to monitor for anddetect cheating in an environment in which it was unim-portant to distinguish situations meriting reciprocity fromthose that do not.

93 See Cosmides and Tooby (1996), Gerd Gigerenzer(1991, 1996, 1998), and Tversky and Kahneman (1983).For example, when told that 2 percent of the populationhas a disease and that a test produces no false negativesbut 5 percent false positives, many subjects will struggle toascertain the implications of a positive report. However,they fare better when told that out of every thousand peo-ple, all twenty who have the disease turn up positive but sodo fifty others.

of experiments.94 These results initiallyseem to strike at the core of economic theo-ry, calling into question the idea of stablepreferences. Notice, however, that theprocess by which preferences are discov-ered or constructed sounds much like theprocess Luce and Raiffa (1957) describe asthe foundation for expected utility maxi-mization (cf. section 4.2.2). Rather than sug-gesting that we abandon expected utilitytheory, the experimental results againremind us that the theory may not be asstraightforward as one would like. We canexpect consistent behavior in settingsamenable to small-worlds modeling, butmust expect anomalies to appear in othersituations. In terms of experiments, theimplication is again that behavior may bequite sensitive to seemingly irrelevantdetails of the experimental environment.

What is the common theme of these threeexamples? Echoing the ideas that openedsection 5.1, it is that evolution has equippedus with a variety of models and rules andshortcuts for dealing with a dauntingly com-plex world.95 The possibility the people donot perfectly model futureless protocols may

whether they would be willing to purchase a product at aprice equal to the last two digits of their social securitynumber, and are then asked their valuation of the product,there is significant correlation between their social securi-ty numbers and reported valuations. The suggested inter-pretation is that the subjects subconsciously use thenumbers involved in the first purchase decision as clues tothe appropriate valuation in the second. This reliance oncontextual information may work well in many applica-tions, but leads to apparently absurd behavior in theexperiment.

95 Evolutionary psychologists find evidence for con-straints on evolution’s ability to simply enhance our rea-soning powers and dispense with these devices in therelatively large amount of energy required to maintain thehuman brain (Katharine Milton 1988), the high risk ofmaternal death in childbirth posed by infants’ large heads(W. Leutenegger 1982), and the similarly-caused lengthyperiod of human postnatal development (Paul H. Harvey,R. D. Martin, and T. H. Clutton-Brock 1986). Andy Clark(1993) discusses the potential advantages of using contex-tual clues and specialized rules to conserve on generalizedreasoning resources.

Samuelson: Economic Theory and Experimental Economics 99

96 For example, Jack L. Knetsch, Fang-Fang Tang, andThaler (2003), who also provide references to earlier work,comment that “The endowment effect and loss aversionhave been among the most robust findings of the psychol-ogy of decision making. People commonly value lossesmuch more than commensurate gains . . . ” (2001, p. 257).Kahneman and Tversky (1979) stress that people viewgains and losses quite differently.

mr05_Article 2 3/28/05 3:25 PM Page 99

then arise not because evolution has erro-neously designed us for a different environ-ment, but because evolution has effectivelydesigned us for an environment in whichsuch shortcuts are valuable.

The first two examples are concerned withtechniques for processing information. Thethird reflects a spillover into preferences,bringing us back to the fact that informationconstraints also feature in models of prefer-ence evolution. Samuelson and JeroenSwinkels (2001) consider the relationshipbetween these, examining the implicationsfor preferences of an evolutionary processthat must cope with scarce reasoningresources. The conclusion is that we canexpect a variety of seemingly nonstandardfeatures to be built into our preferences inresponse to imperfections and limitations inour information processing and reasoning.

What are the implications for economics?We can often expect people to act consis-tently and rationally, given their prefer-ences. However, the preferences involved inthe resulting optimizing behavior mayinvolve all sorts of features that at first blushdo not appear consistent with either the pur-suit of individual self-interest or a narrowconcept of consistency. Finally, we canexpect context to be important. In thissense, evolution and the axiomaticapproaches of Savage (1972) and Luce andRaiffa (1957) are on the same page. Bothsuggest that we can expect consistent behav-ior within sufficiently constrained contexts,though with preferences that may appear togo beyond a narrow conception of self-inter-est, but that the context will be importantand that seeming anomalies may readilyarise across contexts.

Smith (2003) argues that we can usefullyview human behavior in terms of two typesof rationality, “constructivist” and “ecologi-cal” rationality. Constructivist rationalityresembles the rational choice models oftraditional economic theory, though againallowing the possibility that preferencesmight reflect more than a narrow self-

interest, while ecological rationality is con-cerned with “the possible intelligenceembodied in the rules, norms, and institu-tions of our cultural and biological her-itage....” (2003, p. 470). The argument hereties these concepts together with the visionof an evolutionary process that struggleswith the constraints imposed by scarce rea-soning resources. The quest to maximizefitness generates a motivation for behaviorto reflect constructionist rationality, whilethe quest to relax constraints leads to theready incorporation of ecological forces.

What are the implications for combiningeconomic theory and experiments? A firstone is that we must be careful in assessingboth experimental findings and economictheory. For example, numerous experi-ments have found that subjects are willingto pay less to receive an object than they arewilling to accept to relinquish the object.Some have interpreted this as reflecting acommon feature of preferences, being anillustration of the more general principlethat people value losses more heavily thangains.96 However, as Plott and KathrynZeiler (2003) argue, there have also beenmany cases in which such discrepancies donot appear, and one can identify experi-mental settings in which the effect reliablydoes or does not appear. This suggests thatwe should stop short of proclaiming theendowment effect a universal feature ofpreferences, and focuses attention on theinternal validity of the experiments. Whatlinks do we draw between differences inexperimental settings and the forces thatshape valuations? At the same time, it indi-cates that something is missing from ourtheoretical repertoire, which currently

100 Journal of Economic Literature, Vol. XLIII (March 2005)

97 For example, the theory of utility maximization occu-pies a prized place at the center of economics. However,the theory has very little predictive content. Given thefreedom to define preferences, virtually any behavior canbe reconciled with expected-utility maximization. Evenapparent violations of the axioms of revealed preferencecan often be apologized away by noting that the data con-sist of choices made at different times and thus while thedecision-maker is in different states, and hence possiblydescribed by different preferences. Predictions thenrequire some augmenting or extension of the revealed-preference axioms, as in Cox (1997).

mr05_Article 2 3/28/05 3:25 PM Page 100

provides little insight into such forces. Afirst step in addressing this issue wouldthen be the construction of theoreticalmodels, especially models shedding insightinto how and when it might have been evo-lutionarily valuable to condition valuationson ownership.

6. Conclusion

Economic theory and economic experi-ments can be combined to the benefit ofboth. By itself, this is a fairly uninformative“more is better” conclusion. There must begains from considering experiments or theo-ry more carefully when doing theory orexperimentation. But what steps can we taketo make it more likely that potential gainsare realized?

The danger with the concerns raised inthis essay is that they might be used to apol-ogize away any potential interactionbetween theory and experiments. It isunlikely that we will usefully combine theo-ry and experiments if we too freely respondto contrasts between the two with such state-ments as: “The results appear to be at oddswith the theory, but we have no obvious wayto measure how far away they are, and by mypreferred measure they are pretty close.” “ … but I suspect the subjects really per-ceived a different experimental protocolunder which their behavior is consistentwith the theory.” “ … but the theory is anapproximation that cannot be expected toapply everywhere, and the discovery of thisexception tells us nothing about the theoryin other applications.” How do we avoidworking at such cross-purposes?

A good beginning would be for exercisesin economic theory to routinely identifybehavior that would be consistent with thetheory, and especially behavior that woulddistinguish the theory from contendingexplanations. Section 3.1 noted that predict-ing behavior is not the only goal of econom-ic theory, and so we cannot expect alltheoretical exercises to be in a position to

point to such behavior.97 We must also allowthe possibility that making connections tobehavior is a goal that the theory will oftennot yet be sufficiently advanced to address.But at some point some connections must bemade between theory and behavior if eco-nomic theory is not to fade into either phi-losophy or mathematics, and work thataspires to make this connection should beexplicit about the implied behavior. In thecourse of doing so, it would be helpful tohave some idea not only of the expectedbehavior itself, but also of how much noisewe might expect to surround this behavior.

Perhaps more importantly, it would beuseful for theory to identify behavior forwhich the theory cannot account, in thesense that the observations would force thetheorist to reconsider. This would ensurethat the theory is not performing well by“theorizing to the test,” as in section 4.1.1.The behavior relegated to this categorymight further be grouped in two categories.One, recognizing that theories can be usefulwithout applying universally, would identifysituations that are not a good match for thetheory and in which contrary behavior wouldnot shake one’s confidence in the usefulnessof the theory. The second would consist ofbehavior that would force reconsideration ofthe theory. The strength of a theory will oftenbe reflected in the content of this latter cate-gory, and we might move toward an explicitexamination of what makes a theory useful.

Similarly, it would be helpful to have theexperimental design indicate which out-comes would be regarded as a failure as well

Samuelson: Economic Theory and Experimental Economics 101

mr05_Article 2 3/28/05 3:25 PM Page 101

as which would be considered a success.This question appears to be trivial in manycases, with success and failure riding on thestatistical significance of an estimatedparameter. However, one of the advantagesof experimental work is the ability to controlthe environment and design the tests. Thisallows us to direct attention away from issuesof statistical significance and toward issues ofeconomic importance. The strength of theexperiment will often be reflected in thecontent of this “failure” category.

Finally, again returning to section 4.1.1, itis important that both theoretical modelsand interpretations of experimental resultsbe precise enough to apply beyond theexperimental situation from which theyemerge. This allows links to be made thatmultiply the power of single studies.

7. Appendix: Proof of Proposition 1

Proof. Given the design xN, the experi-ment induces a true probability distributionπ∗∈�SM over elements sM of the set SM,while the theory f induces a probability dis-tribution f ∗∈��SM over elements π of theset �SM. Given f ∗ and π∗, let

.

H(f ∗, π∗)is thus the probability the theory isaccepted. Let f π* be a measure over �SM thatputs probability one on the true distributionπ∗. Then, from the assumption that Taccepts the truth with probability 1��, wehave, for any π∗,

(5) .

We need to show there exists an such that, for all

(6) ,

where πsM is a probability distribution overSM that puts unitary probability on outcome

H fsM

ˆ ,� �( ) ≥ −1

s SM M∈f̂ SM∈��

H f�

� �∗∗( ) ≥ −, 1

T s df d sS S

M M

M M

∗ ∗( ) ( ) ( )∫ ∫ ,� � ��

H f ∗ ∗( ) =,�

sM. Using the definition of a maximum forthe first inequality and (5) for the second,we have

(7)

.

From Ky Fan’s (1953, Theorem 1) first minmaxtheorem, we have:

(8)

.

Using (8) to replace the first term in (7) anddeleting the middle term then gives

(9) ,

and hence, there is an f̂ such that, for all,

which, from (6), is the desired result.

REFERENCES

Anderson, Simon P., Jacob K. Goeree, and Charles A.Holt. 1998. “Rent Seeking with Bounded Rationality:An Analysis of the All-Pay Auction.” Journal ofPolitical Economy, 106(4): 828–53.

Anderson, Simon P., Jacob K. Goeree, and Charles A.Holt. 1998. “A Theoretical Analysis of Altruism andDecision Error in Public Goods Games.” Journal ofPublic Economics, 70(2): 297–323.

Anderson, Simon P., Jacob K. Goeree, and Charles A.Holt. 2001. “Minimum-Effort Coordination Games:Stochastic Potential and Logit Equilibrium.” Gamesand Economic Behavior, 34(2): 177–99.

Andersson, Malte. 1982. “Female Choice Selects forExtreme Tail Length in a Widowbird.” Nature,299(5886): 818–20.

Andreoni, James, Paul M. Brown, and Lise Vesterlund.2002. “What Makes an Allocation Fair? SomeExperimental Evidence.” Games and EconomicBehavior, 40(1): 1–24.

Andreoni, James, Marco Castillo, and Ragan Petrie.2003. “What Do Bargainers’ Preferences Look Like?Experiments with a Convex Ultimatum Game.”American Economic Review, 93(3): 672–85.

Andreoni, James and John Miller. 2002. “GivingAccording to GARP: An Experimental Test of theConsistency of Preferences for Altruism.”Econometrica, 70(2): 737–53.

Andreoni, James and Larry Samuelson. 2003. “Building

H fsM

ˆ ,� �( ) ≥ −1

s SM M∈

f S SM MH f

∗ ∗∈ ∈

∗( ) ≥ −�� �

max min�

� �, 1

�∗ ∗∈ ∈�� �f S SMmax

MMH fmin ∗ ∗( ),�

��

∗ ∗∈ ∈

∗ ∗( ) =� ��S f SM M

H fmin max ,

� �∗ ∗∈�SM

H fmin ,,� �∗( ) ≥ −1

��

∗ ∗∈ ∈( ) ≥

� ��S f SM MHmin max ƒ ,∗ ∗

102 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 102

Rational Cooperation,” SSRI Working Paper 2003-4.Ariely, Dan, George Loewenstein, and Drazen Prelec.

2003. “‘Coherent Arbitrariness’: Stable DemandCurves without Stable Preferences.” The QuarterlyJournal of Economics, 118(1): 73–105.

Aumann, Robert J. 1985. “What Is Game TheoryTrying to Accomplish?” in Frontiers of Economics.Kenneth J. Arrow and Seppo Honkapohja, eds.Oxford: Blackwell, 28–76.

Balkenborg, Dieter. 1994. “An Experiment on Forwardversus Backward Induction,” SFB Discussion PaperB-268, University of Bonn.

Banks, Jeffrey S., John O. Ledyard, and David P.Porter. 1989. “Allocating Uncertain andUnresponsive Resources: An ExperimentalApproach.” Rand Journal of Economics, 20(1): 1–25.

Bateson, Melissa and Alex Kacelnik. 1996. “RateCurrencies and the Foraging Starling: The Fallacy ofthe Averages Revisited.” Behavioral Ecology, 7(3):341–52.

Battalio, Raymond, Larry Samuelson, and John VanHuyck. 2001. “Optimization Incentives andCoordination Failure in Laboratory Stag HuntGames.” Econometrica, 69(3): 749–64.

Berg, Joyce E., John W. Dickhaut, and Thomas A. Rietz.2003. “Diminishing Preference Reversals by InducingRisk Preferences: Evidence for Noisy Maximization.”Journal of Risk and Uncertainty, 27(2): 139–70.

Bergstrom, Theodore C. 2003. “Vernon Smith’sInsomnia and the Dawn of Economics AsExperimental Science.” Scandinavian Journal ofEconomics, 105(2): 181–205.

Binmore, Ken. 1999. “Why Experiment inEconomics?” Economic Journal, 109(453): F16–24.

Binmore, Ken and Paul Klemperer. 2002. “The BiggestAuction Ever: The Sale of the British 3G TelecomLicenses.” Economic Journal, 112(478): C74–96.

Binmore, Ken, John McCarthy, Giovanni Ponti, LarrySamuelson, and Avner Shaked. 2002. “A BackwardInduction Experiment.” Journal of EconomicTheory, 104(1): 48–88.

Binmore, Ken, Peter Morgan, Avner Shaked, and JohnSutton. 1991. “Do People Exploit Their BargainingPower? An Experimental Study.” Games andEconomic Behavior, 3(3): 295–322.

Binmore, Ken and Larry Samuelson. 1999.“Evolutionary Drift and Equilibrium Selection.” TheReview of Economic Studies, 66(2): 363–93.

Binmore, Ken, Avner Shaked, and John Sutton. 1985.“Testing Noncooperative Bargaining Theory: APreliminary Study.” American Economic Review,75(5): 1178–80.

Binmore, Ken, Avner Shaked, and John Sutton. 1989.“An Outside Option Experiment.” The QuarterlyJournal of Economics, 104(4): 753–70.

Binmore, Ken, Joe Swierzbinski, and Chris Proulx.2001. “Does Minimax Work? An ExperimentalStudy.” Economic Journal, 111(473): 445–64.

Binmore, Kenneth G., John Gale, and Larry Samuelson.1995. “Learning to Be Imperfect: The UltimatumGame.” Games and Economic Behavior, 8(1): 56–90.

Blount, Sally. 1995. “When Social Outcomes Aren’tFair: The Effect of Causal Attributions on

Preferences.” Organizational Behavior and HumanDecision Processes, 63(2): 131–44.

Bolton, Gary E. 1991. “A Comparative Model ofBargaining: Theory and Evidence.” AmericanEconomic Review, 81(5): 1096–136.

Bolton, Gary E. and Axel Ockenfels. 2000. “ERC: ATheory of Equity, Reciprocity, and Competition.”American Economic Review, 90(1): 166–93.

Bolton, Gary E. and Rami Zwick. 1995. “Anonymityversus Punishment in Ultimatum Bargaining.”Games and Economic Behavior, 10(1): 95–121.

Brandts, Jordi and Charles A. Holt. 1992. “AnExperimental Test of Equilibrium Dominance inSignaling Games.” American Economic Review,82(5): 1350–65.

Brandts, Jordi and Charles A. Holt. 1993. “AdjustmentPatterns and Equilibrium Selection in ExperimentalSignaling Games.” International Journal of GameTheory, 22(3): 279–302.

Brandts, Jordi and Charles A. Holt. 1995. “Limitationsof Dominance and Forward Induction: ExperimentalEvidence.” Economics Letters, 49(4): 391–95.

Brewer, Paul J., Maria Huang, Brad Nelson, andCharles R. Plott. 2002. “On the BehavioralFoundations of the Law of Supply and Demand:Human Convergence and Robot Randomness.”Experimental Economics, 5(3): 179–208.

Brewer, Paul J. and Charles R. Plott. 1996. “A BinaryConflict Ascending Price (BICAP) Mechanism forthe Decentralized Allocation of the Right to UseRailroad Tracks.” International Journal of IndustrialOrganization, 14(6): 857–86.

Brown, James N. and Robert W. Rosenthal. 1990.“Testing the Minimax Hypothesis: A Re-examinationof O’Neill’s Game Experiment.” Econometrica,58(5): 1065–81.

Bulmer, Michael. 1997. Theoretical EvolutionaryEcology. Sunderland, MA: Sinauer Associates, Inc.

Burnham, Terry and Jay Phelan. 2000. Mean Genes:From Sex to Money to Food: Taming Our PrimalInstincts. Cambridge: Perseus Publishing.

Camerer, Colin. 1995. “Individual Decision Making,”in Handbook of Experimental Economics. John H.Kagel and Alvin E. Roth, eds. Princeton: PrincetonUniversity Press: 587–703.

Camerer, Colin. 2003. Behavioral Game Theory:Experiments in Strategic Interaction. Princeton:Princeton University Press and Russell SageFoundation.

Camerer, Colin and Keith Weigelt. 1988.“Experimental Tests of a Sequential EquilibriumReputation Model.” Econometrica, 56(1): 1–36.

Cameron, Lisa A. 1999. “Raising the Stakes in theUltimatum Game: Experimental Evidence fromIndonesia.” Economic Inquiry, 37(1): 47–59.

Cason, Timothy N. and Daniel Friedman. 1996. “PriceFormation in Double Auction Markets.” Journal ofEconomic Dynamics and Control, 20(8): 1307–37.

Charness, Gary and Matthew Rabin. 2002.“Understanding Social Preferences with SimpleTests.” The Quarterly Journal of Economics, 117(3):817–69.

Clark, Andy. 1993. Microcognition: Philosophy,

Samuelson: Economic Theory and Experimental Economics 103

mr05_Article 2 3/28/05 3:25 PM Page 103

Cognitive Science and Parallel DistributedProcessing. Cambridge: MIT Press.

Coller, Maribeth, Glenn W. Harrison, and E. ElisabetRutström. 2003. “Are Discount Rates Constant?Reconciling Theory and Observation.” Mimeo.University of South Carolina and University ofCentral Florida.

Cooper, David J., Nick Feltovich, Alvin E. Roth, andRami Zwick. 2003. “Relative versus Absolute Speedof Adjustment in Strategic Environments:Responder Behavior in Ultimatum Games.”Experimental Economics, 6(2): 181–207.

Cooper, David J., Susan Garvin, and John H. Kagel.1997. “Adaptive Learning vs. EquilibriumRefinements in an Entry Limit Pricing Game.”Economic Journal, 107(442): 553–75.

Cooper, David J., Susan Garvin, and John H. Kagel.1997. “Signalling and Adaptive Learning in an EntryLimit Pricing Game.” Rand Journal of Economics,28(4): 662–83.

Cooper, David J. and John H. Kagel. 2003. “LessonsLearned: Generalizing Learning across Games.”American Economic Review, 93(2): 202–07.

Cosmides, Leda and John Tooby. 1992. “ThePsychological Foundations of Culture,” in TheAdapted Mind: Evolutionary Psychology and theGeneration of Culture. Jerome H. Barkow, LedaCosmides, and John Tooby, eds. Oxford: OxfordUniversity Press, 19–136.

Cosmides, Leda and John Tooby. 1996. “Are HumansGood Intuitive Statisticians After All? RethinkingSome Conclusions from the Literature on JudgementUnder Uncertainty.” Cognition, 58(1): 1–73.

Cosmides, Leda and John Tooby. 1996. “CognitiveAdaptations for Social Exchange,” in The AdaptedMind: Evolutionary Psychology and the Generation ofCulture. Jerome H. Barkow, Leda Cosmides, andJohn Tooby, eds. Oxford: Oxford University Press:163–228.

Cox, James C. 1997. “On Testing the UtilityHypothesis.” Economic Journal, 107(443): 1054–78.

Cox, James C. 2004. “How to Identify Trust andReciprocity.” Games and Economic Behavior, 46(2):260–81.

Cox, James C. and Ronald L. Oaxaca. 1995. “InducingRisk-Neutral Preferences: Further Analysis of theData.” Journal of Risk and Uncertainty, 11(1): 65–79.

Cox, James C. and Vjollca Sadiraj. 2002. “Risk Aversionand Expected Utility Theory: Coherence for Small-and Large-Stakes Gambles.” Mimeo. University ofArizona and University of Amsterdam.

Crawford, Vincent P. 1997. “Theory and Experiment inthe Analysis of Strategic Interaction,” in Advances inEconomics and Econometrics: Theory and Applica-tions, Volume 1. Davis M. Kreps and Kenneth F.Wallis, eds. Cambridge: Cambridge University Press,206–42.

Dasgupta, Partha and Eric Maskin. 2003. “Uncertainty,Waiting Costs, and Hyperbolic Discounting.”Mimeo. Institute for Advanced Study, Princeton.

Davis, Douglas D. and Charles A. Holt. 1993.Experimental Economics. Princeton: PrincetonUniversity Press.

Dawes, Robyn M. 1988. Rational Choice in anUncertain World. New York: Harcourt BraceJovanovich.

Dekel, Eddie and Yossi Feinberg. 2004. “A TrueExpert Knows Which Questions Should be Asked,”Mimeo. Institute for Advanced Study, Princeton.

Dennett, Daniel C. 1995. Darwin’s Dangerous Idea.New York: Simon and Schuster.

Drago, Robert and John S. Heywood. 1989.“Tournaments, Piece Rates, and the Shape of thePayoff Function.” Journal of Political Economy,97(4): 992–98.

Dufwenberg, Martin and Georg Kirchsteiger. 2004. “ATheory of Sequential Reciprocity.” Games andEconomic Behavior, 47(2): 268–98.

Dyer, Douglas and John H. Kagel. 1996. “Bidding inCommon Value Auctions: How the CommercialConstruction Industry Corrects for the Winner’sCurse.” Management Science, 42(10): 1463–75.

El-Gamal, Mahmoud A. and David M. Grether. 1995.“Are People Bayesian? Uncovering BehavioralStrategies.” Journal of the American StatisticalAssociation, 90(432): 1137–45.

Erev, Ido, Alvin E. Roth, Robert L. Slonim, and GregBarron. 2002. “Combining a Theoretical PredictionWith Experimental Evidence.” Mimeo. HarvardUniversity.

Fagin, Ronald, Joseph Y. Halpern, Yoram Moses, andMoshe Y. Vardi. 1995. Reasoning About Knowledge.Cambridge: MIT Press.

Falk, Armin, Ernst Fehr, and Urs Fischbacher. 2003.“On the Nature of Fair Behavior.” EconomicInquiry, 41(1): 20–26.

Fan, Ky. 1953. “Minimax Theorems.” Proceedings ofthe National Academy of Sciences of the UnitedStates of America, 39: 42–47.

Fehr, Ernst and Simon Gächter. 2000. “Cooperationand Punishment in Public Goods Experiments.”American Economic Review, 90(4): 980–94.

Fehr, Ernst and Simon Gächter. 2000. “Fairness andRetaliation: The Economics of Reciprocity.” Journalof Economic Perspectives, 14(3): 159–81.

Fehr, Ernst and Simon Gächter. 2002. “AltruisticPunishment in Humans.” Nature, 415(6868): 137–40.

Fehr, Ernst, Simon Gächter, and Georg Kirchsteiger.1997. “Reciprocity as a Contract EnforcementDevice: Experimental Evidence.” Econometrica,65(4): 833–60.

Fehr, Ernst and Joseph Henrich. 2003. “Is StrongReciprocity a Maladaptation? On the EvolutionaryFoundations of Human Altruism,” in The Geneticand Cultural Evolution of Cooperation. PeterHammerstein, ed. Cambridge: MIT Press: 55–82.

Fehr, Ernst and Klaus M. Schmidt. 1999. “A Theory ofFairness, Competition, and Cooperation.” TheQuarterly Journal of Economics, 114(3): 817–68.

Forsythe, Robert, Joel L. Horowitz, N. E. Savin, andMartin Sefton. 1994. “Fairness in Simple BargainingExperiments.” Games and Economic Behavior, 6(3):347–69.

Frederick, Shane, George Loewenstein, and TedO’Donoghue. 2002. “Time Discounting and TimePreference: A Critical Review.” Journal of Economic

104 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 104

Literature, 40(2): 351–401.Fudenberg, Drew and David K. Levine. 1997.

“Measuring Players’ Losses in Experimental Games.”The Quarterly Journal of Economics, 112(2): 507–36.

Fudenberg, Drew and David K. Levine. 1998. TheTheory of Learning in Games. Cambridge: MITPress.

Gächter, Simon and Armin Falk. 2002. “Reputation andReciprocity: Consequences for the Labour Relation.”Scandinavian Journal of Economics, 104(1): 1–26.

Geanakoplos, John, David Pearce, and EnnioStacchetti. 1989. “Psychological Games andSequential Rationality.” Games and EconomicBehavior, 1(1): 60–79.

Gigerenzer, Gerd. 1991. “How to Make CognitiveIllusions Disappear: Beyond Heuristics and Biases,”in European Review of Social Psychology, Volume 2.Wolfgang Stroebe and Miles Hewstone, eds. NewYork: Wiley.

Gigerenzer, Gerd. 1996. “The Psychology of GoodJudgment: Frequency Formats and SimpleAlgorithms.” Journal of Medical Decision Making,16(3): 273–80.

Gigerenzer, Gerd. 1998. “Ecological Intelligence: AnAdaptation for Frequencies,” in The Evolution ofMind. D. Cummings and C. Allen, eds. Oxford:Oxford University Press: 9–29.

Gneezy, Uri and Aldo Rustichini. 2000. “A Fine is aPrice.” Journal of Legal Studies, 29(1): 1–18.

Gneezy, Uri and Aldo Rustichini. 2000. “Pay Enough orDon’t Pay at All.” The Quarterly Journal ofEconomics, 115(3): 791–810.

Gode, Dhananjay K. and Shyam Sunder. 1993.“Allocative Efficiency of Markets with Zero-Intelligence Traders: Market as a Partial Substitutefor Individual Rationality.” Journal of PoliticalEconomy, 101(1): 119–37.

Gode, Dhananjay K. and Shyam Sunder. 1993. “LowerBounds for Efficiency of Surplus Extraction inDouble Auctions,” in The Double Auction Market:Institutions, Theories and Evidence, Proceedings Vol.15. Daniel Friedman and John Rust, eds. Santa Fe:Santa Fe Institute in the Science of Complexity.

Gode, Dhananjay K. and Shyam Sunder. 1997. “WhatMakes Markets Allocationally Efficient?” TheQuarterly Journal of Economics, 112(2): 603–30.

Godfray, H. C. J. and R. A. Johnstone. 2000. “Beggingand Bleating: The Evolution of Parent-OffspringSignalling.” Philosophical Transactions of the RoyalSociety of London B, 355(1403): 1581–91.

Goeree, Jacob K. and Charles A. Holt. 2001. “TenLittle Treasures of Game Theory and Ten IntuitiveContradictions.” American Economic Review, 91(5):1402–22.

Goeree, Jacob K., Charles A. Holt, and Thomas R.Palfrey. 2002. “Quantal Response Equilibrium andOverbidding in Private-Value Auctions.” Journal ofEconomic Theory, 104(1): 247–72.

Goeree, Jacob K., Charles A. Holt, and Thomas R.Palfrey. 2004. “Regular Quantal ResponseEquilibrium,” Mimeo. California Institute ofTechnology and University of Virginia.

Gould, Steven Jay and Richard C. Lewontin. 1979.

“The Spandrels of San Marco and the PanglossianParadigm: A Critique of the AdaptionistProgramme.” Proceedings of the Royal Society ofLondon, Series B, 205(1161): 581–98.

Grafen, Alan. 1990a. “Biological Signals as Handicaps.”Journal of Theoretical Biology, 144(4): 517–46.

Grafen, Alan. 1990b. “Sexual SelectionUnhandicapped by the Fisher Process.” Journal ofTheoretical Biology, 144(4): 473–516.

Güth, Werner, Rolf Schmittberger, and BerndSchwarze. 1982. “An Experimental Analysis ofUltimatum Bargaining.” Journal of EconomicBehavior and Organization, 3(4): 367–88.

Güth, Werner and Reinhard Tietz. 1990. “UltimatumBargaining Behavior: A Survey and Comparison ofExperimental Results.” Journal of EconomicPsychology, 11(3): 417–49.

Haile, Philip, Ali Hortacsu, and Grigory Kosenok.2004. “On the Empirical Content of QuantalResponse Equilibrium,” Mimeo. Yale University.

Halevy, Yoram. 2004. “Diminishing Impatience:Disentangling Time Preference from UncertainLifetime,” Mimeo. University of British Columbia.

Harless, David W. and Colin F. Camerer. 1994. “ThePredictive Utility of Generalized Expected UtilityTheories.” Econometrica, 62(6): 1251–89.

Harless, David W. and Colin F. Camerer. 1995. “AnError Rate Analysis of Experimental Data TestingNash Refinements.” European Economic Review,39(3–4): 649–60.

Harrison, Glenn W. 1989. “Theory and Misbehavior ofFirst-Price Auctions.” American Economic Review,79(4): 749–62.

Harrison, Glenn W. 1990. “Risk Attitudes in First-PriceAuction Experiments: A Bayesian Analysis.” Reviewof Economics and Statistics, 72(3): 541–46.

Harrison, Glenn W. 1992. “Theory and Misbehavior ofFirst-Price Auctions: Reply.” American EconomicReview, 82(5): 1426–43.

Harrison, Glenn W., Ronald M. Harstad, and E.Elisabet Rutström. 2002. “Experimental Methodsand Elicitation of Values,” University of SouthCarolina Working Paper B-95-11.

Harrison, Glenn W. and Jack Hirshleifer. 1989. “AnExperimental Evaluation of Weakest Link/Best ShotModels of Public Goods.” Journal of PoliticalEconomy, 97(1): 201–25.

Harrison, Glenn W. and John A. List. 2004. “FieldExperiments,” Mimeo. University of Central Floridaand University of Maryland.

Harrison, Glenn W. and Kevin A. McCabe. 1992.“Testing Non-Cooperative Bargaining Theory inExperiments,” in Research in ExperimentalEconomics, Volume 5. R. Mark Isaac, ed. London:JAI Press, 137–69.

Harrison, Glenn W. and Kevin A. McCabe. 1996.“Expectations and Fairness in a Simple BargainingExperiment.” International Journal of Game Theory,25(3): 303–27.

Harvey, Paul H., R. D. Martin, and T. H. Clutton-Brock. 1986. “Life Histories in ComparativePerspective,” in Primate Societies. Barbara B. Smuts,Dorothy L. Cheeney, Robert M. Seyfarth, Richard

Samuelson: Economic Theory and Experimental Economics 105

mr05_Article 2 3/28/05 3:25 PM Page 105

W. Wrangham, and Thomas T. Struhsaker, eds.Chicago: University of Chicago Press, 181–96.

Henrich, Joseph. 2000. “Does Culture Matter inEconomic Behavior? Ultimatum Game Bargainingamong the Machiguenga of the Peruvian Amazon.”American Economic Review, 90(4): 973–79.

Henrich, Joseph. 2004. “Cultural Group Selection,Coevolutionary Processes and Large-ScaleCooperation.” Journal of Economic Behavior andOrganization, 53(1): 3–35.

Henrich, Joseph, Robert Boyd, Samuel Bowles, ColinCamerer, Ernst Fehr, Herbert Gintis, and RichardMcElreath. 2001. “In Search of Homo Economicus:Behavioral Experiments in 15 Small-Scale Societies.”American Economic Review, 91(2): 73–78.

Hoffman, Elizabeth, Kevin A. McCabe, and Vernon L.Smith. 1996. “On Expectations and the MonetaryStakes in Ultimatum Games.” International Journalof Game Theory, 25(3): 289–301.

Hoffman, Elizabeth, Kevin McCabe, Keith Shachat,and Vernon L. Smith. 1994. “Preferences, PropertyRights, and Anonymity in Bargaining Games.”Games and Economic Behavior, 7(3): 346–80.

Hoffman, Elizabeth, Kevin McCabe, and Vernon L.Smith. 1996. “Social Distance and Other-RegardingBehavior in Dictator Games.” American EconomicReview, 86(3): 653–60.

Hoglund, J., M. Eriksson, and Lindell. L. E. 1990.“Females of the Lek-Breeding Great Snipe,Gallinago media, Prefer Males with White Tails.”Animal Behavior, 40: 23–32.

Holt, Charles A. and Susan K. Laury. 2002. “RiskAversion and Incentive Effects.” AmericanEconomic Review, 92(5): 1644–55.

Hopkins, Ed. 2002. “Two Competing Models of HowPeople Learn in Games.” Econometrica, 70(6):2141–66.

Houser, Daniel, Michael Keane, and Kevin McCabe.2004. “Behavior in a Dynamic Decision Problem: AnAnalysis of Experimental Evidence Using a BayesianType Classification Algorithm.” Econometrica, 72(3):781–822.

Houston, Alasdair I. and John M. McNamara. 1999.Models of Adaptive Behavior: An Approach Based onState. Cambridge: Cambridge University Press.

Jehiel, Philippe. 2004. “Analogy-Based ExpectationEquilibrium.” Journal of Economic Theory, In Press.

Johnstone, Rufus A. 1995. “Sexual Selection, HonestAdvertisement and the Handicap Principle:Reviewing The Evidence.” Biological Reviews,70(1): 1–65.

Johnstone, Rufus A. 1998. “Game Theory andCommunication,” in Game Theory and AnimalBehavior. Lee Alan Dugatkin and Hudson KernReeve, eds. Oxford: Oxford University Press,94–117.

Johnstone, Rufus A. and Alan Grafen. 1992. “Error-Prone Signalling.” Proceedings of the Royal Societyof London, Series B, 248: 229–33.

Kacelnik, Alex. 1997. “Normative and DescriptiveModels of Decision Making: Time Discounting andRisk Sensitivity,” in Characterizing HumanPsychological Adaptations. Gregory R. Bock and

Gail Cardew, eds. New York: Wiley, 51–70.Kacelnik, Alex and Fausto Brito e Abreu. 1998. “Risky

Choice and Weber’s Law.” Journal of TheoreticalBiology, 194(2): 289–98.

Kagel, John H. 1995. “Auctions: A Survey ofExperimental Research,” in The Handbook ofExperimental Economics. John H. Kagel and AlvinE. Roth, eds. Princeton: Princeton University Press,501–85.

Kagel, John H., Ronald M. Harstad, and Dan Levin.1987. “Information Impact and Allocation Rules inAuctions with Affiliated Private Values: A LaboratoryStudy.” Econometrica, 55(6): 1275–304.

Kagel, John H. and Dan Levin. 2002. Common ValueAuctions and the Winner’s Curse. Princeton andOxford: Princeton University Press.

Kahneman, Daniel and Amos Tversky, eds. 2000.Choices, Values, and Frames. Cambridge: CambridgeUniversity Press; New York: Russell Sage Foundation.

Kahneman, Daniel and Amos Tversky. 1979. “ProspectTheory: An Analysis of Decision under Risk.”Econometrica, 47(2): 263–91.

Knetsch, Jack L., Fang-Fang Tang, and Richard H.Thaler. 2001. “The Endowment Effect and RepeatedMarket Trials: Is the Vickrey Auction DemandRevealing?” Experimental Economics, 4(3): 257–69.

Kohlberg, Elon and Jean-Francois Mertens. 1986. “Onthe Strategic Stability of Equilibria.” Econometrica,54(5): 1003–37.

Ledyard, John O. 1986. “The Scope of the Hypothesisof Bayesian Equilibrium.” Journal of EconomicTheory, 39(1): 59–82.

Ledyard, John O., David Porter, and Randii Wessen.2000. “A Market-Based Mechanism for AllocatingSpace Shuttle Secondary Payload Priority.”Experimental Economics, 2(3): 173–95.

Leutenegger, W. 1982. “Encephalization andObstetrics in Primates with Particular Reference toHuman Evolution,” in Primate Brain Evolution:Methods and Concepts. Este Armstrong and DeanFalk, eds. New York: Plenum: 85–95.

Levine, David K. 1998. “Modeling Altruism andSpitefulness in Experiments.” Review of EconomicDynamics, 1(3): 593–622.

Lipman, Barton L. 1999. “Decision Theory withoutLogical Omniscience: Toward an AxiomaticFramework for Bounded Rationality.” The Review ofEconomic Studies, 66(2): 339–61.

Loewenstein, George and Drazen Prelec. 1992.“Anomalies in Intertemporal Choice: Evidence andan Interpretation.” The Quarterly Journal ofEconomics, 107(2): 573–97.

Low, Bobbi S., R. D. Alexander, and K. M. Noonan.1987. “Human Hips, Breasts, and Buttocks: Is FatDeceptive?” Ethology and Sociobiology, 8(4): 249–57.

Luce, Duncan and Howard Raiffa. 1957. Games andDecisions. New York: Wiley.

Mazur, James E. 1984. “Tests of an Equivalence Rulefor Fixed and Variable Reinforcer Delays.” Journal ofExperimental Psychology: Animal BehaviorProcesses, 10(4): 426–36.

Mazur, James E. 1986. “Fixed and Variable Ratios andDelays: Further Tests of an Equivalence Rule.”

106 Journal of Economic Literature, Vol. XLIII (March 2005)

mr05_Article 2 3/28/05 3:25 PM Page 106

Journal of Experimental Psychology: AnimalBehavior Processes, 12(2): 116–24.

Mazur, James E. 1987. “An Adjusting Procedure forStudying Delayed Reinforcement,” in QuantitativeAnalyses of Behaviour: The Effect of Delay and ofIntervening Events on Reinforcement Value. MichaelL. Commons, James E. Mazur, John A. Nevin, andHoward Rachlin, eds. Hillsdale, NJ: LawrenceErlbaum, 55–73.

McCabe, Kevin A., Stephen J. Rassenti, and Vernon L.Smith. 1998. “Reciprocity, Trust, and Payoff Privacyin Extensive Form Bargaining.” Games andEconomic Behavior, 24(1–2): 10–24.

McKelvey, Richard D. and Thomas R. Palfrey. 1992.“An Experimental Study of the Centipede Game.”Econometrica, 60(4): 803–36.

McKelvey, Richard D. and Thomas R. Palfrey. 1995.“Quantal Response Equilibria for Normal FormGames.” Games and Economic Behavior, 10(1): 6–38.

McKelvey, Richard D. and Thomas R. Palfrey. 1998.“Quantal Response Equilibria for Extensive FormGames.” Experimental Economics, 1(1): 9–41.

Milgrom, Paul. 2004. Putting Auction Theory to Work.Cambridge: Cambridge University Press.

Milton, Katharine. 1988. “Foraging Behavior and theEvolution of Primate Cognition,” in MachiavellianIntelligence: Social Expertise and the Evolution ofIntellect in Monkeys, Apes and Humans. Richard W.Byrne and Andrew Whiten, eds. Oxford: ClarendonPress: 285–305.

Møller, Anders. 1988. “Female Choice Selects for MaleSexual Tail Ornaments in the MonogamousSwallows.” Nature, 332(6165): 640–42.

Nash, John F. 2002. “Noncooperative Games,” in TheEssential John Nash. Harold W. Kuhn and SylviaNasar, eds. Princeton: Princeton University Press:85–98.

Ochs, Jack and Alvin E. Roth. 1989. “An ExperimentalStudy of Sequential Bargaining.” AmericanEconomic Review, 79(3): 355–84.

Pinker, Steven. 1997. How the Mind Works. New York:W. W. Norton.

Plott, Charles R. 1996. “Rational Individual Behaviorin Markets and Social Choice Processes: TheDiscovered Preference Hypothesis,” in The RationalFoundations of Economic Behavior. Kenneth J.Arrow, Enrico Colombatto, Mark Perlman andChristian Schmidt, eds. New York: St. Martin’s Press,225–50.

Plott, Charles R. and Vernon L. Smith. 1978. “AnExperimental Examination of Two ExchangeInstitutions.” The Review of Economic Studies, 45(1):133–53.

Plott, Charles R. and Kathryn Zeiler. 2003 “TheWillingness to Pay/Willingness to Accept Gap, the‘Endowment Effect,’ Subject Misconceptions andExperimental Procedures for Eliciting Valuations,”Mimeo. California Institute of Technology.

Postlewaite, Andrew. 1998. “The Social Basis ofInterdependent Preferences.” European EconomicReview, 42(3–5): 779–800.

Prasnikar, Vesna and Alvin E. Roth. 1992.“Considerations of Fairness and Strategy:

Experimental Data from Sequential Games.” TheQuarterly Journal of Economics, 107(3): 865–88.

Rabin, Matthew. 1993. “Incorporating Fairness intoGame Theory and Economics.” American EconomicReview, 83(5): 1281–302.

Rabin, Matthew. 2000. “Risk Aversion and Expected-Utility Theory: A Calibration Theorem.”Econometrica, 68(5): 1281–92.

Rabin, Matthew and Richard H. Thaler. 2001.“Anomalies: Risk Aversion.” Journal of EconomicPerspectives, 15(1): 219–32.

Rassenti, Stephen J., Vernon L. Smith, and Robert L.Bulfin. 1982. “A Combinatorial Auction Mechanismfor Airport Time Slot Allocation.” Bell Journal ofEconomics, 13(2): 402–17.

Ridley, Matt. 1993. The Red Queen: Sex and theEvolution of Human Nature. New York: PenguinBooks.

Robson, Arthur J. 1992. “Status, the Distribution ofWealth, Private and Social Attitudes to Risk.”Econometrica, 60(4): 837–57.

Robson, Arthur J. 1996a. “A Biological Basis forExpected and Non-expected Utility.” Journal ofEconomic Theory, 68(2): 397–424.

Robson, Arthur J. 1996b. “The Evolution of Attitudesto Risk: Lottery Tickets and Relative Wealth.” Gamesand Economic Behavior, 14(2): 190–207.

Robson, Arthur J. 2001. “The Biological Basis ofEconomic Behavior.” Journal of EconomicLiterature, 39(1): 11–33.

Robson, Arthur J. 2001. “Why Would Nature GiveIndividuals Utility Functions?” Journal of PoliticalEconomy, 109(4): 900–14.

Ross, Sheldon. 1996. Stochastic Processes. New York:Wiley.

Roth, Alvin E. 1987. “Laboratory Experimentation inEconomics,” in Advances in Economic Theory: FifthWorld Congress of the Econometric Society. TrumanBewley, ed. Cambridge: Cambridge UniversityPress, 269–99.

Roth, Alvin E. 1988. “Laboratory Experimentation inEconomics: A Methodological Overview.” EconomicJournal, 98(393): 974–1031.

Roth, Alvin E. 1991. “Game Theory as a Part ofEmpirical Economics.” Economic Journal, 101(404):107–14.

Roth, Alvin E. 1993. “The Early History ofExperimental Economics.” Journal of the History ofEconomic Thought, 15(2): 184–209.

Roth, Alvin E. 1994. “Let’s Keep the Con out ofExperimental Econ.: A Methodological Note.”Empirical Economics, 19(2): 279–89.

Roth, Alvin E. 1995. “Bargaining Experiments,” inHandbook of Experimental Economics. John H.Kagel and Alvin E. Roth, eds. Princeton: PrincetonUniversity Press, 253–348.

Roth, Alvin E. and Ido Erev. 1995. “Learning inExtensive-Form Games: Experimental Data andSimple Dynamic Models in the Intermediate Term.”Games and Economic Behavior, 8(1): 164–212.

Roth, Alvin E. and Michael W. K. Malouf. 1979. “Game-Theoretic Models and the Role of Information inBargaining.” Psychological Review, 86(6): 574–94.

Samuelson: Economic Theory and Experimental Economics 107

mr05_Article 2 3/28/05 3:25 PM Page 107

Roth, Alvin E. and J. Keith Murnighan. 1982. “TheRole of Information in Bargaining: An ExperimentalStudy.” Econometrica, 50(5): 1123–42.

Roth, Alvin E., Vesna Prasnikar, Masahiro Okuno-Fujiwara, and Shmuel Zamir. 1991. “Bargaining andMarket Behavior in Jerusalem, Ljubljana,Pittsburgh, and Tokyo: An Experimental Study.”American Economic Review, 81(5): 1068–95.

Roth, Alvin E. and Francoise Schoumaker. 1983.“Expectations and Reputations in Bargaining: AnExperimental Study.” American Economic Review,73(3): 362–72.

Rubinstein, Ariel. 1998. Modeling BoundedRationality. Cambridge: MIT Press.

Rubinstein, Ariel. 2001. “Comments on the Risk andTime Preferences in Economics,” Mimeo. Tel AvivUniversity.

Rubinstein, Ariel. 2003. “‘Economics and Psychology’?The Case of Hyperbolic Discounting.” InternationalEconomic Review, 44(4): 1207–16.

Samuelson, Larry. 2001. “Analogies, Adaptation, andAnomalies.” Journal of Economic Theory, 97(2):320–66.

Samuelson, Larry. 2004. “Information-Based RelativeConsumption Effects.” Econometrica, 72(1): 93–118.

Samuelson, Larry and Jeroen Swinkels. 2001.“Information and the Evolution of the UtilityFunction,” SSRI Working Paper 2001-06, Universityof Wisconsin.

Sandroni, Alvaro. 2003. “The Reproducible Propertiesof Correct Forecasts.” International Journal of GameTheory, 32(1): 151–59.

Savage, Leonard J. 1972. The Foundations of Statistics.New York: Dover Publications.

Segal, Uzi and Joel Sobel. 2003. “Tit for Tat:Foundations of Preferences for Reciprocity inStrategic Settings,” Mimeo. Johns HopkinsUniversity and University of California, San Diego.

Selten, Reinhard. 1965. “SpieltheoretischeBehandlung eines Oligopolmodells mitNachfrageträgheit.” Zeitschrift fur die GesamteStaatswissenschaft, 121: 301–24 and 667–89.

Selten, Reinhard. 1975. “Reexamination of thePerfectness Concept for Equilibrium Points inExtensive-Form Games.” International Journal ofGame Theory, 4(1–2): 25–55.

Selten, Reinhard, Abdolkarim Sadrieh, and KlausAbbink. 1999. “Money Does Not Induce RiskNeutral Behavior, but Binary Lotteries Do EvenWorse.” Theory and Decision, 46(3): 211–49.

Slonim, Robert and Alvin E. Roth. 1998. “Learning inHigh Stakes Ultimatum Games: An Experiment inthe Slovak Republic.” Econometrica, 66(3): 569–96.

Smith, Cedric A. B. 1961. “Consistency in StatisticalInference and Decision.” Journal of The RoyalStatistical Society, Series B, 23(1): 1–37.

Smith, Vernon L. 1962. “An Experimental Study ofCompetitive Market Behavior.” Journal of PoliticalEconomy, 70(2): 111–37.

Smith, Vernon L. 1964. “Effect of Market Organizationon Competitive Equilibrium.” Quarterly Journal ofEconomics, 78(2): 181–201.

Smith, Vernon L. 1965. “Experimental Auction

Markets and the Walrasian Hypothesis.” Journal ofPolitical Economy, 73(4): 387–93.

Smith, Vernon L. 1976. “Bidding and AuctioningInstitutions: Experimental Results,” in Bidding andAuctioning for Procurement and Allocation. Y.Amihud ed. New York: New York University Press,43–64.

Smith, Vernon L. 1982. “Microeconomic Systems as anExperimental Science.” American Economic Review,72(5): 923–55.

Smith, Vernon L. 1991. Papers in ExperimentalEconomics. Cambridge: Cambridge University Press.

Smith, Vernon L. 2000. Bargaining and MarketBehavior: Essays in Experimental Economics.Cambridge: Cambridge University Press.

Smith, Vernon L. 2003. “Constructivist and EcologicalRationality in Economics.” American EconomicReview, 93(3): 465–508.

Sober, Elliott and David Sloan Wilson. 1998. UntoOthers: The Evolution and Psychology of UnselfishBehavior. Cambridge: Harvard University Press.

Sozou, Peter D. 1998. “On Hyperbolic Discounting andUncertain Hazard Rates.” Proceedings of the RoyalSociety of London, Series B, 265(1409): 2015–20.

Sunder, Shyam. 2004. “Markets as Artifacts: AggregateEfficiency from Zero-Intelligence Traders,” inModels of a Man: Essays in Honor of Herbert A.Simon. Mie Augier and James G. March, eds.Cambridge: MIT Press, 501–19.

Thaler, Richard H. 1988. “Anomalies: The UltimatumGame.” Journal of Economic Perspectives, 2(4):195–206.

Thaler, Richard H. 1992. The Winner’s Curse.Princeton: Princeton University Press.

Thaler, Richard H. 1994. Quasi-Rational Economics.New York: Russell Sage Foundation.

Tversky, Amos and Daniel Kahneman. 1982.“Judgment under Uncertainity: Heuristics andBiases,” in Judgment under Uncertainity: Heuristicsand Biases. Daniel Kahneman, Paul Slovic, andAmos Tversky, eds. Cambridge: CambridgeUniversity Press, 3–22.

Tversky, Amos and Daniel Kahneman. 1983.“Extensional Versus Intuitive Reasoning: TheConjunction Fallacy in Probability Judgment.”Psychological Review, 90(4): 293–315.

Walker, James M., Vernon L. Smith, and James C. Cox.1990. “Inducing Risk-Neutral Preferences: AnExamination in a Controlled Market Environment.”Journal of Risk and Uncertainty, 3(1): 5–24.

Wason, Peter. 1966. “Reasoning,” in New Horizons in Psychology. B. M. Foss, ed. Harmondsworth:Penguin Books.

Weibull, Jörgen W. 2004. “Testing Game Theory,” inAdvances in Understanding Strategic Behaviour:Game Theory, Experiments and BoundedRationality: Essays in Honour of Werner Güth.Steffen Huck, ed. Basingstoke: Palgrave.

Winter, Eyal and Shmuel Zamir. 1997. “An Experimentwith Ultimatum Bargaining in a ChangingEnvironment,” Discussion Paper 159, The HebrewUniversity, Center for Rationality and InteractiveDecision Theory.