40
Counterfactual Desirability Richard Bradley and H. Orri StefÆnsson Department of Philosophy, Logic and Scientic Method London School of Economics Houghton Street London WC2A 2AE March 21, 2014 Abstract The desirability of what actually occurs is often inuenced by what could have been. Preferences based on such value dependencies between actual and counterfactual outcomes generate a class of problems for or- thodox decision theory, the best known perhaps being the so-called Allais Paradox. In this paper we solve these problems by extending Richard Je/reys decision theory to counterfactual prospectsusing a multidimen- sional possible-world semantics for conditionals, and showing that prefer- ences that are sensitive to counterfactual considerations can still be de- sirability maximising. We end the paper by investigating the conditions necessary and su¢ cient for a desirability function to be an expected util- ity. It turns out that these additional conditions imply highly implausible epistemic principles. The desirability of what actually occurs is often inuenced by what could have been. Suppose you have been o/ered two jobs, one very exciting but with a substantial risk of unemployment, the other less exciting but more secure. If you choose the more risky option, and as a result become unemployed, you might nd that the fact that you could have chosen the risk-free alternative makes being unemployed even worse. For in addition to experiencing the normal pains of being out of job, you might be lled with regret for not having chosen the risk-free alternative. On other occasions something di/erent from regret explains the dependence of our assessments of what is the case on what could have been. Suppose that a patient has died because a hospital gave the single kidney that it had available to another patient. Suppose also that the two patients were in equal need of the kidney, had equal rights to treatment, etc. Now if we were to learn that a fair lottery was used to determine which patient was to receive the kidney, then most of us would nd that this makes the situation less undesirable than had the kidney simply been given to one of them. For that at least means that the patient who died for lack of a kidney had had a chance to acquire it. In other words, had some random event turned out di/erently than it actually did, the dead patient would have lived. This desirabilistic dependency between what is and what could have been creates well known problems for the traditional theory of rational choice under 1

Counterfactual Desirability 4th Revision2

Embed Size (px)

Citation preview

Counterfactual Desirability

Richard Bradley and H. Orri StefánssonDepartment of Philosophy, Logic and Scienti�c Method

London School of EconomicsHoughton Street

London WC2A 2AE

March 21, 2014

Abstract

The desirability of what actually occurs is often in�uenced by whatcould have been. Preferences based on such value dependencies betweenactual and counterfactual outcomes generate a class of problems for or-thodox decision theory, the best known perhaps being the so-called AllaisParadox. In this paper we solve these problems by extending RichardJe¤rey�s decision theory to counterfactual prospects�using a multidimen-sional possible-world semantics for conditionals, and showing that prefer-ences that are sensitive to counterfactual considerations can still be de-sirability maximising. We end the paper by investigating the conditionsnecessary and su¢ cient for a desirability function to be an expected util-ity. It turns out that these additional conditions imply highly implausibleepistemic principles.

The desirability of what actually occurs is often in�uenced by what could havebeen. Suppose you have been o¤ered two jobs, one very exciting but with asubstantial risk of unemployment, the other less exciting but more secure. If youchoose the more risky option, and as a result become unemployed, you might�nd that the fact that you could have chosen the risk-free alternative makesbeing unemployed even worse. For in addition to experiencing the normal painsof being out of job, you might be �lled with regret for not having chosen therisk-free alternative. On other occasions something di¤erent from regret explainsthe dependence of our assessments of what is the case on what could have been.Suppose that a patient has died because a hospital gave the single kidney thatit had available to another patient. Suppose also that the two patients were inequal need of the kidney, had equal rights to treatment, etc. Now if we were tolearn that a fair lottery was used to determine which patient was to receive thekidney, then most of us would �nd that this makes the situation less undesirablethan had the kidney simply been given to one of them. For that at least meansthat the patient who died for lack of a kidney had had a chance to acquire it.In other words, had some random event turned out di¤erently than it actuallydid, the dead patient would have lived.This desirabilistic dependency between what is and what could have been

creates well known problems for the traditional theory of rational choice under

1

risk and uncertainty, as formulated by John von Neumann and Oskar Morgen-stern [von Neumann and Morgenstern, 1944] and Leonard Savage [Savage, 1954].The �rst example is just a simpli�ed version of Maurice Allais�infamous para-dox [Allais, 1953], [Allais, 1979], whereas the latter is an instance of a decisiontheoretic problem identi�ed decades ago by Peter Diamond [Diamond, 1967].In this paper we use a framework based on a combination of Richard Je¤rey�sdecision theory [Je¤rey, 1983] and a multidimensional possible-world semanticsfor conditionals [Bradley, 2012] to explore the above dependency.Section 1 explains the two paradoxes and why they cast doubt on a ratio-

nality postulate known as separability. Unlike expected utility theory, RichardJe¤rey�s decision theory does not encode this separability property. But aswe explain in section 2, the lack of counterfactual prospects in Je¤rey�s theorynevertheless means that it cannot easily represent the preferences revealed inAllais� and Diamond�s examples. To overcome this problem, in section 3 weintroduce counterfactuals into Je¤rey�s theory and then, in section 4, show howthis makes it possible to represent Allais�and Diamond�s preferences as max-imising the value of a Je¤rey desirability function. In section 5 we show thatcontrary to what decision theorists and philosophers have typically assumed, asecond assumption of ethical actualism, quite di¤erent from the aforementionedseparability property, is involved in the clash between Allais� and Diamond�spreferences and expected utility theory. We conclude by showing that it is bothnecessary and su¢ cient for expected utility maximisation that agents�desiressatisfy both ethical actualism and separability, and that these principles imposeunreasonable constraints on agents�beliefs.

1 Two Paradoxes of Rational Choice

The Allais Paradox has generated a great deal of discussion both amongstphilosophers and behavioural economists and psychologists. The paradox isgenerated by o¤ering people a pair of choices between di¤erent lotteries, each ofwhich consists in tickets being randomly drawn. First people are o¤ered a choicebetween a lottery that is certain to result in the decision maker receiving a par-ticular prize, say £ 2.400, and a lottery that could result in the decision makerreceiving nothing, but could also result in the decision maker receiving eitheras much as or more than £ 2.400. The situation can be represented as a choicebetween the lotteries L1 and L2 below, where for instance L1 results in the de-cision maker receiving a prize of £ 2.500 if one of tickets number 2 to 34 is drawn:

1 2� 34 35� 100L1 £ 0 £ 2500 £ 2400L2 £ 2400 £ 2400 £ 2400

Having made a choice between L1 and L2, people are asked to make a secondone, this time between lotteries L3 and L4:

1 2� 34 35� 100L3 £ 0 £ 2500 £ 0L4 £ 2400 £ 2400 £ 0

2

Repeated (formal and informal) experiments have con�rmed that people byand large choose and strictly prefer L2 over L1 and L3 over L4. (See [Kahnemanand Tversky, 1979] for discussion of an early experiment of the Allais Paradox.)One common way to rationalise this preference, which we will refer to as �Allais�preference�, is that when choosing between L1 and L2, the possibility of endingup with nothing when you could have received £ 2.400 for sure outweighs thepossible extra gain of choosing the riskier alternative, since receiving nothingwhen you could have gotten £ 2.400 for sure is bound to cause considerable regret(see e.g. [Loomes and Sugden, 1982] and [Broome, 1991]). When it comes tochoosing between L3 and L4, however, the desire to avoid regret does not playas strong role, since decision makers reason that if they choose L3 and end upwith nothing then they would, in all likelihood, have received nothing even ifthey had chosen the less risky option L4.Intuitively rational as it seems, Allais� preference is inconsistent with the

most common formal theories of rational choice. (Assuming, that is, that theprobabilities of each ticket is the same in the two choice situations. That is,the probability of a ticket being drawn from e.g. tickets 2-34 is the same inboth choice situations.) Let us continue, throughout this section, to think ofthe alternatives between which people have preferences as lotteries, with theunderstanding that some lotteries result in the same consequence (or �prize�)in all states of the world. According to the standard theory of rational choice,expected utility theory (EU theory), all rational preferences can be representedas maximising the expectation of utility, where the expected utility of a lotteryL is given by:

EU(L) =XSi�S

u(L(Si)):Pr(Si)

where S is a partition of the possible states, L(Si) is the consequence of L if Sihappens to be the actual state of the world, u a utility measure on consequencesand Pr a probability measure on states.In the usual manner let % represents the agent�s �... is least as preferred as ...�

relation between alternatives and � and � the corresponding strict preferenceand indi¤erence relations between them. Then EU theory states that for anyrational agent:

L � L0 i¤ EU(L) > EU(L0) (1)

(When the above holds for someone�s preferences, we say that the EU func-tion represents their preferences.)The problem that the Allais Paradox poses to standard decision theory, is

that there is no way to represent Allais�preferences in terms of the maximisationof the value of a function with the EU form. To see this, let us assume thatin both choice situations the decision maker considers the probability of eachticket being drawn to be 1/100. Then if Allais�evaluation of the alternatives isin accordance with the EU equation, Allais�preferences implies that both:

u($0) + (33� u($2500)) + (66� u($2400)) < 100� u($2400) (2)

and:u($2400) + 33� u($2400) < u($0) + 33� u($2500)

3

But the latter implies that:

u($2400) + 33� u($2400) + 66� u($2400) = 100� u($2400)< u($0) + 33� u($2500) + 66� u($2400)

in contradiction with inequality 2. Hence, there is no EU function that simul-taneously satis�es EU(L1) < EU(L2) and EU(L4) < EU(L3). In other words,there is no way to represent a person who (strictly) prefers L2 over L1 and L3over L4 as maximising utility as measured by the an EU function. Since allrational preference should, according to EU theory, be representable as max-imising expected utility, this suggests that either Allais�preference is irrationalor EU theory is incorrect. (Hence the �paradox�: many people both want to saythat Allais�preference is rational and that EU theory is the correct theory ofpractical rationality.)Another way to see that Allais�preference cannot be represented as max-

imising the value of an EU function, is to notice that the preference violatesa separability condition on preferences which is required for it to be possibleto represent them by an EU function. (This separability condition is perhapsbest known in the form of Savage�s Sure Thing Principle). The condition re-quires that when comparing two alternatives whose consequences depend onwhat state is actual, rational agents only consider the states of world where thetwo alternatives di¤er. More formally:

If

S1 S2Li x zLj y z

then Li � Lj i¤ x � y.

In the choice problem under discussion, this means that you only need toconsider the tickets that give di¤erent outcomes depending on which alternativeis chosen. Hence, you can ignore the fourth column, i.e. tickets 35-100, bothwhen choosing between L1 and L2 and when choosing between L3 and L4,since these tickets give the same outcome no matter which alternative is chosen.When we ignore this column, however, alternative L1 becomes identical to L3and L2 to L4. Hence, by simultaneously preferring L2 over L1 and L3 over L4,the decision maker seems to have revealed an inconsistency in her preferences.The second example discussed in the introduction generates a paradox sim-

ilar to Allais�if we assume that there is nothing irrational about strictly prefer-ring a lottery that gives the patients an equal chance of receiving the kidney togiving the kidney to either patient without any such lottery being used. If wecall the patients Ann and Bob, and let ANN represent the outcome where Annreceives the kidney and BOB the outcome where Bob receives the kidney, thento represent the aforementioned attitude, which we will refer to as �Diamond�spreference�, as maximising the value of an EU function, it has to be possible tosimultaneously satisfy:

u(ANN) < 0:5u(ANN) + 0:5u(BOB)

u(BOB) < 0:5u(ANN) + 0:5u(BOB)

But that is of course impossible: an average of the values u(ANN) andu(BOB) can never be greater than both values u(ANN) and u(BOB).

4

Again, we can see the tension between Diamond�s preference and standardtheories of rational choice by noticing that it violates separability. An implica-tion of separability is that, given the prospects displayed below, where E repre-sents the outcome of some random event (e.g. a coin toss), L � LA i¤ LB � LAand L � LB i¤ LA � LB . Hence, Diamond�s preference in conjunction withseparability implies a contradiction.

E :EL ANN BOBLA ANN ANNLB BOB BOB

The fact that both Allais�and Diamond�s preferences involve a violation ofseparability and that their preferences seem intuitively rational (or at least notirrational), casts doubt on separability as a rationality postulate. Moreover,both the desire to avoid regret, as manifested in Allais� preference, and theconcern for giving each patient a �fair chance�, which seems to be what underliesDiamond�s preference, have something to do with counterfactuals. Regret, atleast in the situation under discussion, is a bad feeling associated with knowingthat one could have acted di¤erently and that if one had things would have beenbetter. And to say that even if Bob did not receive a kidney he neverthelesshad a chance, seems to mean that there is a meaningful sense in which thingscould have turned out di¤erently � for instance, a coin could have come updi¤erently � and if they had, Bob would have received the kidney. So bothAllais and Diamond violate the formal separability requirement of standarddecision theories since they judge that the value of what actually occurs at leastpartly depends on what could have been, i.e. on counterfactual possibilities.Perhaps for the reason discussed above, some economists and philosophers

have thought that separability as a requirement on preference is implied by anevaluative assumption we call ethical actualism. Informally put, ethical actual-ism is the assumption that only the actual world matters, so that the desirabilityof combinations of what actually occurs and what could have occurred only de-pends on the desirability of what actually occurs. In a well-known defence ofseparability, Nobel Laureate Paul Samuelson argues that it would be irrationalto violate ethical actualism, and since he thinks that ethical actualism impliesseparability, he takes this argument to show that it would be irrational to vi-olate separability. The separability postulate Samuelson was defending, whichis implied by what we above called separability, states that if some outcome(A)1 is at least as good as (B)1 and (A)2 is at least as good as (B)2, then analternative that results in either (A)1 or (A)2 depending on whether a fair coincomes up heads or tails, is at least as good as an alternative that results in(B)1 or (B)2 depending on how the coin lands. Here is Samuelson�s informaljusti�cation of the axiom:

[E]ither heads or tails must come up: if one comes up, the othercannot; so there is no reason why the choice between (A)1 and(B)1 should be �contaminated�by the choice between (A)2 and (B)2.([Samuelson, 1952]: 672-673)

In other words, the reason an evaluation or ordering of alternatives should

5

satisfy separability, is that there should be no desirabilistic dependencies be-tween mutually incompatible outcomes; in other words, our preferences shouldsatisfy separability since our evaluation of outcomes should satisfy ethical actu-alism.Some philosophers and decision theorists have cited Samuelson�s remark

favourably. John Broome, who takes it to at least provide a �prima facie pre-sumption in favour of [separability]�, rhetorically asks: �How can somethingthat never happens possibly a¤ect the value of something that does happen?�([Broome, 1991]: 96). But however closely related ethical actualism and sepa-rability might seem to be, the former does not (by itself) imply the latter. Infact the two are based on di¤erent, though consistent, intuitions. The formerexpresses the idea that only what actually happens matters, while the latterexpresses the idea that the desirability of what would be the case if one set ofconditions held true is independent of what would be the case if some other setof conditions did. To see that these are di¤erent requirements consider the setof prospects displayed in the matrix below.

E :EL1 ANN BOBLA ANN ANNLB BOB BOBL2 BOB ANN

Now, as we have seen, separability requires that L1 � LA i¤ LB � L2. Onthe other hand, ethical actualism requires that, conditional on E being true,L1 � LA and LB � L2. Clearly it is possible for one of these to hold withoutthe other. So even if Samuelson and Broome are right about the intuitiveappeal of ethical actualism, this does not in any way establish that separabilityis rationally required.In section 5 we will show that, in fact, both conditions need to be satis�ed

for it to be possible to represent an agent�s preferences as maximising expectedutility. So expected utility theory is a good deal more demanding than is typ-ically acknowledged by its proponents. But �rst we show that there are formsof maximisation that are far less demanding.

2 Je¤rey Desirability

Not all decision theories assume separability. In particular, the version of de-cision theory developed by Richard Je¤rey [Je¤rey, 1983] makes do with muchweaker conditions on preference, but which nonetheless entail that the desirabil-ity of any proposition is a weighted average of the di¤erent ways in which theproposition can be true, where the weights on each is given by the conditionalprobability, given the truth of the proposition, of it coming true in that way.The question that we now want to explore is whether we can represent Allais�and Diamond�s preferences as maximising Je¤rey desirability.1

1The possibility of representing Allais�preference as maximising desirability would prob-ably not have impressed Je¤rey himself, who was satis�ed with Savage�s view that Allais�preference reveals some sort of �error�of judgement ([Savage, 1954]: 102-103; [Je¤rey, 1982]:722).

6

Je¤rey de�nes both his desirability measure, Des, and his probability mea-sure, Prob, on a Boolean algebra of propositions �a set of propositions closedunder negation, conjunction and disjunction �from which the impossible propo-sition has been removed. If we take a proposition to be a set of possible worlds,we can state his theory more formally as follows. Let W be the universal set ofpossible worlds and the set of subsets of W (i.e. the power set of W). Thendesirability and probability measures are de�ned over , elements of which (thepropositions) we denote by non-italic uppercase letters (A, B, C, etc.). Wecan thus think of each way in which proposition A can be true as a worldthat is compatible with the truth of A. Assuming for simplicity that there arecountably many mutually exclusive worlds compatible with A,2 then the Je¤rey-desirability of a proposition is given by:

Des(A) =Xwi2W

Des(wi):P rob(wi j A)

Why should we accept this as a measure of how desirable the truth of aproposition is? One way to see the appropriateness of this measure is to thinkof desirability as news value; that is, a proposition A is desirable to an agent tothe extent that it would be valuable for her to learn that A is true. Intuitively,it seems that the desirability of �nding out the truth of A depends on howdesirable are the di¤erent ways in which A could be true and the probabilitythat it comes true in one of these ways rather than another.A more formal justi�cation for Je¤rey�s equation as a way of calculating

desirability is that it is partition invariant. That is, if a proposition A can beexpressed as the disjoint disjunction of both fB1;B2;B3:::g and fC1;C2;C3:::g,then

PBi2A Prob(Bi j A):Des(Bi) =

PCi2A Prob(Ci j A):Des(Ci) (see e.g.

[Joyce, 1999], Theorem 4.1). The same is not true of the expected utility equa-tion: the same alternative will get assigned di¤erent utilities depending on howwe partition the state and outcome spaces. In fact, unlike Je¤rey�s desirabilitymeasure, the expected utility measure can only be used when the state andoutcome spaces have been partitioned �nely enough to account for everythingthe agent cares about. In other words, our partition needs to be such thatgiven each alternative and state, there is no uncertainty as to the utility valueof the outcome associated with that alternative-state pair. If we do not havesuch �ne partitions, as ordinary decision makers rarely (if ever) do, then di¤er-ent partitions will lead to alternatives being assigned di¤erent values. Hence,when using the Expected Utility equation, di¤erent partitions may recommenddi¤erent courses of action.In Je¤rey�s theory acts are just propositions that can be made true at will

and so the desirabilities of acts will depend on the conditional probabilities oftheir consequences, given the performance of the acts. As a result, separabilitycan fail. For instance, consider two acts A and B with consequences contingenton states S1 and S2, as displayed below:

S1 S2A x zB y z

2 If we want to assume in�nitely many (non-atomic) worlds we take the integral instead ofthe sum.

7

Separability requires that A � B i¤ x � y. But if z is considered a moredesirable outcome than both x and z, and A makes S2 more likely than does B,then A might be assigned a higher Je¤rey desirability than B even when x isnot preferred to y.Unfortunately, this does not completely settle our question. For although

Je¤rey�s theory does not imply separability, as it is usually applied, it is alsoinconsistent with the Allais and Diamond preferences. Let us focus on theDiamond paradox to see the problem. LB now represents the set of worldswhere Bob gets the kidney no matter what, L:B the set of worlds where Anngets the kidney no matter what, and L the set of worlds where the toss of afair coin decides who gets the kidney. Then for Diamond�s preference to becompatible with Je¤rey�s theory, it would seem that there has to be a functionDes such that:

Des(ANN) < Des(ANN):P rob(ANN j L) +Des(BOB):P rob(BOB j L)

Des(BOB) < Des(ANN):P rob(ANN j L) +Des(BOB):P rob(BOB j L)But again, a probability mixture of the desirabilities of ANN and BOB can ofcourse never exceed the desirability of both ANN and BOB.What this shows is that there is more at play than just the failure of separa-

bility in the explanation of Allais�and Diamond�s preferences. For the standardrepresentation of the two problems, and our application of Je¤rey�s theory tothem, implicitly builds in the aforementioned assumption of ethical actualism.Without this assumption (but assuming that the desirability of Ann or Bobgetting the kidney is independent of the random event E), Je¤rey�s theory justsays that:

Des(L) = Des(ANN ^ L):P rob(ANN j L) +Des(BOB ^ L):P rob(BOB j L)

and nothing requires that Des(ANN^L) = Des(ANN) or Des(BOB^L) =Des(BOB).It seems then that the way to accommodate the Allais and Diamond prefer-

ences within Je¤rey�s framework is just to specify the consequences of actionssu¢ ciently broadly so as to make it intelligible that, for instance, Ann gettingthe kidney in a fair lottery is a di¤erent consequence from her getting it as apart of a process that made it certain she would receive it. More generally thenotion of consequence should be broadened to take account of what could havehappened as well as what did happen. Just such a response to Allais�paradoxhas been suggested by, for instance, John Broome [Broome, 1991], who arguesthat if regret matters to an agent then it should be encoded in the outcomesof the lotteries3 , and by Paul Weirich [?], who argues that the correct way toaccount for the risk attitudes displayed in the Allais paradox is to allow thatthe risk involved in exercising an option counts as one of its consequences.Solutions of this kind will be unsatisfactory however if they involve intro-

ducing new primitive consequences in the representation of the decision prob-lem, without explaining their relationship to the available actions. In particu-lar, they must explain what it is about the form of the lottery L that makes

3Broome makes his suggestion for resolving the problem withinin Savage�s framework but,as he notes, this leads to other problems: most notably to a tension with what he calls therectangular �eld assumption. As Je¤rey�s theory makes not such assumption, the solutionlooks more promising in his framework.

8

Des(ANN^L) > Des(ANN). Moreover, to avoid trivialising decision theoryby making it allow that any possible choice is rational, we should require thatexercises of this kind, where new propositions (or consequences) are created tomake seemingly problematic preferences compatible with decision theory, adhereto some independently plausible principles.In the context of Je¤rey�s framework, avoiding these objections requires a

speci�cation of the propositional structure of lotteries and the attitudes thatthey support. We do so by widening the domain of Je¤rey�s theory to includecounterfactual propositions. And we show that the properties that generateAllais�and Diamond�s paradoxes, respectively regret and fairness, then emergeas a relationship between factual and counterfactual conditional propositions.This is not an ad hoc solution to the problems under consideration, we think,since decision theory should independently of these problems allow for the valuedependencies one often �nds between actual and counterfactual outcomes.Let us �rst brie�y mention why introducing indicative conditionals to Jef-

frey�s theory (as e.g. done in [Bradley, 1998] and [Bradley, 2007]) will not solvethe problem of representing Allais�and Diamond�s preferences. An indicativeconditional is generally considered to be what Jonathan Bennett calls zero in-tolerant, �meaning that such a conditional is useless to someone who is reallysure that its antecedent is false� ([Bennett, 2003]: 45). In other words, if �7!�represents the indicative conditional connective, then A 7! B is informative forsomeone who thinks that A might be true (where �might� is understood epis-temically, not merely logically or metaphysically). But A 7! B provides noinformation about a world where one is certain that A is false. (Hence, it is�uselessness�to someone who is certain that A is false.4) It is therefore plausibleto assume, as Bradley does, that Des(:A ^ (A 7! B)) = Des(:A), since if A isbelieved to be false A 7! B makes no desirabilistic di¤erence. Thus the condi-tionals that generate the paradoxes discussed in section 1 cannot be indicativeconditionals, since the problems they generate consist exactly in the fact thatthey have desirabilistic impact when their antecedents are believed to be false.What we need to do therefore is introduce counterfactual conditionals into

Je¤rey�s theory. Je¤rey himself recognised the need to do so and tried to solvethe problem of providing an account of counterfactuals, but by his own accountdid not succeed.

(If I had, you would have heard of it. There�s a counterfactual foryou.) In fact, the problem hasn�t been solved to this day. I expectit�s unsolvable. ([Je¤rey, 1991]: 161)

Je¤rey was unduly pessimistic. Since he made this remark there has beenconsiderable progress in the understanding of counterfactuals, progress that wenow build on.

4The fact that a conditional is zero-tolerant does not necessarily mean that its antecedentis false. Hence, some want to call such conditionals subjunctive conditionals. That name ishowever not necessarily any better, since zero-tolerant conditionals are not always expressedin the subjunctive mood. Hence, we will stick with the term �counterfactual�.

9

3 Counterfactuals

Our problem is to �nd a way of representing counterfactual propositions (coun-terfactuals for short) in a way which will enable us to exploit the resources ofJe¤rey�s decision theory. To do so we extend standard possible world modellingof propositions in a natural way by introducing the notion of a possible coun-teractual world under a supposition. A possible world is a way things might beor might have been. A possible counteractual world under the supposition thatsome A is true, on the other hand, is just a way things might be, or might havebeen, were A true.If world wA could be the case under the supposition that A, then we will

say that wA is a possible counteractual A-world. If A is false, wA will be said tobe strictly counterfactual. (Any counteractual A-world is strictly counterfactualrelative to any possible world in which A is false for instance. But counteractualworlds are not always strictly counterfactual: if A is true then wA may not onlybe a possible way things are under that supposition that A, but the way thingsactually are.)Our basic thesis is: possible counteractual worlds make counterfactual claims

true in the same way that possible actual worlds make factual claims true. Forinstance, if wA is a counteractual A-world at which it is true that B, then wAmakes it true that if A were the case then B would be. Thus the counteractualworld in which Obama is born in Kenya and goes to school in Nairobi makesit true that had Obama been born in Kenya he would have gone to school inNairobi, while the counteractual world in which he is born in Kenya but goesto school in Mombasa, makes it false.To illustrate this thesis, consider a simple model based on the set W =

fw1; w2; w3; w4; w5g of just �ve possible worlds and the corresponding set of itssubsets, including the events A = fw1; w2; w3g, A = fw4; w5g, B = fw1; w2; w4gand C = fw1; w3; w5g which are respectively the sets of worlds at which it istrue that A, :A, B and C (throughout, we use A to denote W - A). Rela-tive to the set of possible worlds W, a supposition induces a set of possiblecounteractual worlds. The supposition that A, for instance, induces the setof counteractual A-worlds, WA = fw1; w2; w3g, and the corresponding set ofsets of counteractual worlds, A, containing conditional events BA = fwi 2WA : wi 2 Bg = fw1; w2g, CA = fw1; w3g and so on. The supposition that A isfalse induces a di¤erent set of counteractual worlds �namely WA = fw4; w5g �and a corresponding set of conditional events A . The supposition that B yetanother. And so on. Note that we have adopted the convention of denoting setsof worlds with non-italicised letters, with A denoting the set of worlds at whichit is true that A and BA denoting the set of A-worlds at which it is true that B.For simplicity, we restrict attention to a single supposition for the moment,

namely the supposition that A. The set of elementary possibilities is then givenby a subset z of the cross-product of W and WA , which can be presented intabular form as follows:

10

Supposed A-worldsWorlds w1 w2 w3w1 hw1; w1i hw1; w2i hw1; w3iw2 hw2; w1i hw2; w2i hw2; w3iw3 hw3; w1i hw3; w2i hw3; w3iw4 hw4; w1i hw4; w2i hw4; w3iw5 hw5; w1i hw5; w2i hw5; w3i

Table 1: Possibility Space

Each ordered pair !ij = hwi; wji appearing in the cells of the table repre-sents an elementary possibility: that wi is the actual world and that wj is thecounteractual A-world. Sets of such possibilities will serve for us as proposi-tions. Factual propositions are given by rows of the table. The proposition thatA, for instance, is given by the �rst, second and third rows of the table, whilethat of B by the �rst, second and fourth. Conditional propositions, on the otherhand, are given by columns of the table. The proposition that if A then B, forinstance, is given by the �rst and second columns of the table, while the propo-sition that if A then C is given by the �rst and third columns. Conjunctions,disjunctions and negations of propositions (conditional or otherwise) are givenby their intersection, union and complements.Table 1 implicitly assumes that every element of W � WA is a possible

combination of facts and counterfacts, but this assumption is easy to dispensewith. To generate a spacez of elementary possibilities we make use of a selectionfunction on worlds which determines which counteractual worlds are �accessible�from them. Formally, a selection function f is a mapping from W� to satisfying, for all A � W:

1. f(W, A) � A

2. f(W, A) = ?, A = ?

3. If w 2 A then w 2 f(W, A)

The �rst condition simply states that counteractual worlds under the sup-position that A must be worlds at which it is true that A and the second thatthe set of counteractual worlds is empty only if the supposition is contradictory.The third condition requires that any world at which it is true that A must bea possible counteractual A-world. This condition is termed Weak Centering, incontrast to its stronger �cousin� that is typically assumed in the semantics ofcounterfactuals, namely:

Centering: If w 2 A then f(W, A) = fwg

Centering expresses a particular conception of the relation between factualand counterfactual possibility, according to which what is actually true deter-mines what might have been true under any supposition consistent with theactual truth. This is surely right for epistemic possibility: at any world w atwhich it is true that A it is not epistemically possible that any world other thanw be the case on the supposition that A. And epistemic possibility would seem

11

to be what is at issue when we reason evidentially using indicative conditionals.On the other hand it is much more controversial whether Centering governscausal possibility and hence whether it is appropriate to counterfactual rea-soning. Both Lewis and Stalnaker assume that it is, perhaps because they takecounterfactual and evidential reasoning to coincide when what is being supposedis in fact true. But in the absence of a deterministic relationship between twoevents it does not seem obviously right to regard the fact of their co-occurrenceto imply that the occurrence of one causally necessitated the other. So it is notclear that the assumption is appropriate for counterfactuals. In any case, we donot need to settle the issue here and will for the sake of generality not assumeit.We now have all the ingredients in place to state our account of counterfac-

tual possibility. Formally a Suppositional Algebra is a structure hW;;S; f;z;�iwith W a set of possible worlds, a Boolean algebra of subsets of W, S =fSig � a set of n suppositions, f a selection function from W�S to , z theset of elementary possibilities and � = }(z) the set of all propositions, with:

z := f! = hw;w1; :::; wni : w 2W; wi 2 f(W, Ai)g

For any Si 2 S, let i be the power set of Si. Then given X 2 and Yi 2i, let hX, Y1 , ..., Yni be the element of � that is the proposition that X isthe case, that Y1 is or would be, on the supposition that S1, ..., and that Yn isor would be, on the supposition that Sn. Each such ordered n-tuple is thus acoarse-grained but complex proposition concerning both what is and what couldbe. When there is no risk of ambiguity we drop �empty�notation and write Xfor hX, S1 , ..., Sni, the proposition that X is the case; Yi for hW, S1 , ..., Yi, ...,Sni, the proposition that if Si is or were the case then Yi is or would be; hX,Yii for hX, S1 , ..., Yi, ..., Sni; and so on. It follows that hX, Yii = X \ Yi, hY1 ,..., Yni = \(Yi) and so on.Of particular interest to our discussion are propositions of the form hY1 ,

..., Yni which specify what will or would be the case under each supposition.For it is propositions of this kind that serve in expected utility theory as rep-resentations of the actions over which agents have preferences. Consider, forinstance, the case described by Diamond which was previously represented intabular form by:

E :EL ANN BOBLA ANN ANNLB BOB BOB

In our framework, ANN , BOB and E, as well as any Boolean compound ofthem, would make up the set of factual propositions, with E and :E serving asthe suppositions of interest. The full set of elementary possibilities would thenbe given by the cross product of the set of factual propositions and fE;:Eg,and would contain conditional propositions such as ANNE (the propositionthat Ann would get the kidney of E were the case) and Boolean compounds ofthem. In particular, lottery L would be identi�ed by the complex propositionhANNE ; BOB �Ei; a proposition that is a conjunction of the conditional propo-sitions ANNE and BOB �E , i.e. L = ANNE \BOB �E . Similarly for degeneratelotteries: LA = (ANNE ; ANN �E) and LB = (BOBE ; BOB �E). Our task now is

12

to say what attitudes one can rationally take to such propositions.

3.1 Probability

An agent can be uncertain both about what is actually the case and about whatis or would be the case if some condition is or were true. One might be prettysure that the match is to be played tomorrow, for instance, but quite unsure as towhether it would be played were it to rain. The �rst kind of uncertainty will berepresented here by a probability mass function p0 on the set of possible worldsW measuring the probability that any world is the actual one.5 The second kindof uncertainty �her uncertainty about what would be case if some suppositionSi were true �will be represented by a probability mass function pi on the setof possible counteractual worlds Si. Finally her combined uncertainty will begiven by a joint probability mass function, p, on the ordered n-tuples of worldsthat constitute the elementary possibilities, measuring the joint probabilitiesof actuality and counteractuality under the various suppositions. For example,p(hw;w1; :::; wni) is the probability that w is the actual world, that w1 is/wouldbe the counteractual world on the supposition that S1, ..., and that wn is/wouldbe the counteractual world on the supposition that Sn.The mass function p induces a corresponding probability measure Prob on

the set � of all propositions by means of the following de�nition. For all � 2 �(where � could be either factual or conditional):

Prob(�) :=X!2�

p(!)

Similar de�nitions would allow derivation of a probability measure Prob0 onthe set of sets of worlds and probability measures Prob1, ..., Probn on thecorresponding sets of sets of possible counteractual worlds 1; :::;n. Each ofthese sets constitute a Boolean algebra of sets of worlds and it is straightforwardto establish that the corresponding measure on them satis�es the axioms ofprobability.Within our multidimensional possible world model, Prob0 serves as a mea-

sure of the agent�s degrees of belief in the facts, while each Probi measures herdegrees of belief in the counterfacts under the supposition that Si. Finally Probencodes the agent�s state of belief regarding both the facts and the counterfacts,with Prob(hX,Y1 ,...,Yni) measuring the joint probability that X is the case andthat Yi is or would be the case if Si. Now if Prob is to serve as a measure of jointprobability in the manner suggested it must agree with its marginals, Prob0,..., and Probn in its probability assignments to the coarse-grained propositionsfalling within their domain. More precisely, the marginals must agree with Probin accordance with the following condition:

5A probability mass function p on a set W = fwig is real-valued function on W satisfying:Xwi2W

p(wi) = 1

13

Marginalisation: For all X 2 and Yi 2 i:

Prob0(X) =Xw2X

p(hw;w1; :::; wni)

Probi(Yi) =X

wi2Wi

p(hw; :::; wi; :::; wni)

Marginalisation is a basic condition of coherence on any joint probabilitymeasure of an agent�s degrees of belief in both the facts and the counterfactsand we will assume throughout that it holds.Some much stronger conditions on probabilities, which require various kinds

of probabilistic independence between facts and counterfacts, will also play arole in our discussion since they are implied by expected utility theory butnot the framework we develop. To state them formally, note that the space ofpropositions � is the cross-product of the di¤erent domains 1; :::;n. To saythat any pair of these domains, i and j is stochastically independent is to saythat any pair of worlds wi 2 i and wj 2 j are probabilistically independentand hence that:

Prob(hwi; wji) = Prob(wi)� Prob(wj) = pi(wi)� pj(wj)

When stochastic independence holds between any such pair of domains theprobabilities of the propositions in each domain are independent of those in theother. We can now state two salient independence conditions corresponding tothe stochastic independence of the facts from the strict counterfacts and thestochastic independence of the counterfacts under mutually exclusive supposi-tions.

Fact-Counterfact Independence: If X \ Si = ?, then:

Prob(hX,Yii) = Prob(X):P rob(Yi)

Counterfact Independence: If Si \ Sj = ?, then:

Prob(hYi,Yji) = Prob(Yi):P rob(Yj)

Both conditions are very demanding and it is not di¢ cult to think of counter-examples. Suppose that I know that a prize is contained in one and only oneof two boxes. Then if pick one of them and discover that there is no prize init, then I can be sure that if I had picked the other box then I would have gotthe prize. So what is the case, namely that the prize in not in the box I picked,completely determines what would have been case had I picked the other one.In violation of Fact-Counterfact Independence.Similarly, suppose I am about to pick one of the boxes but before opening

them am told that were I to open the other box I would win the prize. I can inferimmediately that if I open the box I intended then I will not win the prize. Sothe counterfacts under the supposition that I open one box are not independentof those under the supposition that I open the other. In violation of ProbabilitySeparation.It seems clear that counterfactual reasoning does not typically satisfy these

two conditions. And rationality surely does not require that they be satis�ed,

14

since that would mean that some very useful strategies we use to gather informa-tion about our world would be deemed rationally impermissible. Nonetheless,as we shall see, they are both implied by expected utility theory (but not byJe¤rey�s theory). We take it that a good theory of practical rationality should, ifpossible, avoid such implausible epistemic implications. Hence, this result castsdoubt on the claim that expected utility theory is our best theory of practicalrationality.As the above examples suggest, Fact-Counterfact Independence and Coun-

terfact Independence are closely related. But they are not equivalent. In fact inthe presence of Centering, Fact-Counterfact Independence implies CounterfactIndependence, but the latter only implies the former in the presence of a furthercondition, namely:

Supposition Independence: Prob(hSi,Yii) = Prob(Si):P rob(Yi)

Supposition Independence says that the probability that Yi is or would bethe case on the supposition that Si is independent of whether Si is true or not.It�s a much more compelling than the other two independence conditions andarguably the characteristic property of evidential supposition. In this context,however, its main signi�cance lies in the following claim, which we prove in theappendix as Theorem 10.

Probability Equivalence Theorem: Assume Centering. Then Fact-CounterfactIndependence is equivalent to the conjunction of Supposition Indepen-dence and Counterfact Independence.

3.2 Desirability and Counterfactual Value

Beliefs about counterfactual possibilities play an important role in our reasoningabout what we should do for they are the means by which we consider the con-sequences of our actions. So too do our evaluative attitudes to counterfactualpossibilities: for instance, through the regret we anticipate if we forego oppor-tunities that would have led to desirable outcomes. And just as our uncertaintyabout what is the case can be di¤erent from our uncertainty about what wouldbe the case if some or another condition were true, so too can our assessmentof how desirable something is di¤er from our assessment of how desirable itstruth is on the supposition of some condition or other. Even if one prefers tobe served a cold beer rather than a hot chocolate tonight, the preference couldbe reversed under the supposition that the evening will be a very cold.To re�ect this we now introduce measures of value on both the facts and the

counterfacts in the same way that we introduced probability measures on both.The desirability of possible actual worlds will be represented here by a utilityfunction u0 on W, while the desirability of possible counteractual worlds underthe supposition that Si, will be represented by a utility function ui on Wi.6 Fi-nally the joint desirability of worlds will be measured by a utility function, u; onn-tuples of worlds. For example, u(hw;w1; :::; wni) will measure the desirabilitythat w is the actual world, that w1 is/would be the counteractual world on thesupposition that S1, ..., and that wn is/would be the counteractual world on

6A utility function is nothing more than a numerical index of some underlying order,canonically the agent�s preference order over worlds.

15

the supposition that Sn. For convenience we assume that the measures are allzero-normalised in the sense that:X

w2Wu0(w):pw(w) =

Xwi2Wi

ui(wi):pi(wi) =X!2z

u(!):p(!) = 0

From the function u we can determine a corresponding desirability functionDes on all propositions by de�ning the desirability of any � 2 � to be theconditional expectation of utility given �. Formally:

Des(�) :=X!2�

u(!):p(!)

Prob(�)(3)

Similar de�nitions would determine desirability functions Desw and Desirespectively on and the i:The function Des serves to encode within our model the agent�s degrees

of desire for the truth of both facts and counterfacts by measuring, for anyelement hX,Y1 ,...,Yni, the desirability that X is the case and that Yi is/wouldbe the case if Si is/were. So informally we can speak of Des as a measure of jointdesirability value. But caution is required sinceDes is not a measure in the strictmathematical sense and the relationship between it and the �marginals�Desiand Des0 that is imposed by the multidimensional possible worlds structure isless strict than in the case of the probability measures.7 It is natural, however, totreat u and Des as extensions of u0 and Des0 to the broader set of propositions.Then without ambiguity we can write u(w) for u0(w) and, correspondingly,Des(X) for Des0(X). On the other hand, the co-scaling of desirabilities andsuppositional desirabilities requires caution and we will make no assumptionshere about how this should be done.Now what properties of desirability are entailed by these basic de�nitions

and conditions? The most important one is that Des is a normalised desirabilityfunction in the formal sense of satisfying the following two characteristic axioms:

V1 (Normality): Des(>) = 0

V2 (Desirability): If � \ � = ?, then:

Des(� [ �) = Des(�):P rob(�) +Des(�):P rob(�)

Prob(� [ �)

To see this let � and � be two disjoint propositions. Then in virtue of thezero-normalisation of the utility u on worlds:

Des(hW,W1 ,...,Wni) =X!2z

u(!):p(!) = 0

7The zero-normalisation of the desirability functions ensures that Desi(W i) = Des(W,Wi)= Des0(W) = 0, but since the unit of the desirability functions is not speci�ed (in contrast toprobability measures), the structure only requires that Desi and Des0 be linear transformsof Des. See Theorem 5 in the appendix for a proof of this claim.

16

in accordance with V1. And since � \ � = ?:

Des(� [ �) =X

!2�_�

u(!):p(!)

Prob(� [ �)

=X!2�

u(!):p(!)

Prob(� [ �) +X!2�

u(!):p(!)

Prob(� [ �)

=Des(�):P rob(�) +Des(�):P rob(�)

Prob(� [ �)

in accordance with V2, the axiom of desirability.This concludes our demonstration of the possibility of extending Je¤rey�s

decision theory to counterfactuals. The �nal issue that we need to address isthe conditions under which an agent�s preferences can be represented by the kindof value function constructed here. In other words, what conditions must theysatisfy if they are to be representable in terms of desirability maximisation? Theanswer is contained in the representation theorem for Je¤rey�s decision theoryproved by Ethan Bolker [Bolker, 1966]. Bolker imposes two main conditions onpreferences in addition to the standard requirement that they be continuous,complete and transitive. To state them in a form appropriate to our discussion,let % be a complete, transitive and continuous relation on a Boolean algebra ofpropositions (construed as sets of n-tuples of worlds) and let � and � be thecorresponding indi¤erence and strict preference relations on propositions. ThenBolker postulates:

Averaging: If � \ � = ?, then � % (� [ �) % � , � % �

Impartiality: Suppose � � � and that for some 6� �; � such that �\ � = ?and � \ = ?, it is the case that � [ � � [ . Then for all such ,� [ � � [ .

The axiom of Averaging is the main rationality constraint on preferencerequired for desirability maximisation and was implicitly assumed in our con-struction of a value function on counterfactual propositions. The essential ideathat motivates it is that no proposition can be better (worse) than its best(worst) realisation. The proposition that � [ � is consistent with it being thecase that � and with it being the case that �, but not both if � and � are mu-tually exclusive. Suppose � is preferred to �. Then at worst it being the casethat � [ � means that � and, at best, that �. So the desirability one attachesto � [ � should lie between that of � and �.Impartiality, on the other hand, is a rationality constraint on the relation

between preference and belief. It says that we can test for the equiprobabilityof any two co-ranked propositions � and � by taking a third proposition thatis inconsistent with both and checking to see whether �[ and �[ are rankedtogether. For suppose that the probability of � was in fact greater than that of�. Then it would be less likely that given that � [ than it would be that given that � [ . And so � [ would be either a less or a more attractiveproposition than � [ depending on whether � �; � or �; � � . But if it isestablished that the probability of � and � are the same then it should be thecase for all inconsistent with both � and �, that � [ � � [ .

17

Let us say that a pair of desirability and probability functions, Des andProb, jointly represent a preference relation % just in case for all � and � inthe domain of %:

� % � , Des(�) � Des(�)

In this case we say that the pair (Prob;Des) constitute a Je¤rey representationof the preference relation %. What Bolker proved was that, given some tech-nical conditions on the set of propositions (speci�cally that they constitute acomplete, atomless Boolean algebra) and on the preference relation % (speci�-cally that it be a weak order continuous with %), satisfaction of the axioms ofAveraging and Impartiality is necessary and su¢ cient for the preference relationto be desirability maximising. More formally:

Theorem 1 [Bolker, 1966] Let h�;�;?i be a complete, atomless Boolean al-gebra of propositions with zero element ?. Let % be a complete, transitive andcontinuous relation on � � f?g. Then there exists a pair of desirability andprobability functions, Des and Prob, respectively on �� f?g and �, that are aJe¤rey representation of % i¤ % satis�es Averaging and Impartiality:

4 Counterfactual-Dependent Preferences

Recall that Allais�and Diamond�s preferences cannot be represented as max-imising the value of an EU function because the EU equation implies that thevalue of an outcome in state Si is desirabilistically independent of any outcomein state Sj that is incompatible with Si; which in turn implies that the valueof what actually occurs never depends on what merely could have been. Butfor people with Allais�preferences, the desirability of receiving nothing is notindependent of whether or not one could have chosen a risk-free alternative.Similarly, for people with preferences like Diamond�s, the desirability of eitherpatient not receiving the kidney is not independent of what would have occurredhad some random event turned out di¤erently. So both Allais�and Diamond�spreferences, on this interpretation, are dependent on the truth of counterfactu-als. Moreover, the part that causes the violation of expected utility theory canin both cases be formalised as a relationship between a proposition and a set ofworlds that are strictly counter-actual.To make the above claim more precise let�s look at Diamond�s preference

�rst and suppose that Diamond wants to use a coin toss to decide who receivesthe kidney. Let A be the set of worlds where the coin comes heads up and A theset of worlds where the coin comes tails up. Let B be the set of worlds whereBob receives the kidney and B the set of worlds where Ann receives the kidney.We have thus made two simplifying assumptions already. Firstly, it might seemmore natural to let A (A) be the set of worlds where the coin comes heads (tails)up if tossed. But nothing is lost, we believe, by this simpli�cation. Secondly,we have limited our attention to situations where either Ann or Bob receivesthe kidney. But what is distinctive about Diamond�s preference is what it hasto say about situations where a number of individuals have an equal claim onan indivisible good that some but not all of them get. (Any kind of welfarismfor instance condemns a situation where none of the needing patients receivethe kidney.) Hence, since we want to focus on the core of this preference, it

18

seems justi�able to limit our attention to situations where one of Anna and Bobreceives the kidney.The part of Diamond�s preference that leads to violation of expected utility

theory can then be formulated thus:

hA \ B; BAi � hA \ B;BAi (4)

In other words, Diamond prefers the proposition that the coin comes headsup and Bob receives the kidney but Ann would have gotten it had tails comeup, to the proposition that the coin comes heads up and Bob receives the kidneyand would also have gotten it had the coin come tails up.Let us then turn to Allais�preferences and let A represent the set of worlds

where Allais chooses the risky option (which will be L1 or L3 depending onthe choice situation) and B the set of worlds where Allais is not guaranteed towin anything. Unlike when representing Diamond�s preference, we need a third(basic) set of worlds to represent Allais� preferences, since the worlds whereAllais is not guaranteed to win anything are not necessarily the same as theworlds where Allais wins nothing. But it is relative to a situation where Allaishas won nothing that the fact that he could have chosen a risk-free alternativemakes a di¤erence. Let D denote the set of worlds where Allais wins nothing.Then the preference that causes Allais to violate expected utility theory can berepresented thus:

hA \ B \D;BAi � hA \ B \D; BAi (5)

In other words, according to Allais, winning nothing after having taken arisky choice is made worse when it is true that had he chosen di¤erently hewould de�nitely have won something.

4.1 Preference Actualism and Desirability Maximisation

We have seen that both Diamond�s and Allais�preferences exhibit a non-trivialsensitivity to counterfactual states of a¤airs that is manifested in the violationof a condition that we will call Preference Actualism: the requirement thatpreferences for propositions be independent of the strict counterfacts. Formally:

Preference Actualism: For all sets of worlds A, B, C such that C \ A = ?:

hC;BAi � hC; BAi

Preference Actualism is of course just a version of the doctrine of ethicalactualism that was informally introduced earlier. As we mentioned then, andwill explain more precisely in section 5, it is not su¢ cient that preferences areseparable that they satisfy Preference Actualism. An agent may regard the de-sirability of the counterfacts to be independent of the facts without thinking thatthe counterfacts do not matter. In the Diamond example such an agent mighthave preferences LB � L � LA, in accordance with separability, but contraryto Preference Actualism not be indi¤erent between L and LA, conditional on Ebeing the case, perhaps because they value the two relevant strict counterfacts�that Bob or Ann would have got it if E had not been the case �di¤erentlybut positively.In the appendix, we prove (as Theorem 19) that preferences that violate Pref-

erence Actualism cannot be represented as maximising expected utility. Since

19

a preference might violate Preference Actualism without violating separability,this result does not simply follow from the fact that separability is necessarycondition for expected utility maximisation. However, given certain assump-tions that are either implicitly or explicitly part of standard formulations ofexpected utility theory such as that of Savage [Savage, 1954] and which do seemto be satis�ed in Allais�and Diamond�s examples (in particular, Centering andan assumption about the probabilistic independence of counterfacts under dis-joint suppositions), Preference Actualism does imply separability. Hence, giventhe background of Savage�s framework, Allais�and Diamond�s violation of Pref-erence Actualism can be seen as explaining why they violate separability.While expected utility maximisation requires adherence to ethical actualism,

it is perfectly possible for preferences to satisfy Bolker�s axioms but violate Pref-erence Actualism. To show this we work again with our simple model based onthe set W = fw1; w2; w3; w4; w5g of �ve possible worlds and the correspondingset of its subsets, including the events A = fw1; w2; w3g, A = fw4; w5g, B= fw1; w2; w4g and B = fw3; w5g. For present purposes we only need to focus onone supposition, namely the supposition that A is false. Then the set of elemen-tary possibilities is given byW = fw1; w2; w3; w4; w5g�fw4; w5g and, in partic-ular, hA\B; BAi = fhw1; w5i; hw2; w5ig and hA\B; BAi = fhw1; w4i; hw2; w4ig.To induce the preferences required, we de�ne a pair of probability and utility

mass functions, p and u, on this set of world pairs, by setting p(hw4; w5i) =p(hw5; w4i) = 0 and assigning the values to remaining possibilities displayed inTable 2.

World Pairs Probability Utilityhw1; w4i 0:125 �1hw1; w5i 0:125 1hw2; w4i 0:125 �1hw2; w5i 0:125 1hw3; w4i 0:125 �1hw3; w5i 0:125 1hw4; w4i 0:125 0hw5; w5i 0:125 0

Table 2: Probability-Utility Values

Let Prob and Des be pair of probability and desirability functions on }(W)constructed from p and u in the manner previously outlined by application ofthe standard axioms of probability and desirability. It is easy to see that thepreferences induced by Des will violate Preference Actualism. In particularthey will be such that:

hA \ B; BAi � hA \ B;BAi (6)

hA \ B;BAi � hA \ B; BAi (7)

But by construction they satisfy the standard preference axioms of Je¤rey�sdecision theory. So it follows that preferences violating Preference Actualism,although not representable as utility maximising, may nonetheless be desirabil-ity maximising.

20

4.2 Modelling Allais�and Diamond�s preferences

Strictly speaking, equation 4 does not quite represent Diamond�s preference infull. Recall that Diamond�s preference consists in preferring a lottery (say acoin toss) that results in either Bob or Ann receiving a kidney (alternative L)to to giving the kidney to Ann without using a fair lottery (alternative LA) andalso to giving the kidney to Bob without using a fair lottery (alternative LB).This is how Diamond might evaluate the �constant�alternatives:

Des(LA) = Des(hA \ B; BAi)

Des(LB) = Des(hA \ B;BAi)

But since the lottery can turn out in more than one way, Diamond must, ifhe is to satisfy Je¤rey�s equation, evaluate its desirability as a weighted sum ofthe ways in which it might turn out, for instance:

Des(L) = 0:5Des(hA \ B; BAi) + 0:5Des(hA \ B;BAi)

assuming that he believes that a fair coin is (properly) tossed to decide whoreceives the kidney.There is thus a Je¤rey-desirability function representing Diamond�s prefer-

ence as long as there is a function Des that simultaneously satis�es:

Des(hA \ B;BAi) < 0:5Des(hA \ B; BAi) + 0:5Des(hA \ B;BAi)

Des(hA \ B; BAi) < 0:5Des(hA \ B; BAi) + 0:5Des(hA \ B;BAi)

Since what motivates Diamond�s preference is his concern for fairness, he is(let us suppose) indi¤erent between Bob and Ann actually receiving the kidney.Moreover, the value generated by having used the lottery, or the disvalue gen-erated by not having used the lottery, is according to Diamond independent ofwhether Ann or Bob actually receives the kidney. Hence, for Diamond:

0:5Des(hA \ B; BAi) + 0:5Des(hA \ B;BAi) = Des(hA \ B; BAi)= Des(hA \ B;BAi)

Des(hA \ B;BAi) = Des(hA \ B; BAi)

Therefore, to be able to represent Diamond�s preference as maximising Je¤rey-desirability, all that is required is that there is a Je¤rey-desirability function suchthat:

Des(hA \ B;BAi) < Des(hA \ B; BAi)

and in last section we saw that such functions exist.The same can be said for Allais�preference, namely that it is only partly

captured by 5. But again, it is not hard to show that in Allais�case all that needsto be established is that there is a desirability function such that Des(hA\B\D,BAi) < Des(hA\B\D, BAi). And this is precisely what we did in last section.

21

5 Ethical Actualism and Separability

We have argued that there are rational patterns of preference that are desir-ability maximising but not expected utility maximising. In this last section weturn to the question of what additional assumptions are needed for an agent�spreferences to be representable, not just by a desirability function, but by adesirability function that takes the form of an expected utility. We do so withthe aim of showing just how strong the additional constraints imposed on pref-erences by these assumptions are and hence to add plausibility to ourclaim that rationality does not require expected utility maximisation.Let us begin by de�ning more carefully what it means for a desirability func-

tion to be an expected utility. Recall that acts are modelled in our frameworkby propositions of the form hY1 , ..., Yni, where each Yi is the consequence ofchoosing the action in question in the event that Si. An expected utility rep-resentation of a preference relation is characterised by a particular form thatthe desirability of such propositions take, namely that their desirabilities areprobability weighted averages of the desirabilities of the Yi. More exactly:

Expected Utility: A desirability function Des de�ned on a suppositional al-gebra of propositions is an expected utility on this algebra i¤:

Des(hY1 ,...,Yni) =nXi=1

Des(YijSi):P rob(Si)

It should be noted that this de�nition of an expected utility is somewhatmore general than the usual one in that it allows that the desirabilities of conse-quences be dependent on the state of the world in which they are realised. In theevent that state-independence holds, Des(YijSi) = Des(Yi). Then if we let actf be the proposition hY1 , ..., Yni and f(Si) = Yi, we obtain the familiar Savageformulation of expected utility theory: Des(f) =

Pni=1Des(f(Si)):P rob(Si).

We now show that a desirability function is an EU representation of a prefer-ence relation just in case it satis�es both a separability condition and a conditionof ethical actualism. Let us discuss each of these conditions in turn in order tomake this claim more precise.

5.1 Additive Separability and Desirabilistic Independence

We have noted at various points that expected utility theory implies that theagent�s preferences are separable or that they are representable by an additivelyseparable utility function. Our �rst task is to make precise what this require-ment amounts to in the framework in which we are working. Intuitively twosets of propositions are separable from the point of view of some agent if theirpreferences for the members of one of the sets are independent of the truth orfalsity of the members of the other set. If we consider not the preferences butthe desirabilities that represent them, this translates into the requirement thatthe desirability of any member of one set is independent of the truth of anyproposition in the other.In this context, the sets of propositions that are relevant are the sets of

counterfactuals under disjoint suppositions. And the form of separability thatis required by expected utility theory can be rendered as the claim that the

22

desirability that any Yi would be the case if Si were true is independent ofwhat would be the case if any supposition inconsistent with Si were true. Moreformally, given a set of disjoint suppositions fSig and a desirability Des, it mustbe the case that:

Des(Yi� j\

i 6=i�Yi) = Des(Yi�)

Then it follows from the de�nition of conditional desirability8 that:

Des(Y1,...,Yn) = Des(Y1 j Y2,...,Yn) +Des(Y2,...,Yn)= Des(Y1) +Des(Y2 j Y3,...,Yn) +Des(Y3,...,Yn)= Des(Y1) +Des(Y2) + :::

=nXi=1

Des(Yi)

When a numerical representation of preference takes this form then it is saidto be an additive or additively separable. So we can conclude that a desirabilitymeasure is additively separable over the Si i¤ the counterfacts under any sup-position are desirabilistically independent of those under any other suppositiondisjoint to it.In the light of this we can state as follows the separability condition required

for expected utility:

Counterfact Separability: If {Signi=1 is a set of n disjoint suppositions, then:

Des(Y1,...,Yn) =nXi=1

Des(Yi)

Just how strong a condition this is can be brought out by noting that if adesirability function is additively separable then the corresponding probabilityfunction is multiplicative, i.e.

Prob(Y1,...,Yn) =nYi=1

Prob(Yi)

This fact is proven in the appendix as Theorem 9. Intuitively this is to beexplained by the fact that the counterfacts cannot be desirabilistically inde-pendent unless knowing that one of the counterfacts holds is irrelevant to howlikely the other counterfacts are to be true (which, as we have seen, is not gener-ally the case). In other words, desirabilistic independence requires probabilisticindependence.

5.2 World Actualism

An additive desirability function is not yet an expected utility. An expectedutility is an additive desirability that satis�es a version of a principle commonto many decision theories and that we have termed ethical actualism. The basicintuition behind this principle is that only the actual world matters, so that the

8See the appendix for a statement of its de�nition.

23

desirability of combinations of facts and counterfacts should depend only on thedesirability of the facts. In this section we consider several formulations of thisprinciple and clarify its relationship to separability.One way of expressing ethical actualism more formally is as follows:

World Actualism: u(hw;w1; :::; wni) = u(w)

World Actualism says that the desirability that w is the actual world and thatthe wi worlds would be the case if the Ai were, depends only on the desirabilityof w. In other words, once it has been established what world is the actual one,then it should be a matter of indi¤erence what the counteractual worlds are.The plausibility of World Actualism rests on the possibility of giving a completedescription of everything that matters. If we were able to do so, then any way inwhich the counterfacts mattered to us in the actual world could be registered inthe description we give of that world. It is not that the counterfacts themselvesmust be written into the descriptions of worlds �this would lead to contradictionwhen the counterfacts speci�ed in the description of a world di¤ered from thosein counteractual worlds �but that any way in which these counterfacts bear onour evaluation of the facts must be speci�ed. For instance, suppose that thedesirability of dining at home is sensitive to how good a meal one would havehad, had one dined out at the local restaurant, because the fact that one wouldhave had a better meal at the restaurant causes one to regret eating at home andthe fact that one would have had a worse meal makes one appreciate the homecooked meal all the more. Then these facts �the regret or the appreciation oneexperiences in the light of the counterfacts �must be built into the descriptionof the actual world if World Actualism is to obtain.The problem with the condition of World Actualism is therefore that it is

partition-dependent : it might hold for one speci�cation of the possible worlds,but not for a model in which they are speci�ed more coarsely. So we should notthink of it as condition that applies to every model of counterfactual possibility,but rather as a methodological principle: one which requires contingencies to besu¢ ciently �nely individuated for World Actualism to hold within the model.This principle is one that many decision theorists seem to endorse. For instance,Broome [Broome, 1991] recommends just such a strategy of �ne individuation asa way of avoiding the putative counterexamples to the separability of rationalpreference of Allais and Diamond. In a nutshell his claim is that if there issome property of the outcomes of a decision that makes it rational to value anoutcome di¤erently depending on whether it has the property or not, then theoutcomes should be individuated in accordance that the property.Contrary to what appears to be common view, however, imposing World Ac-

tualism on a model by appropriate individuation of prospects does not su¢ ce toensure the additive separability of desirabilities. For, as we have already seen,additive separability requires that counterfacts under mutually exclusive suppo-sitions be probabilistically independent. But World Actualism alone does notimply anything about the probabilistic relations between the counterfacts. Sothe question of whether rationality requires expected utility maximisation is notsettled by the question of whether World Actualism is a reasonable condition.

24

5.3 Prospect Actualism

A much stronger and partition-independent version of ethical actualism �thequantitative analogue of the condition that we termed Preference Actualism �takes us much closer to what is required. Let S be a set of suppositions andsuppose that X \ Si = ?, for every supposition Si 2 S. Then consider:

Prospect Actualism: Des(hX, Y1 , ...,Yni) = Des(X)

Prospect Actualism says that the desirability that X is the case and thatthe Yi would be on the contrary-to-fact supposition that Si, depends only onthe desirability that X. Or to put it slightly di¤erently, once is is given thatX then it does not matter what is or would be the case under any suppositioninconsistent with the truth of X.Although Prospect Actualism expresses a similar idea to World Actualism,

the relationship between them is quite complicated. Given Centering, ProspectActualism implies World Actualism, but the converse is not true. In fact,Prospect Actualism only follows from World Actualism in conjunction with theassumption that the facts are stochastically independent of the strict counter-facts, a condition we previously formalised as Fact-Counterfact Independence.(This claim is proven in the appendix as Theorem 14.)Prospect Actualism substantially constrains how we may value outcomes.

Suppose for instance you have to choose between two restaurants. You go torestaurant A and are served a very poor meal. An acquaintance goes to theother restaurant and reports that they were served a very good meal. Arethings worse overall than they would have been if it had been the case that youwould have been served a poor meal at the other restaurant as well? The issueis not whether your judgement concerning the meal at restaurant A can dependon what the meal at restaurant B would have been like �surely it should not�but whether the prospect of having a poor meal at restaurant A when youwould have had a good one at restaurant B is a worse one than that of havingthe poor meal at restaurant A when you would also have had a poor one atrestaurant B.In this case the issue boils down to whether the badness associated with the

di¤erence between what is the case and what might have been if some othercourse of action had been pursued is built into the description of the actualstate of a¤airs: for instance, the regret you might feel about not having gone tothe other restaurant. In other cases, it depends on the information containedin the description of the counterfactual circumstances. Suppose, for instance,that the acquaintance in our example reports that standards of food hygienewere very poor at the other restaurant. You know they have the same owner,so you infer that standards will also be poor at the restaurant you chose. Thisa¤ects your view about the desirability of your choice. In other words, thedesirability of the prospect of going to restaurant A is not independent of thesupposition that had you gone to restaurant B you would have found foodhygiene standards to be very poor. So Prospect Actualism will be violatedwhenever there are either probabilistic or desirabilistic dependencies betweenthe facts and the strict counterfacts.In this context the main signi�cance of Prospect Actualism is that, jointly

with the condition that the facts are probabilistically independent of the coun-

25

terfacts, it is su¢ cient for expected utility maximisation. More formally, as weprove in the appendix as Theorem 23:

First Su¢ ciency Theorem: Assume Centering. If Des is a desirability rep-resentation of a preference relation % that satis�es Fact-Counterfact In-dependence and Prospect Actualism, then Des is an expected utility rep-resentation of %.

5.4 Expected Utility, Separability and Ethical Actualism

We are now in a position to make precise our earlier claim that the separabilityand ethical actualism are independent, necessary conditions for expected utilitymaximisation. Let�s take each aspect in turn. First, as we prove in the appendixas Theorems 19 and 21, strong forms of both separability and ethical actualismare required for expected utility maximisation. More exactly:

Necessity Theorem: Assume Centering. If Des is an expected utility repre-sentation of the preference relation %, then Des satis�es Counterfact Sep-arability, Prospect Actualism, Fact-Counterfact Independence and Coun-terfact Independence.

The Necessity Theorem is surprisingly strong and forcefully demonstratesjust how much more demanding the requirement that agents maximise expectedutility is than the requirement that they maximise desirability. We consider ithighly implausible that failure to satisfy all four conditions entails irrationalityon the part of an agent. Hence doubtful that require requires us to maximiseexpected utility.Second, as we noted earlier on, ethical actualism and separability are based

on di¤erent, though consistent, intuitions. The former expresses the idea thatonly what actually happens matters, while the latter expresses the idea that thedesirability of orthogonal counterfacts are independent of each other�s truth.It is not di¢ cult to see that the counterfacts can be separable even if ethicalactualism is false. For instance, consider again the set of prospects displayed be-low and suppose that you thought that the counterfacts do matter. Speci�callysuppose that were E not the case then you would prefer that BOB rather thanthat ANN, in violation of ethical actualism. So you prefer L1 to LA (in virtueof the former dominating the latter) even when you know that E. Nonethelessyou regard the outcomes under E and :E as separable because your preferencethat BOB were it the case that :E is not a¤ected by whether it is the case thatBOB or ANN if E. Hence LB � L2.

E :EL1 ANN BOBLA ANN ANNLB BOB BOBL2 BOB ANN

This example shows that satisfaction of ethical actualism is not necessary forseparability. On the other hand, it might seem that ethical actualism should besu¢ cient for separability since if the counterfacts don�t matter, then trivially

26

they will be desirabilistically independent of one another (they won�t matterwhatever orthogonal counterfacts hold). But this intuition is false. Even ifethical actualism is true, the counterfacts can matter because they can be infor-mative about what the facts are. If for instance I don�t know which box containsthe prize, then I will care about whether or not it is true that if I were to openone of them I would �nd the prize, since learning this counterfact enables meto infer where the prize is.What this example brings out is the possibility that the counterfacts matter

because of probabilistic dependencies between facts and counterfacts. So onemight hypothesise that when the counterfacts are probabilistically independentof the facts, then ethical actualism should imply separability. It turns out thatthis is true. More precisely, provided that Centering holds, Prospect Actualismand Fact-Counterfact Independence jointly imply Counterfact Separability (weprove this in the appendix as Theorem 16).We have already observed that separability is not su¢ cient for ethical actu-

alism. But Prospect Actualism, the strong form of ethical actualism requiredby expected utility theory, is a consequence of separability together with thefollowing, weaker form of ethical actualism:

Restricted Actualism: Des(Si,Yi) = Des(Si)

Restricted Actualism says that it does not matter that Yi is the case underthe supposition that Si, given that Si is false. Or to put it slightly di¤erently,given that Si is not the case, it is a matter of indi¤erence what would be thecase if it were. Restricted Actualism, like Prospect Actualism, is a partition-independent condition on evaluative attitudes, but it is quite a bit weaker thanthe latter. While Prospect Actualism clearly implies Restricted Actualism, thelatter only implies Prospect Actualism when the counterfacts are probabilis-tically and desirabilistically independent of each other. More formally, as weprove in the appendix as Theorem 17, given Centering, Counterfact Separabilityand Restricted Actualism imply Prospect Actualism.In virtue of the First Su¢ ciency theorem, we can now infer a second set

of su¢ cient conditions for a desirability function to be an expected utility, bydrawing on the Probabilistic Equivalence Theorem and the fact that CounterfactSeparability implies Counterfact Independence. For then it follows, as we provein the appendix as Theorem 22, that:

Second Su¢ ciency Theorem: Assume Centering. If Des is a Je¤rey repre-sentation of preference relation % that satis�es Counterfact Separation,Supposition Independence and Restricted Actualism, then Des is an ex-pected utility representation of %.

This second set of su¢ cient conditions is perhaps the more illuminating ofthe two since the dual dependence of expected utility theory on separabilityand ethical actualism is more transparent as is the need for a distinct indepen-dence condition relating suppositions to beliefs about counterfacts under thesesuppositions. On the other hand, it somewhat obscures how demanding theprobabilistic independence conditions are on expected utility maximisation. Soto �nish, let us bring our various results together into a single statement relat-ing expected utility theory and the two pairs of conditions on desirability andprobability that have been discussed.

27

EU Equivalence Theorem: Let (Des; Prob) be a Je¤rey representation of apreference relation. Given Centering, the following are equivalent:

1. Des is an expected utility

2. Des satis�es Prospect Actualism and Prob satis�es Fact-CounterfactIndependence

3. Des satis�es both Counterfact Separability and Restricted Actualismand Prob satis�es Supposition Independence

5.5 Conclusion

We have seen that by extending Richard Je¤rey�s decision theory to counterfac-tual propositions it becomes possible to represent two preference patterns thathave for decades discomforted decision theorists. We have also seen that whenwe add the conditions necessary for an expected utility representation, we can nolonger represent these intuitively rational preferences. Furthermore, the addedpostulates imply restrictions on the agent�s beliefs and desires that have littleplausibility as rationality constraints. This, we think, seriously undermines EUtheory�s claim to be the correct theory of practical rationality.

6 Appendix: De�nitions and Proofs

6.1 Je¤rey Representations

In this �rst section we present some useful results relating to Je¤rey repre-sentations of preferences on Boolean algebras. Let h;�;>;?i be a complete,atomless Boolean algebra of propositions with upper bound > and lower bound? and let % be a preference relation on . A pair of functions (Des, Prob) isa Je¤rey representation of % just in case Prob is a probability function on and Des a desirability function on 0 = � f?g such that for all �; � 2 0,Des(�) � Des(�) , � % �. Recall that a desirability function on 0 is areal-valued function such that for all �; � 2 0:

V1 (Normality): Des(>) = 0

V2 (Desirability): If � \ � = ?, then:

Des(� [ �) = Des(�):P rob(�) +Des(�):P rob(�)

Prob(� [ �)

Recall also the de�nitions of conditional probability and desirability.

Conditional Probability: If Prob(�) 6= 0 :

Prob(�j�) := Prob(� \ �)Prob(�)

Conditional Desirability: If Prob(� \ �) 6= 0 :

Des(�j�) := Des(� \ �)�Des(�)

28

Lemma 2 Let (Des; Prob) be a Je¤rey representation of % . Then:

1. Des(�):P rob(�) = �Des(��):P rob(��)

2. Prob(�)Prob(��) = �

Des(��)Des(�)

3. If Des(�j�) = Des(�) and Des(��j�) = Des(�), then Prob(�j�) = Des(�)and Prob(��j�) = Des(�)

Proof. Given that � [ �� = >, it follows by the axioms of averaging andnormality, that:

Des(>) = Des(�):P rob(�) +Des(��):P rob(��) = 0

Hence Des(�):P rob(�) = �Des(��):P rob(��). But this is the case:

, Des(�):P rob(�)

Prob(��)= �Des(��)

, Prob(�)

Prob(��)= �Des(��)

Des(�)

Assume that Des(�j�) = Des(�) and Des(��j�) = Des(�). Then by applicationof the above and from the fact that Des(�j�) is a desirability function:

Prob(�j�)Prob(��j�) = �Des(��j�)

Des(�j�)

= �Des(��)Des(�)

=Prob(�)

Prob(��)

Hence Prob(�j�) = Des(�) and Prob(��j�) = Des(�).

Lemma 3 Let (Prob;Des) and (Prob�; Des�) be two pairs of probability anddesirability functions. Then:

Prob(�) = Prob�(�), Des(�)

Des(��)=Des�(�)

Des�(��)

Proof. By the axiom of desirability, Des(�):P rob(�) +Des(��):P rob(��) = 0 =Des�(�):P rob�(�) +Des�(��):P rob�(��)

, Des(�)

Des(��):P rob(�) + 1� Prob(�) = Des�(�)

Des�(��):P rob�(�) + 1� Prob�(��)

, (Des(�)

Des(��)� 1):P rob(�) = (Des

�(�)

Des�(��)� 1):P rob�(�)

Hence:

Prob(�) = Prob�(�), Des(�)

Des(��)� 1 = Des�(�)

Des�(��)� 1, Des(�)

Des(��)=Des�(�)

Des�(��)

29

6.2 Suppositional Algebras

Hereafter our results pertain to Suppositional Algebras of propositions, wherethe latter are construed as sets of n-tuples of worlds. Let S = hW;;S; f;z;�ibe a suppositional algebra with W a set of possible worlds, a Boolean algebraof subsets of W, S = fSig � a set of n suppositions, f a selection functionfrom W�S to , z the set of elementary possibilities and � = }(z) the set ofall propositions. If f satis�es Centering then we say that S is a centered sup-positional algebra. Throughout we assume that the Marginalisation conditionholds. To avoid confusion we denote the proposition that Si by (Si) and the(tautologous) proposition that if Si were the case then Si would be, by hW;Sii.

Lemma 4 Assume that S is a centered. Let X � Si. Then (X, Y1 ,...,Yn) =(X\ Yi ,

Tj 6=iYj).

Proof. (X, Y1 ,...,Yn) = fhw0; w1; :::; wni : w0 2 X and wj 2 Yjg. Since X �Si, it follows from Centering that hw0; w1; :::; wni 2 (X, Y1 ,...,Yn) , wi = w0.So:

(X,Y1 ,...,Yn) = fhw0; w1; :::; wni : w0 2 X \Yi and for all j; wj 2 Yjg= (X \Yi,Y1 ,...,Si,...,Yn)= (X \Yi;

\j 6=iYj)

Theorem 5 There exist constants j and ki > 0 such that, for all X2 andYi 2 i:

Des(X) = j:Des0(X)

Des(W,Yi) = ki:Desi(Yi)

Proof. If X is W, then Des(X) = Des0(X) = 0 by the axiom of normality.So for any j Des(X) = j:Des0(X). So assume that X � W and let Y be anyproposition such that X\Y = ?. By Marginalisation Prob(�) = Prob0(�) andso by Lemma 3:

Des(X [Y)Des(X \ Y)

=Des0(X [Y)Des0(X \ Y)

Let j = Des0(X\Y)Des(X\Y) . Then Des0(X\Y) = j:Des(X\Y) and Des0(X[Y) =

j:Des(X[Y). Hence Des0(X[Y):P rob(X[Y) = j:Des(X[Y):P rob(X[Y). Itthen follows by the axiom of desirability that:

Des0(X):P rob(X)+Des0(Y):P rob(Y) = j:Des(X):P rob(X)+j:Des(Y):P rob(Y)

But this can only be the case non-accidentally if Des0(X) = j:Des(X). Theproof that Des(W,Yi) = ki:Desi(Yi) is identical.

Corollary 6 Des(W,Si) = 0

Proof. By Theorem 5, Des(W, Si) = ki:Desi(Si). But by the axiom of nor-mality, Desi(Si) = 0. Hence Des(W, Si) = 0.

30

6.3 Probability Theorems

In this section we prove a number of results concerning the relation betweenthree di¤erent conditions of probabilistic independence. Thoughout let S = fSigbe a set of disjoint suppositions and Yi �Si. Then consider:

Supposition Independence: Prob(Si,Yi) = Prob(Si):P rob(Yi)

Fact-Counterfact Independence: If X \ Si = ?, then:

Prob(X,Yj) = Prob(X):P rob(Yj)

Counterfact Independence: If Si \ Sj = ?, then:

Prob(Xi,Yj) = Prob(Xi):P rob(Yj)

Theorem 7 Fact-Counterfact Independence implies Supposition Independence.

Proof. Suppose that X�Si. Then by Fact-Counterfact Independence, since X\ Si = ?, it follows that:

Prob(Si,Xi) = Prob(Si):P rob(Xi)

But then Prob(Si,Xi) = Prob(Si):P rob(Xi).

Theorem 8 Let Xi =X \ Si and assume Centering. Then Supposition Inde-pendence implies that Prob(Xi) = Prob(XjSi):

Proof. Assume Centering. Then

Prob(Xi j Si) =Prob(Si,Xi)Prob(Si)

=Prob(Si \X)Prob(Si)

= Prob(X j Si)

But by Theorem 7, Fact-Counterfact Independence implies that Prob(XijSi) =Prob(Xi). Hence Prob(Xi) = Prob(XjSi)

Theorem 9 Assume Counterfact Independence. Let S1; :::;Sn 2 S: Then:

Prob(Y1 ,...,Yn ) =Yn

i=1Prob(Yi)

Proof. We prove the claim by induction on the number n of suppositions inS. Counterfact Independence implies that the claim is true for n = 2, i.e. thatProb(Y1 ;Y2) = Prob(Y1):P rob(Y2). Assume true for n = k. Now:

Prob(Y1 ,...,Yk+1) = Prob(Y1 ,...,Yk jYk+1).Prob(Yk+1)

= Prob(Yk+1):Yk

i=1Prob(YijYk+1)

in virtue of the induction hypothesis for n = k and the fact that Prob(�jYk+1)is a probability on the space of propositions. But by Counterfact Independence,Prob(YijYk+1) = Prob(Yi). So Prob(Y1 ,...,Yn) =

Qk+1i=1 Prob(Yi).

Theorem 10 Assume Centering. Then Counterfact Independence and Suppo-sition Independence are jointly equivalent to Fact-Counterfact Independence.

31

Proof. Assume Centering, Counterfact Independence and Supposition Inde-pendence. Suppose that Sj =Si, X � Si and Y � Sj . Then by Centering andthen Counterfact Independence:

Prob(X,Yj) = Prob(Si\X, Yj)= Prob(Si;Xi, Yj)

= Prob(Xi, Yj jSi):P rob(Si)= Prob(XijSi):P rob(Yj jSi):P rob(Si)

But by Supposition Independence:

Prob(Yj jSi) = Prob(Yj jSj) = Prob(Yj)

Hence:

Prob(X,Yj) = Prob(XijSi):P rob(Yj):P rob(Si)

=Prob(Si;Xi)Prob(Si)

:P rob(Yj):P rob(Si)

= Prob(Si\X):P rob(Yj)

in virtue of Centering. So Prob(X,Yj) = Prob(Si\X):P rob(Yj) = Prob(X):P rob(Yj),in accordance with Fact-Counterfact Independence.Now assume Fact-Counterfact Independence. Supposition Independence fol-

lows by Theorem 7. Now:

Prob(Si [ Sj ;Xi,Yj) = Prob(Si, Xi, Yj) + Prob(Sj , Xi, Yj)

But by Lemma 4, Centering implies that:

Prob(Si, Xi, Yj) = Prob(Si \X, Yj)Prob(Sj , Xi, Yj) = Prob(Sj \Y, Xi)

And by Fact-Counterfact Independence:

Prob(Si \X, Yj) = Prob(Si \X):P rob(Yj)Prob(Sj \Y, Xi) = Prob(Sj \Y):P rob(Xi)

Prob(Si [ Sj ; Xi, Yj) = Prob(Si [ Sj):P rob(Xi, Yj)

So:

Prob(Xi,Yj) =Prob(Si \X):P rob(Yj) + Prob(Sj \Y):P rob(Xi)

Prob(Si [ Sj)

But by Theorem 8, it follows from Supposition Independence that:

Prob(Yj) = Prob(Y j Sj)Prob(Xi) = Prob(X j Si)

So:

Prob(Xi,Yj) =Prob(X j Si):P rob(Y j Sj):P rob(Si) + Prob(Y j Sj):P rob(X j Si):P rob(Sj)

Prob(Si [ Sj)= Prob(Y j Sj):P rob(X j Si)= Prob(Xi):P rob(Yj)

32

in accordance with Fact-Counterfact Independence. We conclude that Coun-terfact Independence and Supposition Independence are jointly equivalent toFact-Counterfact Independence, given Centering.

Corollary 11 Let X \Yi = ?. Assume Centering. Then Fact-CounterfactIndependence implies that:

Prob(X;Y1 ,...,Yn ) = Prob(X):Yn

i=1Prob(Yi)

Proof. By the de�nition of conditional probability and Fact-Counterfact Inde-pendence:

Prob(X;Yi,...,Yj) = Prob(X;Y1jY2...,Yn):P rob(Y2...,Yn)= Prob(XjY2...,Yn):P rob(Y1jY2...,Yn):P rob(Y2...,Yn)= Prob(X;Y2...,Yn):P rob(Y1)

by Theorem 9. Hence, by repeating the argument:

Prob(X;Yi,...,Yj) = Prob(X;Y2jY3...,Yn):P rob(Y3...,Yn)= Prob(XjY3...,Yn):P rob(Y2jY3...,Yn):P rob(Y3...,Yn)= Prob(X;Y3...,Yn):P rob(Y1):P rob(Y2)

:::

= Prob(X):Yn

i=1Prob(Yi)

6.4 Desirability-Probability Results

In this section we prove a number of results concerning the relation betweenthree di¤erent conditions on desirabilities and the probabilistic independenceconditions studied in the last section. As before, throughout let S = fSig be aset of disjoint suppositions and Yi �Si. Then consider:

Restricted Actualism: Des(Si,Yi) = Des(Si)

Prospect Actualism: If X \ Si = ?, then:

Des(hX,Yii) = Des(X)

Counterfact Separability: IfTSi = ?, then:

Des(hY1 ; ...,Yni) =nXi=1

Des(Yi)

Theorem 12 Counterfact Separability implies Counterfact Independence.

Proof. Let {Signi=1 is a set of n disjoint suppositions. Then by CounterfactSeparability and the fact that (hYi,Yji) = (hW, S1; :::;Yi,Yj ; :::;Sni):

Des(hYi,Yji) = Des(Yi) +Des(Yj) +Xk 6=i;j

Des(W, Sk)

Des(hYi,Yji) = Des(Yi) +Des(Yj) +Xk 6=i;j

Des(W, Sk)

33

But by Corollary 6, Des(W,Sk) = 0. So Des(hYi, Yji) = Des(Yi) + Des(Yj)and Des(hYi, Yji) = Des(Yi) +Des(Yj). But by the axiom of desirability:

Des(Yi) = Des(hYi,Yji):P rob(Yj jYi) +Des(hYi,Yji):P rob(Yj jYi)= [Des(Yi) +Des(Yj)]:P rob(Yj jYi) + [Des(Yi) +Des(Yj)]:P rob(Yj jYi)= Des(Yi) +Des(Yj):P rob(Yj jYi) +Des(Yj):P rob(Yj jYi)

But this can hold only if:

Des(Yj):P rob(Yj jYi)+Des(Yj):P rob(Yj jYi) = 0 = Des(Yj):P rob(Yj)+Des(Yj):P rob(Yj)

by Lemma 2. Hence only if Prob(Yj jYi) = Prob(Yj), in accordance with Coun-terfact Independence.

Theorem 13 Assume Centering. Then Restricted Actualism and SuppositionIndependence imply that Des(Yi) = [Des(Si\Y)�Des(Si)]:P rob(Si).

Proof. By the axiom of desirability:

Des(Yi) = Des(hSi,Yii):P rob(SijYi) +Des(hSi,Yii):P rob(SijYi)= Des(Si \Y):P rob(SijYi) +Des(Si):P rob(SijYi)

in virtue of Centering and Restricted Actualism. And by Supposition Indepen-dence Prob(SijYi) = Prob(Si) = Prob(SjYi). Hence

Des(Yi) = Des(Si \Y):P rob(Si) +Des(Si):P rob(Si)= Des(Si \Y):P rob(Si)�Des(Si)]:P rob(Si)

by Lemma 2. Hence Des(Yi) = [Des(Si\Y)�Des(Si)]:P rob(Si).

Theorem 14 Assume Centering. Then World Actualism and Fact-CounterfactIndependence imply Prospect Actualism.

Proof. Let S = fSigni=1 be a set of n disjoint suppositions and suppose that X�Yi� 2 S. By Centering (X, Y1,..., Yn) = (X;

Ti 6=i�Yi) and by construction:

Des(X,Y1,...,Yn) =X u(w;w1; :::; wn):p(w;w1; :::; wn)

Prob(X;Y1,...,Yn)

=X u(w):p(w;w1; :::; wn)

Prob(S,Y1,...,Yn)

by World Actualism. But by Centering and Fact-Counterfact Independencep(w;w1; :::; wn) = p(w):p(

Ti 6=i� wi) and Prob(X, Y1,..., Yn) = Prob(X;

Ti 6=i�Yi) =

Prob(X):P rob(Ti 6=i� Yi). Hence

Des(X,Y1,...,Yn) =X u(w):p(w)

Prob(X)= Des(X)

Theorem 15 Suppose that X \(SSi) = ?. Then Prospect Actualism implies

that Des(X,Y1,...,Yn) = Des(X).

34

Proof. By repeated applications of the de�nition of conditional desirability andProspect Actualism:

Des(X,Y1,...,Yn) = Des(X,Y1 j Y2,...,Yn) +Des(Y2,...,Yn)= Des(X j Y2,...,Yn) +Des(Y2,...,Yn)= Des(X,Y2,...,Yn)

= Des(X,Y2 j Y3,...,Yn) +Des(Y3,...,Yn):::

= Des(X,Yn)

= Des(X)

Theorem 16 Assume Centering. Then Fact-Counterfact Independence andProspect Actualism imply Counterfact Separability.

Proof. By the axiom of desirability, V2, and then Lemma 14, given Centering:

Des(hY1 ,...,Yni) =X

i = 1nDes(Si, Y1 ,...,Yn):P rob(Si j hY1 ,...,Yni)

=

nXi=1

Des(Si \Yi,\j 6=i(Yj)):P rob(Si j hY1 ,...,Yni)

=nXi=1

Des(Si \Yi):P rob(Si j hY1 ,...,Yni)

in virtue of Prospect Actualism. Now by Corollary 11, given Centering, Fact-Counterfact Independence implies that:

Prob(Si j hY1 ,...,Yni) = Prob(Si)

It follows that:

Des(hY1 ,...,Yni) =nXi=1

Des(Si \Yi):P rob(Si)

=nXi=1

Des(Si \Yi):P rob(Si)�nXi=1

Des(Si):P rob(Si)

=nXi=1

[Des(Si \Yi)�Des(Si)]:P rob(Si)

in virtue of the fact that by V1 and V2,Pn

i=1Des(Si):P rob(Si) = 0. In partic-ular:

Des(Yi) = Des(hS1 ,...,Yi,...,Sni)= [Des(Si \Yi)�Des(Si)]:P rob(Si) +

Xj 6=i[Des(Sj)�Des(Sj)]:P rob(Sj)

= [Des(Si \Yi)�Des(Si)]:P rob(Si)

Hence Des(hY1 ,...,Yni) =Pn

i=1Des(Yi).

35

Theorem 17 Given Centering, Counterfact Separability and Restricted Actu-alism imply Prospect Actualism.

Proof. Let XA = X \ A. Then by Lemma 3, given Centering, Des(hA,XA ;YAi) = Des(hA\X;YAi). But by the de�nition of conditional desirabilityand Counterfact Separability:

Des(hA, XA ,YAi) = Des(hXA ,YAijA)�Des(A)= Des(XA jA) +Des(YA jA)�Des(A)= Des(A, XA) +Des(A, YA) +Des(A)

= Des(A \X) +Des(A)�Des(A)= Des(A \X)

in virtue of Restricted Actualism and Centering. Hence Des(hA\X;YAi) =Des(A\X) in accordance with Prospect Actualism.

6.5 Characterisation Results for Expected Utility

Throughout we assume that (Prob;Des) is Je¤rey representation of preferencesde�ned on a Centered suppositional algebra � of propositions. Let S = fSig bea set of disjoint suppositions and Yi �Si.

6.5.1 Necessity Results

Theorem 18 Let Des be an expected utility. Then Prob satis�es SuppositionIndependence.

Proof. Let Xi = Si \ X. By the axiom of desirability:

Prob(Xi) =Des(Xi)

Des(Xi)�Des(Xi)

=Des(Si \ X):P rob(Si) +Des(Si):P rob(Si)

Des(Si \ X):P rob(Si) +Des(Si \X):P rob(Si)

in virtue of the fact that Des is an expected utility. But then by the axiom ofdesirability:

Prob(Xi) =Des(Si \ X):P rob(Si)�Des(Si):P rob(Si)

Des(Si \ X):P rob(Si) +Des(Si \X):P rob(Si)

=Des(Si \ X):P rob(Si)�Des(Si \X):P rob(Si \X)�Des(Si \ X):P rob(Si \ X)

Prob(Si):[Des(Si \ X) +Des(Si \X)]

=Des(Si \ X):P rob(Si \X)�Des(Si \X):P rob(Si \X)

Prob(Si):[Des(Si \ X) +Des(Si \X)]

=Prob(Si \X):[Des(Si \ X) +Des(Si \X)]Prob(Si):[Des(Si \ X) +Des(Si \X)]

=Prob(Si;Xi)Prob(Si)

Hence Prob(Si;Xi) = Prob(Xi):P rob(Si) in accordance with Supposition Inde-pendence.

36

Theorem 19 Let Des be an expected utility. Then:

1. Des(Yi) = [Des(Si\Yi)�Des(Si)]:P rob(Si)

2. Des(hY1 ,...,Yn i) =Pn

i=1Des(Yi)

Proof. By de�nition if Des is an expected utility, then:

Des(hY1 ,...,Yni) =nXi=1

Des(YijSi):P rob(Si)

So in particular, since Yi = hS1 ,...,Yi...,Sni = hYi,Tj 6=iSji, it follows that:

Des(Yi) = Des(YijSi):P rob(Si) +Xj 6=i

Des(Sj jSj):P rob(Sj)

= Des(YijSi):P rob(Si)

since Des(Sj jSj) = 0. But by the de�nition of conditional desirability:

Des(Yi j Si) = Des(Si \Yi)�Des(Si)

So Des(Yi) = [Des(Si\Yi)�Des(Si)]:P rob(Si). But then.nXi=1

Des(Yi) =nXi=1

Des(YijSi):P rob(Si) = Des(hY1 ,...,Yni)

Hence:

Des(hY1 ,...,Yni) =nXi=1

Des(Yi)

Corollary 20 Des satis�es Counterfact Independence and Fact-CounterfactIndependence.

Proof. By Theorem 19 Des satis�es Counterfact Separability and by Theo-rem 12, Counterfact Separability implies Counterfact Independence. Similarly,by Theorem 18, Des satis�es Supposition Independence and by Theorem 10,Counterfact Independence and Supposition Independence are jointly equivalentto Fact-Counterfact Independence.

Theorem 21 Let Des be an expected utility. Then Des satis�es Prospect Ac-tualism.

Proof. By Theorem 18, Prob satis�es Supposition Independence. So Prob(SijYi) =Prob(Si) and by the axiom of desirability:

Des(Yi) = Des(Si;Yi):P rob(SijYi) +Des(Si;Yi):P rob(SijYi)= Des(Si \Yi):P rob(Si) +Des(Si;Yi):P rob(Si)

by Lemma 4, given Centering. But by Theorem 19, Des(Yi) = (Des(Si\Yi)�Des(Si)):P rob(Si). Hence by Lemma 2, Des(Yi) = Des(Si\Yi):P rob(Si) +Des(Si):P rob(Si). So, in accordance with Restricted Actualism:

Des(Si;Yi) = Des(Si)

But then it follows from Theorems 21, 19 and 17 that Des satis�es ProspectActualism.

37

6.5.2 Su¢ ciency Results

Theorem 22 Assume that Des satis�es Counterfact Separability and RestrictedActualism and that Prob satis�es Supposition Independence. Then Des is anexpected utility.

Proof. Let Yi = Y \ Si. By Counterfact Separability:

Des(hY1 ,...,Yni) =nXi=1

Des(Yi)

But by Theorem 13, Restricted Actualism and Supposition Independence implythat Des(Yi) = [Des(Si\Yi)�Des(Si)]:P rob(Si). Hence:

Des(hY1 ,...,Yni) =nXi=1

[Des(Si \Yi)�Des(Si)]:P rob(Si)

=

nXi=1

Des(Si \Yi):P rob(Si)�nXi=1

Des(Si):P rob(Si)

=

nXi=1

Des(Si \Yi):P rob(Si)

in virtue of the fact that by V1 and V2,Pn

i=1Des(Si):P rob(Si) = 0. But bythe de�nition of conditional desirability:

Des(Yi j Si) = Des(Si \Yi)�Des(Si)

So:

Des(hY1 ,...,Yni) =nXi=1

Des(Y j Si):P rob(Si)

Theorem 23 Assume that Des satis�es Prospect Actualism and that Prob sat-is�es Fact-Counterfact Independence. Then Des is an expected utility.

Proof. Let Yi = Y \ Si. By the axiom of desirability and the Lemma 14:

Des(hY1 ,...,Yni) =X

i = 1nDes(Si, Y1 ,...,Yn):P rob(Si j hY1 ,...,Yni)

=nXi=1

Des(Si \Y,\j 6=i(Yj)):

P rob(Si \Yi;Tj 6=i(Yj))

Prob(Yi;Tj 6=i(Yj))

Now by Theorem 10, Fact-Counterfact Independence implies Counterfact Inde-pendence which implies, by Theorem 9, that Prob(Yij

Tj 6=i(Yj)) = Prob(Yi).

Similarly, by Corollary 11, Fact-Counterfact Independence implies that Prob(Si\YijTj 6=i(Yj)) = Prob(Si\Yi). Hence:

Prob(Si \Yi jTj 6=i(Yj))

Prob(Yi jTj 6=i(Yj))

=Prob(Si \Yi)Prob(Yi)

= Prob(Si j Yi)

38

Similarly by Theorem 15, Prospect Actualism implies thatDes(Si\Y,Tj 6=i(Yj)) =

Des(Si\Y). So:

Des(hY1 ,...,Yni) =X

i = 1nDes(Si \Y):P rob(Si j Yi)

But by Theorem 7, Fact-Counterfact Independence implies that Prob(Si jYi) =Prob(Si). Hence:

Des(hY1 ,...,Yni) =nXi=1

Des(Si \Y):P rob(Si)

References

[Allais, 1953] Allais, M. (1953). Le comportement de l�homme rationnel devantle risque: Critique des postulats et axiomesde l�ecole americaine. Economet-rica, 21(4):503�546.

[Allais, 1979] Allais, M. (1979). The foundations of a positive theory of choiceinvolving risk and a criticism of the postulates and axioms of the Americanschool. In Allais, M. and Hagen, O., editors, Expected Utility Theory and theAllais Paradox: Contemporary Discussions of Decisions under Uncertaintywith Allais�Rejoinder.

[Bennett, 2003] Bennett, J. (2003). A Philosophical Guide to Conditionals.Clarendon Press.

[Bolker, 1966] Bolker, E. D. (1966). Functions resembling quotients of measures.Transactions of the American Mathematical Society, 124(2):292�312.

[Bradley, 1998] Bradley, R. (1998). A representation theorem for a decisiontheory with conditionals. Synthese, 116(2):187�229.

[Bradley, 2007] Bradley, R. (2007). A uni�ed Bayesian decision theory. Theoryand Decision, 63(3):233�263.

[Bradley, 2012] Bradley, R. (2012). Multidimensional possible-world semanticsfor conditionals. Philosophical Review, 121(4):539�571.

[Broome, 1991] Broome, J. (1991). Weighing Goods. Basil Blackwell.

[Diamond, 1967] Diamond, P. (1967). Cardinal welfare, individualistic ethics,and interpersonal comparison of utility: Comment. Journal of Political Econ-omy, 75(5):765�766.

[Je¤rey, 1982] Je¤rey, R. (1982). The sure thing principle. Philosophy of Sci-ence, 2:719�730.

[Je¤rey, 1983] Je¤rey, R. (1990/1983). The Logic of Decision. The Universityof Chicago Press (revised edition).

[Je¤rey, 1991] Je¤rey, R. (1991). Matter-of-fact conditionals. Proceedings of theAristotelian Society, Supplementary Volumes, 65:161�183.

39

[Joyce, 1999] Joyce, J. (1999). The Foundations of Causal Decision Theory.Cambridge University Press.

[Kahneman and Tversky, 1979] Kahneman, D. and Tversky, A. (1979).Prospect theory: An analysis of decision under risk. Econometrica, 47(2):263�292.

[Loomes and Sugden, 1982] Loomes, G. and Sugden, R. (1982). Regret theory:An alternative theory of rational choice under risk. The Economic Journal,92:805�824.

[Samuelson, 1952] Samuelson, P. A. (1952). Probability, utility, and the inde-pendence axiom. Econometrica, 20(4):670�678.

[Savage, 1954] Savage, L. (1972/1954). The Foundations of Statistics. DoverPublication (revised edition).

[von Neumann and Morgenstern, 1944] von Neumann, J. and Morgenstern, O.(2007/1944). Games and Economic Behavior. Princeton University Press.

40