17
BEN ROGERS THE PROBABILITIES OF THEORIES AS FREQUENCIES* From the beginning of his career, Reichenbach studied the role that probability played both in modern physical theory and in epistemology. 1 He was, with Richard von Mises, one of the foremost proponents of the frequency theory of probability and axiomatized a very general form of it. He further claimed that all reasonable uses of the concept of probability, both in physical theory and as an evaluative concept in epistemology, are to be explicated in terms of the frequency theory. Central to Reichenbach's epistemological position is the very strong claim that all inductive inference is reducible, in principle, to inference by simple enumeration, z Briefly, one can infer probability statements about classes of events by means of the rule of simple enumeration; and, once such probability statements are available, other probability statements can be inferred from these by means of the probability calculus. Since these later inferences are analytic, all inductive inferences are reducible to the application of the enumeration rule which supplies the prob- abilities in the first place. In particular, he accepts the idea that the relation between a theory or hypothesis and the evidence for or against it can be construed in terms of probabilities. 3 Via such probabilities of hypotheses, construed as frequencies, all inductive inferences about scientific theories then in principle can be reduced to inference by simple enumeration. Most recent work concerning the probability of theories has approached the problem either by attempting to develop the concept as a logical relationship as in the work of Carnap or via a personalist Bayesian interpretation as in the work of de Finetti and Savage. Except for the extensions of his work made by Salmon, Reichenbach's attempt to interpret the probabilities of hypotheses in terms of the frequency theory has been little discussed in the literature, Nagel's comments in Principles and in his review of the English edition of Theoryof Probability constitut- ing almost the sole discussion. 4 On the other hand, Reichenbach's frequency interpretation and his justification of induction have been Synthese 34 (1977) 167-183. All Rights Reserved. Copyright @~ 1977 by D. l~eidel Publishing Company, Dordrecht-Holland.

The probabilities of theories as frequencies

Embed Size (px)

Citation preview

Page 1: The probabilities of theories as frequencies

BEN R O G E R S

T H E P R O B A B I L I T I E S O F T H E O R I E S AS

F R E Q U E N C I E S *

From the beginning of his career, Reichenbach studied the role that probability played both in modern physical theory and in epistemology. 1 He was, with Richard von Mises, one of the foremost proponents of the frequency theory of probability and axiomatized a very general form of it. He further claimed that all reasonable uses of the concept of probability, both in physical theory and as an evaluative concept in epistemology, are to be explicated in terms of the frequency theory.

Central to Reichenbach's epistemological position is the very strong claim that all inductive inference is reducible, in principle, to inference by simple enumeration, z Briefly, one can infer probability statements about classes of events by means of the rule of simple enumeration; and, once such probability statements are available, other probability statements can be inferred from these by means of the probability calculus. Since these later inferences are analytic, all inductive inferences are reducible to the application of the enumeration rule which supplies the prob- abilities in the first place. In particular, he accepts the idea that the relation between a theory or hypothesis and the evidence for or against it can be construed in terms of probabilities. 3 Via such probabilities of hypotheses, construed as frequencies, all inductive inferences about scientific theories then in principle can be reduced to inference by simple enumeration.

Most recent work concerning the probability of theories has approached the problem either by attempting to develop the concept as a logical relationship as in the work of Carnap or via a personalist Bayesian interpretation as in the work of de Finetti and Savage. Except for the extensions of his work made by Salmon, Reichenbach's attempt to interpret the probabilities of hypotheses in terms of the frequency theory has been little discussed in the literature, Nagel's comments in Principles and in his review of the English edition of Theory of Probability constitut- ing almost the sole discussion. 4 On the other hand, Reichenbach's frequency interpretation and his justification of induction have been

Synthese 34 (1977) 167-183. All Rights Reserved. Copyright @~ 1977 by D. l~eidel Publishing Company, Dordrecht-Holland.

Page 2: The probabilities of theories as frequencies

168 BEN ROGERS

widely discussed. Given the dearth of published commentary, one can only speculate as to why his attempt has received so little attention. Perhaps, the general difficulties with the frequency interpretation have made it seem otiose to show the complexities and defects in his attempt to handle the probability of hypotheses in this interpretation, or it could have been the intrinsic appeal of the logical interpretation.

However, I conjecture that one reason for the lack of discussion is that there is the following very evident prima facie objection to the interpreta- tion as exposited by both Reichenbach and Salmon. Given the form of their exposition, there s e e m s to be no way to assign probabilities to statistical hypotheses except in extremely limited circumstances. This objection has long kept me from giving this interpretation other than the most casual study and has been mentioned by almost everyone I have discussed the interpretation with. After explaining how Reichenbach intends the interpretation to be carried through, I will exhibit the objection in detail and then offer a partial resolution of the difficulty. I hope that the removal of this prima facie objection and the subsequent clarification of the nature of the difficulties which still face the interpreta- tion will lead to a more definitive reappraisal of Reichenbach's work on this important topic in inductive inference.

Reichenbach analyzes inductive inference in terms of two fundamental items: a rule of induction by simple enumeration which licenses the assertion of probability statements, and the probability calculus which establishes deductive relationships between probability statements, some of which must be taken as already established. The rule of inference allows one to infer limit statements of the relative frequency of properties of things taken in ordered sequences on the basis of frequencies in finite ordered sequences. The connection between the rule of inference and the probability calculus is established by showing that the identification of the limit so inferred with probability results in an admissible interpretation of that calculus.

His central thesis is that given a suitable choice of statements taken as true 5 the enumeration rule will license inferences to probabilities of other statements; and hence, actual inductive inference can be understood in terms of this interpretation. The fundamental difficulty in understanding his interpretation is in getting clear on the manner in which statements of different levels of generality are related to each other by the interpreta-

Page 3: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 169

tion so that true theories, if there are any among those under considera- tion, will get assigned a probability near one in the long run.

As is usual in studies of epistemic probability, the interpretation is chosen so that Bayes' Theorem may be used to calculate the increase or decrease in the probability of a hypothesis as new evidence is considered. In a simple form the theorem states:

P(A nC, B) P(A, B ) x P(A n B, C)

P(A,B)xP(A nB, C)+P(A,J~)xP(A nffl, C)

For the purposes of the present interpretation, let A represent theories of a certain kind, B represent true theories, and C represent evidence of a certain kind. 6 The P(A n C, B) represents the probability of truth, given theories of kind A and evidence of kind C; that is, it represents the posterior probability of a theory of kind A on evidence C. Similarly, P(A, B) represents the probability of the truth of A prior to the evidence provided by C, or the prior probability of A. An analogous interpretation is given to P(A, B) and P(A n J~, C), where/~ represents the class of false theories.

A particular theory T is not assigned a probability on Reichenbach's account because it is an individual theory and individual instances of a class (e.g., events) cannot have a probability on the relative frequency account; probability is a property of classes. However, weight is assigned to individual instances by assigning the individual instance to any appro- priate reference class , 7 and these weights form the basis for practical action or evaluation. For example, weights can function as betting quotients in games of chance where bets are placed on forthcoming individual events. So, the weight of an individual theory T on evidence E is posited by assigning T to an appropriate class of theories A and an appropriate class C of evidence statements like E, and it is such posits about weight which are used for the evidential comparison of individual theories.

In order to use Bayes' Theorem to calculate the posterior probability of theories of kind A, it is necessary to have the probabilities on the right side of the equation. Following Reichenbach, Salmon suggests that these

Page 4: The probabilities of theories as frequencies

1 7 0 B E N R O G E R S

probabilities may be obtained from a certain probability lattice. Each horizontal row in the lattice represents a particular theory, T~.

To incorporate a theory into the lattice as a row we consider the separate observat ions which serve as tests of that theory. Some of these observations will be positively confirmatory, others will tend to disconfirm the theory. For a confirmatory instance we will put ' E ' and for a disconftrmatory instance '/~'. This will yield a sequence such as this:

E, E E E E E E E E E E E E E E [ . . . ] . . .

• . . A is the class of horizontal rows in the lattice and derivatively the class of theories which contr ibute rows . . . . B is the class of true t h e o r i e s - represented by rows in which the limit of the relative frequency of E is very close to 1. C is the class made up of rows all of which have similar initial sections. 8

The probabilities P(A, B), P(A n B, C), and P(A n/3, C) are inferred directly from the lattice by applying the enumeration rule to the appro- priate classes. For example, P(A, B) is obtained by examining, in the vertical direction in the lattice, those theories T~ which are of a kind A and then obtaining for this reference class the relative frequency of true theories, B, which are represented by horizontal rows in which the limit of the relative frequency of E 's is near 1. Since P(A, B) equals ( 1 - P(A, B)), on this interpretation we have all the probabilities required to calculate P(A n C, B) by means of Bayes' Theorem.

Reichenbach distinguishes between the use of the induction by enum- eration in contexts of promitive knowledge, where no probability values are known, and in situations of advanced knowledge where some prob- ability values are already known. 9 The probabilities P(A, B), P(A n B, C), and P(A n/~, C) in the interpretation above are clearly cases of the application of the enumeration rule in the context of advanced knowledge because they depend on prior inferences concerning the limit of the E 's in each horizontal row. These latter limits are inferred from the initial sequences of E's in each row by the enumeration rule. The inference about the limit of E 's in each individual lattice row appears to be an inductive inference in a state of primitive knowledge because no probability knowledge appears to be presupposed in the inference. These row probabilities form the basis for 'the concatenation of inductions' via either the probability calculus (principally the use of Bayes' Theorem) or enumerative inductions in advartced knowledge (as in the inference in the

Page 5: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 171

vertical direction on the lattice above to obtain P(A, B)). It is these concatenated inductions, and not enumerative induction in primitive knowledge, which characterize most of our actual inductive inference.

For Reichenbach, all inductive inference is based on and is reducible to enumerative induction from observations. But as has just been described, he argues that most practical inductive inference occurs in a context of advanced knowledge, where the probabilities of some state- ments are known; and such inference is best construed as applications of Bayes' theorem which yield the posterior probability of the truth of the statements under consideration on the available evidence. But ultimately the probabilities used in inference in advanced knowledge are based on the lattice row probabilities which are associated with individual theories, hypotheses, or other statements about physical events. Hence, it is important to see in detail what is involved in the characterization of these row probabilities and to see with what epistemic characteristics they are endowed. To accomplish this we must examine in more detail the role played by the enumerative rule in assigning probabilities to theories.

First, let us consider induction by simple enumeration as characterized by Reichenbach in his pragmatic justification of induction. He considers a sequence of events of some specified kind, A, which constitute the reference class for the inference. With each event A is associated another event, and it is to be determined whether the associated event has an attribute of interest, B. For example, the A's might be tosses of a given die and B might be the property of the ace being turned up on the toss. Let F"(A, B) be the relative frequency of B's among the first n members of A. The rule of induction by simple enumeration allows the inference of a probability on the basis of an observed frequency.

The rule of induction by simple enumeration: Given F"(A,B)=m/n one may infer that the limit F"(A,B)=

n --~ c o

m/n ±d.

That is, given the frequency interpretation of probability, one has in- ferred that P(A, B) = m/n + d, since P(A, B) is identified with the limit (as n ~ oo) of F"(A, B).

Reichenbach claims that we are justified in using the rule of induction

Page 6: The probabilities of theories as frequencies

172 BEN ROGERS

by simple enumeration because continued application of the rule will lead to the assertion of the actual limit in a finite time if the limit exists, a° The probability which is asserted on the basis of the rule establishes a relation between a class of events of kind A and a class of events of kind B. Our practical interest is, however, with the prediction and explanation of individual events of these kinds; and, strictly speaking, the probabilities so defined do not apply to the individual events but to the classes only. But he argues that if the reference class, A, is chosen appropriately, then the probability, P(A, B), can be transferred to individual A 's as a weight or posit which may serve as a betting quotient for betting on the occurrence of individual B's.

In the fundamental interpretation of probability given by Reichen- bach, the things which are related are events classified by their physical properties, and the probability assertions which are licensed by the rule of enumerative induction are assertions about the long run relative frequen- cies of these events. Thus probability statements are claims about general physical facts; they are claims which, if true, are physical laws. 11 By analogy, it would seem that if one infers from events which are confirming instances of a theory to the probability of the theory by means of enumerative induction then the probability of the theory must also be construed as a physical fact. But to construe the evideotial relationship between a theory and the evidence for or against it, which is what the probability of a theory presumably expresses, as a physical fact must seem, for many philosophers, simply a category mistake. Surely, they might say, the evidential relationship is a logical relationship, not a relationship expressible as a physical fact or law. The prejudice against Reichenbach's interpretation is reinforced, I suspect, by the prima facie objection to his interpretation which I alluded to in the opening para- graphs of this essay. It is to the development of the details leading to this prima facie objection that we now turn.

As e~plained above, the interpretation of the probabilities of theories is given in terms of a probability lattice. Each row in the lattice represents a different theory. A particular theory, T, is represented in its row by E 's and/~ 's which stand for respectively confirmatory instances of T and disconfirmatory instances of T. x2

The most basic question here is, 'What is a confirmatory instance of a theory?' In characterizing what counts as a confirming instance of a

Page 7: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 173

theory, one must start with an account of the kinds of things about which assertions are made in the theory. For the sake of simplicity, I shall assume that the theory T contains predicates which classify individual physical events into three kinds: D, F, and G. Also, let I represent a set of initial conditions which, relative to T, will yield the set of events E, which are confirming instances of T. Let us further suppose that T entails assertions of probabilistic hypotheses, H, of the form P(D, F) = r, 0 <~ r <~ 1, which state relations between classes of events of the kinds D and F.

If the theory T entails universal statements H of the form (x) (Dx D Fx), then the occurrence of an individual event D would constitute an instance of an initial condition ! and if F occurred, then a confirming instance E of the theory is established; and if F fails to occur, a disconfirming instance/~ of the theory is established. From the resulting E ' s and ~"s a lattice row is constructed which is associated with H and thereby with T. The limit of this sequence of E ' s and/~'s is inferred by the enumeration rule. The property of truth is ascribed to H if the limit of F n (I, E) is near 1.

When T contains statements of other than strictly universal form, it is not so clear what is to count as a confirmaion instance of the theory. Consider the case where T entails statements H which ascribe low probability to individual events of a certain kind, e.g., P(D, F) = I. On Reichenbach's account such a statement may be asserted on the basis of the enumeration rule from the observed frequency, F"(D, F), of the individual events of the kinds D and F. But it is not at all clear how one can come to assert that the probability of the statement P(D, F) = 1is near 1, because it is not at all clear on Reichenbach's theory what is to count as a confirmatory instance of a statement like this. The difficulty is that one cannot simply count the individual occurrences of F 's as confirming it because if P(D, F) = ~ is true one would expect Fn(D, F) to be near ~ and thus would not approach a limit near 1, which is expected of confirmatory instances when the statement tested is true. But if the individual events F are not confirming instances of the statement P(D, F) = I, what are?

Reichenbach, in expositing his interpretation of the probability of theories, considered only universal statements and probability state- ments which attributed very high probabilities to individual events.13 For such statements there seems to be little question as to what counts as confirming instances of them. But surely if one is to have an adequate

Page 8: The probabilities of theories as frequencies

174 BEN R O G E R S

account of the probability of hypotheses and theories, one must provide for the confirmation and disconfirmation of statistical statements. That Reichenbach did not seem to do so constitutes a powerful prima facie objection to the attempt to construe the probability of hypotheses in terms of the relative frequencies of the events the hypotheses are about. For example, if it is claimed that the probability of aces on the toss of a die is ~, then the occurrence of an ace, in and of itself, does not constitute evidence for the claim. This prima facie objection reinforces the intuition that the relative frequency interpretation of the probability of hypotheses rests on a category mistake, that the relation between a hypotheses and the evidence relevant to it is a logical relation and not an empirical one.

I will now consider the possibility of characterizing confirmation instances in keeping with Reichenbach's interpretation. To do so I shall take orthodox statistical practice as a guide, but certainly one to be followed warily. The idea is to construct a confirmation class, E, such that if certain initial conditions,/, are fulfilled then for a particular observation 0, the P(I, 0 ~ E) is near 1 if the probabilistic hypothesis H: P(D, F) = r is true. In general, one chooses a class of D ' s which constitute the set of initial conditions, L For example, / , might be sets of n consecutive D's. Then one calculates the confirmation class E such that 0 ~ E iff the observed F n (D, F) is in the range rl <~ F(D, F) <~ r2 and PH(L 0 ~ E) is n e a r 1.14

If one can calculate a class E with these properties and if when 0 e E one enters E in the lattice row which represents the theory T and enters/~ when 0 ~ E, then that row has the property that if H is true the limit of the relative frequency of E ' s in the row will be near 1. Thus by following the guide of orthodox statistical theory we can characterize the concept of the confirmation instance of a probabilistic hypothesis which meets Reichen- bach's condition that the limit of confirming instances is near 1 if the hypothesis is true.

What is needed in order to select an I and its correlated E which has the desired property that P(1, 0 ~ E) is near 1? First, one must have H stated. Secondly, the structure of the sequence of ordered pairs (D, F), must be characterized. Proceeding with our earlier example where H is P(D, F) = 1 ~, we need to choose a set of n D ' s which will be the initial condition for the confirmation class E, so that P(L E) can be established. If the F 's are distributed randomly in the sequence of (D, F)'s, then we might choose n = nl. But suppose we knew that F 's tend to form runs whose average

Page 9: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 175

length is longer than would be expected if the sequence is random. Then we would choose n = n2 so that nl << n2, because we would need a much longer sequence of D 's in order to get a fair sample of F n (/9, F) than if the sequence of F's is random. Thus, in general, a necessary condition for the calculation of an E which meets the criterion that P(1, 0 ~ E) is near 1 is that the structure of the sequence of F's be characterized, as well as H stated.

Once it is recognized that one cannot define a confirmation instance for the probabilistic hypothesis H unless one can also characterize the structure of the sequence of ordered pairs of events (/9, F), which H is about, one can see that there is a certain difficulty in carrying through Reichenbach's schema for defining the probability of hypotheses in terms of the relative frequency of their confirmation instances. The difficulty concerns the epistemic status of the characterization of the structure of the sequence of (/9, F)'s.

One option is to consider the class of E's, not as conformation instances of the probabilistic hypothesis H alone, but as confirmation instances of the conjunction of H and S, where S is a hypothesis about the structure of the sequence of (/9, F)'s. If this option is exercised, then one cannot assign a probability to any arbitrary empirical hypothesis via Reichenbach's schema, but only to certain joint hypotheses, like H and S, which both attribute a probability to events of a certain kind in a sequence and which also concern the probabilities of events in the subsequences of the principal sequence of (D, F)'s. I shall call such an interpretation a foint interpretation because the confirmation instance E confirms H and S taken conjointly but is not taken to confirm H (or S) taken alone. The row probability of the conjoint hypothesis H and S is defined in terms of the confirming instances; and since the characterization of a confirming instance does not depend on knowing some other empirical generaliza- tion, the row probability is inferred in the context of primitive knowledge. In attaining a joint interpretation one has paid the price of being unable to assign a row probability by enumeration over confirming instances to any arbitrary probabilistic hypothesis, but only to those which are conjoined with a hypothesis about the structure of the sequence of events referred to by the first hypothesis.

There are serious objections to the joint interpretation as defined above. First, Reichenbach's overall program seems to require that there be no significant restriction on what kind of hypothesis can be assigned

Page 10: The probabilities of theories as frequencies

176 BEN R O G E R S

probabilities. Especially, straightforward probabilistic hypotheses surely must not be excluded, for after all these are exactly the kind of statements which are assertable on the basis of the enumeration rule and, as such, have a fundamental place in his treatment of induction. A second consideration against the joint interpretation is that it goes against scientific practice to the extent that, by whatever poorly understood means, we do seem to judge the probability of individual statements within a theoretical context, the typical Duhemian argument notwith- standing. Third, as I shall argue subsequent to my discussion of the second option below, there remains an essential arbitrariness to the preceding definition of confirmation instance which is intuitively unacceptable and to which the joint interpretation is particularly subject.

In contrast to the joint interpretation characterized above, let us say that an interpretation is an individual interpretation when a confirmation instance E confirms an individual hypothesis H. In the preceding discus- sion, I showed that in the individual interpretation it is in general impossible to define a confirmation instance E for H in the context of primitive knowledge, because the definition depends on the characteriza- tion by S of the structure of the sequence of (D, F)'s. For the joint interpretation, there remains open the option of defining a confirmation instance so that the row probability of the conjoint hypothesis H and S is inferred in the context of primitive knowledge. A second option must be taken in the case of the individual interpretation, which is to assert S by means of induction by enumeration. If this is done, then the confirmation class E for a probabilistic hypothesis H is calculated on the basis of the structure S, which itself has been asserted by an application of the rule of enumerative induction. In this way, Reichenbach'a claim that the proba- bility of hypotheses is based solely on induction by enumeration and inference via the probability calculus would be fulfilled. But if this second procedure is followed, in the resulting individual interpretation the characterization of a confirming instance of H depends on the structure as characterized by S, is not defined therefore purely with respect to H, and is justified only if the assertion about the character of S is true. In the individual interpretation, the enumeration rule must be used to make assertions over two entirely different domains in the process of ascertain- ing the row probability of/4. First, it is used over the domain of individual events like D and E, which domain forms the inductive basis for the

Page 11: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 177

assertion of the structure statement S. 'Then it is used a second time over the domain of confirmation instances to obtain the row probability for h. Thus, the row probability of H, on the individual interpretation, is a probability asserted in the context of advanced knowledge.

Similarly, on the individual interpretation, if one wants to calculate the probability of a structural hypothesis like S, then it will be necessary, in order to calculate a confirmation class for S, to assert some limit state- ments about the events which determine the primary sequence of <D, F)'s, i.e., statements like H, on the basis of induction by enumeration.

Reichenbach himself analyzed carefully the nature of orthodox statisti- cal theory. In so doing he clearly recognized that most of the inferences licensed by this theory are made in the context of advanced knowledge. In fact, he described in detail how one might proceed to determine the nature of the sequence structure by repeated use of the enumeration rule. 15 What is missing in his writing, and which I am attempting to supply, is the extension of these ideas to the problem of the probability of hypotheses, so that it is treated in its full generality. The aim is to overcome a prima facie inadequacy of the interpretation and to lay bare the more fundamental difficulties of this position.

With respect to his interpretation of the probability of hypotheses, the conclusion is that the lattice row probability of an arbitrary individual hypothesis can be asserted only on the basis of a prior inference by induction to another generalization. This certainly goes against the impression Reichenbach sometimes gives that the row probability of any hypothesis can be given in the context of primitive knowledge; 16 that is, that all of the uses of the enumeration rule in determining a row probability of H are of the same kind, namely over confirming instances alone. Does this situation undercut the self-corrective nature of repeated uses of the enumerative rule? The corrective nature seems to be retained only if the characterization of a confirmation instance relative to a particular H may change as one moves out the lattice row, which is somewhat awkward and certainly not something Reichenbach tells us about his method. More will be said about this problem toward the end of the paper.

In the preceding pages, I have been exploring the difficulties which the Reichenbach analysis is heir to because the confirmation instance of a hypothesis can only be defined if the structure of the primary sequence is

Page 12: The probabilities of theories as frequencies

178 B E N R O G E R S

characterized. There is another, and I think, more serious difficulty facing his account on the score of characterizing this concept, which has to do with a certain arbitrariness in the choice of the confirmation class E. Earlier the confirmation class E was characterized as a class defined on a class I of D's such that the probability of a particular observation 0 having an observed F ~ (D, F) falling in the interval rl ~< F ~ (D, F)~< r2 is near 1. But the class E in general is not unique. There are typically a large number of classes E (the set is infinite for a continuous probability distribution) so that Pn(L 0 ~ E) is near 1. These alternative classes are constructed by taking different ranges into which Fn(D, F) must fall, all the ranges being subject to the condition that Pn(I, 0 ~ E) is near 1. Some of these classes are highly counterintuitive as confirmation classes. For example, suppose P(D, F) = 0.2 and the sequence of (D, F)'s is random in F. If one takes an observation of 1000 D's then the probability of getting exactly 200 F's is small; but still 200 F's is a possible observation, no other single observation has a higher expectation, and most other single values have a much lower expectation. Yet one can construct a confirma- tion class E so that P(I, O e E) is near 1 and 200 F's is not in the confirmation class at all.

When faced with this difficulty, one seeks some criterion which pro- vides a means of reducing the number of classes which will count as confirmation classes and which are also justifiable or in some manner non-arbitrary. Orthodox statistical theory at this point suggests choosing a confirmation class which not only makes Pn(I, 0 ~ E) near 1 but which also maximizes the probability of showing that H is false if H is false. But in order to do this one must consider the class of hypotheses H' which may be true if H is false. If H ' contains all the other hypotheses which are logically possible, then the class is too diverse to allow one to maximize the probability that is required. 17 However, in many important cases if H' is properly characterized, a confirmation class can actually be constructed so that PH(I, 0 ~ E) is near 1 and so that E maximizes the probability of H being discovered as false when one of the members of H ' is true. The confirmation class chosen will have the property of maximizing the probability of discovering that H is false if it is false only if the true hypothesis is in the class H u H'. But one can make the inference that the true hypothesis is in H u H', when H u H' is not logically exhaustive, only on the basis of positive empirical evidence. Hence, from

Page 13: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 179

Reichenbach's point of view, the choice of a confirmation class using the criteria applied in orthodox statistical testing can only take place in the context of advanced knowledge. 18

It is impossible both to follow the orthodox hint in characterizing confirmation instances and to have a joint interpretation where the row probabilities are inferred in the context of primitive knowledge. As shown previously, in order to have such a joint interpretation one must define a confirmation instance E for the joint hypothesis S and H. But the considerations just adduced show that there is a large class of E 's which have the property that the sequence of E 's so defined has a limit near 1 if S and H is true. In order to narrow the class of E 's by following the orthodox hint, one would have to consider a set of hypotheses S and (HUH'). If the latter set contains all the hypotheses logically possible relative to S, then it will be impossible to define an E with the desired properties; and if this set is a proper subset of all the hypotheses logically possible relative to S, then the true hypothesis must be in this set for the E so defined to have the characteristics claimed for it. In the latter case, the claim that the true hypothesis is in the restricted set is an empirical claim, and on Reichenbach's general empiricist position, can only be asserted on the basis of positive empirical evidence, which contradicts the supposition that the interpretation allows the inference to the row probabilities of H in the context of primitive knowledge. Hence, if we follow the hint given by orthodox statistical theory as to the proper way to characterize a confirmation instance, such an interpretation is impossible. I know of no other way of restricting the class of E 's which is at all in harmony with Reichenbach's general empirical orientation. Hence, I conclude that if we are to make sense of the probability of hypotheses on Reichenbach's account the interpretation must be an individual interpretation or a joint interpretation where the row probabilities are inferred in the context of advanced knowledge, a9

Now consider the individual interpretation. Suppose that the structure of the sequence is inferred by induction by enumeration. Does the overall process- induction plus testing for confirmation instances- retain the characteristic of giving a high probability to the true hypothesis if there is one?

Before we attempt to answer that question let us look a little further into the dynamics of the individual interpretation. Suppose that we wish

Page 14: The probabilities of theories as frequencies

180 BEN R O G E R S

to test a theory T. A statistical hypothesis H is derivable from T which concerns a set-up of a particular kind. We run a sequence of n individual trials on a set up of this kind. On the basis of those trials we use induction by enumeration and infer that the trials are independent. Then we define a confirmation class E for H which will be about frequencies in selected subsets of the primary sequence. Then using the same primary sequence of n individual trials, we set up the sequence of E 's and/~'s from which the row probability for H is inferred. Now suppose we continue the individual trials until we have n + k individual trials. We must continue making inductions by enumeration on the structure of the extended primary sequence, otherwise the self-corrective structure of the inductive inference won't come into play. Nevertheless, Reichenbach's interpreta- tion of the probability of hypotheses can be carried through in this manner. The interpretation has the required characteristic that hypoth- eses which are true, if any true ones are among those considered, will attain a probability near one in the long run. Thus, Reichenbach's claim that in principle inductive inference about theories can be reduced, via the probabilities of hypotheses, to inference by inductive enumeration is correct. Hence, his justification of induction carries through to these inductive inferences in advanced knowledge. Even though his interpreta- tion can be carried through, his general program for justifying the enumerative rule is not without its difficulties, some of which cause particular problems in the present context.

We shall now review some of the general difficulties with the position. Reichenbach's justification of induction applies not only to the rule of simple enumeration but to any rule which will converge to the limit, if such a limit exists. Call the class of rules which have this convergence property the class of asympototic rules. Salmon has shown that the class of such rules is clearly too broad a class to constitute a proper basis of inductive inference and has made substantial progress in eliminating subsets of these rules which generate pathological inferences. 2° Nonethe- less, the task of justifying the rule of simple enumeration is not complete in that it cannot as yet be accorded status as uniquely justified.

Let us now look at a related problem with respect to the topic of this essay. We noticed that, on the basis of continuing applications of the enumerative rule to the primary sequence, differing conclusions as to the structure of the sequence might be inferred at different stages of inquiry.

Page 15: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S O F T H E O R I E S A S F R E Q U E N C I E S 1 8 1

Thus the use of the probability lattice to calculate the probabilities for use in Bayes' Theorem must in reality be seen as the use of a sequence of such lattices. At any particular stage in inquiry, one uses the last lattice in the sequence. But in terms of anything like practical inference, one must not move from lattice to lattice too quickly; the situation requires a modicum of stability. The rule states that

limit F" = m/n + d. n - ~ o o

The stability of the system of inferences is related to the role of d in applications of the rule. If you do not have a d, every finite primary sequence which is not uniform will have at least one subsequence in which the inferred probability will differ from the probability inferred for the primary sequence. So without d as part of the rule, one can never conclude that the sequence is random, unless the sequence is uniform. Also, typically the structure attributed to the sequence will change with each new bit of evidence. So, to get any practical stability one must have a d. But the choice of a particular value for d affects the rate at which truly different sequence structures are discriminated. One intuition about the selection of d is that as the primary sequence gets longer d ought to decrease in size and so enable one to discriminate among more and more similar structures. But the particular function chosen to relate the length of the primary sequence to the value of d will tend to favor inferences to one characterization over other possible ones.

Salmon, in discussing the role of d in Reichenbach's statement of the rule, suggests that the choice of d is a pragmatic question and that it should be chosen in relation to the practical inference at hand. zl When the rule is being used where practical inferences are being made this suggestion is a reasonable one. But where the question of the choice of d is raised in the theoretical context of establishing the probabilities of a whole set of hypotheses, the choice takes on a more systematic character. In effect the choice of a particular value for d partially determines a metric in a whole system of inductive inferences, much in the way that the choice of a c-function determines an inductive logic in early Carnapian systems. 22 Unless there is some means of determining a value for d, the interpretation is subject to a very large degree of arbitrariness. This and the related arbitrariness involved in choosing a rule from the class of

Page 16: The probabilities of theories as frequencies

1 8 2 B E N R O G E R S

asymptotic rules remains one of the outstanding problems of Reichen- bach's inductive logic.

Wichita State University

N O T E S

* Part of the research for this paper was done during study supported by the National Science Foundation. I benefited from comments by James A. Fulton, Deborah H. Soles, and James W. Nickel. i Wesley C. Salmon, 'The Philosophy of Hans Reichenbach', Synthese 34 (1977), 5-88.

Hans Reichenbach, The Theory of Probability (Berkeley and Los Angeles: University of California Press, 1949), pp. 432--433. Also see his Experience and Prediction (Chicago: The University of Chicago Press, 1938), p. 304. 3 Theory of Probability, pp. 434--442. See especially pp. 438-441. 4 Salmon's work on the probabilities of hypotheses is summarized in his The Foundations of Scientific Inference (Pittsburgh: The University of Pittsburgh Press, 1967), Section VIII. Nagel's comments are in: Ernest Nagel, 'Principles of the Theory of Probability', in Otto Neurath, Rudolf Carnap, and Charles Morris (eds.), Foundations of the Unity of Science, Vol. 1, Chicago, 1955 (originally published separately in 1939), pp. 404-408; and 'Review of Reichenbach, The Theory of Probability', The Journal of Philosophy 47, 551-555. s In Experience and Prediction, Reichenbach proceeded from the general empiricist posi- tion that no statement could be known as true; and so only probability weights, and not truth, are attributed to individual observation statements. In Theory of Probability, he took the admittedly simplified position of treating such statements as true. I follow the latter course in this paper. 6 This follows Salmon's exposition in Foundations of Scientific Inference, p. 117. 7 See Reichenbach, Theory of Probability, Section 72, and Salmon, Foundations of Scientific Inference, pp. 90-96. One of Nagei's criticisms of Reichenbach was that there are not enough different theories to constitute a proper reference class. Modern analytic techni- ques, however, allow scientists to formulate and compare many different but similar theories. For example, see Clifford M. Will, 'Gravitation Theory', Scienu'fic American (1974), 24-33, where a large class of alternatives in Einstein's general theory of relativity is examined. 8 Wesley C. Salmon, 'The Frequency Interpretation and Antecedent Probabilities', Philosophical Studies 4 (1953), 44-48. [Guest Editor's note: I now consider this article hopelessly naive.] 9 Theory of Probability, p. 364. lo Ibid., Section 91; Experience and Prediction, Section 39. 11 The exact characterization of the nature of a physical law is a complex matter in Reiehenbach's later work. See his Nomological Statements and Admissable Operations (Amsterdam: North-Holland, 1954). The probability statements I am considering would probably be considered derivative laws in the language of this book. 12 'Confirmatory instance' is the term used by Salmon and not by Reichenbach; its use seems to be consistent with Reichenbach's intention. 13 Theory of Probability, pp. 434-439. Salmon seems to make the same assumption in 'The Frequency Interpretation'.

Page 17: The probabilities of theories as frequencies

T H E P R O B A B I L I T I E S OF T H E O R I E S AS F R E Q U E N C I E S 183

14 ,PH( I ' 0~E) ' means 'P(I, O~E) calculated on the basis of H' . 15 Theory of Probability, pp. 463--465. 16 Ibid., p. 439. "We now establish the degree of probability for each row [a first level probability] by means of a posit based on the inductive rule. Assuming the posits to be true, we count in the vertical direction and thus construct the probability of the second level holding for the statement that the probability of a row is =p." Because of the possibility of very long, but finite runs, no hypothesis but the hypothesis that the sequence is uniform can really be treated so that the row probability is inferred in the context of primitive knowledge. 17 For a discussion of this point and of the general theory of orthodox statistical inference, see my 'Material Conditions on Tests of Statistical Hypotheses', in Roger C. Buck and Robert S. Cohen (eds.), PSA 1970 (Dordrecht-Holland: Reidel, 1971), pp. 403-412. For a more complete discussion, see Ronald N. Giere, 'The Epistemological Roots of Scientific Knowledge', in Grover Maxwell and R. M. Anderson, Jr. (eds.), Induction, Probability, and Confirmation: Minnesota Studies in the Philosophy of Science, Vol. V (Minneapolis: University of Minnesota Press, 1975). pp. 212-261. See especially pp. 238-240. is There might be theoretical reasons to expect the true hypothesis to belong to a restricted set of hypotheses, or one might proceed by inferring a set H w H' by using the enumeration rule with a fairly large d, thus in effect giving a set of hypotheses. See the discussion about the role of d at the end of this paper. 19 From here on when I talk about the individual interpretation, I shall assume that to give a proper individual interpretation one must be in the context of advanced knowledge. Although one can carry through a joint interpretation in the context of advanced know- ledge, as described here, no further discussion of this possibility is given because there are no epistemic issues raised by this approach that do not arise for the individual interpretation. 20 The Foundations of Scientific Inference, Section VI, and the references cited therein. 21 Ibid., p. 138, n. 111. 22 Rudolf Carnap, Logical Foundations of Probability (Chicago: University of Chicago Press, 2nd ed., 1962), Section 110.