A linear programming framework for logics of uncertainty

A Linear Programming Framework for Logics of Uncertainty �K. A. AndersenMatematisk Institut, �Arhus UniversitetDK-8000 �Arhus C, Denmark J. N. HookerGraduate School of Industrial AdministrationCarnegie Mellon University, Pittsburgh, PA 15213 USAMay 1992Revised September 1992AbstractSeveral logics for reasoning under uncertainty distribute\probability mass" over sets in some sense. These includeprobabilistic logic, Dempster-Shafer theory, other logicsbased on belief functions, and second-order probabilisticlogic. We show that these logics are instances of a certaintype of linear programming model, typically with expo-nentially many variables. We also show how a single linearprogram package can implement these logics computation-ally if one \plugs in" a di�erent column generation sub-routine for each logic.1 IntroductionSeveral logics for reasoning under uncertainty are vari-ations on a theme. Numbers, perhaps probabilities,are assigned to propositions to indicate degrees of con-�dence. The object is to determine the degree of con-�dence one can have in a conclusion inferred from thepropositions. Dependencies among the propositionsrequire that some of the \probability mass" assignedto one proposition be distributed to others. Solution ofthis distribution problem yields a range of con�dencelevels for the conclusion.The oldest uncertainty logic of this kind is Boole'sprobabilistic logic [2, 3], which was revived a few yearsago by Hailperin [12, 13], rediscovered by Nilsson [18],and recently discussed by a number of others [5, 8, 9,10, 11, 15, 16, 17, 19]. But Dempster-Shafer theoryhas a similar structure [20], as does a logic based onShafer's belief functions suggested by Dubois, Pradeand others [7, 16, 17]. A number of other logics canbe devised along similar lines.�The second author is partially supported by ONR grantN00014-92-J-1028 and AFOSR grant 91-0287.

It seems to be widely recognized that several un-certainty logics can be viewed as probability mass dis-tribution problems in some sense. Here we not onlymake this sense precise but show the following. a) In-ference in all these logics can all be formulated as alinear programming problem of a certain type, typi-cally with exponentially many variables in the worstcase. b) At least in the logics discussed here, the expo-nential number of variables can be dealt with compu-tationally by using column generation schemes, a wellknown device for such situations. The practicality ofthis approach has already been demonstrated for prob-abilistic logic [15]. This suggests that a single linearprogramming code can implement several uncertaintylogics, if one plugs in a di�erent column generationsubroutine for each logic.We will show how several logics �t into this frame-work and will describe the column generation subprob-lem in each case. In probabilistic logic, column gener-ation is a pseudo-boolean optimization problem, as isalready well known. In Dempster-Shafer theory it isan integer programming problem. We will introducea second-order probability logic in which it is a mixedinteger/linear programming problem. In the logic ofbelief functions mentioned above, there is no expo-nential explosion of columns, and a column generationtechnique is likely to be unnecessary.For some applications one may wish to add nonlin-ear constraints, although we do not pursue this pos-sibility here. For instance, Dempster's combinationrule, which is a key ingredient of Dempster-Shafer the-ory, combines a renormalization device with an inde-pendence assumption. The rule can be used perfectlywell, and in many cases more appropriately, with-

out the independence assumption, and we do so here.But the independence assumption can be imposed byadding nonlinear constraints to the otherwise linearmodel. Probabilistic logic can also be augmented withindependence assumptions, such as those depicted bya Bayesian network, by adding nonlinear constraints.In [1] we show when and how this can be done with-out an exponential growth in the number of nonlinearconstraints. It is unclear at this point whether columngeneration techniques may be successfully extended tononlinear problems.We begin below with a statement of the general lin-ear programming model. After discussing brie y howa column generation approach is implemented compu-tationally, we show how several logics can be placedin this framework. These include probabilistic logic,a version of probabilistic logic with unreliable sourcesof information, Dempster-Shafer theory, second-orderprobabilistic logic (which allows for unreliable sourcesin a di�erent way), and a simple logic of belief func-tions.2 The General ModelWe are given propositions F1; : : : ; Fh and some infor-mation about the level of con�dence we may have inthem. The con�dence level for Fi is indicated by its\mass," which is a number in the interval [0; 1]. Theinterpretation of mass varies from one logic to another;in probabilistic logic, for instance, it is probabilitymass in the classical sense. Since the precise massof Fi may be unknown, we will suppose that an in-terval [Li; Ui] is given, within which the mass lies. Ifnothing is known about the mass of Fi, we set Li = 0,Ui = 1.Let Si be the set of possible outcomes that makeproposition Fi true. In probabilistic logic, Si is the setof \possible worlds" in which Fi is true. In Dempster-Shafer theory, it is a subset of the \frame of discern-ment." We let �(Si) denote the mass of Fi, which wealso refer to as the mass of Si. The sets S1; : : : ; Shneed not all be distinct.We are interested in knowing how much con�dencewe can place in a proposition Ft whose mass is notgiven. Its mass is constrained by the fact that St in-

tersects some of the sets for which masses are given.A fundamental problem, therefore, is to �nd out howthe mass assigned a set Si can be distributed over itsintersections with other sets.The precise formulation of this distribution prob-lem varies from logic to logic. In a given logic, certainintersections Ti2J Si are of interest, where the indexsets J � f1; : : : ; hg belong to a family I. It is conve-nient to let SJ denote Ti2J Si. Di�erent index sets Jmay give rise to the same intersection SJ .We associate mass qJ with each index set J 2I. Thus we wish to distribute the mass assigned Siover variables qJ associated with Si's intersection withother sets. �(Si) = XJ2I(i) qJ : (1)Thus Si = SJ2I(i) SJ = SJ2I(i)Tj2J Sj . The familyI(i) of index sets depends on the type of logic.Since �(Si) must lie in [Li; Ui], we have the con-straints, Li � �(Si) � Ui; i = 1; : : : ;m: (2)qJ � 0; all J 2 I; (3)where �(Si) is given by (1).We can place bounds on the mass of St by solvingthe optimization problem,min=max ��(St)1� ��(;) (4)s.t. (2); (3):The notation ��(St) is used to indicate that massmay be distributed di�erently in the objective func-tion than in the constraints. Thus we have,��(St) = XJ2I�(t) qJ ; (5)where the index set I�(t) depends on the logic.The objective function is normalized by dividingby the mass that is not assigned to the empty set.Among the logics we discuss, this plays a role only inDemspter-Shafer theory. In the other logics, no massis assigned to the empty set, so that ��(;) = 0 and (4)is a linear programming problem. When ��(;) 6= 0,(4) becomes a fractional programming problem that isreadily transformed to a linear programming problemusing well-known methods [4].

The sets I(i) and I�(t) give instructions for gen-erating the columns of the coe�cient matrix in (4).Each column corresponds to a set J 2 I. Assuming��(;) = 0, it has the form (y0; y1; : : : ; ym), where yi(i � 1) is the coe�cient of qJ in constraint i and y0its coe�cient in the objective function. The columnis given by,yi = � 1 if J 2 I(i)0 otherwise; i = 1; : : : ;m;y0 = � 1 if J 2 I�(t)0 otherwise: (6)A similar column de�nition can be given for the lin-earized version of (4) when ��(;) 6= 0.Logics that use conditional probabilities require amodel in which the masses in (4) are replaced with\conditional masses." A conditional mass �(SjT ) isde�ned to be equal to �(S\T )=�(T ). Intervals [Li; Ui]are given for conditional masses �(SijSc(i)), where Sc(i)is one of the sets S1; : : : ; Sh. Thus the constraints (2)become, Li � �(Si \ Sc(i))�(Sc(i)) � Ui:These constraints can be written in linear form,0 � �(Si \ Sc(i)) � Li�(Sc(i)) (7)�(Si \ Sc(i)) � Ui�(Sc(i)) � 0:Assuming ��(;) = 0, the linear programming problem(4) becomes a fractional programming problem,min=max ��(St \ Sc(t))��(Sc(t)) (8)s.t. (7); (3):This is again convertible to a linear programmingprob-lem.When the objective function of (8) is an uncon-ditional mass ��(St), column J of (8) has the form(y0; y1; z1; : : : ; ym; zm), whereyi = 8<: 1� Li if J 2 I(i) \ I(c(i))�Li if J 2 I(c(i)) n I(i)0 otherwise; ; i = 1; : : : ;m;zi = 8<: 1� Ui if J 2 I(i) \ I(c(i))�Ui if J 2 I(c(i)) n I(i)0 otherwise; ; i = 1; : : : ;m;y0 = � 1 if J 2 I�(t)0 otherwise: (9)

When the objective function is conditional, a similarcolumn de�nition can be given for the linearized formof (8)Table 1 shows how the four uncertainty logics con-sidered here �t into this pattern.3 The ColumnGeneration Sub-problemWe have the modelmin=max ��(St) = XJ2I�(t) qJ (10)s.t.�(Si) = XJ2I(i) qJ � Ui; i = 1; : : : ;m:�(Si) = XJ2I(i) qJ � Li; i = 1; : : : ;m:qJ � 0; J 2 IIn general, depending on the logic, there can be anexponential number of columns in the above program.Therefore it might be a good idea to use a column gen-eration procedure. For an introduction to column gen-eration procedures, especially Dantzig-Wolfe, see [6].Assume that we have only generated some columnsmin=max ��(St) = XJ2I�(t) qJ (11)s.t.�(Si) = XJ2I(i) qJ � Ui; i = 1; : : : ;m:�(Si) = XJ2I(i) qJ � Li; i = 1; : : : ;m:qJ � 0; J 2 IHere the index sets I�(t), I(i) and I denote the set ofcolumns generated so far. What we need now is to de-termine if the above columns are su�cient for solvingthe program and if not how a new (improving) columnmay be generated. This is usually done by construct-ing a subproblem. By maximizing or minimizing acertain objective function over some set, it is possibleto decide if an improving column does exist. If one ex-ists it is added to the program which is then resolved.If no improving column exists the optimal solution to

Table 1: Linear programming structure of four uncertainty logics.ProbabilisticLogic Dempster-ShaferTheory 2nd-OrderProbabilisticLogic BeliefFunctionsInterpre-tationof set Si Set of possibleworlds in whichpropositionFi is true Subset offrame ofdiscernmentassociatedwith someevidencesource k Half-space inprobabilityspacede�ned byPr(Fi) � � Subset offrame ofdiscernmentConstraints,where�(Si) =PJ2I(i) qJ Bounds onpriorprobabilities1 :L � �(Si) � U Value ofbasicprobabilityfunction:�(Si) = mk(Si) Bounds on2nd-orderprobabilitythatPr(Fi) � � :L � �(Si) � U Bounds onvalue of belieffunctionBel(Si):L � �(Si) � UObjectivefunction,where��(Si) =PJ2I�(i) qJ Posteriorprobability:��(Si\Sc(i))��(Sc(i)) Normalizedprobability:��(St)1��(;) 2nd-orderprobabilitythatPr(Fi) � � :��(St) Value of belieffunctionBel(St):��(St)IntersectionsSJ =Ti2J Si All minimal2nonemptyintersectionsof sets Si Allintersectionsof one Siassociatedwith eachevidencesource k All minimalnonemptyintersectionsof halfspacesSi The sets SiI(i) containsall Jfor which: SJ � Si i 2 J SJ � Si SJ � SiI�(i) containsall Jfor which: SJ � Si SJ � Si SJ � Si SJ � SiPracticalgenerationof columnsqJ Pseudo-booleanoptimization Integerprogramming Mixedintegerprogramming Nonerequired1Can also place bounds on conditional probabilities Li � Pr(FijFc(i)) � Ui by using the constraintsLi�(Sc(i)) � �(Si \ Sc(i)) and Ui�(Sc(i)) � �(Si \ Sc(i)).2A minimal intersection SJ is one containing no other nonempty intersection of Si's.

the program has been found. The procedure can bestarted with any set of known columns, possibly none,in which case the program only contains slack- andsurplus variables in the constraints.To describe the column generation procedure sup-pose we are maximizing the programs (10) and (11).The procedure is similar when minimizing. Let �i,i = 1; : : : ;m, denote the dual variables to the con-straints XJ2I(i) qJ � Ui, and let i, i = 1; : : : ;m, denotethe dual variables to the constraints XJ2I(i) qJ � Li.Then �i � 0 and i � 0, i = 1; : : : ;m. Suppose weconstruct a set, say P , such that the extreme pointsof P are exactly the possible columns qJ , J 2 I. Thenthe subproblem becomes:min (� � )y � yt (12)s.t.y = (y1; : : : ; ym) 2 Pwhere � = (�1; : : : ; �m) and = ( 1; : : : ; m):If the optimal solution to this problem is strictlyless than 0, then an improving column qJ has beenfound. The index J is added to the index set s I�(t),I(i) and I, meaning that column qJ is added to theprogram (11), which is then resolved. If the optimalsolution to t he contrary is at least 0, then an opti-mal solution to (10) has been determined. We noticethat the problem is in a sense to state the set P in areasonable way. For the logics described in this paperthe sets P are described in sections 4{7.4 Probabilistic LogicIn probabilistic logic, conditional probabilitiesPr(FijFc(i)) are constrained to lie in intervals [Li; Ui],where the Fi's are formulas of propositional logic. (Anunconditioned probability Pr(Fi) can be given by let-ting Fc(i) be a tautologous proposition.) The object isto compute bounds on a probability mass Pr(FtjFc(t))that are consistent with the given probabilistic infor-mation.The formulas Fi contain atomic propositionsx1; : : : ; xn. A possible world is an assignment v : fx1; : : : ; xng !f0; 1gn of truth values to the atomic propositions. Fi

is true in a possible world v when the assignment vmakes it true, which we indicate by writing v(Fi) = 1.Let Si be the set of possible worlds in which Fiis true. Thus Pr(Fi) is the probability that the ac-tual world lies in Si. We therefore interpret �(Si) tobe the probability Pr(Fi) and �(SijSc(i)) to be theconditional probability Pr(FijFc(i)).The given intervals [Li; Ui] impose the constraints(7). Since the probability of all possible worlds mustsum to one, we use one of the constraints (7) to assigna mass of one to a tautologous proposition.By the law of total probability, �(Si) is the sumof the probabilities of the possible worlds in Si. Theprobability mass �(Si) must therefore be distributedover these worlds. If we let qv denote the probabilityof world v, we have,�(Si) = ��(Si) = Xv(Si)=1 qv: (13)The inference problem is to place bounds on a con-ditional probability Pr(FtjFc(t)). Thus we solve (8),where � and �� are given by (13)We now show that probabilistic logic �ts the gen-eral model of the previous section. Whenever an in-terval [Li; Ui] is given for Pr(Fi), the interval [1 �Ui; 1� Li] is implicitly given for Pr(:Fi). We there-fore assume without harm that :Fi belongs to the listF1; : : : ; Fh whenever Fi does; that is, the complementSi of Si belongs to the list S1; : : : ; Sh whenever Sidoes. Since the probability mass attributed to a set isspread over the possible worlds in the set, empty setscannot receive probability mass.The intersections SJ are all minimal nonempty in-tersections of S1; : : : ; Sh. A minimal intersection isone that properly contains no nonempty intersectionSJ 0 for any J 0 � f1; : : : ; hg. Thus J 2 I if and only ifSJ is a minimal nonempty intersection.Due to the fact that Si is among S1; : : : ; Sh when-ever Si is, the distinct intersections SJ for J 2 I par-tition the set of all possible worlds. Note that Si maycontain SJ when i 62 J . To distribute the probabilitiesPr(Si) over the qJ 's as in (1), we letI(i) = fJ 2 IjSJ � Sig: (14)We must now show that distributing probabilityover the variables qJ as in (1) results in the same prob-

lem (8) as distributing it over possible world probabil-ities qv as in (13). We can do this by showing that (8)has the same columns in either case. Note �rst thatfor any J 2 I, qJ occurs in a constraint or objectivefunction of (8) with a given coe�cient if and only allvariables qJ 0 with SJ 0 = SJ occur with that coe�-cient. These variables can therefore be collapsed intoone, say qJ , which represents the set SJ in the parti-tion of possible worlds. But the column correspondingto qJ is identical to that for qv, where v is any possibleworld in SJ . Thus the possible worlds v generate thesame columns as the sets J .The column description (14) is not computation-ally practical. Determining whether J 2 I involvessolving a satis�ability problem to check whether SJ isempty. Thus we generate columns corresponding topossible worlds v rather than sets J . In the noncondi-tional problem (4) this yields columns (yt; y1; : : : ; ym)with yi = v(Fi). If (for instance) Fi is a logical clauseWj2P xj_Wj2N :xj, then yi can be written as a pseudo-boolean function yi = 1�Qj2P (1�xj)Qj2N xj. Thusthe column generation subproblem, which is to min-imize (12), becomes a pseudo-boolean optimizationproblem. The situation is similar for the conditionalproblem (8).Whether generated by J 's or possible worlds, someof the columns of (4) are identical. But two identicalcolumns will never appear in the basis of the solution.Thus when several possible worlds generate the samecolumn, only one of the worlds will in practice absorball the probability attributed to the set SJ containingthose worlds.5 Probabilistic Logic with Un-reliable SourcesIn probabilistic logic, the probabilities Pr(Fi) of log-ical formulas Fi are delivered from a single evidencesource. However, it might be useful to extend themodel, such that it is possible to allow for more thanone evidence source to supply probabilities (or esti-mates of these) to the logical formulas in question.This can be done in a straightforward manner.Suppose there are K evidence sources, denoted byESk, k = 1; : : : ;K. If K = 1, the ordinary model

for probabilistic logic is obtained. If K � 2, we in-stead get conditional probabilities Pr(FijESk), for alli, all k. The interpretation of these probabilities is:the probability of Fi given that evidence source k isreliable. For each of the logical formulas Fi, there areK probabilities, namely the ones obtained from the Kevidence sources.Let Si be the set of possible worlds in which Fiis true, and let Rk be the set of possible worlds, inwhich evidence source k is reliable (with certainty).Notice that here is a slight di�erence from probabilisticlogic. The model not only has possible worlds in whichsome logical formulas are true but also possible worldsin which evidence sources are reliable. Suppose thatevidence source k informs us that the probability ofSj is in the interval [Lki , Uki ]. This gives rise to theset of linear constraints:Lki � Pr(FijESk) � Uki ; i = 1; : : : ;m; k = 1; : : : ;K:These constraints can be rewritten as:0 � Pr(Si \Rk)� LkiPr(Rk) (15)Pr(Si \Rk)� Uki Pr(Rk) � 0 (16)It is of course also possible for each evidence sourceto specify conditional probabilities. Suppose that evi-dence source k informs us that the conditional proba-bility of Fi given Fc(i) belongs to some interval [Lkic(i),Ukic(i)]. The following set of constraints is then ob-tained:Lkic(i) � Pr(FijFc(i); ESk) � Ukic(i);i = 1; : : : ;m; k = 1; : : : ;K:These constraints can be rewritten as follows:0 � Pr(Si \ Sc(i) \Rk) � Lkic(i)Pr(Sc(i) \Rk) (17)Pr(Si \ Sc(i) \Rk) � Ukic(i)Pr(Sc(i) \Rk) � 0 (18)In ordinary probabilistic logic, K = 1, it is implic-itly assumed that the evidence source is reliable. Thisneed of course not be the case. Therefore, in addi-tion to the above mentioned conditional probabilities,it is possible to specify probability intervals [Lk, Uk],k = 1; : : : ;K, indicating to which degree the di�erentevidence sources are reliable. This gives rise to the setof constraints:Lk � Pr(ESk) � Uk; k = 1; : : : ;K: (19)

If one believes in some of the evidence sourceswith certainty, the corresponding probability intervalsshould overlap, since otherwise the model is inconsis-tent. We cannot with certainty believe, for instance,that a probability is in the interval [0.2, 0.3] and atthe same time with certainty believe that it is in theinterval [0.5, 0.6].As in ordinary probabilistic logic, we have:Pr(Si \Rk) = Xv(Si\Rk)=1 qvwhere qv denotes the probability of world v.The inference problem is to place bounds on a con-ditional probability Pr(FtjFc(t)). As in ordinary prob-abilistic logic, we solve a fractional linear program sim-ilar to, min=max Pr(St \ Sc(t))Pr(Sc(t))s.t.(15); (16); (17); (18); (19):The sets S1, ..., Sh are those de�ned above: the sets ofpossible worlds in which the given formulas Fi ^ESk,Fi ^ Fc(i) ^ ESk, ESk, i = 1; : : : ;m, k = 1; : : : ;K,respectively, are true.The probabilities of these formulas can be expressedin terms of possible worlds as explained in section 4.We see, that the only di�erence from probabilis-tic logic is that we now have possible worlds in whichsome formulas are true and possible worlds in which anevidence source is reliable (with certainty). Instead ofjust having probabilities of formulas, we have proba-bilities conditioned on evidence sources. Furthermore,it is possible to state the probability of the reliabilityof some evidence source. If any of the probabilities areunknown, they are simply left unspeci�ed.So everything in this section has been formulatedin the same way as was done in the section on proba-bilistic logic. Especially, the model falls into the gen-eral frame. The column generation procedure is as inprobabilistic logic.6 Dempster-Shafer TheoryIn Dempster-Shafer theory there are several evidencesources, indexed by k = 1; : : : ;K. Each evidence

source k distributes a probability mass of one overa family fSiji 2 Hkg of distinct sets of possible out-comes. The index sets Hk are disjoint, but a set Si inone family may be equal to a set Si0 in another fam-ily. The union of the Si's is the frame of discernment,denoted �.For each i 2 Hk, we interpret the mass �(Si) to bethe value mk(Si) of the basic probability function mk,which indicates the strength of k's evidence that theactual outcome lies in Si. Thus we have constraints(2) with Li = Ui = mk(i)(Si), where i 2 Hk(i). Al-thoughPi2Hk mk(Si) = 1 for each evidence source k,some of its mass can be assigned to a set representingthe universe �, which contains all possible situations.This mass represents evidence that supports no par-ticular proposition.When there are K evidence sources, the intersec-tions SJ are associated with cells of a K-dimensionalcube. Each cell corresponds to an index set J thatcontains for each k 2 f1; : : : ;Kg an index i 2 Hk rep-resenting the \coordinate" of the cell along dimensionk. Since SJ may be the same for di�erent J 's, the sameSJ may be associated with several cells. The mass�(Si) for i 2 Hk is distributed over the masses qJ of allcells J whose k-th coordinate is i; i.e., all cells J withi 2 J . Thus we have (1), with I(i) = fJ 2 Iji 2 Jg.Classical Dempster-Shafer theory uses the partic-ular distribution dictated by Dempster's combinationrule: qJ =Yi2Jmk(i)(Si); (20)which assumes that the evidence sources are in somesense independent. But we will allow any distributionobserving (1), since independence assumptions may beunjusti�ed. The classical theory can be obtained byadding the constraints (20) to the model.This distribution scheme has the curious result thatan empty set SJ may receive probability mass. Butit also implies that the constraints (2) and (3) alwayshave a feasible solution. This can be seen by notingthat the probabilities dictated by Dempster's combi-nation rule satisfy them.Whenever SJ � St, evidence that SJ contains theactual outcome adds to the evidence that St does. Sowe interpret ��(St) as the sum of the masses of allSJ � St. Thus we have (5) with I�(t) = fJ jSJ � Stg.

Since empty SJ 's receive mass, this mass is ignoredand the rest renormalized so that it sums to one. Thusthe inference problem is to obtain bounds onBel(St) = ��(St)1� ��(;) ; (21)where \Bel" is Shafer's notation. We therefore obtainthe fractional programming problem (4).Assuming for simplicity that all intersections SJare nonempty so that ��(;) = 0, the columns(y0; y1; : : : ; ym) of (4) satisfyXi2Hk yi = 1; k = 1; : : : ;K: (22)The objective function coe�cient y0 is 1 if and only ifTyi=1 Si � St, which is to say Vyi=1 Fi � Ft. If theFi's are formulas of propositional logic, additional con-straints and 0-1 variables representing atomic propo-sitions can be used to de�ne y0 in terms of y1; : : : ; ym,using well-known methods [14]. The column gener-ation subproblem becomes an integer programmingproblem: minimize (12) subject to (22) and the ad-ditional constraints. When ��(;) 6= 0, the linearizedversion of (4) can be similarly treated.7 Second-Order ProbabilisticLogicSecond-order probabilistic logic assigns probability dis-tributions to Pr(FjjFc(j)), rather than specifyingPr(FjjFc(j)) directly. These second-order distribu-tions are approximated by specifying the probabil-ity that Pr(FjjFc(j)) lies in each of several intervals[0; �i]. Probabilistic information is therefore given inthe form,Li � Pr(Pr(Fji ^ Fc(ji))Pr(Fc(ji)) � �i) � Ui; i = 1; : : : ;m:(23)The propositions Fj again belong to propositional logicand contain atomic propositions x1; : : : ; xn. In gen-eral, probabilities from a less reliable source will havea more dispersed second order distribution.The �rst-order probability space consists of all prob-ability distributions (p1; : : : ; p2n) over the possible worldsv. Each constraint in (23) places limits on the prob-ability mass assigned to a region Si of this space.

Speci�cally, Si is the half-space in which Pr(Fji ^Fc(ji)) � �iPr(Fc(ji)), orXv(Fji ) = 1v(Fc(ji)) = 1 pv � �i Xv(Fc(ji))=1 pv: (24)We interpret �(Si) as the second-order probabilitythat Pr(Fji jFc(ji)) � �i. We can again suppose thatSi belongs to the list S1; : : : ; Sh whenever Si does.Thus the set of all minimal nonempty intersections SJof halfspaces partitions probability space into polyhe-dral regions. The probability mass Pr(Si) of a halfs-pace is the sum of the masses qJ of the polyhedra SJin it. Thus we have (1) with I(i) = fJ jSJ � Sig.The inference problem is to �nd bounds on ��(St) =�(St) by solving (4), which is a linear programmingproblem because ��(;) = 0.It remains to �nd a computationally practical wayto implement the column generation scheme (6) whenI(i) = fJ jSJ � Sig. This can be done via mixedinteger programming. From (24), each Si consists ofthe vectors p in probability space satisfying the i-thconstraint of�yi< Xv(Fji ) = 1v(Fc(ji)) = 1 pv � �i Xv(Fc(ji))=1 pv �1� yi; (25)i = 1; : : : ;m;when yi = 1, and its complement consists of thosesatisfying this same constraint when yi = 0. Thus thesets SJ for J 2 I are precisely the nonempty solutionsets of (25) over all 0-1 vectors y = (y1; : : : ; ym). Thecolumn generation subproblem is therefore to mini-mize (12) subject to (25) andXv pv = 1; pv � 0; all v;where y0 = yt.8 Belief FunctionsDempster-Shafer theory derives the plausibilityBel(St)of St by combining evidence from several sources. Eachsource k provides support for several sets Si that ismeasured by a basic probability function mk.

A variation on this approach is to suppose thatthe evidence for the sets Si is combined in advance,outside the mechanism of the theory. This provides anestimate Bel(Si) of the plausibility of each Si. Thegoal is to �nd what values of Bel(St) are consistentwith these estimates.Thus we interpret �(Si) = ��(Si) to be Bel(Si).Again any evidence for a subset of St is evidence for St.We therefore postulate an underlying basic probabilityfunction m(Si) that measures evidence in favor of theparticular set Si, with Bel(Si) =PSj�Si m(Sj ). Thismeans that we distribute the mass �(Si) = Bel(Si)over the variables qJ = qfjg = m(Sj ) for all subsetsSj of Si. So the \intersections" SJ are simply thesets Sj . The distribution formula is (1) with I(i) =ffjgjSj � Sig.If the precise value ofBel(Si) is given for all Si, theunderlying basic probability function is determined bythe inclusion-exclusion formula,m(Si) = XSj�Si(�1)jSinSj jBel(Sj ):But if Bel(Si) is only partially speci�ed, several basicprobability functions are possible, and Bel(St) may berestricted to a range but not precisely determined.If Li; Ui are the bounds placed on Bel(Si) = �(Si),we have the constraints (2). To obtain bounds onBel(St) = ��(St) we solve (4), which is linear because��(;) = 0. Since this problem contains only one col-umn for each Si, column generation should not in gen-eral be necessary. Also it should normally be easy todetermine whether Sj � Si (i.e., whether Fj impliesFi), since the given propositions Fi should normallybe simple.9 ConclusionIn this paper we have demonstrated how several log-ics for reasoning under uncertainty �t into the sameframework. Each admits a linear programming modelof essentially the same structure, except that the dif-ferent logics are implemented with di�erent columngeneration procedures. The column generation pro-cedures include pseudo-boolean optimization, integerprogramming and mixed integer programming. Thelogics that we have been able to show �t into this

general framework include probabilistic logic, proba-bilistic logic with unreliable sources of information,Dempster-Shafer theory, second order probabilistic logicand a simple logic of belief functions.An important question is how well the column gen-eration scheme will work in practice. It has beendemonstrated to work very well in the case of prob-abilistic logic [15], and therefore also for the specialversion of probabilistic logic with unreliable sourcesof information. Further computational experience isneeded to determine how well it will work in Dempster-Shafer theory and second-order probabilistic logic.References[1] Andersen, K. A., and J. N. Hooker, Bayesianlogic, to appear in Decision Support Systems.[2] Boole, G., An Investigation of the Laws ofThought, on which are Founded the MathematicalTheories of Logic and Probabilities. Dover Pub-lications (New York, 1951). Original work pub-lished 1854.[3] Boole, G., Studies in Logic and Probability, ed.by R. Rhees, Watts and Co (London) and OpenCourt Publishing Company (La Salle, Illinois,1952).[4] Charnes, A., and W. W. Cooper, Programmingwith linear fractionals, Naval Research LogisticsQuarterly 9 (1962) 181-186.[5] Chen, S. S., Some extensions of probabilisticlogic, in J. F. Lemmer and L. N. Kanal, eds.,Uncertainty in Arti�cial Intelligence 2, North-Holland (1988).[6] Dirickx, I. M. I., and L. P. Jennergren, Sys-tems Analysis by Multilevel Methods: With Ap-plications to Economics and Management. Wiley,Chichester (1979).[7] Dubois, D., and H. Prade, The principle of min-imum speci�city as a basis for evidential reason-ing, Uncertainty in Knowledge-Based Systems,Lecture Notes in Computer Science 286 (1986)75-84

[8] Dubois, D., and H. Prade, A tentative compari-son of numerical approximate reasoning method-ologies, International Journal Man-MachineStudies 27 (1987) 149-183.[9] Georgakopolous, G., D. Kavvadias and C. H. Pa-padimitriou, Probabilistic satis�ability, Journalof Complexity 4 (1988) 1-11.[10] Grosof, B. N., An inequality paradigm for prob-abilistic reasoning, in J. F. Lemmer and L. N.Kanal, eds., Uncertainty in Arti�cial Intelligence1, North-Holland (1986).[11] Grosof, B. N., Non-monotonicity in probabilis-tic knowledge, in J. F. Lemmer and L. N.Kanal, eds., Uncertainty in Arti�cial Intelligence2, North-Holland (1986).[12] Hailperin, T., Boole's Logic and Probability,Studies in Logic and the Foundations of Math-ematics v. 85, North-Holland (1976).[13] Hailperin, T., Probability logic, Notre DameJournal of Formal Logic 25 (1984) 198-212.[14] Hooker, J. N., A quantitative approach to logicalinference, Decision Support Systems 4 (1988) 45-69.[15] Jaumard, B., P. Hansen and M. P. Araga�o, Col-umn generation methods for probabilistic logic,ORSA Journal on Computing 3 (1991).[16] McLeish, M., Probabilistic logic: some commentsand possible use for nonmonotonic reasoning, inJ. F. Lemmer and L. N. Kanal, eds., Uncertaintyin Arti�cial Intelligence 2, North-Holland (1986).[17] McLeish, M., Nilsson's probabilistic entailmentextended to Dempster-Shafer theory, in Uncer-tainty in Arti�cial Intelligence 3 (1989) 23-34.[18] Nilsson, N. J., Probabilistic logic, Arti�cial Intel-ligence 28 (1986) 71-87.[19] Paa�, G., Probabilistic logic, in Non-standardLogics for Automated Reasoning, ed. P. Smets emet al., Academic Press (New York, 1988) 213-251.[20] Shafer, G., A Mathematical Theory of Evidence,Princeton University Press (1976).

Documents

A linear programming framework for logics of uncertainty