View
6.533
Download
2
Tags:
Embed Size (px)
Citation preview
I .I .!Ij
!
.
DECISION THEORY! .f :
DEFINITION 1.1 :
DECISION THEORY (DT) is a set of concepts, principles, tools andtechniques that aid the decision maker in dealing with compl~x decisionproblems under uncertainty.
COMPONENTS OF A DT PROBLEM:.-
1. THE DECISION MAKER
2. ALTERNATIVE COURSES OF ACTION
This is the controllable aspect of the problem.
3. STATES OF NATURE OR EVENTSThese are the scenarios or states of the environmen.tnot under thecontrol of the decision maker. The events defmed should bemutually exclusive and collectively exhaustive.
4. CONSEQUENCESThe consequences that must be assessed by the decision maker aremeasures of the net bertefit, payoff, cost or revenue received by thedecision maker. There is a consequence (or vector ofconsequences) associated with each action-event pair. Theconsequencesaresutilmarizedin a decisionmatrix. .
1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111II
DECISION THEORY EDGAR L. DE CASTRO PAGE 1..
.."
CLASSIFICATIONS OF DT PROBLEMS:
1. Single Stage Decision ProblemsA decision is m~de only once.
~"
~
2. Multiple Stage/Sequential Decision ProblemsDecisions are made one after another.
..f:
3. Discrete DT ProblemsThe alternative courses of actions and states of nature are finite.
./
4. Continuous DT ProblemsThe alternative courses of actions and states of nature are infinite.
DT Problems can also be classified as those with or withoutexperimentation. Experimentation is perfoaned to obtain additionalinformation that will aid the decision maker.
I. DISCRETE DECISION THEORY PROBLEMS
DECISION TREES
A discrete DT problem can be represented pictorially using a tree.diagram or decision tree. It chronologically depicts the sequence ofactions and events as they unfold.
A square node ( D) precedes the set of possible actions that canbe taken by the decision maker. A round node (0 ) precedes the set ofevents or states of nature that could be encountered after a decision ismade. The nodes are connectedby branches. (""- )
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISION THEORY EDGAR L. DE CASTRO PAGE 2
:.:.~.;..' " '.
'.
'. .
EXAMPLE: i ..,
I
. ..
DECISIONS UNDER RISK AND UNCERTAINTY
Consider a DT problem with m alternative courses of actions and amaximum of n events or states of nature for each alternativecourse ofaction.
Defme: Ai = alternative course of action i; i = 1, 2, . . .,m
I" 11/1" III" III" 111111111111" 11/1111111111111111111111111111111" 111111111111111/1/11111111111111111111111/11/11111111111111111" 111/111111111"" II" 1111 "" lilli/II n" 111111 n n I" nnn 111111 nn" 1I11 "" III! 11111" 11111/1111111111111
DECISION THEORY EDGAR L. DE CASTRO PAGE 3
.," ...... ..........
.' ... '.'.. . . .~" '. .
. .. '. .
','. ,',',. ..'.:..
q)j=state of nature j; j = 1, 2, . . .,n
The decision matrix of payoffs is given by :
q)l q)2 ... q)11
Al v(A1,q)1) veAl ,q)2) ... v(Abn)
A2 v(A2'1) v(A2'2) ... v(A2 '11). . . . .. . . . .. . . . .
Am v(Am '1) V(Am'2) ... v(An"l1 )
.~.ll1i."j'
"
A. LAPLACE CRITERION
This criterion is based on what is mown as the principle ofinsufficientreason.Her~the probabilitiesassociatedwith the'occurrence
of the event rjJj is uriknown.We do not have,Su(.ticie~~,leasonto, "
conclude that the probabilities are different. Hence we 'assume that allevents are equally likely, i.e.
. 1P(t/J =rjJ .) =-
} n
',.
"
....f.. .~.:-...
;-:.....
Then, the optimal decision rule is to select action '51t corresponding to"....
,,'
B. MINIMAX (MAXIMIN) CRITERION
This is the most conservative criterion since it is based on makingthe best out of the worst possible conditions. For each possible decisionalternative, we select the worst condition and then select the alternativecorresponding to the best of the worst conditions.
The MINIMAX strategy is given by:
minfmax{V(Ai,rjJ j }]A; r/J.L. J
The MAXIMIN strategy is given by:
max[min {v( Ai , rjJj } ]A; r/Jj
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISION THEORY EDGAR L. DE CASTRO PAGE 4
. : .: .'. :'. ".: ,'; ~::..~.' '. . ..., :...:..: : ~.:r::... ,'.'.:.
C. SAVAGE MINIMAX REGRET CRITERION
The MINIMAX rule is an extremely conservative type of decisionrule. The savage MINIMAx regret criterion assUmes that a new .loss
matrix is constructed id which v(Ai, (Jj) is replaced by r(A~,..(Jj)' whichis defined by:
,.Ir,.
1.
max { v( Ak , (Jj )} - v( Ai , (Jj ),Ak
v(Ai,(J j) - min{v(Ak,(Jj )},Ak .,/
..
if v is profit,
if v i~loss II
r
t.
. .
Once the loss matrix is constructed using the above fonnula, we cannow apply the MINIMAX criterion defined in b.
D. HURWICZ CRITERION
This criterion represents a range of attitudes from the mostoptimistic to the most pessimistic.
Under the most optimistic conditions, one would choose the actionyielding:
max{
max {v( Ai , (Jj }}Ai t/>I
Under the most pessimistic conditions, the chosen action cOlTespondsto: .
max{
min {v( Ai , (Jj }}A. A..
t '1')
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISIONTHEORY EDGARL. DE CASTRO PAGE5
" '..". ., : '..: ;.," .:j"...... . . . ..
The Hurwiczcriterionstrikes a balancebetween extremepessimism andextreme optimism by weighing the above conditions by respectiveweights a and (1- a), where 0 < a <1. That is the action selected isthat which yields:'\ .
ma..'X{
a m~ v( Ai , r/Jj ) + (1 - a )min v( Ai , r/Jj )}
.,
Ai t/Jj t/Jj
It,
..
[Note the above formulas represent the case where payoffs are expressedas profits]
If a = 1, the decision rule is referred to as the MAXIMAX RULE, and ifa = 0, the decision rule becomes the MAXIMIN RULE. For the casewhere the payoff represent costs, the decision rule is given by:
min{
a min v( Ai , f/Jj ) + (1- a) max v( Ai , f/Jj )}~ ~ ~
E. BAYES' RULE
Here, weasswne that the probabilities associated with each stateof natureareknown.Let . .
P{ljJ = ljJj} =Pj
The action which minimizes (maximizes) the cost (profit) is selected.This is given by:
The backward induction approach is used. With the aid of a decisiontree, expected values are computed each time a round node isencountered and the above decision rule is utilized each time a squarenode is encountered ,i.e., a decision is made each time a square node isencountered.
IIIIIIIIIIIII!IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISIONTHEORY EDGARL. DE CASTRO PAGE6
.,". ':". . ','. ..,',,1, . . . .: .. . .~.: i.L , .
. .
I .'t,
F. EXPECTED VALUE-VARIANCE CRITERION
This is an extension of the expected value criterion. Here wesimultaneously maximi~e profit and minimize the variance of the profit.
If Z represents profit as a random variable with variance q:, then thecriterion is given by:
. .
maximize E(z) - Kvar(z)
where K is any specified constant. If Z represents cost:.' .
./
minimize E(z) + Kvar(z)
G. DECISION MAKING WITH EXPERIMENTATION
In some situations, it may be viable to secure additionalinformation tp revise the original estimates of the probability ofoccurrence of the state of nature.
DEFINITION 1.2 :
Preposterior Analysis considers the question of deciding whether or notit would be worthwhile to get additional information or to perfonnfurther experimentation.
DEFINITION 1.3 :
Posterior Analysis deals with the optimal choice and evaluation of anaction subsequent to all experimentation and testing using theexperimental results.
/1/1/1111111111111111111111/111111111111111111111111111111111111111111111/111111/11111111/1111111111111111111/111111111111111111/1/11111/1111111111111111111111111111111111111111111111111/1/111111111/1/111111111111111111111/1/111111/11111111111
DECISION THEORY EDGAR L. DE CASTRO PAGE 7
".':.:::: >. " :.-:'.:.:.."
---.......-----
DEFINITION 1.4 :
Prior probabilities are the initial probabilities assumed without thebenefit of experiment~tion. Posterior probabilities refer to the revisedprobability values obtamed after experimentation. . -
Let: Pj =prior probability estimate of event (Jj
P{Zkl(Jj} =conditional probability of experimental (jutcomeZk
P{(Jj IZk} =posterior probability of event (Jj
The experimental results are assumed to be given by Zk, k = 1, 2, ... 1.The conditional probability can be considered to be a measure of thereliability of the ~xperiment. The idea is to calculate the posteriorprobabilities by combining the prior probabilities and the conditionalprobabilities of experimental outcome Zk.The posterior probabilities aregiven by:
m
L P{Zk l(Ji}P{(Ji}i=1
Once the posterior probabilities are calculated, the original problem canbe viewed as a multiple stage/sequential DT problem. The first stageinvolves the decision of whether to perform additional experimentationor not. Once this is decided, the outcomes of the experiment areconsidered together with the original set of decision alternatives andevents.
1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111/1111111111111111111111111111111111111111111111111111111111111111
DECISIONTHEORY EDGARL. DE CASTRO PAGE8
. '. .
DEFINITION 1.5 :
A perfect infonnation source would provide, with 100% reliability,which of the statesof nature wouldoccur. .
>'.
Define: EPPI = expected profit from a perfect information sourceEVPI = expected value of the perfect infonnation sourceEP = Bayes' expected profit without experimentation
Then:.,
EVPI = EPPI -EP
where:
11
E VPI = L Pj * max { v( Ai , fjJj ) }.j=1 Ai
EVPI is easily seen as a measure of the maximum amount a decisionmaker should be willing to pay for additional infonnation.
Define: EVSI = expected value of sample informationENOS = expected net gain from samplingCAI =cost of getting additional information
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISIONTHEORY EDGARL. DE CASTRO PAGE9
. '.' '.. .' .'. ." .
. . . ,....
..' ',' '. ,".. I"
\,.,,
Then: I, ,
", -. ,
ENGS = EVSI -CAI..
The information source would be viable if ENGS > O.
II. CONTINUOUS DECISION THEORY.
As previously mentioned, continuous decision theory problemsrefer to those where the number of alternatives and/or states of nature
can be considered infmite. The optimization model in this case is givenby: .
max f(A) =J: v(A,t/J)htjJ(t/J)dt/J
where:
htjJ(t/J) =prior distribution function of the states of nature
In the above model, it is assumed that no additional infonnation isavailable and the expectation is evaluated with respect to the priordistribution of the states of nature. If additional infonnation is available,we update the prior distribution of the states of nature by detennining itsposterior distribution, which is nothing but the conditional distributionof the states of nature given the experimental outcome. Hence, theoptimization converts to:
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISIONTHEORY EDGARL. DE CASTRO PAGE 10
. ..... ..:...:.... ... . ;...:..' :....... . . . ..
.jII
:!II
..J
-------
. max f(A) = f: v(A,f/J)ht/>IZ=z (f/J)df/J
where:'..
hfj)IZ=z (rjJ) = conditional distribution of the state of nature' given theexperimental outcome
hZIfj)(z) = conditional distribution of the experimental outcome given thestate of nature
hz (z) = marginal distribution function of the experimental outcomes
where:
LEIBNIZ' RULE
LEIBNIZ' Rule is applied to find the derivative'of a functionwhich contains integrals. Consider a function in one variable A:
df
b
fb ig db da- g(A,rjJ)drjJ= ~rjJ+g(A,b)--g(A,a)-ciA a a 8A ciA ciA
111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
DECISION THEORY EDGAR L. DE CASTRO PAGE 11
'. '.: :'. .