Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Utility-weighted sampling in decisions from experience
Falk Lieder, Tom Griffiths, Ming HsuUC Berkeley
Extreme potential outcomes influence people as if they were far more likely than they really are.
Extreme potential outcomes influence people as if they were far more likely than they really are.
Extreme potential outcomes influence people as if they were far more likely than they really are.
Extreme potential outcomes influence people as if they were far more likely than they really are.
38% of Americans say they are less likely to travel overseas because of 9/11.
Extreme potential outcomes influence people as if they were far more likely than they really are.
38% of Americans say they are less likely to travel overseas because of 9/11.
Expected Utility Theory
Expected Utility Theory
Take action
utility ofoutcome O
expected value
Expected Utility Theory
Take action
utility ofoutcome O
expected value
Expected Utility Theory
Take action
utility ofoutcome O
expected value
Intractable!
Expected Utility Theory
Take action
utility ofoutcome O
expected value
Intractable!
Violated!
EU can be Approximated by Sampling
EU can be Approximated by Sampling
simulated outcomes
EU can be Approximated by Sampling
simulated outcomes
EU estimates
EU can be Approximated by Sampling
simulated outcomes
EU estimates decision
EU can be Approximated by Sampling
simulated outcomes
EU estimates decision
finite time finitely many simulated outcomes
Representative sampling is dangerous
Representative sampling is dangerous
Representative sampling is dangerous
variance
Representative sampling is dangerous
variance
Representative sampling is dangerous
biasvariance
Representative sampling is dangerous
vs.
biasvariance
Utility estimation by importance samplingfu
nde
ath
p
Utility estimation by importance sampling
simulatedoutcomes
fun
deat
h
pq
Utility estimation by importance sampling
simulatedoutcomes
fun
deat
h
o1, o2, o3 q~
pq
Utility estimation by importance sampling
simulatedoutcomes EU estimates
fun
deat
h
o1, o2, o3 q~
pq
Utility estimation by importance sampling
simulatedoutcomes EU estimates
fun
deat
h
o1, o2, o3 q~
w1=p(o1)/ q(o1)p
q
Utility estimation by importance sampling
simulatedoutcomes EU estimates
fun
deat
h
o1, o2, o3 q~
w1=p(o1)/ q(o1)
w3=p(o3)/ q(o3)…
pq
Utility estimation by importance sampling
simulatedoutcomes EU estimates decision
fun
deat
h
o1, o2, o3 q~
w1=p(o1)/ q(o1)
w3=p(o3)/ q(o3)…
pq
Utility estimation by importance sampling
Which distribution should the brain sample from?
simulatedoutcomes EU estimates decision
fun
deat
h
o1, o2, o3 q~
w1=p(o1)/ q(o1)
w3=p(o3)/ q(o3)…
pq
Answer: Utility-Weighted Sampling (UWS)
simulationfrequency
probability
extremity
Answer: Utility-Weighted Sampling (UWS)
simulationfrequency
probability
extremity
Lieder, Hsu, Griffiths (2014)
Decisions from Experience (Ludvig, et al., 2014)
Decisions from Experience (Ludvig, et al., 2014)
Decisions from Experience (Ludvig, et al., 2014)
Decisions from Experience (Ludvig, et al., 2008)
Decisions from Experience (Ludvig, et al., 2008)
Decisions from Experience (Ludvig, et al., 2008)
Inconsistent Risk Preferences Emerge from Learning
+40/0 vs. +20
-40/ 0 vs. -20
+45/5 vs. +25
-45/-5 vs. -25
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
at
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
Outcomes: -40 -20 +20 +40
at
ot
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
Outcomes: -40 -20 +20 +40
wt
at
ot
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
Outcomes: -40 -20 +20 +40
wt
at
ot
prediction error
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
Outcomes: -40 -20 +20 +40
wt
learning rate
at
ot
prediction error
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
Outcomes: -40 -20 +20 +40
wt
learning rate
at
ot
prediction error
UWS Can Emerge from Reward-Modulated Associative Learning
Actions:
Outcomes: -40 -20 +20 +40
wt
For all o≠ot:
forgetting rate
learning rate
at
ot
prediction error
Learning Rule Convergences to Utility-Weighted Sampling
Utility-weighted learning converges to
with
Learning Rule Convergences to Utility-Weighted Sampling
Utility-weighted learning converges to
with
with activation function the networklearns to perform utility-weighted sampling.
Efficient coding (Summerfield & Tsetsos, 2015)
Efficient coding (Summerfield & Tsetsos, 2015)
Efficient coding (Summerfield & Tsetsos, 2015)
Model fitting
Maximum-Likelihood-Estimation offrom block-by-block choice frequencies inExperiments 1-4 by Ludvig et al. (2014).
A single set of parameters fits all experiments.
UWS captures that people learn to overweight extreme outcomes
+40/0 vs. +20
-40/ 0 vs. -20
+45/5 vs. +25
-45/-5 vs. -25
UWS captures that people learn to overweight extreme outcomes
+40/0 vs. +20
-40/ 0 vs. -20
+45/5 vs. +25
-45/-5 vs. -25
Utility-Weighted Sampling Captures Memory Biases (Madan et al. 2014)
Which outcome comes to mind first?
Utility-Weighted Sampling Captures Memory Biases (Madan et al. 2014)
Which outcome comes to mind first?
Utility-Weighted Sampling Captures Memory Biases (Madan et al. 2014)
Which outcome comes to mind first?
Utility-Weighted Sampling Captures Frequency Estimation Bias (Madan et al. 2014)
How often did this door lead to each outcome?
+40: ___ %+20: ___ %
0: ___ %
Utility-Weighted Sampling Captures Frequency Estimation Bias (Madan et al. 2014)
How often did this door lead to each outcome?
+40: ___ %+20: ___ %
0: ___ %
Utility-Weighted Sampling Captures Frequency Estimation Bias (Madan et al. 2014)
How often did this door lead to each outcome?
+40: ___ %+20: ___ %
0: ___ %
Biased Beliefs Predict Risk Seeking
rpeople= +0.16; p<0.05
Judged Freq. of High Gain (%)
Biased Beliefs Predict Risk SeekingrUWS = +0.23rpeople= +0.16; p<0.05
Judged Freq. of High Gain (%)
Biased Beliefs Predict Risk SeekingrUWS = +0.23rpeople= +0.16; p<0.05 rpeople = -0.48; p< 0.05
Judged Freq. of High Gain (%) Judged Freq. of Large Loss (%)
Biased Beliefs Predict Risk SeekingrUWS = +0.23 rUWS = -0.44rpeople= +0.16; p<0.05 rpeople = -0.48; p< 0.05
Judged Freq. of High Gain (%) Judged Freq. of Large Loss (%)
Conclusions
|PE|
Conclusions1. Utility-weighted sampling provides a unifying explanation
for biases in memory, judgment, and decision making.
|PE|
Conclusions1. Utility-weighted sampling provides a unifying explanation
for biases in memory, judgment, and decision making.2. Utility-weighted sampling can emerge from reward-
modulated associative learning.
|PE|
Conclusions1. Utility-weighted sampling provides a unifying explanation
for biases in memory, judgment, and decision making.2. Utility-weighted sampling can emerge from reward-
modulated associative learning.3. People overweight extreme events, because it is rational
to focus on the most important eventualities.
|PE|
Conclusions1. Utility-weighted sampling provides a unifying explanation
for biases in memory, judgment, and decision making.2. Utility-weighted sampling can emerge from reward-
modulated associative learning.3. People overweight extreme events, because it is rational
to focus on the most important eventualities.4. Some cognitive biases may serve or reflect the rational
allocation of finite cognitive resources.
|PE|
Thank you!
This work was supported by grant number N00014-13-1-0341 from the Office of Naval Research to Thomas L. Griffiths, grant number RO1 MH098023 from the National Institutes of Health to Ming Hsu, and a Berkeley Graduate Fellowship to Falk Lieder.We thank Elliot Ludvig, Quentin Huys, and Mike Pacer for their suggestions and feedback.
Poster T28