Download pdf - Does the Optimism Bias Support Rational Action?cocosci.princeton.edu/falk/Optimism.pdfOptimism as a Prior Belief about the Probability of Future Reward. PLoS Comput Biol, 10(5), e1003605

3. SimulationEnvironment:

Bounded Agents

Results:

1.  BackgroundMyopic decision-making: Cognitive limitations prevent bounded agents from fully considering all possible the long-term consequences of our actions.Optimism bias: People systematically overestimate the probability of good outcomes [1] and underestimate how long it will take to achieve them [2]. Yet, optimists often perform better than realists [1]. Questions: Is it resource-rational to be optimistic? Hypothesis: The optimism bias rational decision-making by compensating for the limitation that people can look only a small number of steps ahead.

Does the Optimism Bias Support Rational Action?Falk Lieder ∙ Sidharth Goel ∙ Ronald Kwan ∙ Thomas L. Griffiths1 University of California at Berkeley, CA, USA, † Correspondence: [email protected]

References: [1] T. Sharot, “The optimism bias,” Current Biology, vol. 21, no. 23, pp. R941–R945, 2011. [2] R. Buehler, D. Griffin, and M. Ross, “Exploring the” planning fallacy”: Why people underestimate their task completion times.,” Journal of personality and social psychology, vol. 67, no. 3, p. 366, 1994. [3] R. Neumann, A. N. Rafferty, and T. L. Griffiths, “A bounded rationality account of wishful thinking,” in Proceedings of the 36th Annual Conference of the Cognitive Science Society, 2014. [4] R. S. Sutton, “Integrated architectures for learning, planning, and reacting based on approximating dynamic programming,” in Proceedings of the seventh international conference on machine learning, pp. 216–224, 1990. [5] P. Auer, “Using confidence bounds for exploitation-exploration trade-offs,” J. Mach. Learn. Res., vol. 3, pp. 397–422, 2003. [6] I. Szita and A. Lorincz, “The many faces of optimism: a unifying approach,” in Proceedings of the 25th international conference on Machine learning, pp. 1048–1055, ACM, 2008. [7] P. Sunehag and M. Hutter, “Rationality, optimism and guarantees in general reinforcement learning,” J. Mach. Learn. Res., vol. 16, pp. 1345–1390, 2015. [8] Stankevicius, A., Huys, Q. J. M., Kalra, A., & Series, P. (2014). Optimism as a Prior Belief about the Probability of Future Reward. PLoS Comput Biol, 10(5), e1003605.

Acknowledgment: This work was supported by ONR MURI N00014-13-1-0341.

2.  ModelEnvironment: MDP lifetime:

Bounded Rational Agent

Internal Model: in which transition probabilities are distorted depending on the value of the transition (cf. [3]):

Optimism: Realism: Optimism:

Decision Mechanism:

Limited Computational resources:

Planning Horizon Expected Life Time

3. Experiment: Does optimism help people decide better?•  Product Manager Paradigm:

1.  Simulation Phase

2.  Decision Phase

•  1 episode of an MDP structurally equivalent to simulations

•  Bonus of up to $1 proportional to financial gain in the game

•  Participants: 337 adults recruited on Amazon Mechanical Turk

•  Independent variables:

1.  Horizon: 5, 12, 24, or 72 steps (months)

2.  Progress rate in simulations:

a)  Pessimism (50% of true rate)

b)  Realism (true rate)

c)  Optimism: 80%

•  Dependent variables: #investments, financial gain

E = S,A,T,γ, r( )

M = S,A, T̂,γ, r( )

T̂α (s ' | s,a) ⋅sigmoid(VE*(s ')−VE

*(s))α

α > 0 α = 0 α < 0

π h (s) = argmaxa

Qh (s,a)

Qh (s,a) = ET̂ r(s t ,a,St+1)+maxπ r(S i ,π (S i ),Si+1)i=t+1

t+h+1

∑"

#$

%

&'

h