Curious Characters in Multiuser Games: A Study in Motivated Reinforcement Learning for Creative...
If you can't read please download the document
Curious Characters in Multiuser Games: A Study in Motivated Reinforcement Learning for Creative Behavior Policies * Mary Lou Maher University of Sydney
Curious Characters in Multiuser Games: A Study in Motivated
Reinforcement Learning for Creative Behavior Policies * Mary Lou
Maher University of Sydney AAAI AI and Fun Workshop July 2010 1
Based on Merrick, K. and Maher, M.L. (2009) Motivated Reinforcement
Learning: Curious Characters for Multiuser Games, Springer.
Slide 2
Outline Curiosity and Fun Motivation Motivated Reinforcement
Learning An Agent Model of a Curious Character Evaluation of
Behavior Policies
Slide 3
Can AI model Fun? Claim: An agent motivated by curiosity to
learn patterns is a model of fun.
Slide 4
Games try to achieve flow: a function of the players skill and
performance J. Chen, Flow in games (and everything else).
Communications of the ACM 50(4):31-34, 2007
Slide 5
Why Motivated Reinforcement Learning? More efficient learning:
Complement external reward with internal reward External reward not
known at design time Design tasks Real world scenrios: Robotics
Virtual world scenarios: NPC in computer games More autonomy in
determining learning tasks Robotics NPC in computer games
Slide 6
Models of Motivation Cognitive: Interest Competency Challenge
Biological Stasis variables: energy, blood pressure, etc Social
Conformity Peer pressure
Slide 7
MRL Agent Model
Slide 8
Motivation as Interesting Events Event is a change in
observations: O (t) O (t) = ((o 1(t), o 1(t) ), (o 2(t), o 2(t) ),
(o L(t), o L(t) ), ) D.E. Berlyne, Exploration and Curiosity,
Science 153:24-33, 1966
Slide 9
Sensed States: Context Free Grammar (CFG) CFG = (VS, S, S, S)
where: VS is a set of variables or syntactic categories, S is a
finite set of terminals such that VS S = {}, S is a set of
productions V -> v where V is a variable and v is a string of
terminals and variables, S is the start symbol. Thus, the general
form of a sensed state is: S -> -> | -> |
Slide 10
MRL for Non Player Characters
Slide 11
Habituated Self Organizing Map
Slide 12
Behavioral Variety Behavioural variety measures the number of
events for which a near optimal policy is learned. We characterise
the level of optimality of a policy learned to achieve the event E
(t) in terms of its structural stability.
Slide 13
Behavioral Complexity The complexity of a policy can be
measured by averaging the mean numbers of actions E(t) required to
repeat E (t) at any time when the current behaviour is stable
Slide 14
Research Directions Scalability and dynamics: different RL such
as decision trees and NN function approximation Motivation
functions: competence, optimal challenges, social models
Slide 15
Relevance to AI and Fun Is it more fun to play with curious
NPC? Can a curious agent play a game to test how fun a game
is?