Upload
emory-dalton
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
Human level control through deep reinforcement learning
Naiyan Wang
Part
1Q Learning
S
Q Learning
tate Action Reward
Q Learning
New State Old State Reward
Learning Rate Discount Factor
Part
2Deep Q Learning
Traditional Cooking
Traditional Cooking
Traditional Cooking
Traditional Cooking
Traditional Cooking
End to End Cooking
End to End Learning
Formulation
Target Variable
1
2
3
Results AnalysisDQN is good at … DQN is bad at …
Part
3Discussion
Q: What is the key contributing factor?
Q: How to account for long term dependency ?
Discussion
A: Almost unlimited training data
A: Long short term memory may be the solution
Thank You