17

Human level control through deep reinforcement learning Naiyan Wang

Human level control through deep reinforcement learning Naiyan Wang

Download PPTX Report

Upload
emory-dalton
View
222
Download
0

Tags:

Embed Size (px)

Citation preview

Page 1: Human level control through deep reinforcement learning Naiyan Wang

Human level control through deep reinforcement learning

Naiyan Wang

Page 2: Human level control through deep reinforcement learning Naiyan Wang

Part

1Q Learning

Page 3: Human level control through deep reinforcement learning Naiyan Wang

S

Q Learning

tate Action Reward

Page 4: Human level control through deep reinforcement learning Naiyan Wang

Q Learning

New State Old State Reward

Learning Rate Discount Factor

Page 5: Human level control through deep reinforcement learning Naiyan Wang

Part

2Deep Q Learning

Page 6: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 7: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 8: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 9: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 10: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 11: Human level control through deep reinforcement learning Naiyan Wang

End to End Cooking

Page 12: Human level control through deep reinforcement learning Naiyan Wang

End to End Learning

Page 13: Human level control through deep reinforcement learning Naiyan Wang

Formulation

Target Variable

1

2

3

Page 14: Human level control through deep reinforcement learning Naiyan Wang

Results AnalysisDQN is good at … DQN is bad at …

Page 15: Human level control through deep reinforcement learning Naiyan Wang

Part

3Discussion

Page 16: Human level control through deep reinforcement learning Naiyan Wang

Q: What is the key contributing factor?

Q: How to account for long term dependency ?

Discussion

A: Almost unlimited training data

A: Long short term memory may be the solution

Page 17: Human level control through deep reinforcement learning Naiyan Wang

Thank You

Improving Taxi Revenue with Reinforcement Learningcs229.stanford.edu/proj2014/Jingshu Wang, Benjamin...Improving Taxi Revenue with Reinforcement Learning Jingshu Wang1 and Benjamin

Improving Taxi Revenue with Reinforcement Learningcs229.stanford.edu/proj2014/Jingshu Wang, Benjamin...Improving Taxi Revenue with Reinforcement Learning Jingshu Wang1 and Benjamin

Documents

Deep Reinforcement Learning-based Image Captioning with ... · Deep Reinforcement Learning-based Image Captioning with Embedding Reward Zhou Ren 1Xiaoyu Wang Ning Zhang Xutao Lv1

Deep Reinforcement Learning-based Image Captioning with ... · Deep Reinforcement Learning-based Image Captioning with Embedding Reward Zhou Ren 1Xiaoyu Wang Ning Zhang Xutao Lv1

Documents

Eick: Reinforcement Learning. Reinforcement Learning Introduction Passive Reinforcement Learning Temporal Difference Learning Active Reinforcement Learning

Eick: Reinforcement Learning. Reinforcement Learning Introduction Passive Reinforcement Learning Temporal Difference Learning Active Reinforcement Learning

Documents

Guide to Historical Reinforcement - SRIA Concrete 2017 Historical Reinforcement... · Guide to Historical Reinforcement ... reinforcement material properties to use when checking

Guide to Historical Reinforcement - SRIA Concrete 2017 Historical Reinforcement... · Guide to Historical Reinforcement ... reinforcement material properties to use when checking

Documents

Reinforcement Learning Architectures for Deep · Dueling Network Architectures for Deep Reinforcement Learning Paper by: Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc

Reinforcement Learning Architectures for Deep · Dueling Network Architectures for Deep Reinforcement Learning Paper by: Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc

Documents

Reinforcement Learning - 4. Model-free reinforcement Learning

Reinforcement Learning - 4. Model-free reinforcement Learning

Documents

A brief review of non-neural- network approaches to deep learning Naiyan Wang

A brief review of non-neural- network approaches to deep learning Naiyan Wang

Documents

ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING · COMPUTATIONAL INTELLIGENCE – Vol. I - Adaptive Dynamic Programming And Reinforcement Learning - Derong Liu, Ding Wang

ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING · COMPUTATIONAL INTELLIGENCE – Vol. I - Adaptive Dynamic Programming And Reinforcement Learning - Derong Liu, Ding Wang

Documents

A Probabilistic Approach to Robust Matrix Factorizationwinsty.net/papers/prmf.pdf · 2018-01-02 · A Probabilistic Approach to Robust Matrix Factorization Naiyan Wang†, Tiansheng

A Probabilistic Approach to Robust Matrix Factorizationwinsty.net/papers/prmf.pdf · 2018-01-02 · A Probabilistic Approach to Robust Matrix Factorization Naiyan Wang†, Tiansheng

Documents

Bayesian Adaptive Matrix Factorization With Automatic ... · Bayesian Adaptive Matrix Factorization with Automatic Model Selection Peixian Chen Naiyan Wang Nevin L. Zhang Dit-Yan

Bayesian Adaptive Matrix Factorization With Automatic ... · Bayesian Adaptive Matrix Factorization with Automatic Model Selection Peixian Chen Naiyan Wang Nevin L. Zhang Dit-Yan

Documents

Deep Reinforcement Learning with Smooth Policy · 2020-03-25 · Deep Reinforcement Learning with Smooth Policy Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao † Abstract

Deep Reinforcement Learning with Smooth Policy · 2020-03-25 · Deep Reinforcement Learning with Smooth Policy Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao † Abstract

Documents

Schedules of reinforcement. Schedules of Reinforcement Continuous reinforcement refers to reinforcement being administered to each instance of a response

Schedules of reinforcement. Schedules of Reinforcement Continuous reinforcement refers to reinforcement being administered to each instance of a response

Documents

Reinforcement Learning to Play an Optimal Nash Equilibrium in Coordination Markov Games XiaoFeng Wang and Tuomas Sandholm Carnegie Mellon University

Reinforcement Learning to Play an Optimal Nash Equilibrium in Coordination Markov Games XiaoFeng Wang and Tuomas Sandholm Carnegie Mellon University

Documents

Bayesian Reinforcement Learning - mlg.eng.cam.ac.ukmlg.eng.cam.ac.uk/rowan/files/BayesianReinforcementLearning.pdf · Introduction Bayesian Reinforcement Learning Bayesian Reinforcement

Bayesian Reinforcement Learning - mlg.eng.cam.ac.ukmlg.eng.cam.ac.uk/rowan/files/BayesianReinforcementLearning.pdf · Introduction Bayesian Reinforcement Learning Bayesian Reinforcement

Documents

$Advanced Q-Function Learning Methodsrll.berkeley.edu/deeprlcoursesp17/docs/lec4.pdfZ. Wang, N. de Freitas, and M. Lanctot.\Dueling network architectures for deep reinforcement learning"$

Advanced Q-Function Learning Methodsrll.berkeley.edu/deeprlcoursesp17/docs/lec4.pdfZ. Wang, N. de Freitas, and M. Lanctot.\Dueling network architectures for deep reinforcement learning"

Documents

Virtual to Real Reinforcement Learning for …YOU,WANG,LU: VIRTUAL TO REAL REINFORCEMENT LEARNING 1 Virtual to Real Reinforcement Learning for Autonomous Driving Xinlei Pan 1 xinleipan@berkeley.edu

Virtual to Real Reinforcement Learning for …YOU,WANG,LU: VIRTUAL TO REAL REINFORCEMENT LEARNING 1 Virtual to Real Reinforcement Learning for Autonomous Driving Xinlei Pan 1 [email protected]

Documents

Online Robust Non-negative Dictionary Learning for Visual ......Online Robust Non-negative Dictionary Learning for Visual Tracking Naiyan Wang yJingdong Wangz Dit-Yan Yeung y Hong

Online Robust Non-negative Dictionary Learning for Visual ......Online Robust Non-negative Dictionary Learning for Visual Tracking Naiyan Wang yJingdong Wangz Dit-Yan Yeung y Hong

Documents

LEARNING TO REINFORCEMENT LEARN - arXiv · LEARNING TO REINFORCEMENT LEARN JX Wang 1, Z Kurth-Nelson , D Tirumala , H Soyer , JZ Leibo1, ... system gradually tunes into this consistent

LEARNING TO REINFORCEMENT LEARN - arXiv · LEARNING TO REINFORCEMENT LEARN JX Wang 1, Z Kurth-Nelson , D Tirumala , H Soyer , JZ Leibo1, ... system gradually tunes into this consistent

Documents

Reinforcement and deep reinforcement learning for wireless

Reinforcement and deep reinforcement learning for wireless

Documents

Deep Reinforcement Learning for Foreign Exchange Trading · 2019-08-23 · Deep Reinforcement Learning for Foreign Exchange Trading 1st Chun-Chieh Wang Department of Computer Science

Deep Reinforcement Learning for Foreign Exchange Trading · 2019-08-23 · Deep Reinforcement Learning for Foreign Exchange Trading 1st Chun-Chieh Wang Department of Computer Science

Documents

Prefrontal cortex as a meta-reinforcement CS330 Student ...cs330.stanford.edu/presentations/presentation-11.4-1.pdf · Prefrontal cortex as a meta-reinforcement learning system Wang

Prefrontal cortex as a meta-reinforcement CS330 Student ...cs330.stanford.edu/presentations/presentation-11.4-1.pdf · Prefrontal cortex as a meta-reinforcement learning system Wang

Documents

Head First Dropout Naiyan Wang. Outline Introduction to Dropout – Basic idea and Intuition – Some common mistakes for dropout Practical Improvement –

Head First Dropout Naiyan Wang. Outline Introduction to Dropout – Basic idea and Intuition – Some common mistakes for dropout Practical Improvement –

Documents

Modeling 3D Shapes by Reinforcement Learning · Modeling 3D Shapes by Reinforcement Learning Cheng Lin 1;2, Tingxiang Fan , Wenping Wang , and Matthias Nieˇner2 1 The University

Modeling 3D Shapes by Reinforcement Learning · Modeling 3D Shapes by Reinforcement Learning Cheng Lin 1;2, Tingxiang Fan , Wenping Wang , and Matthias Nieˇner2 1 The University

Documents

Like What You Like: Knowledge Distill via Neuron …Like What You Like: Knowledge Distill via Neuron Selectivity Transfer Zehao Huang Naiyan Wang TuSimple fzehaohuang18, winstyg@gmail.com

Like What You Like: Knowledge Distill via Neuron …Like What You Like: Knowledge Distill via Neuron Selectivity Transfer Zehao Huang Naiyan Wang TuSimple fzehaohuang18, [email protected]

Documents

Collaborative Deep Learning for Recommender …Collaborative Deep Learning for Recommender Systems Hao Wang Hong Kong University of Science and Technology hwangaz@cse.ust.hk Naiyan

Collaborative Deep Learning for Recommender …Collaborative Deep Learning for Recommender Systems Hao Wang Hong Kong University of Science and Technology [email protected] Naiyan

Documents

Adaptive Dynamic Bipartite Graph Matching: A Reinforcement ...€¦ · Adaptive Dynamic Bipartite Graph Matching: A Reinforcement Learning Approach Yansheng Wang y, Yongxin Tong y,

Adaptive Dynamic Bipartite Graph Matching: A Reinforcement ...€¦ · Adaptive Dynamic Bipartite Graph Matching: A Reinforcement Learning Approach Yansheng Wang y, Yongxin Tong y,

Documents

Structured Control Nets for Deep Reinforcement Learningproceedings.mlr.press/v80/srouji18a/srouji18a.pdf · architectures for DRL. The Dueling network of (Wang et al., 2016) splits

Structured Control Nets for Deep Reinforcement Learningproceedings.mlr.press/v80/srouji18a/srouji18a.pdf · architectures for DRL. The Dueling network of (Wang et al., 2016) splits

Documents

Reinforcement Learning for FX trading Font: Roboto 14€¦ · Roboto 14 Reinforcement Learning for FX trading Yuqin Dai, Chris Wang, Iris Wang, Yilun Xu . Font ... The agent may learn

Reinforcement Learning for FX trading Font: Roboto 14€¦ · Roboto 14 Reinforcement Learning for FX trading Yuqin Dai, Chris Wang, Iris Wang, Yilun Xu . Font ... The agent may learn

Documents

Towards Monocular Vision based Obstacle Avoidance through ... (1).pdf · Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning Linhai Xie, Sen Wang,

Towards Monocular Vision based Obstacle Avoidance through ... (1).pdf · Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning Linhai Xie, Sen Wang,

Documents

Towards Monocular Vision based Obstacle Avoidance ... Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning Linhai Xie, Sen Wang, Andrew Markham and Niki Trigoni

Towards Monocular Vision based Obstacle Avoidance ... Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning Linhai Xie, Sen Wang, Andrew Markham and Niki Trigoni

Documents