15
Robot control with Deep Reinforcement Learning Hadi Beik-Mohammadi INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Robot controlwith Deep Reinforcement Learning

Hadi Beik-Mohammadi

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

Page 2: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

• Inverse and Forward Kinematic

• How to Learn a behavior

• Methods

Inverse Recurrent Model

Deep Deterministic Policy Gradient

• Conclusion

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 1

Page 3: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

End Effector

Joint 1

Joint 0

Joint 0Join

t 1

0 360

360

0 300X (CM)

200

Y (

CM

)

Joint SpaceCartesian Space

Target

End Effector

Target

TargetTarget

Target

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

80

290

2

Page 4: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Ө 1

Ө 2

Ө 3

Ө n

X

Y

Z

Ө X

Ө Y

Ө Z

Forward Kinematic

Forward and Inverse Kinematic

.

.

.Inverse Kinematic

Joint Space Cartesian Space

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 3

Page 5: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

How to build agents that learn behaviors in a dynamic world?

The brain evolved, not to think or feel, but to control movement

Daniel Wolpert, nice TED talk

Learning a behavior:

Learning to map sequences of observations to actions, for a particular goal

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 [3] 4

Page 6: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

What supervision does an agent need to learn purposeful

behaviors in dynamic environments?

• Rewards: sparse feedback from the environment whether the

desired behavior is achieved

• Demonstrations

• Specifications/Attributes of good behavior

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 5

Page 7: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Inverse Recurrent Model (IRM)[1]• Control Snake-Like Many Joint Robot Arms

• BPTT on recurrent forward models

• Recurrent Neural Networks LSTM

• Offline

Deep Deterministic Policy Gradient (DDPG)[2]• Deep Reinforcement Learning Method

• Actor Critic Network

• Continuous Action Domain

• Model Free

• Online

ME

TH

OD

S

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 6

Page 8: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Inverse Recurrent Model (IRM)

[1]

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 7

Page 9: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Deep Deterministic Policy Gradient (DDPG)

CRITIC NETWORK

ACTOR NETWORK

ENVIRONMENT

ACTION

ACTION

STATE

STATE

TD

State-Value Function

Policy Function

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 8

Page 10: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Deep Deterministic Policy Gradient (DDPG)

https://youtu.be/tJBIqkC1wWM

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 9

Page 11: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Rewarding

End Effector

Joint 1

Joint 0

Dist (t)

Reward 1 = Gaussian(Dist(t))

https://www.sfu.ca/sonic-studio/handbook/Graphics/Gaussian.gif

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

Reward 2 = Dist(t-1) – Dist(t)

10

Page 12: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 11

Page 13: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

2 DOF Manipulator Actor Critic maps during Learning

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 12

Page 14: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Pros:

• Operate over continuous action spaces

• Algorithm can learn policies end-to-end

• Model-Free

Cons:

• No Proof for learning

• No Guarantee for results

• Requires a large number of training episodes to find solutions

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 13

Page 15: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

References

• [1] Sebastian Otte , Adrian Zwiener , and Martin V. Butz, Inherently Constraint-Aware Control of Many-Joint Robot Arms with Inverse Recurrent Models

• [2] Continuous control with deep reinforcement learning, Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

• [3] Deep Reinforcement Learning and Control, Spring 2017, CMU 10703

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 14