View
213
Download
0
Category
Tags:
Preview:
Citation preview
1
An Application of Reinforcement Learning to Aerobatic Helicopter
Greg McChesneyTexas Tech University
Greg.mcchesney@ttu.edu
Apr 08, 2009CS5331: Autonomous Mobile
Robots
Overview
Creating a robot that can fly autonomously
Software developed at Stanford as part of their AI lab
This paper is slightly outdated as many new maneuvers have been created.
Apr 08, 2009CS5331: Autonomous Mobile
Robots 2
Learning Approach
Apprenticeship Collect data from human trying
maneuver (multiple times) Learn a model from the data Find controller than can simulate based
on model Test on helicopter (pray it doesn’t
crash)
Apr 08, 2009CS5331: Autonomous Mobile
Robots 3
Helicopters State
Position Velocity Angular Velocity Controlled with 4 dimensions
Cyclic pitch Tail rotor
Take gravity out when calculating the model
Apr 08, 2009CS5331: Autonomous Mobile
Robots 4
Controller Design
Use a Markov decision process Sextuple (S,A,T,H,s(0),R)
S-set of states A-set of actions (inputs) T-dynamic model-set of probability
distributions for the next state H-horizon or number of time steps of
interest s(0)-initial state R-reward function
Apr 08, 2009CS5331: Autonomous Mobile
Robots 5
Differential Dynamic Programming(DDP)
Compute the linear approximation Compute the optimal solution to the
linear quadratic regulator Must take into account error state Cost for change in input-needed in real
testing
Apr 08, 2009CS5331: Autonomous Mobile
Robots 6
DDP-Continued
2 phases DDP to find open loop input sequence Use DDP again refining the inputs as a
deviation from the nominal open-loop input sequence
Integral control-take into account wind and errors in the model
Apr 08, 2009CS5331: Autonomous Mobile
Robots 7
Rewards
24 features Used inverse reinforcement learning Rewards from inverse reinforcement
usually did not produce correct result
Took inverse results and manually tuned them to get good results
Apr 08, 2009CS5331: Autonomous Mobile
Robots 8
Helicopter
Xcell Tempest 54” long 19” high 13 lbs Two-stroke engine Orientation sensors GPS-doesn’t work during flips
Apr 08, 2009CS5331: Autonomous Mobile
Robots 9
Questions
Motivations/Who pays for it I can see applications in the defense
sector DARPA
Could more maneuvers be done just by changing some parameters? Probably not because the filter is
learned based on a model so you would need to create a new model
Apr 08, 2009CS5331: Autonomous Mobile
Robots 15
More Questions
What's the relationship between reinforcement learning and MDP? Not Sure
Could a helicopter like this operate in the West Texas wind storms?
Apr 08, 2009CS5331: Autonomous Mobile
Robots 16
Recommended