1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University [email protected] Apr 08, 2009 CS5331: Autonomous

1

An Application of Reinforcement Learning to Aerobatic Helicopter

Greg McChesneyTexas Tech University

[email protected]

Apr 08, 2009CS5331: Autonomous Mobile

Robots

Overview

Creating a robot that can fly autonomously

Software developed at Stanford as part of their AI lab

This paper is slightly outdated as many new maneuvers have been created.


Robots 2

Learning Approach

Apprenticeship Collect data from human trying

maneuver (multiple times) Learn a model from the data Find controller than can simulate based

on model Test on helicopter (pray it doesn’t

crash)


Robots 3

Helicopters State

Position Velocity Angular Velocity Controlled with 4 dimensions

Cyclic pitch Tail rotor

Take gravity out when calculating the model


Robots 4

Controller Design

Use a Markov decision process Sextuple (S,A,T,H,s(0),R)

S-set of states A-set of actions (inputs) T-dynamic model-set of probability

distributions for the next state H-horizon or number of time steps of

interest s(0)-initial state R-reward function


Robots 5

Differential Dynamic Programming(DDP)

Compute the linear approximation Compute the optimal solution to the

linear quadratic regulator Must take into account error state Cost for change in input-needed in real

testing


Robots 6

DDP-Continued

2 phases DDP to find open loop input sequence Use DDP again refining the inputs as a

deviation from the nominal open-loop input sequence

Integral control-take into account wind and errors in the model


Robots 7

Rewards

24 features Used inverse reinforcement learning Rewards from inverse reinforcement

usually did not produce correct result

Took inverse results and manually tuned them to get good results


Robots 8

Helicopter

Xcell Tempest 54” long 19” high 13 lbs Two-stroke engine Orientation sensors GPS-doesn’t work during flips


Robots 9


Robots 10

Flip


Robots 11

Roll


Robots 12

Tail-In Funnel


Robots 13

Nose-In Funnel


Robots 14

Questions

Motivations/Who pays for it I can see applications in the defense

sector DARPA

Could more maneuvers be done just by changing some parameters? Probably not because the filter is

learned based on a model so you would need to create a new model


Robots 15

More Questions

What's the relationship between reinforcement learning and MDP? Not Sure

Could a helicopter like this operate in the West Texas wind storms?


Robots 16

Fun Stuff

Videos: http://heli.stanford.edu/ http://www.youtube.com/watch?v=VCd

xqn0fcnE Helicopter

http://www.miniatureaircraftusa.com/helicopterkits/1025_Spectra_G/1025_kit_main.asp


Robots 17

Documents

1 An Application of Reinforcement Learning to Aerobatic Helicopter Greg McChesney Texas Tech University [email protected] Apr 08, 2009 CS5331: Autonomous