23
Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Embed Size (px)

Citation preview

Page 1: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learning Parameterized Maneuvers for Autonomous Helicopter FlightJie Tang, Arjun Singh, Nimbus Goehausen,

Pieter AbbeelUC Berkeley

Page 2: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Dynamics Model

Optimal Control

Overview

Target Trajectory

Controller

Page 3: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Problem

• Robotics tasks involve complex trajectories– Stall turn

• Challenging, nonlinear dynamics

Page 4: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Dynamics Model

Optimal Control

Overview

Target Trajectory

Controller

Demonstrations

Page 5: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learning Target Trajectory From Demonstration

Height

Problem: Demonstrations are suboptimal– Use multiple demonstrations– Current state of the art in helicopter

aerobatics (Coates, Abbeel, and Ng, ICML 2008)

– Our work: learn parameterized maneuver classes

Problem: Demonstrations will be different from desired target trajectory

Page 6: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Example Data

Page 7: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learning Trajectory

• HMM-like generative model– Dynamics model used as HMM transition model– Synthetic observations enforce parameterization– Demos are observations of hidden trajectory

• Problem: how do we align observations to hidden trajectory?

Demo 1

Demo 2

Hidden

Height 50m

Page 8: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learning Trajectory

• Dynamic Time Warping• Extended Kalman filter / smoother• Repeat

Demo 1

Demo 2

Hidden

Height 50m

Page 9: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Smoothed Dynamic Time Warping• Potential outcome of dynamic time warping:

• More desirable outcome:

• Introduce smoothing penalty – Extra dimension in dynamic program

Page 10: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

• Some demonstrations should contribute more to target trajectory than others– Difficult to tune these observation covariances

• Learn optimal observation covariances using EM

Weighting Demonstrations

Targ

et H

eigh

t

Page 11: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learned TrajectoryTa

rget

Hei

ght

Page 12: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Dynamics Model

Optimal Control

Overview

Target Trajectory

Controller

Demonstrations

Frequency Sweeps and Step

Responses

Page 13: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Learning dynamics• Standard helicopter dynamics model estimated from data

– Has relatively large errors in aggressive flight regimes• After learning target trajectory, we obtain aligned demonstrations

– Errors in model are consistent for executions of the same maneuver class• Many hidden variables are not modeled explicitly

– Airflow, rotor speed, actuator latency• Learn corrections to dynamics model along each target trajectory

2G error

Page 14: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Dynamics Model

Optimal Control

Overview

Target Trajectory

Controller

Standard Dynamics Model+Trajectory-Specific

Corrections

Frequency Sweeps and Step

Responses

Optimal ControlReceding Horizon

Differential Dynamic Programming

Demonstrations

Page 15: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Experimental Setup

Onboard IMU @333Hz

Offboard Cameras 1280x960@20HzExtended Kalman FilterRHDDP controller

Controls @ 20Hz

“Position”

3-axis magnetometer, accelerometer,

gyroscope (“Orientation”)

Page 16: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Results: Stall Turn

Max speed: 57 mph

Page 17: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Results: Loops

Page 18: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Results: Tic-Tocs

Page 19: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Typical Flight Performance: Stall Turn

Page 20: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Quantitative Evaluation

• Flight conditions: wind up to 15mph• Similar accuracy is maintained for queries very

different from our demonstrations– e.g., can learn 60m stall turns from 40m, 80m

demonstrations• Four or five demonstrations sufficient to cover a

wide range of stall turns, loops, and tic-tocs– e.g., four stall turns at 20m, 40m, 60m, 80m sufficient

to generate any stall turn between 20m and 80m

Page 21: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Conclusions

• Presented an algorithm for learning parameterized target trajectories and accurate dynamics models from demonstrations

• With few demonstrations, can generate a wide variety of novel trajectories

• Validated on a variety of parameterized aerobatic helicopter maneuvers

Page 22: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Thank you

Page 23: Learning Parameterized Maneuvers for Autonomous Helicopter Flight Jie Tang, Arjun Singh, Nimbus Goehausen, Pieter Abbeel UC Berkeley

Thank you