Learning and Autonomous Control

14 January 2011

Challenge the future

DelftUniversity ofTechnology

Learning and Autonomous ControlJens Kober

Cognitive RoboticsDelft University of TechnologyThe Netherlands

j.kober@tudelft.nl www.jenskober.de

Cognitive Robotics Department

• Robot Dynamics (dynamic motion control, motor control)

• Intelligent Vehicles (perception and modeling, dynamics,

human factors)

• Human-Robot Interaction (physical human-robot interaction)

• Learning and Autonomous Control (intelligent control,

cognition)

Robot Dynamics

Intelligent

Vehicles

Human-Robot Interaction

Robert Babuska

professor

learning and adaptive control

Jens Kober

assistant professor

robotics, learning motor skills

Javier Alonso Mora

assistant professor

robotics, motion planning

Learning and Autonomous Control

youtube/ABCUPM

Growing Interest in Robotics

Challenges

Uncertainty due to environment

Robustness

ComplianceLearning from mistakes

Safety

Autonomy

Power efficiency

Situation awareness

Interaction with humans

Future Robots

• Need to operate under unforeseen circumstances

• Interact with humans in an intelligent way

• Learn from mistakes, improve over time (performance, energy)

• Impossible to preprogram all tasks or behaviors

• Economic point of view: short design time and low costs

Learning and adaptation is essential!

Main Research Topics

• Adaptive and learning control systems

• Reinforcement learning, optimal control

• Nonlinear adaptive control

• Deep learning for robotics

• Data-effective learning algorithms

• Sensor fusion

• Supervisory and distributed control design

• Motion planning, coordination

• Multi-agent systems

Emphasis on novel, generic, widely applicable techniques

Realistic / real-world case studies and applications

Autonomous Multi-Robots Lab

Autonomous navigationPlanning in dynamic environments

AI + ControlConstrained optimization tools, combined with machine learning, formal methods and consensus

Multi-robot systems&

Diverse application fieldsTransportation, Manipulation, Inspection, Ground & aerial vehicles

Cooperative mobile manipulation

GOALGOAL

Safe motionin dynamic

environments

Robot dynamics

Environment obstacles

Formation definition

Constrained optimization

Multiple robotsOptimize formation parameters

within convex region in position-time free space

orSingle robot

Non-linearmodel predictive control

J.Alonso-Mora, et al. “Multi-robot formation control and object transport in dynamic environments via constrained optimization”, IJRR, 2017

Online planning for aerial vehicles

Safe motion

Robot dynamics

Environment obstacles

Objective

Video/Cinematography

Formation control

T. Naegeli, et al., “Trajectory Optimization for Multi-view Aerial Cinematography”, ACM Siggraph, 2017; Videography, RA-L 2017J. Alonso-Mora et al., “Distributed formation control in dynamic environments”, ICRA 2017

Motion planning with interaction

W. Schwarting et al., “Safe Nonlinear Trajectory Generation for Parallel Autonomy With a Dynamic Vehicle Model”, T-ITS 2017L. Ferranti et al., “Coordination of Multiple Vessels Via Distributed Nonlinear Model Predictive Control”, ECC 2018B. Zhou et al., ” Joint Multi-Policy Behavior Estimation and Receding-Horizon Trajectory Planning for Automated Urban Driving”, ICRA 2018

Information from other participants

Environment models

Safe motion among robots &

humans

Vehicle model

CommunicationDistributed

asynchronous

EstimationProbabilistic models &

chance constraints

Intelligent Control

Systems and control

model-based design

Artificial intelligence, machine learning

learn a task from experience, without being explicitly programmed for that task

Intelligent control

Potential of Machine Learning

• adapt to changes in environment

• find new, better solutions

• teach by demonstration or imitation

in addition:

• modeling is tedious, time consuming, expensive

Initial Optimism

• Herbert A. Simon

Artificial Intelligence Pioneer:

“Machines will be capable in 20 years of doing any work a man can do” (1965)

• MIT: Cog Project (1993-2003)

Learning like human children - through constant

interaction with humans and the environment

nobelprize.org

MIT Press

Spectrum of Learning Techniques

Unsupervised learning

(without ‘teacher’)

Reinforcement learning

(learn directly to control)

Supervised learning

(with ‘teacher’)

Construct a Robot Model from Data

actuate motors and observe the response

Two ways to model the system:

1. Physical modeling

2. System identification

Maarten Vaandrager

Imitation Learning

Manschitz, Kober, Gienger, & Peters, RAS 2015

Supervised learning

Unsupervised Learning - Clustering

• Discover automatically groups and structures in data

Applications:

• Robot perception, vision

• Data-driven construction of dynamic models

Construction of Nonlinear Models

Nonlinear and uncertain behavior

Experimental data

Supervised learning

Reinforcement Learning

Adapt the control strategy so that the sum of rewards over time is maximal.

Performanceevaluation

reward

Adaptive

controller System

action

Inspiration - animal learning (reward desired behavior)

Reinforcement Learning for Control

• Approximation in high-dimensional continuous spaces

• Computationally effective methods

• Constrained learning and convergence

Reward-Weighted Imitation

• Maximize reward = optimize strategy

• One possibility: reward-weighted imitation

strategy

success (high reward) failure (low reward)reward

imitate all examples

new strategy

imitate only good examples

Kober & Peters, NIPS 2008

Trial 1Reward 0,1

Trial 2Reward 0,3

Trial 3Reward 0,8 Trial 4

Reward 1,0

Reward-weighted Imitation

Ball-in-a-Cup

Celemin, Maeda, Ruiz-del Solar, Peters, & Kober, under review

Learning a Difference Model

Rastogi, MSc 2017

RL-based Compensation

Pane, MSc 2015

Learning Abilities

•minimal •high

•today* •future?

•research

openclipart.org

•*or “brilliant idiots”

Courses

• (Partially) taught by LAC

• SC42035 Integration Project S&C

• SC42090 Robot Motion Planning and Control

• SC42050 Knowledge Based Control Systems

• IN4320 Machine Learning

• ME41025 Robotics Practicals

• Other recommended courses

• IN4085 Pattern Recognition

• CS4180 Deep Learning

• IN4010 Artificial Intelligence Techniques

• SC42100 Networked and Distributed Control Systems

• ME41105 Intelligent Vehicles

Questions?

latd.com

Learning and Autonomous Control

Documents

Teaching Strategies for Autonomous Learning

Autonomous Blimp Control using Deep Reinforcement Learning

Optimal and Learning Control for Autonomous Robots · Optimal and Learning Control for Autonomous Robots has been taught in the Robotics, Systems and Controls Masters at ETH Zu¨rich

Modeling and Nonlinear Adaptive Control for Autonomous ... · PDF fileModeling and Nonlinear Adaptive Control for Autonomous Vehicle Overtaking ... Nonlinear Adaptive Control for Autonomous

AUTONOMOUS CONTROL SYSTEMS LABORATORY

Experiments in learning distributed control for a hexapod ... · Robotics and Autonomous Systems 54 (2006) 864–872 Experiments in learning distributed control for a hexapod robot

Perception and Navigation for Autonomous Rotorcraft...control, state estimation, map learning, and mission planning. Autonomous navigation requires stabilization of the vehicle. To

Autonomous Learning Ana Clara

Autonomous Helicopter Control Using Reinforcement Learning Policy.pdf

Fine-grained acceleration control for autonomous ... · Fine-grained acceleration control for autonomous intersection management using deep reinforcement learning Hamid Mirzaei Dept

Deep Reinforcement Learning for Simulated Autonomous ...cs231n.stanford.edu/reports/2016/pdfs/112_Report.pdf · Deep Reinforcement Learning for Simulated Autonomous Vehicle Control

Autonomous Learning Based on Call

Model-Reference Reinforcement Learning Control of ... · Model-Reference Reinforcement Learning Control of Autonomous Surface Vehicles with Uncertainties Qingrui Zhang 1; 2, Wei Pan

Autonomous learning through Internet resources

Autonomous Rate Control

Autonomous Learning Laboratory – Department of Computer Science Perspectives on Computational Reinforcement Learning Andrew G. Barto Autonomous Learning

Research on Learning Motivation and Autonomous L2 Learning

ALMOST GLOBAL FEEDBACK CONTROL OF AUTONOMOUS UNDERWATER ...s_Thesis.pdf · ALMOST GLOBAL FEEDBACK CONTROL OF AUTONOMOUS ... ALMOST GLOBAL FEEDBACK CONTROL OF AUTONOMOUS UNDERWATER

Tips for autonomous learning

Autonomous Planetary Robots: Fuzzy Control and … Control... · Autonomous Planetary Robots: Fuzzy Control and Decision ... autonomous navigation functionality can be decomposed