34
Michael G. Safonov University of Southern California AFOSR Dynamics and Control Workshop Pasadena,CA. 12 -14 August 2002 Myopic Unfalsified Control: A Gradient Approach to Adaptation Michael G. Safonov University of Southern California

Michael G. Safonov University of Southern California AFOSR Dynamics and Control Workshop Pasadena,CA. 12 -14 August 2002 Myopic Unfalsified Control: A

Embed Size (px)

Citation preview

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Myopic Unfalsified Control: A Gradient Approach to

Adaptation

Michael G. SafonovUniversity of Southern California

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Background: Unfalsified Control ONLY 3 elements

candidate controller

hypotheses

FALSIFIED

COMPUTER SIEVE

Unfalsified K

M. G. Safonov. In Control Using Logic-Based Switching, Spring-Verlag, 1996.

goalsdata

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

An Achilles heel of modern system theory has been the

habit of ‘proof by assumption’

Theorists typically give insufficient attention to the possibility of future observations which may be at odds with assumptions.

Motivation for Unfalsification:

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Galileo: open-eyed “data-driven”“data-driven”

Reality is what we observe.

Two ways to learn & adapt: Observation vs. Introspection

vs.

‘Curve-Fitting’

Plato: introspective“assumption driven”

Reality is an ideal, observable only through noisy sensors.

‘Probabilistic Estimation’MODELS approximate Observed Data Data approximates Unobserved TRUTH

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Observation vs. Introspection in adaptive control

“Unfalsified: Data Driven”• goals are modest:

– no guaranteed predictions of future stability

– just consistency of goals, decisions and data

• no troublesome assumptions, parsimonious formulation:

– DATA – GOALS – DECISIONS

• Fast, reliable, exact

“Traditional: Assumption Driven”• goals are ambitious:

– guaranteed future stability– Cost & estimation convergence

• many troublesome assumptions– “the ‘true’ plant is in model set”, – “noise I.I.D.” – “bounds on parameters, probabilities”– …, “linear time-invariant, minimum-

phase plant, order < N”• Slow, quasi-static, approximate

Remarkably, some leading “assumption driven” control theorists have held that observed DATA inconsistent with ASSUMPTIONS should be ignored (cf. M. Gevers et al., "Model Validation in Closed-Loop", ACC, San Diego, 1999)

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Unfalsified Adaptive Control

candidate controllers

FALSIFIED

COMPUTER SIEVE

LEA

RN

ING

FE

ED

BA

CK

LOO

PS

K

UnfalsifiedControllers

K

given

M. G. Safonov. In Control Using Logic-Based Switching, Spring-Verlag, 1996.

GOALS

CONTROLLER SELECTOR

DATA yuz ,

DECISIONS

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

UnfalsifiedControllers

Unfalsification

A candidate controller K is FALSIFIED when the data (y,u) proves existence of a ‘fictitious’

that would violate performance spec’s if the K were in the loop.

• Controllers may remain UNFALSIFIED until the data proves otherwise.

• Or, fading memory may be used to reinstate previously falsified controllers

FalsifiedControllers

Key points: ● A controller need not be in the loop to be falsified. ● Even a single data point can falsify many controllers.

Unknown Plant

K

gain

u

ye

-

+ yr

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

• Plant Data: at time t=0, (u,y)=(1,1)

• Candidate K’s: u=Ke real gain

• Goal: |e(t)/r(t)| < 0.1 for all r(t) Relations: e=u/K=1/K , r=y+e=y+u/K= 1+1/K => K is unfalsified if |1/(1+K)| < 0.1

=> unfalsified K’s: K>9 or K<-11

Trivial Example

Unknown PlantK

gain

r u ye

-

+

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Jun & Safonov, CCA/CACSD ‘99

control input u(t)

proportional gain kP (t)

integral gain kI (t)

derivative gain kD (t),

plant output y(t)

Evolution of unfalsified set Time responses

Tsao & Safonov, IEEE Trans, AC-42, 1997.

time-history of unfalsified candidate control gains

time-history of plant response and optimal unfalsified control gain

Simulation Example: ACC Benchmark

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Simulation Example: Missile

• Learns control gains

• Adapts quickly to compensate for damage & failures

• Superior performance Specified target response bound

Actual response

Commanded response

Brugarolas, Fromion & Safonov, ACC ’98

Unfalsified adaptive missile autopilot:• discovers stabilizing control gains as it flies, nearly instantaneously• maintains precise sure-footed control

Brugarolas, Fromion and Safonov, ACC98

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Simulation Example: Unfalsified ‘PID Universal Controller’

—————

Jun & Safonov, CCA/CACSD ‘99

Example: Adaptive PID

dttff

ruwyrw2

0

2

222

2

2

1

)( where

*)(*

Goal:

• Unfalsified adaptive control loop stabilizes in real-time

• Unstable Plant 30 Candidate PID Controllers:

KI =[2, 50, 100]

KD = [.5, .6]

KP = [5, 10, 25, 80, 110]

0 5 10 15 20-12

-10

-8

-6

-4

-2

0

2

4

6

time

Command RControl UOutput Y

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Other People’s Successes with Unfalsified Control

• Emmanuel Collins et al. (Weigh Belt Feeder adaptive PID tuning, CDC99)

• Kosut (Semiconductor Mfg. Process run-to-run tuning, CDC98)

• Woodley, How & Kosut (ECP Torsional disk control, adaptive tuning, ACC99

• Razavi & Kurfess, Int. J ACSP, Aug. 2001

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

How Unfalsified Works

ftp://routh.usc.edu/pub/safonov/PID_Demo_for_MATLAB6.zip

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Inside the Unfalsification Algorithm Block

Data InControllers Out

Bank of Filters (one for each candidate K)

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Why doesn’t everybody use unfalsified control?

• Unfalsified Adaptive Control is better:– fast– reliable– exact

• Full and precise use of all available evidence– meets Russell’s criterion: “Intimate association

between observation and hypothesis”

• But, it’s not yet widely used. Why?

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Unfalsified Adaptive Control:Some Invalid Concerns

• Strange noise-free, model-free formulation • You can’t prove convergence or stability (unless

you assert the usual adaptive control assumptions)• Claims seem to good to be true:

– Fast, reliable– Handles noise without noise models– Works for non-minimum phase plants– No need for plant degree bounds– No need for plant models

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Unfalsified Adaptive Control:A Valid Concern

• Unfalsified Adaptive Control can be computationally intensive: – Computationally intensive when there are many

candidate controllers– Requires global minimization of unfalsified

cost levels – May require ‘gridding’ (like multi-model

adaptive)

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Solutions: How to address concerns over unfalsified?

1. Direct comparision with traditional methods?

2. Make unfalsified more like traditional methods?– make it slow, quasi-static (break its legs):

• differential equation update, or

• run-to-run controller update

– make it myopic, near-sighted (break its spectacles): • do local optimization only, not global

• use steepest descent, gradient-based approaches to limit computational complexity

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

• Traditional assumptions: – LTI minimum phase plant No– Known upper bound for plant order No– Known relative degree of the plant No– Known sign of the high frequency gain No

• Alternative ‘universal controllers’ (Martensson, 1985; Fu et al., 1986):– Require less prior information YES – Search a set of controllers and switch

to the one satisfying a given performance YES – Assume the plant to be LTI No

Traditional & Alternative Unfalsified

Solution #1: Direct Comparison

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

EXAMPLE: Direct Comparision Narendra MRAC vs. Unfalsified MRAC

Cost Function J( ,t ): Unfalsified if J( ,t) < threshold = 200

J( ,t) = α e(,t)2 + β s0t exp{-σ (t- τ)} (e (, τ)2) d τ < 200 ; weights ==1

where e( ,t) = fict(,t) * Wm(t) - yp(t)

σ=.001 is ‘fading-memory forgetting-factor (sigma-modification) ’

is the ‘fictitious reference signal’

Controller K()

up

yp

yp

REFERENCEMODEL Wm(s)

+r

-

-

+ Error e(t)ym

Unknown Plant

Candidate Controllers K() :

where 1=[-2;-1;0;2;4] ; 2=[2;5.5;6;6.5;7;8] and 3 =[-10;-6;-5;-4;-3;2]

A. Paul and M. Safonov, 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Example: SimuLink Simulation of Unfalsified ControlNarendra MRAC vs. Unfalsified Adaptive

A. Paul and M. Safonov, 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Simulation Results:Narendra MRAC vs. UNFALSIFIED

Simulation Parameters: Reference signal: 5 cost(t) + 10 cos (5t)Reference model: Wm = 1/ (s+1). Unknown plant, used in simulation: Wp = (s+1) / (s2-3s+2)

Narendra MRAC vs. Unfalsified MRAC

Switching sequence:Unfalsification process

Narendra & Annaswamy, 1988

ConvergenceOf Controllerparameters :

ErrorSignal :

A. Paul and M. Safonov, 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Direct Comparison:MRAC vs. Unfalsified

Tsao & Safonov, CCA/CACSC’99T. C. Tsao and M. G. Safonov. Int. J. Adaptive Control and Signal Processing, 15:319–334, 2001.

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Solution #2: Simplified Gradiant-Based ‘Myopic’ Unfalsified Controller (IFAC 2002)

Jun and Safonov, IFAC 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

• Similar in most ways to traditional adaptive– Simple to implement, low computational load– Uses only local (myopic) analysis of data– Slow, quasi-static adaptation

– Same as ‘MIT Rule’ if MRAC cost and no s dt

• But even myopic unfalsified is better– Always fewer assumptions, often slightly faster– Data-driven scientific logic of unfalsification is

inherently more reliable

Properties of ‘Myopic’ Unfalsified Adaptive Control

Jun and Safonov, IFAC 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Myopic Unfalsified Simulation (Example 1)Jun and Safonov, IFAC 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Myopic Simulation ResultJun and Safonov, IFAC 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Myopic Simulation (Example 2)Jun and Safonov, IFAC 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Simulation Result (Example 2)Jun and Safonov, IFAC 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Direct comparison with Hjalmarsson’s gradient-based IFT:

Plant Data: at time t = 0, K ( 0 ) = 8; T = 10s; Sample time = 0.1

• Unknown Plant: 1/s(s+2) with white noise;

• Candidate Controllers: K > 0;

• Cost Function:IFTrun-to-run

Unfalsifiedfinite-memory

Unknown PlantK

gain

r u ye

-+

10st ,02

))()(~(

10st ,02

))()(~(

),(5.1

0

2

5.12

tt

tt

Tt

edyr

edyr

tkJ

TttytrEtkJ ,0 ,0]))()([(2

1),( 2

Wang and Safonov, 2002

Hjalmarsson et al., IEEE CDC, 1994

vs.

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

IFT vs. Unfalsified:Simulation Setup

• IFT • Unfalsified

Wang and Safonov, 2002

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

IFT vs. Unfalsified: Comparison Wang and Safonov, 2002

Experiment #1

Experiment #2

Experiment #3

Experiment #1

Experiment #2

Experiment #3

IFT: Error Signal Squared Unfalsified: Error Signal Squared

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Conclusions• Introspective models are not

adequate for analyzing adaptation

• To explain learning and adaptation, we need the open-eyed, data-driven scientific logic of unfalsification

• Unfalsified control is better than traditional assumption-driven adaptive control, even when crippled and myopic

Michael G. SafonovUniversity of Southern California

AFOSR Dynamics and Control WorkshopPasadena,CA. 12 -14 August 2002

Selected Referenceshttp://routh.usc.edu