27
System Identification and Optimization Methods Based on Derivatives Chapters 5 & 6 from Jang System Identification and Optimization Methods Based on Derivatives Chapters 5 & 6 from Jang

System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

System Identification and Optimization MethodsBased on Derivatives

Chapters 5 & 6 from Jang

System Identification and Optimization MethodsBased on Derivatives

Chapters 5 & 6 from Jang

Page 2: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

2

Neuro-Fuzzy and Soft ComputingNeuro-Fuzzy and Soft Computing

Neural networks

Fuzzy inf. systems

Model space

Adaptive networks

Derivative-free optim.

Derivative-based optim.

Approach space

SoftSoftComputingComputing

Page 3: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

3

OutlineOutline

•System Identification•Least Square Optimization•Optimization Based on Differentiation

• First order differentiation – Steepest descent• Second order differentiation

Page 4: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

4

System Identification: IntroductionSystem Identification: Introduction

Goal• Determine a mathematical model for an unknown

system (or target system) by observing its input-output data pairs

Page 5: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

5

System Identification: IntroductionSystem Identification: Introduction

Purpose:• To predict a system’s behavior, as in time series

prediction & weather forecasting• To explain the interactions & relationships

between inputs & outputs of a system• To design a controller based on the model of a

system, as an aircraft or ship control• Simulate the system under control once the model

is known

Page 6: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

6

System Identification: IntroductionSystem Identification: Introduction

There are two main steps that are involved

• Structure identification

• Parameter identification

Page 7: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

7

System Identification: IntroductionSystem Identification: Introduction

• Structure identification

Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted; this class of model is denoted by a function y = f(u,θ) where:

y is the model outputu is the input vector θ is the parameter vector

f depends on the problem at hand and on the designer’s experience and the laws of nature governing the target system

Page 8: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

8

System Identification: IntroductionSystem Identification: Introduction

• Parameter identification

The structure of the model is known, however we need to apply optimization techniques in order todetermine the parameter vector such thatthe resulting model describes the system appropriately:

θ=θ ˆ)ˆ,u(fy θ=

iii u to assignedy with 0yy →−

Page 9: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

9

System Identification: IntroductionSystem Identification: Introduction

Page 10: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

10

System Identification: IntroductionSystem Identification: Introduction

The data set composed of m desired input-output pairs (ui;yi) (i = 1,…,m) is called the training data

System identification needs to do both structure & parameter identification repeatedly until satisfactory model is found: it does this as follows:

1) Specify & parameterize a class of mathematical models representing the system to be identified

2) Perform parameter identification to choose the parameters that best fit the training data set

3) Conduct validation set to see if the model identified responds correctly to an unseen data set

4) Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected and repeat steps 2 to 4

Page 11: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

11

Least-Square EstimatorsLeast-Square Estimators

General form:

y = θ1 f 1(u) + θ2 f2(u) + … + θnfn(u) (*)where:

• u = (u1, …, up)T is the model input vector• f1, …, fn are known functions of u• θ1, …, θn are unknown parameters to be estimated

Page 12: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

12

Least-Square EstimatorsLeast-Square Estimators

The task of fitting data using a linear model is referred to as linear regression

We collect a training data set {(ui;yi), i = 1, …, m}

Equation (*) becomes:

which is equivalent to: A θ = y

=θ++θ+θ

=θ++θ+θ

=θ++θ+θ

mnmn2m21m1

2n2n222121

1n1n212111

y)u(f...)u(f)u(f

y)u(f...)u(f)u(fy)u(f...)u(f)u(f

M

Page 13: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

13

Least-Square EstimatorsLeast-Square Estimators

Where: A is an m*n matrix which is:

θ is n*1 unknown parameter vector:

and y is an m*1 output vector:

A θ = y ⇔ θ = A-1y (solution)

=

)u(f)u(f

)u(f)u(fA

mnm1

1n11

L

M

L

θ

θ=θ

n

1

M

[ ])u(f),...,u(fa and y y

y ini1Ti

m

1

=

= M

Page 14: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

14

Least-Square EstimatorsLeast-Square Estimators

We have m outputs & n fitting parameters to find (or m equations & n unknown variables)

Usually, m is greater than n, since the model is just an approximation of the target system and the data observed might be corrupted. Therefore, an exact solution is not always possible.

To overcome this inherent conceptual problem, an error vector e is added to compensate

A θ + e = y

Page 15: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

15

Least-Square EstimatorsLeast-Square Estimators

The goal now consists of finding that reduces the errors between yi and

or

θθ= T

ii ay

∑=

−m

i

Tii ayimize

1min θ

θ

( )∑=θ

θ−m

1i

2Tii ayimizemin

Page 16: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

16

Least-Square EstimatorsLeast-Square Estimators

If e = y - Aθ then:

We need to compute:

∑=

=θ−θ−==θ−=θ

mi

1i

TT2Tii )Ay()Ay(ee)ay()(E

)Ay()Ay(min T θ−θ−θ

Page 17: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

17

Least-Square EstimatorsLeast-Square Estimators

Theorem [least-squares estimator]

The squared error is minimized when θ =(called the least-squares estimators LSE) satisfies the normal equation ATA = ATy, if ATA isnonsingular, is unique & is given by

= (ATA)-1ATy

θ

θθ

θ

Page 18: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

18

Least-Square Estimators: ExampleLeast-Square Estimators: Example

ExampleThe relationship is between the spring length & the force applied L = k1f + k0 (linear model)

• Goal: find = (ko , k1)T that best fits the data for a given force fo, we need to determine the corresponding spring length L0

- Solution 1: provide 2 pairs (L0,f0) and (L1,f1) and solve a linear system of 2 equations and 2 variables k1 and k0However, because of noisy data, this solution is not reliable.

- Solution 2: use a larger training set (Li,fi)

θ

01 k & k⇒

Page 19: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter
Page 20: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

20

Least-Square Estimators: ExampleLeast-Square Estimators: Example

since y = e + Aθ, we can write:

therefore the LSE of [k0,k1]T which minimizesis equal to :

{

{321

M

43421y

e7

4

3

2

1

1

0

A

0.56.41.43.35.21.25.1

e

eeee

kk

9.2 17.4 15.9 14.4 13.2 11.9 11.1 1

=

+

θ

Page 21: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

21

Least-Square Estimators: ExampleLeast-Square Estimators: Example

Therefore, the LSE of [k0,k1]T which minimizes

is equal to :

∑=

=7

1i

2i

T eee

==

44.020.1

yA)AA(k

k T1T

1

0

Page 22: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

22

Least-Square Estimators: ExampleLeast-Square Estimators: Example

We rely on this estimation because we have more data

If we are not happy with the LSE estimators then we can increase the model’s degree of freedom such that:

L = k0 + k1f + k2 f2 + … + knfn

(least square polynomial)

Page 23: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

23

Least-Square Estimators: ExampleLeast-Square Estimators: Example

Higher order models fit better the data but they do not always reflect the inner law that governs the system

For example, when f is increasing toward 10N, the length is decreasing.

Page 24: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

24

Least-Square Estimators: ExampleLeast-Square Estimators: Example

Page 25: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

25

Derivative-Based OptimizationDerivative-Based Optimization

Based on first derivatives:• Steepest descent• Conjugate gradient method• Gauss-Newton method• Levenberg-Marquardt method• And many others

Based on second derivatives:• Newton method• And many others

Page 26: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

26

Steepest DescentSteepest Descent[ ]Tnθθθθ ,...,, 21=

( ) ( )( )θθ EE min=*

Have n-dimensional space

*θθ =Want to find such that

Investigate the search space via iteration

( ),...3,2,11 =+=+ kkkkk dηθθ

such that

( ) ( ) ( )kkkk EEE θηθθ <+=+ d1

Page 27: System Identification and Optimization Methods Based on ...Neuro-Fuzzy and Soft Computing: Fuzzy Sets 8 System IdentSystem Identiification:fication: Introdu Introductionction • Parameter

Neuro-Fuzzy and Soft Computing: Fuzzy Sets

27

Steepest DescentSteepest Descent

is determined from the gradient of ( )θEIf the direction d

( ) ( ) ( ) ( ) ( ) T

n

EEEEθ

∂∂

∂∂

∂∂

=∇=θθ

θθ

θθθ ,...,,

21

defgi.e.,

such that gηθθ −=+ kk 1

then we talk about the method of steepest descent