Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
7-1
Lecture Note #7(Chap.11)
CBE 702Korea University
Prof. Dae Ryook Yang
System Modeling and Identification
7-2
Chap.11 Real-time Identification
• Real-time identification– Supervision and tracking of time varying parameters for
• Adaptive control, filtering, prediction• Signal processing• Detection, diagnosis artificial neural networks, etc.
– Identification methods based on set of measurements are notsuitable
– Only few data needed to be stored– Drawbacks
• Requires a priori knowledge on model structure• Iterative solutions based on larger data sets may be difficult to
organize
7-3
• Recursive estimation of a constant– Consider the following noisy observation of a constant parameter
– The least-squares estimate is found as the sample average
– Recursive form
– Variance estimate of the least-squares estimate
– Note that pk→0 as k→∞
2, { } 0, { } ( 1, )k k k k i j ij ky v E v E v v kφ θ σ δ φ= + = = = ∀
1(1/ )
k
k iik yθ
== ∑
1 1(1/ )( )k k k kk yθ θ θ− −⇒ = + −1
kk ii
k yθ=
= ∑1
1 1( 1)
kk ii
k yθ−
− =− = ∑
( ) 12
1{( )( ) }
k T Tk i ii
p Eσ φ φ θ θ θ θ−
== ≈ − −∑
21 1 1
1 2 21
1 kk k k
k
pp p p
pσ
σ σ− − −
−−
= + ⇒ =+
7-4
• Derivation of recursive least-squares identification– Consider as usual the regressor φi and the observation yi
– The least-squares criterion based on k samples is
– The ordinary least-squares estimate
– Introduce the matrix
– Alternative form (avoiding inversion of matrices)
[ ] [ ]1 1T T
k k k kY y yφ φΦ = =
( ) (1/ 2)( ) ( ) (1/ 2) ( ) ( )T Tk k k k k k k k kV Y Yθ θ θ ε θ ε θ= − Φ − Φ =
1 11 1
( ) ( ) ( )k kT T T
k k k k k i i i ii iY yθ φ φ φ− −
= == Φ Φ Φ = ∑ ∑
1 11
( ) ( )k T T
k i i k kiP φ φ − −
== = Φ Φ∑1 1
1T
k k k kP P φ φ− −−= +
1 11 1 1 11
( ) ( ) ( )k T
k k i i k k k k k k k k k k k k kiP y y P P y P yθ φ φ θ φ θ φ φ θ
− −− − − −=
= + = + = + −∑
1 1 1 11 1 1
11 1 1 1
( ) ( ) ( )
( )
T T T Tk k k k k k k k k k
T Tk k k k k k k k
P P
P P I P P
φ φ φ φ
φ φ φ φ
− − − −− − −
−− − − −
= Φ Φ = Φ Φ + = +
= − +
1 1 1 1 1 1cf) ( ) ( )A BC A A B I CA B CA− − − − − −+ = − + Matrix inversion lemma
10( 0)P− =
0( , with )P Iα α ε=
7-5
Recursive Least-Squares (RLS)Identification
• The recursive least-squares (RLS) identification algorithm
• Some properties of RLS estimation– Parameter accuracy and convergence
1
1
1 11 0
1
, given1
k k k k k
Tk k k k
Tk k k k
k k Tk k k
P
y
P PP P P
P
θ θ φ ε
ε φ θ
φ φφ φ
−
−
− −−
−
= +
= −
= −+
2
: the parameter estimate
: the prediction error
: the parameter covariance estimate except
k
k
kP
θ
ε
σ
1 1( ) (1/ 2)( ) ( ) (1/ 2)T Tk k k k k k kQ P Pθ θ θ θ θ θ θ− −= − − =
1 11 1 1 1
1 1 21 1 1 1
2 21
2 21
1
2( ( ) ( ))
( ) 2
( ) ( 1 )
1( )
1
T Tk k k k k k k k
T T Tk k k k k k k k k k k
T Tk k k k k k k
Tk k k kT
k k k
Q Q P P
P P P
P
P
θ θ θ θ θ θ
θ θ θ φ ε φ φ ε
θ φ ε φ φ ε
θ φ ε εφ φ
− −− − − −
− −− − − −
−
−−
− = −
= − + +
= + + − +
= + −+
1k k k k kPθ θ φ ε−= +1 1
1T
k k k kP P φ φ− −−− =
2 2k kε ε+ −
1 11
11
T TT T k k k k k kk k k k k k T
k k k
P PP P
Pφ φ φ φ
φ φ φ φφ φ
− −−
−
= −+
7-6
– Under the linear model assumption,
• If vk=0 for all k, Q decreases in each recursion step• If Q tends to zero, it implies that tends to zero as the sequence of
weighting matrix is an increasing sequence of positive definitematrix where for all k>0.
• Theorem 11.1– The errors of estimated parameters and the prediction error for least-
squares estimation have a bound determined by the noise magnitudeaccording to
– It implies
– The parameter convergence can be obtained for a stationary stochasticprocess {vk} if . (c is a constant)
– Thus, poor convergence is obtained in cases of large disturbance and arank-deficient matrix.
1so thatT Tk k k k k k ky v vφ θ ε θ φ−= + = − +
2 21
1
1 1 1( ) ( )
2 2 1k k k kTk k k
Q Q vP
θ θ εφ φ−
−
− = −+
2kθ1
kP−
1 11k kP P− −
−≥
11 1 1( ) ( ) ( ) ( )
2 2 2T T T
k k k k k k kV Q P v vθ θ ε θ ε θ θ θ−+ = + =
0
2 1 1 1( ) ( )
2 2 2T T T T
k k k k k k kPv vθ ε θ ε θ θ θ= − − Φ Φ
Tk k p pkcI ×Φ Φ >
Tk kΦ Φ
7-7
• Properties of Pk matrix– Pk is a positive definite and symmetric matrix. (Pk=Pk
T>0)– Pk→ 0 as k → ∞
– The matrix Pk is asymptotically proportional to the parameter estimatecovariance provided that a correct model structure has been used.
– It is often called the “covariance matrix.”
• Comparison between RLS and offline LS identifications– If initial value P0 and can be chosen to be compatible with the results of
the ordinary least-squares method, the result obtained from RLS is sameas that of offline least-squares identification.
– Thus, calculate the initial values for RLS from the ordinary LS methodusing some block of initial data.
0θ
7-8
• Modification for time-varying parameters– The RLS gives equal weighting to old data and new data.– If the parameters are time-varying, pay less attention to old data.– Forgetting factor (λ)
– Modified RLS
– Disadvantages• The noise sensitivity becomes more prominent as λ decreases.• Pk matrix may increase as k grows if the input is such that the
magnitude of Pk-1φk is small (P-matrix explosion or covariance matrixexplosion)
2
1
1( ) ( ) (0 1)
2k k i T
k k i kiJ yθ λ φ θ λ−
== − < ≤∑
1
1
1 11 0
1
1( ), given
k k k k k
Tk k k k
Tk k k k
k k Tk k k
P
y
P PP P P
P
θ θ φ ε
ε φ θ
φ φλ λ φ φ
−
−
− −−
−
= +
= −
= −+
7-9
• Choice of forgetting factor– Trade-off between the required ability to track a time-varying parameter
(i.e., a small value of λ) and the noise sensitivity allowed– A value of λ close to 1: less sensitive to disturbance but slow tracking of
rapid variations in parameters– Default choice: 0.97≤λ≤0.995– Rough estimate of data points in memory (time constant): 1/(1–λ)
• Example 11.3: Choice of forgetting factor
2
,
{ } 0, { } ( 1, )k k k
k i j ij k
y v
E v E v v k
φ θ
σ δ φ
= +
= = = ∀
1
1
1 11 0
1
1( ), 1
k k k k k
Tk k k k
Tk k k k
k k Tk k k
P
y
P PP P P
P
θ θ φ ε
ε φ θ
φ φλ λ φ φ
−
−
− −−
−
= +
= −
= − =+
0.99, 0.98, 0.95λ =
Faster tracking
1 2θ = →
7-10
• Delta model– The disadvantage of z-transformation is that the z-transformation
parameters do not converge to the Laplace transformation continuousparameters, from which they were derived, as sampling period decreases.
– Very small sampling periods yield the very small numbers from thetransfer function numerator.
– The poles of transfer function approach the unstable domain as thesampling period decreases.
– These disadvantages can be avoided by introducing a more suitablediscrete model.
– δ-operator:
– This formulation makes the state-space realization and the correspondingsystem identification less error-prone due to favorable numerical scalingproperties of the Φ′ and Γ′ matrices as compared to the ordinary z-transform based algebra.
( 1) /z hδ ≡ −
1 (1/ )( ) (1/ )k k k k k k k k
k k k k
x x u x x u h I x h uy Cx y Cx
δ+ ′ ′= Φ + Γ = Φ + Γ = Φ − + Γ ⇒ = =
7-11
• Kalman filter interpretation– Assume that the time-varying system parameter θ may be described by the
state-space equation
– Kalman filter for estimation of θk
– Differences from RLS• The dynamic of Pk changes from exponential growth to linear growth
rate for φk=0 due to R1.• Pk of the Kalman filter does not approach zero as k → ∞ for a nonzero
sequence {φk}.
1 1
2
, { } 0, { } , ,, { } 0, { } , ,
Tk k k i i j ij
T Tk k k k i i j ij
v E v E v v R i jy e E e E e e R i jθ θ δ
φ θ δ+ = + = = ∀
= + = = ∀
1
1 2 1
1
1 11 1
2 1
/( )k k k k
Tk k k k k k
Tk k k k
Tk k k k
k k Tk k k
K
K P R P
y
P PP P R
R P
θ θ ε
φ φ φ
ε φ θ
φ φφ φ
−
− −
−
− −−
−
= +
= +
= −
= − ++
1
1
1 11 0
1
1( ), given
k k k k k
Tk k k k
Tk k k k
k k Tk k k
P
y
P PP P P
P
θ θ φ ε
ε φ θ
φ φλ λ φ φ
−
−
− −−
−
= +
= −
= −+
(RLS)
7-12
• Other forms of RLS algorithm– Basic version: - Normalized gain version
– Multivariable case:
1 1
1
1
1 11
1
[ ]
1( )
Tk k k k k k
k kk k k T
k k k k
Tk k k k
k k Tk k k k k
K y
PK P
P
P PP P
P
θ θ φ θ
φφ
λ φ φ
φ φλ λ φ φ
− −
−
−
− −−
−
= + −
= =+
= −+
11 1
1 1
1
[ ]
( )
11
Tk k k k k k k k
Tk k k k k k
k
k k
R y
R R R
θ θ γ φ φ θ
γ φ φλ
γ γ
−− −
− −
−
= + −
= + −
= +
1 and k k k k kP R R Rγ−= =
1
1 1
1argmin [ ] [ ]
2
kkT T T
k j k k k k ki j i
y yθ
θ λ φ θ φ θ−
= = +
= − Λ −
∑ ∏
( )( )
1 1
1
1 1
1
1 1 1 1
1 1
[ ]
1( )
( )
Tk k k k k k
Tk k k k k k k k
T Tk k k k k k k k k k k
k
Tk k k k k k
K y
K P P
P P P P P
θ θ φ θ
φ λ φ φ
φ λ φ φ φλ
γ ε ε
− −
−
− −
−
− − − −
− −
= + −
= Λ +
= − Λ +
Λ = Λ + − Λ
1
if , ,k
k ij j
j i
jλ λ λ λ −
= +
= ∀ =∏
Output error covariance
(Parameter error covariance update)
(Output error covariance update)
(Prediction gain update)
7-13
Recursive InstrumentalVariable (RIV) Method
• The ordinary IV solution
• RIV
– Standard choice of instrumental variable:• The variable xk may be, for instance, the estimated output.
– RIV has some stability problem associated with the choice of IV and theupdating of Pk matrix.
1
1 1
1
1 11
1
/(1 )
1
k k k k
Tk k k k k k
Tk k k k
Tk k k k
k k Tk k k
K
K P z P z
y
P z PP P
P z
θ θ ε
φ
ε φ θ
φφ
−
− −
−
− −−
−
= +
= +
= −
= −+
1 11 1
( ) ( ) ( )k kz T T T
k i i i ii iZ Z Y z z yθ φ− −
= == Φ = ∑ ∑
1 1( )A B
Tk k k n k k nz x x u u− − − −= − −
7-14
Recursive Prediction ErrorMethods (RPEM)
• RPEM– Consider a weighted quadratic prediction error criterion
– General RPEM search algorithm
21 1
1( ) ( ) ( 1)
2k kk i T k i
k k k i k ki iJ yθ γ λ φ θ γ λ− −
= == − =∑ ∑
[ ][ ]
1
1 1
1 1
( ) ( / )
( / ) ( )
( ) ( )
k k ik k i i ii
k k k k k
k k k k k
J
J
J J
θ γ λ ψ ε ψ ε θ
γ λ γ θ ψ ε
θ γ ψ ε θ
−=
− −
− −
′ = − ≡ −∂ ∂
′= −
′ ′= + − −
∑
1 11 1 1
1 1
( )
( )k k k k k k k k k k
Tk k k k k k
R J R
R R R
θ θ θ θ γ ψ ε
γ φ φ
− −− − −
− −
′= − = +
= + −
1 1 1 1 1 1 1 1( ) ( ) ( ) ( ) ( ) ( : optimal)k k k k k k k k k k k kJ J Jθ θ γ ψ θ ε θ θ θ− − − − − − − − ′ ′ ′= + − −
00
7-15
• Stochastic gradient methods– A family of RPEM– Also, called stochastic approximation or least mean square (LMS)– Uses steepest descent method to update the parameters
– The algorithm: (Time-varying regressor-dependent gain version)
• Rapid computation as there is no Pk matrix to evaluate• Good detection of time-varying parameters• Slow convergence and noise sensitivity
– Modification for time varying parameters
• Keeping the factor rk at a lower magnitude
1
1
11
/ ( 0)
k k k k k
Tk k k k
Tk k k
Tk k k k
y
Q r Q Q
r r Q
θ θ γ φ ε
ε φ θ
γ φ
φ φ
−
−
−−
= +
= −
= = >
= +
/ ( ) / for linear modelT Tk k k k k k ky yψ ε θ φ θ θ φ φ θ≡ −∂ ∂ = ∂ − ∂ = =
11 0 1T
k k k kr r Qλ φ φ λ−−≡ ≤ ≤
7-16
• RPEM for multivariable case
• Projection of parameters into parameter domain DM
( )1 1
1 1
11 1
1 1
( )
( )
k k k k k k k k
Tk k k k k k k
Tk k k k k k
R
R R R
θ θ γ ψ ε θ
γ ψ ψ
γ ε ε
− −− −
−− −
− −
= + Λ
= + Λ −
Λ = Λ + − Λ
1 1
1
[ ]
if if
Tk k k k k k
k k Mk
k k M
K y
DD
θ θ φ θ
θ θθ
θ θ
− −
−
′ = + −
′ ′ ∈= ′ ∉
7-17
Recursive PseudolinearRegression (RPLR)
• Recursive pseudolinear regression (RPLR)– Also, called Recursive ML estimation, Extended LS method– The regression model:
– The recursive algorithm
– The regression vector:– The algorithm may be modified to iterate for the best possible ε.
1 1 1( )A B C
Tk k k k
Tn n n
y va a b b c c
φ θθ
= + =
1
1 1
1
1 11
1
/(1 )
1
k k k k
Tk k k k k k
Tk k k k
Tk k k k
k k Tk k k
K
K P P
y
P PP P
P
θ θ ε
φ φ φ
ε φ θ
φ φφ φ
−
− −
−
− −−
−
= +
= +
= −
= −+
1 1 1( )A B C
Tk k n k k n k k ny y u uθ ε ε− − − − − −= − −
Estimate of vk
7-18
Application to Models• RPEM to state-space innovation model
– Predictor
– Algorithm
1( ) ( ) ( ) ( ) ( )( ) ( )
k k k k
k k
x F x G u K vy H x
θ θ θ θ θθ θ
+ = + + =
Innovation
( )k k kv y y θ= −
1 1
11 1
1 11
1
1 1
1
1 1 1
( )
( )
k k k
Tk k k k k k
Tk k k k k k k
k k k k k k k
k k k k k k k
k k k
k k k k k k k k
T T Tk k k k k
y y
R R R
R
x F x G u K
y H x
W F K H W M K D
W H D
ε
γ ε ε
γ ψ ψ
θ θ γ ψ εε
ψ θ
− −
−− −
− −−
+
+ +
+
+ + +
= −
Λ = Λ + − Λ = + Λ −
= + Λ= + +
== − + −
= +[ ]
, , ,
where
( ) ( )
( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( )
( ) ( ) ( )k k k k
k k k k
k k k k
T
k k
T
k k
T
k k
k x u
F F G G
H H K K
dy
d
dW x
d
D H x
M F x G u Kθ ε
θ θ
θ θ
ψ θ θθ
θ θθ
θ θ θθ
θ θ θ εθ
= =
= =
∂ ∂
∂+ +
∂
7-19
• RPEM to general input-output models– System:
– Predictor
– Error definitions
– Parameter vector
– Regressor
1 11
1 1
( ) ( )( )
( ) ( )k k kB q C q
A q y u eF q D q
− −−
− −= +1 1
1
1 11
1 11
1 11
1 11
( ) 1
( )
( ) 1
( ) 1
( ) 1
a
a
b
b
f
f
c
c
d
d
nn
nn
nn
nn
nn
A q a q a q
B q b q b q
F q f q f q
C q c q c q
D q d q d q
−− −
−− −
−− −
−− −
−− −
= + + +
= + +
= + + +
= + + +
= + + +
1 1 1 1
1 1 1
( ) ( ) ( ) ( )( ) 1
( ) ( ) ( )k k kD q A q D q B q
y y uC q C q F q
θ− − − −
− − −
= − +
1 1 11
1 1 1
( ) ( ) ( )( ) ( ) ( )
( ) ( ) ( )k k k k k k
D q B q D qy y A q y u v
C q F q C qε θ θ
− − −−
− − −
= − = − =
1
1
( )( )
( )k k
B qw u
F qθ
−
−= 1( ) ( ) ( )k k kv A q y wθ θ−= −
1 1 1 1 1a b f c d
Tn n n n na a b b f f c c d dθ =
1 1 1 1 1( )a b f c d
T
k k k n k k n k k n k k n k k ny y u u w w v vϕ θ ε ε− − − − − − − − − − = − − − − − −
7-20
– Error calculations
– Expression for prediction error
– Gradient expressions
1 1 1 1( ) ( ) ( )b b f fk k n k n k n k nw b u b u f w f wθ θ θ− − − −= + + − − −
1 1( ) ( )a ak k k n k n kv y a y a y wθ θ− −= + + + −
1 1 1 1( ) ( ) ( ) ( ) ( ) ( )d d c ck k k n k n k n k nv d v d v c cε θ θ θ θ ε θ ε θ− − − −= + + − − −
1 1 1 1
1 1 1 1 1 1
1 1 1 1
( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( )
( )
d d c c
a a b b f f
c c d d
k k k n k n k n k n
k k n k n k n k n k n k n
k n k n k n k n
k k
Tk k
v d v d v c c
y a y a y b u b u f w f w
c c d v d v
y y
y
ε θ θ θ θ ε θ ε θ
θ θ
ε θ ε θ θ θ
θ
θ ϕ θ
− − − −
− − − − − −
− − − −
= + + − − −
= + + + − − − + + +
− − − + + +
= −
= − 1
1
1
1 1
1
1 1
1
1
( ) ( )( )
( ) ( )( ) ( )
( ) ( )( ) ( )( ) ( )
( ) 1( )
( )( ) 1
( )( )
kk i
i
kk i
i
kk k i
i
kk i
i
kk i
i
y D qy
a C qy D q
ub C q F q
y D qwf C q F q
yc C q
yv
d C q
θ
θ
θψ θ θ
θε θ
θθ
−
−−
−
−− −
−
−− −
−−
−−
∂ − ∂ ∂ ∂ ∂
= = − ∂ ∂ ∂ ∂ −
∂
1 1 1 1 1 1
1 1
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
k k
k
C q F q y F q C q D q A q y
D q B q u
θ− − − − − −
− −
= − +
( )( ) k
ky θ
ψ θθ
∂=
∂
7-21
– Algorithm
1 1
11
1 1 1 1
1 1
1 1 1 1
1 1
b b f f
a a
a b f
c d
k k k
Tk k k k k k
k k k k k k
k k n k n k n k n
k k k n k n k
Tk k k n k k n k k n
k k n k k n
k k
y y
R R R
R
w b u b u f w f w
v y a y a y w
y y u u w w
v v
v
ε
γ ψ ψ
θ θ γ ψ ε
ϕ
ε ε
ε
− −
−−
− − − −
− −
+ − + − + − +
− + − +
= −
= + − = +
= + + − − −
= + + + −
= − − − −− −
= + 1 1 1 1
1 1
1 1 1 1
1 1 1 1
1 1 1 1
d d c c
d d c c
d d g g
d d g g
k n k n k n k n
Tk k
k k k n k n k n k n
k k k n k n k n k n
k k k n k n k n k n
k
d v d v c c
y
y y d y d y c y c y
u u d u d u g u g u
w w d w d w g w g w
ε ε
θ ϕ
ε ε
− − − −
+ +
− − − −
− − − −
− − − −
+ + − − −
=
= + + + − − −
= + + + − − −
= + + + − − −
= 1 1
1 1
1 1 1 1
1 1
c c
c c
a b f
c d
k k n k n
k k k n k n
Tk k k n k k n k k n
k k n k k n
c c
v v c v c v
y y u u w w
v v
ε ε
ψ
ε ε
− −
− −
+ − + − + − +
− + − +
− − −
= − − −
= − − − −− −
1 1coefficients of ( ) ( )i k kg C q F q− −=
7-22
Extended Kalman Filter
• Kalman filter for nonlinear state-space model– System:
– With extended state vector:
where
– Linearization
1 1
2 12
( , ) ( ) ( { } )
( , ) ( { } , { } )
Tk k k k i j ij
T Tk k k i j ij i j ij
x F x G u w E w w R
y H x e E e e R E w e R
θ θ δ
θ δ δ+ ′= + + =
′= + = =
kk
k
xX
θ
= ⇒
1 ( ) ( )
( )k k k k
k k k
X F X G u w
y H X e
θ+ = + +
= +( , )
( ) ( ) ( , )0 0
k k kk k k k k
k
F x G wF X G w H X H x
θθ
θ
′ ′= = = =
( , ) ( , )
k k
k kX X X X
dF X u dH X uF H
dX dX= =
= =
7-23
• Algorithm– Given (start from k=0)
– To avoid the calculation of large matrices, partition the matrices– Use of latest available measurements to update the parameter estimates
112 2
1
1 1 2
[ ][ ]
( ) [ ]
[ ]
T Tk k k k k k k
k k k k k k k k k
T T Tk k k k k k k k k
K F P H R H P H R
X F X G u K y H X
P F P F R K H P H R K
θ
−
+
+
= + +
= + + −
= + − +
0 0 0, ,x Pθ
( , ) ( , )( ) ( )
k k k k
kk kX X X X
dF X u dH X uF X H X
dX dX= =
= =
112 2
1
1 1 2
[ ][ ]
[ ]
( )
[ ]
T Tk k k k k k k
k k k k kk k
k k k kk k
T T Tk k k k k k k k k
K F P H R H P H R
X X K y H X
X F X G u
P F P F R K H P H R K
θ
−
+
+
= + +
= + −
= +
= + − +
7-24
Subspace Methods forEstimating State-Space Models
– Estimation of system matrices, A, B, C, and D offline
– Assuming minimal realization
• If the estimates of A and C are known, estimates of B and Dcan be obtained using linear least-squares method.
• The estimates of B and D will converge to true values if A and C areexactly known or at least consistent.
• If the (extended) observability matrix (Or) is known,then A and C can be estimated. (r>n)
1 ,n mk k k k
pk k k k
x Ax Bu w x R u R
y Cx Du v y R+ = + + ∈ ∈
= + + ∈
1( )k k k ky C qI A Bu Du v−= − + +1 1
0or ( ) ( )k k k k ky C qI A x C qI A Bu Du vδ− −= − + − + +
1
r
r
CCA
O
CA −
=
7-25
– For linear transformation, – For know system order (n*=n)
– For unknown system order (n*>n)
• Partition the matrices depending on the singular values and neglect theportion for smaller singular values
• The estimate of the observability matrix can be
• Obtain estimates of A and C from the estimated observability matrix
– Noisy estimate of the extended observability matrix
• If Or explains the system well, the EN stems from noise.• If EN is small, the estimate is consistent
*( ) (1: ,1: )r rG O pr n C O p n= × ⇒ =
( 1: ,1: ) (1: ( 1),1: )r rO p pr n O p r n A+ = −
*( ) (SV )DTrG O pr n G USV= × ⇒ =
1 1 1 1 1 1T T
r r rG USV G U S V O T O TV U S O T= ⇒ = = ⇒ = =
1k k r rx T x O O T−= ⇒ =
1 1 1 1or , etc. ( is invertible)r r rO U S O U O U R R= = ⇒ =
1 1 1 (other terms)T Tr NG USV U S V O T E= = + = +
7-26
• Using weighting matrices in the SVD– For flexibility, pretreatment before SVD can be applied
– Then the estimate of extended observability matrix becomes
– When the noise is present, W1 has important influence on space spannedby U1 and hence on the quality of the estimate of A and C.
• Estimating the extended observability matrix– The basic expression
1 1 1
1 21 1
1 21 1
k i k i k i k i
k i k i k i k i k i
i i ik k k k i k i
i ik k k i k i
y Cx Du v
CAx CBu Cw Du v
CA x CA Bu CA Bu CBu Du
CA w CA w Cw v
+ + + +
+ − + − + − + +
− −+ + − +
− −+ + − +
= + +
= + + + +
= + + + + +
+ + + + +
1 2 1 1 1T TG W GW USV U S V= = ≈
11 1rO W U R− ′=
7-27
– Define vectors
– Then– Introduce
– Then– To remove U-term, use
– Choose a matrix Φ/Ν so that the effect of noise vanishes
1 1
1 1
k k
k kr rk k
k r k r
y uy u
Y U
y u
+ +
+ − + −
= =
r rk r k r k kY O x S U V= + +
2 3
0 0 00 0
r
r r
DCB D
S
CA B CA B CB D− −
=
1 2 1 2
1 2 1 2
[ ] [ ]
[ ] [ ]
r r rN N
r r rN N
Y Y Y x x x
U U U V x V
= =
= =
Y X
U V
r rO S= + +Y X U V1( )T
T T⊥ −Π = −U
I U UU U1since ( ) 0T T T T
T TrO⊥ ⊥ ⊥ ⊥ −Π = Π + Π Π = − =
U U U UY X V U U UU UU U
(projection orthogonal to U)
1 1 1T T T
T T Tr r N NG O O T V
N N N⊥ ⊥ ⊥= Π Φ = Π Φ + Π Φ +U U U
Y X V
1lim lim 0
1lim lim ( has fuul rank )
T
T
TNN N
TNN N
VN
T TN
T n
⊥
→∞ →∞
⊥
→∞ →∞
= Π Φ =
= Π Φ =
U
U
V
X
7-28
• Finding good instrument– Let
– From the law of large numbers
– Thus, choose ϕks so that they are uncorrelated with Vk.
– Typical choice is
1 2[ ]s s sNϕ ϕ ϕ=F
( ) ( ) ( ) ( )1
1 1 1 1
1 1 1 1T
N N N NT T T TT s r r r r sk k k k k k k k
k k k k
V V U U U UN N N N
ϕ ϕ−
⊥
= = = =
Π Φ = −
∑ ∑ ∑ ∑UV
( ){ } ( ){ } ( ){ }11lim T
T T TT s r r sk k k k u k kN
E V E V U R E UN
ϕ ϕ⊥ −
→∞Π Φ = −
UV
( ){ }where Tr r
u k kR E U U= (If V and U are independent)0
1
2
1
1
k
k ssk
k
k s
y
yu
u
ϕ
−
−
−
−
=
7-29
• Finding the states and Estimating the noise statistics– r-step ahead predictor
– Least-squares estimate of parameters
– Predicted output
– SVD and deleting small singular values
– Alternatively
– Noise characteristics can be calculated from
r s rk k k kY U Eϕ= Θ + Γ + ⇒ = Θ + Γ +Y F U E
[ ]1
1( )T T
T TT T T T
T T
−
⊥ ⊥ − ΦΦ Φ Θ Γ = Φ ⇒ Θ = Π Φ ΦΠ Φ Φ
U U
UY YU Y
U UU
11 ( )T Tr r T T
NY Y ⊥ ⊥ − = = Π Φ ΦΠ Φ Φ U UY Y
11 1 1 1 1
T Tr rU S V O R S V O−≈ = =Y X
[ ] 11 1where T
NL x x L R U−= = =X Y
1k k k k k k k kw x Ax Bu v y Cx Du+= − − = − −
1 11 1 1 1 1( )T T TR U U S V R U− −= =X Y
11 1 1( , )T
rR S V O U R−= =X
7-30
• Subspace identification algorithm1. From the input-output data, form
2. Select weighting matrix W1 and W2 and perform SVD
• MOESP:• N4SID:• IVM:• CVA:
3. Select a full rank matrix R and define . Then solve for C and A.(Typical choices are R=I, R=S1, or R=S1
1/2)
4. Estimate B and D and x0 from the linear regression problem
5. If a noise model is sought, calculate and estimate the noise contributions.
1T
TGN
⊥= Π ΦU
Y
1 2 1 1 1T TG W GW USV U S V= = ≈
11 1rO W U R−=
(1: ,1: )rC O p n=( 1: ,1: ) (1: ( 1),1: )r rO p pr n O p r n A+ = −
0
21 10
, , 1
1arg min ( ) ( )
N
k k k kB D x k
y C qI A Bu Du C qI A xN
δ− −
=
− − − − −∑X
11 2, ((1/ ) )T T
TW I W N ⊥ − ⊥= = ΦΠ Φ ΦΠU U
11 2, ((1/ ) )T
TW I W N ⊥ −= = ΦΠ Φ ΦU
1/ 2 1/ 21 2((1/ ) ) , ((1/ ) )T
T TW N W N⊥ − −= Π = ΦΦU
Y Y1/ 2 1/ 2
1 2((1/ ) ) , ((1/ ) )T TT TW N W N⊥ − ⊥ −= Π = ΦΠ Φ
U UY Y
7-31
Nonlinear SystemIdentification
• General nonlinear systems
• Discrete-time nonlinear models– Hammerstein models
– Wiener models
( ) ( ( )) ( ( ), ( )) ( )( ) ( ( ), ( ))
x t f x t g x t u t v ty t h x t u t
= + +=
1 ( , )
( , )k k k k
k k k k
x f x u v
y h x u w+ = +
= +
1
1
( )( )
( )k kB z
y F uA z
−
−=
1
1
( )( )
( )k kB z
y F uA z
−
−=
7-32
Wiener Models
– Nonlinear aspects are approximated by Laguerre and Hermite seriesexpansions
– Lauerre operators:
– Hermite polynomial:
– Dynamics is approximated by Laguerre filter and the static nonlinearity isapproximated by the Hermite polynomial.
1 1( )
1 1
i
i
sL s
s sτ
τ τ− = + +
( )2 2
( ) ( 1)i
i x xi i
dH x e e
dx−= −
2 1
1 1
1( )
1 1
i
ia z a
L zaz az
−
− −
− −= − −
1 2 1 2
1 2
1 2
0 0 0
( ) ( ) ( )n n
n
nk i i i i k i k i k
i i i
y c H x H x H x∞ ∞ ∞
= = =
= ∑∑ ∑( )i
k i kx L z u=
1 2 1 2
1 2
1 2
1 0 0 0
1( ) ( ) ( )
n n
n
Nn
i i i k i k i k i kk i i i
c y H x H x H xN
∞ ∞ ∞
= = = =
= ∑∑∑ ∑
7-33
Volterra-Wiener Models
• Volterra series expansion
• Volterra Kernel– n-dimensional weighting function,
• Limitations of Volterra-Wiener models– Difficult to extend to systems with feedback– Difficult to relate the estimated Volterra kernels to a priori information
1 1 1 1 2 1 1 2
1 1 2 1 2
1 1 1 1 1 10 1 2 2
0 0 0 0 0 0k
k
L L L L L L
k i k i i k i k i i k i k i k ii i i i i i
y h h u h u u h u u u− − − − − −
− − − − − −= = = = = =
= + + + +∑ ∑∑ ∑∑ ∑(L is same as the model horizon in MPC)
ikh (multidimensional impulse response coefficients)
7-34
Power Series Expansions
• Example: Time-domain identification of nonlinear system– System:– Unknown parameters:– Collection of data:
– Least-squares solution:– Properties of this approach
• Standard statistical validation tests are applicable without extensivemodifications
• Frequency domain approach is also possible
1 1 11 01k k k k kx a x b x u b u+ = − − +
( )1 11 01Ta b bθ =
1 0 0 0 0 1
11
1 1 1 1 01N N N N N
x x x u u ab
x x x u u b− − − −
− − = − −
( ) 1T TYθ−
= Φ Φ Φ
7-35
• Continuous-time version– System:– Unknown parameters– Solving ODE:
– Collection of data:
– Least-squares solution:– Power-series expansion of general nonlinear system
• Similarly, The parameters can be obtained from the least-squaresmethod
1 11 01x a x b xu b u= − − +
( )1 11 01T
a b bθ =
1 1 1
0 0 0
1 1 1
1 0 1
11
1 01N N N
N N N
t t t
t t t
t t tN N
t t t
xdt xdt xdtx x ab
x x bxdt xudt udt− − −
−
− −− = − − −
∫ ∫ ∫
∫ ∫ ∫
1 1 1 1
1 11 01k k k k
k k k k
t t t t
t t t txdt a xdt b xudt b udt
+ + + +
= − − +∫ ∫ ∫ ∫
( ) 1T TYθ−
= Φ Φ Φ
11 1 1
( ) ( ) ( ) ( ) ( )k k m
i i jij
i i j
x t a x t b x t u t v t= = =
= − − +∑ ∑∑