References - Springer978-3-319-47612-4/1.pdf · 216 AnswersandHintstoExercises 1.8. All the properties can be easily veriﬁed from the deﬁnitions. 1.9.WehavealreadyprovedthatifX1

References

Alfeld, G., & Herzberger, J. (1983). Introduction to interval computations. New York: AcademicPress.

Anderson, B. D. O., & Moore, J. B. (1979). Optimal filtering. Englewood Cliffs: Prentice-Hall.Andrews, A. (1981). Parallel processing of the Kalman filter. In IEEE Proceedings of the Interna-tional Conference on Parallel Processing (pp. 216–220).

Aoki, M. (1989). Optimization of stochastic systems: Topics in discrete-time dynamics. New York:Academic Press.

Aström, K. J., & Eykhoff, P. (1971). System identification – a survey. Automatica, 7, 123–162.Balakrishnan, A. V. (1984, 1987). Kalman filtering theory. New York: Optimization Software, Inc.Bierman, G. J. (1973). A comparison of discrete linear filtering algorithms. IEEE Transactions onAerospace and Electronic Systems, 9, 28–37.

Bierman, G. J. (1977). Factorization methods for discrete sequential estimation. New York: Acad-emic Press.

Blahut, R. E. (1985). Fast algorithms for digital signal processing. Reading: Addison-Wesley.Bozic, S. M. (1979). Digital and Kalman filtering. New York: Wiley.Brammer, K., & Sifflin, G. (1989). Kalman-Bucy filters. Boston: Artech House Inc.Brown, R. G., & Hwang, P. Y. C. (1992, 1997). Introduction to random signals and applied Kalmanfiltering. New York: Wiley.

Bucy, R. S., & Joseph, P. D. (1968). Filtering for stochastic processes with applications to guidance.New York: Wiley.

Burrus, C. S., Gopinath, R. A., & Guo, H. (1998). Introduction to wavelets and wavelet transfroms:A primer. Upper Saddle River: Prentice-Hall.

Carlson, N. A. (1973). Fast triangular formulation of the square root filter. Journal of AIAA, 11,1259–1263.

Catlin, D. E. (1989). Estimation, control, and the discrete Kalman filter. New York: Springer.Cattivelli, F. S., & Sayed, A. H. (2010). Diffusion strategies for distributed Kalman filtering andsmoothing. IEEE Transactions on Automatic Control, 55(9), 2069–2084.

Chen, G. (1992). Convergence analysis for inexact mechanization of Kalman filtering. IEEE Trans-actions on Aerospace and Electronic Systems, 28, 612–621.

Chen, G. (1993). Approximate Kalman filtering. Singapore: World Scientific.

© Springer International Publishing AG 2017C.K. Chui and G. Chen, Kalman Filtering, DOI 10.1007/978-3-319-47612-4

209

210 References

Chen, G., & Chui, C. K. (1986). Design of near-optimal linear digital tracking filters with coloredinput. Journal of Computational and Applied Mathematics, 15, 353–370.

Chen, G., Chen, G., & Hsu, S. H. (1995). Linear stochastic control systems. Boca Raton: CRCPress.

Chen, G., Wang, J., & Shieh, L. S. (1997). Interval Kalman filtering. IEEE Transactions onAerospace and Electronic Systems, 33, 250–259.

Chen, H. F. (1985). Recursive estimation and control for stochastic systems. New York: Wiley.Chui, C. K. (1984). Design and analysis of linear prediction-correction digital filters. Linear andMultilinear Algebra, 15, 47–69.

Chui, C. K. (1997). Wavelets: A mathematical tool for signal analysis. Philadelphia: SIAM.Chui, C. K., & Chen, G. (1989). Linear systems and optimal control. New York: Springer.Chui, C. K., & Chen, G. (1992, 1997). Signal processing and systems theory: Selected topics. NewYork: Springer.

Chui, C. K., Chen, G., & Chui, H. C. (1990). Modified extended Kalman filtering and a real-timeparallel algorithm for system parameter identification. IEEE Transactions on Automatic Control,35, 100–104.

Davis, M. H. A. (1977). Linear estimation and stochastic control. New York: Wiley.Davis, M. H. A., & Vinter, R. B. (1985). Stochastic modeling and control. New York: Chapmanand Hall.

Fleming, W. H., & Rishel, R. W. (1975). Deterministic and stochastic optimal control. New York:Springer.

Gaston, F.M. F.,& Irwin,G.W. (1990). SystolicKalmanfiltering:An overview. IEEProceedings-D,137, 235–244.

Goodwin, G. C., & Sin, K. S. (1984). Adaptive filtering prediction and control. Englewood Cliffs:Prentice-Hall.

Haykin, S. (1986). Adaptive filter theory. Englewood Cliffs: Prentice-Hall.Hong, L., Chen, G., & Chui, C. K. (1998). A filter-bank-based Kalman filtering technique forwavelet estimation and decomposition of random signals. IEEE Transactions on Circuits andSystems (II) (in press).

Hong, L., Chen, G., & Chui, C. K. (1998). Real-time simultaneous estimation and decompositionof random signals. Multidimensional Systems and Signal Processing (in press).

Jazwinski, A. H. (1969). Adaptive filtering. Automatica, 5, 475–485.Jazwinski, A. H. (1970). Stochastic processes and filtering theory. New York: Academic Press.Jover, J. M., & Kailath, T. (1986). A parallel architecture for Kalman filter measurement updateand parameter estimation. Automatica, 22, 43–57.

Kailath, T. (1968). An innovations approach to least-squares estimation, part I: Linear filtering inadditive white noise. IEEE Transactions on Automatic Control, 13, 646–655.

Kailath, T. (1982). Course notes on linear estimation. Stanford: Stanford University.Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Transactions ofthe ASME – Journal of Basic Engineering, 82, 35–45.

Kalman, R. E. (1963). New method in Wiener filtering theory. In Proceedings of the Symposium onEngineering Applications of Random Function Theory and Probability. New York: Wiley.

Kalman, R. E., & Bucy, R. S. (1961). New results in linear filtering and prediction theory. Trans-actions of the ASME – Journal of Basic Engineering, 83, 95–108.

Kumar, P. R., & Varaiya, P. (1986). Stochastic systems: Estimation, identification, and adaptivecontrol. Englewood Cliffs: Prentice-Hall.

Kung, H. T. (1982). Why systolic architectures? Computer, 15, 37–46.Kung, S. Y. (1985). VLSI arrays processors. IEEE ASSP Magazine, 2, 4–22.Kushner, H. (1971). Introduction to stochastic control. New York: Holt Rinehart and Winston Inc.Lewis, F. L. (1986). Optimal estimation. New York: Wiley.

References 211

Lu, M., Qiao, X., & Chen, G. (1992). A parallel square-root algorithm for the modified extendedKalman filter. IEEE Transactions on Aerospace and Electronic Systems, 28, 153–163.

Lu, M., Qiao, X., & Chen, G. (1993). Parallel computation of the modified extended Kalman filter.International Journal of Computer Mathematics, 45, 69–87.

Maybeck, P. S. (1982).Stochasticmodels, estimation, and control (Vol. 1,2,3).NewYork:Academic.Mead, C., & Conway, L. (1980). Introduction to VLSI systems. Reading: Addison-Wesley.Mehra, R. K. (1970). On the identification of variances and adaptive Kalman filtering. IEEE Trans-actions on Automatic Control, 15, 175–184.

Mehra, R. K. (1972). Approaches to adaptive filtering. IEEE Transactions on Automatic Control,17, 693–698.

Mendel, J. M. (1987). Lessons in digital estimation theory. Englewood Cliffs: Prentice-Hall.Potter, J. E. (1963). New statistical formulas. Instrumentation Lab., MIT (Space Guidance AnalysisMemo).

Probability Group (1975). Institute of Mathematics, Academia Sinica, China, ed.: MathematicalMethods of Filtering for Discrete-Time Systems (in Chinese) (Beijing)

Ruymgaart, P. A., & Soong, T. T. (1985, 1988).Mathematics of Kalman-Bucy filtering. New York:Springer.

Shiryayev, A. N. (1984). Probability. New York: Springer.Siouris, G., Chen, G., &Wang, J. (1997). Tracking an incoming ballistic missile. IEEE Transactionson Aerospace and Electronic Systems, 33, 232–240.

Sorenson, H. W. (Ed.). (1985). Kalman filtering: Theory and application. New York: IEEE Press.Stengel, R. F. (1986). Stochastic optimal control: Theory and application. New York: Wiley.Strobach, P. (1990). Linear prediction theory: A mathematical basis for adaptive systems. NewYork: Springer.

Wang, E. P. (1972). Optimal linear recursive filtering methods. Journal of Mathematics in Practiceand Theory (in Chinese), 6, 40–50.

Wonham,W.M. (1968). On the separation theorem of stochastic control. SIAM Journal on Control,6, 312–326.

Xu, J. H., Bian, G. R., Ni, C. K., & Tang, G. X. (1981). State estimation and system identification(in Chinese) (Beijing).

Yang, W., Chen, G., Wang, X. F., & Shi, L. (2014). Stochastic sensor activation for distributed stateestimation over a sensor network. Automatics, 50, 2070–2076.

Young, P. (1984). Recursive estimation and time-series analysis. New York: Springer.Yu, W. W., Chen, G., Wang, Z. D., & Yang, W. (2009). Distributed consensus filtering in sensornetworks. IEEE Transactions on Systems, Man and Cybernetics–Part B: Cybernetics, 39(6),1568–1577.

Zhang, H. S., Song, X. M., & Shi, L. (2012). Convergence and mean square stability of suboptimalestimator for systems with measurements packet dropping. IEEE Transactions on AutomaticControl, 57(5), 1248–1253.

Answers andHints to Exercises

Chapter1

1.1. Since most of the properties can be verified directly by using the definitionof the trace, we only consider trAB = trBA. Indeed,

trAB =n∑

i=1

( m∑

j=1

ai j b ji

)=

m∑

j=1

( n∑

i=1

b ji ai j

)= trBA .

1.2.

(trA)2 =( n∑

i=1

aii

)2

≤ nn∑

i=1

a2i i ≤ n(trAA�) .

1.3.

A =[3 11 2

], B =

[2 00 1

].

1.4. There exist unitary matrices P and Q such that

A = P

⎡

⎢⎣λ1

. . .

λn

⎤

⎥⎦ P� , B = Q

⎡

⎢⎣μ1

. . .

μn

⎤

⎥⎦ Q� ,

andn∑

k=1

λ2k ≥

n∑

k=1

μ2k .

Let P = [pi j ]n×n and Q = [qi j ]n×n . Then

p211 + p221 + · · · + p2n1 = 1 , p212 + p222 + · · · + p2n2 = 1 , · · · ,


213

http://dx.doi.org/10.1007/978-3-319-47612-4_1

214 Answers and Hints to Exercises

p21n + p22n + · · · + p2nn = 1 , q211 + q221 + · · · + q2n1 = 1 ,

q212 + q222 + · · · + q2n2 = 1 , · · · , q21n + q22n + · · · + q2nn = 1 ,

and

trAA� = tr

{P

⎡

⎢⎣λ21

. . .

λ2n

⎤

⎥⎦ P�}

= tr

⎡

⎢⎢⎢⎢⎢⎢⎢⎢⎣

p211λ21 + p212λ

22 ∗

+ · · · + p21nλ2n

p221λ21 + p222λ

22

+ · · · + p22nλ2n

p2n1λ21 + p2n2λ

22

∗ + · · · + p2nnλ2n

⎤

⎥⎥⎥⎥⎥⎥⎥⎥⎦

= (p211 + p221 + · · · + p2n1)λ21 + · · · + (p21n + p22n + · · · + p2nn)λ

2n

= λ21 + λ2

2 + · · · + λ2n .

Similarly, trBB� = μ21 + μ2

2 + · · · + μ2n . Hence, trAA� ≥ trBB�.

1.5. Denote

I =∫ ∞

−∞e−y2dy .

Then, using polar coordinates, we have

I 2 =(∫ ∞

−∞e−y2dy

)(∫ ∞

−∞e−x2dx

)

=∫ ∞

−∞

∫ ∞

−∞e−(x2+y2)dxdy

=∫ 2π

0

∫ ∞

0e−r2rdrdθ = π .

1.6. Denote

I (x) =∫ ∞

−∞e−xy2dy .

Then, by Exercise1.5,

I (x) = 1√x

∫ ∞

−∞e−(

√x y)2d(

√x y) = √

π/x .

Answers and Hints to Exercises 215

Hence,∫ ∞

−∞y2e−y2dy = − d

dxI (x)

∣∣∣∣x=1

= − 1

dx

(√π/x

)∣∣∣∣x=1

= 1

2

√π .

1.7. (a) Let P be a unitary matrix so that

R = P�diag[λ1, · · · ,λn]P ,

and define

y = 1√2diag[√λ1, · · · ,

√λn]P(x − μ) .

Then

E(X) =∫ ∞

−∞x f (x)dx

=∫ ∞

−∞(μ + √

2P−1 diag[ 1/√λ1, · · · , 1/√

λn ]y) f (x)dx

=μ

∫ ∞

−∞f (x)dx

+ Const.∫ ∞

−∞· · ·

∫ ∞

−∞

⎡

⎢⎣y1...

yn

⎤

⎥⎦ e−y21 · · · e−y2n dy1 · · · dyn

= μ · 1 + 0 = μ .

(b) Using the same substitution, we have

Var(X)

=∫ ∞

−∞(x − μ)(x − μ)� f (x)dx

=∫ ∞

−∞2R1/2yy�R1/2 f (x)dx

= 2

(π)n/2 R1/2{∫ ∞

−∞· · ·

∫ ∞

−∞

⎡

⎢⎣y21 · · · y1yn...

...

yn y1 · · · y2n

⎤

⎥⎦

· e−y21 · · · e−y2n dy1 · · · dyn}R1/2

= R1/2 I R1/2 = R .


1.8. All the properties can be easily verified from the definitions.1.9.Wehave alreadyproved that if X1 and X2 are independent thenCov(X1, X2) =0. Suppose now that R12 = Cov(X1, X2) = 0. Then R21 = Cov(X2, X1) = 0 sothat

f (X1, X2) = 1

(2π)n/2det R11det R22

· e− 12 (X1−μ

1)�R11(X1−μ

1)e− 1

2 (X2−μ2)�R22(X2−μ

2)

= f1(X1) · f2(X2) .

Hence, X1 and X2 are independent.1.10. Equation (1.35) can be verified by a direct computation. First, the followingformula may be easily obtained:

[I −Rxy R−1

yy0 I

] [Rxx Rxy

Ryx Ryy

] [I 0

−R−1yy R

�xy I

]

=[Rxx − Rxy R−1

yy Ryx 00 Ryy

].

This yields, by taking determinants,

det

[Rxx Rxy

Ryx Ryy

]= det

[Rxx − Rxy R

−1yy Ryx

] · detRyy

and

([xy

]−[μ

xμy

])� [Rxx Rxy

Ryx Ryy

]−1 ([xy

]−[μ

xμy

])

=(x − μ)�[Rxx − Rxy R

−1yy Ryx

]−1(x − μ) + (y − μ

y)�R−1

yy (y − μy) ,

where

μ = μx

+ Rxy R−1yy (y − μ

y) .

The remaining computational steps are straightforward.1.11. Let pk = C�

k Wkzk and σ2 = E[p�k (C�

k WkCk)−1pk

]. Then it can be easily

verified that

F(yk) = y�k (C�

k WkCk)yk − p�k yk − y�

k pk + σ2 .

FromdF(yk)dyk

= 2(C�k WkCk)yk − 2pk = 0 ,

http://dx.doi.org/10.1007/978-3-319-47612-4_1


and the assumption that the matrix (C�k WkCk) is nonsingular, we have

yk = (C�k WkCk)

−1pk = (C�k WkCk)

−1C�k Wkzk .

1.12.

E xk = (C�k R−1

k Ck)−1C�

k R−1k E(vk − Dkuk)

= (C�k R−1

k Ck)−1C�

k R−1k E(Ckxk + η

k)

= Exk .

Chapter 2

2.1.

W−1k,k−1 = Var(εk,k−1) = E(εk,k−1ε

�k,k−1)

= E(vk−1 − Hk,k−1xk)(vk−1 − Hk,k−1xk)�

=⎡

⎢⎣R0

. . .

Rk−1

⎤

⎥⎦+ Var

⎡

⎢⎢⎣

C0∑k

i=1�0i�i−1ξi−1...

Ck−1�k−1,k�k−1ξk−1

⎤

⎥⎥⎦ .

2.2. For any nonzero vector x, we have x�Ax > 0 and x�Bx ≥ 0 so that

x�(A + B)x = x�Ax + x�Bx > 0 .

Hence, A + B is positive definite.2.3.

W−1k,k−1

=E(εk,k−1ε�k,k−1)

=E(εk−1,k−1 − Hk,k−1�k−1ξk−1)(εk−1,k−1 − Hk,k−1�k−1ξk−1

)�

=E(εk−1,k−1ε�k−1,k−1) + Hk,k−1�k−1E(ξ

k−1ξ�k−1

)��k−1H

�k,k−1

=W−1k−1,k−1 + Hk−1,k−1�k−1,k�k−1Qk−1�

�k−1�

�k−1,k H

�k−1,k−1 .

2.4. Apply Lemma1.2 to A11 = W−1k−1,k−1, A22 = Q−1

k−1 and

A12 = A�21 = Hk−1,k−1�k−1,k�k−1 .

http://dx.doi.org/10.1007/978-3-319-47612-4_2

http://dx.doi.org/10.1007/978-3-319-47612-4_1


2.5. Using Exercise2.4 (or (2.9)), we have

H�k,k−1Wk,k−1

=��k−1,k H

�k−1,k−1Wk−1,k−1

− ��k−1,k H

�k−1,k−1Wk,k−1Hk,k−1�k−1,k�k−1

· (Q−1k−1 + ��

k−1��k−1,k H

�k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1)

−1

· ��k−1�

�k−1,k H

�k−1,k−1Wk−1,k−1

=��k−1,k{I − H�

k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1

· (Q−1k−1 + ��

k−1��k−1,k H

�k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1)

−1

· ��k−1�

�k−1,k}H�

k−1,k−1Wk−1,k−1 .

2.6. Using Exercise2.5 (or (2.10)) and the identity Hk,k−1 = Hk−1,k−1�k−1,k ,we have

(H�k,k−1Wk,k−1Hk,k−1)�k,k−1

· (H�k−1,k−1Wk−1,k−1Hk−1,k−1)

−1H�k−1,k−1Wk−1,k−1

=��k−1,k{I − H�

k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1

· (Q−1k−1 + ��

k−1��k−1,k H

�k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1)

−1

· ��k−1�

�k−1,k}H�

k−1,k−1Wk−1,k−1

=H�k,k−1Wk,k−1 .

2.7.

Pk,k−1C�k (Ck Pk,k−1C

�k + Rk)

−1

=Pk,k−1C�k (R−1

k − R−1k Ck(P

−1k,k−1 + C�

k R−1k Ck)

−1C�k R−1

k )

=(Pk,k−1 − Pk,k−1C�k R−1

k Ck(P−1k,k−1 + C�

k R−1k Ck)

−1)C�k R−1

k

=(Pk,k−1 − Pk,k−1C�k (Ck Pk,k−1C

�k + Rk)

−1

· (Ck Pk,k−1C�k + Rk)R

−1k Ck(P

−1k,k−1 + C�

k R−1k Ck)

−1)C�k R−1

k


�k + Rk)

−1

· (Ck Pk,k−1C�k R−1

k Ck + Ck)(P−1k,k−1 + C�

k R−1k Ck)

−1)C�k R−1

k


�k + Rk)

−1Ck Pk,k−1

· (C�k R−1

k Ck + P−1k,k−1)(P

−1k,k−1 + C�

k R−1k Ck)

−1)C�k R−1

k


�k + Rk)

−1Ck Pk,k−1)C�k R−1

k

http://dx.doi.org/10.1007/978-3-319-47612-4_2

http://dx.doi.org/10.1007/978-3-319-47612-4_2


=Pk,kC�k R−1

k

=Gk .

2.8.

Pk,k−1

=(H�k,k−1Wk,k−1Hk,k−1)

−1

=(��k−1,k(H

�k−1,k−1Wk−1,k−1Hk−1,k−1

− H�k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1

· (Q−1k−1 + ��

k−1��k−1,k H

�k−1,k−1Wk−1,k−1Hk−1,k−1�k−1,k�k−1)

−1

· ��k−1�

�k−1,k H

�k−1,k−1Wk−1,k−1Hk−1,k−1)�k−1,k)

−1

=(��k−1,k P

−1k−1,k−1�k−1,k − ��

k−1,k P−1k−1,k−1�k−1,k�k−1

· (Q−1k−1 + ��

k−1��k−1,k P

−1k−1,k−1�k−1,k�k−1)

−1

· ��k−1�

�k−1,k P

−1k−1,k−1�k−1,k)

−1

=(��k−1,k P

−1k−1,k−1�k−1,k)

−1 + �k−1Qk−1��k−1

=Ak−1Pk−1,k−1A�k−1 + �k−1Qk−1�

�k−1 .

2.9.

E(xk − xk|k−1)(xk − xk|k−1)�

=E(xk − (H�k,k−1Wk,k−1Hk,k−1)

−1H�k,k−1Wk,k−1vk−1)

· (xk − (H�k,k−1Wk,k−1Hk,k−1)

−1H�k,k−1Wk,k−1vk−1)

�

=E(xk − (H�k,k−1Wk,k−1Hk,k−1)

−1H�k,k−1Wk,k−1

· (Hk,k−1xk + εk,k−1))(xk − (H�k,k−1Wk,k−1Hk,k−1)

−1

· H�k,k−1Wk,k−1(Hk,k−1xk + εk,k−1))

�


−1H�k,k−1Wk,k−1E(εk,k−1ε

�k,k−1)Wk,k−1

· Hk,k−1(H�k,k−1Wk,k−1Hk,k−1)

−1


−1

=Pk,k−1 .

The derivation of the second identity is similar.


2.10. Since

σ2 = Var(xk) = E(axk−1 + ξk−1)2

= a2Var(xk−1) + 2aE(xk−1ξk−1) + E(ξ2k−1)

= a2σ2 + μ2 ,

we have

σ2 = μ2/(1 − a2) .

For j = 1, we have

E(xkxk+1) = E(xk(axk + ξk))

= aVar(xk) + E(xkξk)

= aσ2 .

For j = 2, we have

E(xkxk+2) = E(xk(axk+1 + ξk+1))

= aE(xkxk+1) + E(xk + ξk+1)

= aE(xkxk+1)

= a2σ2 ,

etc. If j is negative, then a similar result can be obtained. By induction, we mayconclude that E(xkxk+ j ) = a| j |σ2 for all integers j .2.11. Using the Kalman filtering equations (2.17), we have

P0,0 = Var(x0) = μ2 ,

Pk,k−1 = Pk−1,k−1 ,

Gk = Pk,k−1(Pk,k−1 + Rk)−1 = Pk−1,k−1

Pk−1,k−1 + σ2 ,

and

Pk,k = (1 − Gk)Pk,k−1 = σ2Pk−1,k−1

σ2 + Pk−1,k−1.

http://dx.doi.org/10.1007/978-3-319-47612-4_2


Observe that

P1,1 = σ2μ2

μ2 + σ2 ,

P2,2 = σ2P1,1P1,1 + σ2 = σ2μ2

2μ2 + σ2 .

· · ·

Pk,k = σ2μ2

kμ2 + σ2 .

Hence,

Gk = Pk−1,k−1

Pk−1,k−1 + σ2 = μ2

kμ2 + σ2

so that

xk|k = xk|k−1 + Gk(vk − xk|k−1)

= xk−1|k−1 + μ2

σ2 + kμ2 (vk − xk−1|k−1)

with x0|0 = E(x0) = 0. It follows that

xk|k = xk−1|k−1

for large values of k.2.12.

QN = 1

N

N∑

k=1

(vkv�k )

= 1

N(vNv�

N ) + 1

N

N−1∑

k=1

(vkv�k )

= 1

N(vNv�

N ) + N − 1

NQN−1

= QN−1 + 1

N[(vNv�

N ) − QN−1]

with the initial estimation Q1 = v1v�1 .

2.13. Use superimposition.


2.14. Set xk = [(x1k)� · · · (xNk )�]� for each k, k = 0, 1, · · · , with x j = 0 (andu j = 0) for j < 0, and define

x1k = B1x1k−1 + x2k−1 + (A1 + B1A0)uk−1 ,

· · · · · ·xMk = BMx1k−1 + xM+1

k−1 + (AM + BM A0)uk−1 ,

xM+1k = BM+1x1k−1 + xM+2

k−1 + BM+1A0uk−1 ,

· · · · · ·xN−1k = BN−1x1k−1 + xNk−1 + BN−1A0uk−1 ,

xNk = BNx1k−1 + BN A0uk−1 .

Then, substituting these equations into

vk = Cxk + Duk = x1k + A0uk

yields the required result. Since x j = 0 and u j = 0 for j < 0, it is also clear thatx0 = 0.

Chapter 3

3.1. Let A = BB� where B = [bi j ] �= 0. Then trA = trBB� = ∑i, j b

2i j > 0.

3.2. By Assumption2.1, η�is independent of x0, ξ

0, · · · , ξ

j−1, η

0, · · · , η

j−1,

since � ≥ j . On the other hand,

e j = C j (x j − y j−1)

= C j

(A j−1x j−1 + � j−1ξ j−1

−j−1∑

i=0

Pj−1,i (Cixi + ηi)

)

· · · · · ·

= B0x0 +j−1∑

i=0

B1iξi +j−1∑

i=0

B2iηi

for some constant matrices B0, B1i and B2i . Hence, 〈η�, e j 〉 = Oq×q for all � ≥ j .

3.3. Combining (3.8) and (3.4), we have

e j = ‖z j‖−1q z j = ‖z j‖−1

q v j −j−1∑

i=0

(‖z j‖−1

q C j Pj−1,i

)vi ;

that is, e j can be expressed in terms of v0, v1, · · · , v j . Conversely, we have

http://dx.doi.org/10.1007/978-3-319-47612-4_3

http://dx.doi.org/10.1007/978-3-319-47612-4_2

http://dx.doi.org/10.1007/978-3-319-47612-4_3

http://dx.doi.org/10.1007/978-3-319-47612-4_3


v0 =z0 = ‖z0‖qe0 ,

v1 =z1 + C1y0 = z1 + C1 P0,0v0

=‖z1‖qe1 + C1 P0,0‖z0‖qe0 ,

· · · · · ·that is, v j can also be expressed in terms of e0, e1, · · · , e j . Hence, we have

Y (e0, · · · , ek) = Y (v0, · · · , vk) .

3.4. By Exercise3.3, we have

vi =i∑

�=0

L�e�

for some q × q constant matrices L�, � = 0, 1, · · · , i , so that

〈vi , zk〉 =i∑

�=0

L�〈e�, ek〉‖zk‖�q = Oq×q ,

i = 0, 1, · · · , k − 1 . Hence, for j = 0, 1, · · · , k − 1,

〈y j , zk〉 =⟨ j∑

i=0

Pj,ivi , zk

⟩

=j∑

i=0

Pj,i 〈vi , zk〉

= On×q .

3.5. Since

xk = Ak−1xk−1 + �k−1ξk−1

= Ak−1(Ak−2xk−2 + �k−2ξk−2) + �k−1ξk−1

= · · · · · ·

= B0x0 +k−1∑

i=0

B1iξi

for some constant matrices B0 and B1i and ξkis independent of x0 and ξ

i(0 ≤

i ≤ k − 1), we have 〈xk, ξk〉 = 0. The rest can be shown in a similar manner.


3.6. Use superimposition.3.7. Using the formula obtained in Exercise3.6, we have

{dk|k = dk−1|k−1 + hwk−1 + Gk(vk − �dk − dk−1|k−1 − hwk−1)

d0|0 = E(d0) ,

where Gk is obtained by using the standard algorithm (3.25) with Ak = Ck =�k = 1.3.8. Let

xk =⎡

⎢⎣x1kx2kx3k

⎤

⎥⎦ , x1k =⎡

⎣�k

�k

�k

⎤

⎦ , x2k =⎡

⎣�Ak

� Ak

� Ak

⎤

⎦ , x3k =⎡

⎣�Ek

�Ek

�Ek

⎤

⎦ ,

ξk

=

⎡

⎢⎢⎣

ξ1k

ξ2k

ξ3k

⎤

⎥⎥⎦ , ηk

=⎡

⎢⎣η1kη2kη3k

⎤

⎥⎦ , vk =⎡

⎢⎣v1k

v2kv3k

⎤

⎥⎦ ,

A =⎡

⎣1 h h2/20 1 h0 0 1

⎤

⎦ , and C = [ 1 0 0 ] .

Then the system described in Exercise3.8 can be decomposed into three subsys-tems:

{xik+1 = Axik + �i

kξik

vik = Cxik + ηik ,

i = 1, 2, 3, where for each k, xk and ξkare 3-vectors, vk and ηk are scalars, Qk a

3 × 3 non-negative definite symmetric matrix, and Rk > 0 a scalar.

Chapter 4

4.1. Using (4.6), we have

L(Ax + By, v)

= E(Ax + By) + 〈Ax + By, v〉[Var(v)]−1(v − E(v))

= A{E(x) + 〈x, v〉[Var(v)]−1

(v − E(v))}

+ B{E(y) + 〈y, v〉[Var(v)]−1

(v − E(v))}

= AL(x, v) + BL(y, v) .

http://dx.doi.org/10.1007/978-3-319-47612-4_3

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4


4.2. Using (4.6) and the fact that E(a) = a so that

〈a, v〉 = E(a − E(a)) (v − E(v)) = 0 ,

we have

L(a, v) = E(a) + 〈a, v〉[Var(v)]−1(v − E(v)) = a .

4.3. By definition, for a real-valued function f and a matrix A = [ai j ], d f/d A =[∂ f/∂a ji ]. Hence,

0 = ∂

∂H

(tr‖x − y‖2n

)

= ∂

∂HE((x − E(x)) − H(v − E(v)))�((x − E(x)) − H(v − E(v)))

= E∂

∂H((x − E(x)) − H(v − E(v)))�((x − E(x)) − H(v − E(v)))

= E(−2(x − E(x)) − H(v − E(v))) (v − E(v))�

= 2(H E(v − E(v)) (v − E(v))� − E(x − E(x)) (v − E(v))�

)

= 2(H‖v‖2q − 〈x, v〉) .

This gives

H∗ = 〈x, v〉[‖v‖2q

]−1

so that

x∗ = E(x) − 〈x, v〉[‖v‖2q

]−1(E(v) − v) .

4.4. Since vk−2 is a linear combination (with constant matrix coefficients) of

x0, ξ0, · · · , ξ

k−3, η

0, · · · , η

k−2

which are all uncorrelated with ξk−1

and ηk−1

, we have

〈ξk−1

, vk−2〉 = 0 and 〈ηk−1

, vk−2〉 = 0 .

Similarly, we can verify the other formulas (where (4.6) may be used).4.5. The first identity follows from the Kalman gain equation (cf. Theorem4.1(c)or (4.19)), namely:

Gk(Ck Pk,k−1C�k + Rk) = Pk,k−1C

�k ,

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4


so that

Gk Rk = Pk,k−1C�k − GkCk Pk,k−1C

�k

= (I − GkCk)Pk,k−1C�k .

To prove the second equality, we apply (4.18) and (4.17) to obtain

〈xk−1 − xk−1|k−1, �k−1ξk−1− Kk−1ηk−1

〉= 〈xk−1 − xk−1|k−2 − 〈x#k−1, v#k−1〉

[‖v#k−1‖2

]−1v#k−1,

�k−1ξk−1− Kk−1ηk−1

〉= 〈x#k−1 − 〈x#k−1, v#k−1〉

[‖v#k−1‖2

]−1(Ck−1x#k−1 + η

k−1),

�k−1ξk−1− Kk−1ηk−1

〉= −〈x#k−1, v#k−1〉

[‖v#k−1‖2

]−1(S�

k−1��k−1 − Rk−1K

�k−1)

= On×n ,

in which since Kk−1 = �k−1Sk−1R−1k−1, we have

S�k−1�

�k−1 − Rk−1K

�k−1 = On×n .

4.6. Follow the same procedure in the derivation of Theorem4.1 with the term vkreplaced by vk − Dkuk , and with

xk|k−1 = L(Ak−1xk−1 + Bk−1uk−1 + �k−1ξk−1, vk−1)

instead of

xk|k−1 = L(xk, vk−1) = L(Ak−1xk−1 + �k−1ξk−1, vk−1) .

4.7. Let

wk = −a1vk−1 + b1uk−1 + c1ek−1 + wk−1 ,

wk−1 = −a2vk−2 + b2uk−2 + wk−2 ,

wk−2 = −a3vk−3 ,

and define xk = [ wk wk−1 wk−2 ]�. Then,{xk+1 =Axk + Buk + �ek

vk =Cxk + Duk + �ek ,

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4


where

A =⎡

⎣−a1 1 0−a2 0 1−a3 0 0

⎤

⎦ , B =⎡

⎣b1 − a1b0b2 − a2b0

−a3b0

⎤

⎦ , � =⎡

⎣c1 − a1c0−a2c0−a3b0

⎤

⎦ ,

C = [ 1 0 0 ] , D = [b0] and � = [c0] .4.8. Let

wk = −a1vk−1 + b1uk−1 + c1ek−1 + wk−1 ,

wk−1 = −a2vk−2 + b2uk−2 + c2ek−2 + wk−2 ,

· · · · · ·wk−n+1 = −anvk−n + bnuk−n + cnek−n ,

where b j = 0 for j > m and c j = 0 for j > �, and define

xk = [ wk wk−1 · · · wk−n+1 ]� .

Then{xk+1 = Axk + Buk + �ekvk = Cxk + Duk + �ek ,

where

A =

⎡

⎢⎢⎢⎢⎢⎣

−a1 1 0 · · · 0−a2 0 1 · · · 0

......

......

−an−1 0 0 · · · 1−an 0 0 · · · 0

⎤

⎥⎥⎥⎥⎥⎦,

B =

⎡

⎢⎢⎢⎢⎢⎢⎢⎢⎣

b1 − a1b0...

bm − amb0−am+1b0

...

−anb0

⎤

⎥⎥⎥⎥⎥⎥⎥⎥⎦

, � =

⎡

⎢⎢⎢⎢⎢⎢⎢⎢⎣

c1 − a1c0...

c� − a�c0−a�+1

...

−anc0

⎤

⎥⎥⎥⎥⎥⎥⎥⎥⎦

,

C = [ 1 0 · · · · · · 0 ] , D = [b0] , and � = [c0] .


Chapter 5

5.1. Since vk is a linear combination (with constant matrices as coefficients) of

x0, η0, γ

0, · · · , γ

k, ξ

0, β

0, · · · , β

k−1

which are all independent of βk, we have

〈βk, vk〉 = 0 .

On the other hand, βkhas zero-mean, so that by (4.6) we have

L(βk, vk) = E(β

k) − 〈β

k, vk〉

[‖vk‖2

]−1(E(vk) − vk

) = 0 .

5.2. Using Lemma4.2 with v = vk−1, v1 = vk−2, v2 = vk−1 and

v#k−1 = vk−1 − L(vk−1, vk−2) ,

we have, for x = vk−1,

L(vk−1, vk−1)

= L(vk−1, vk−2) + 〈v#k−1, v#k−1〉[‖v#k−1‖2

]−1

v#k−1

= L(vk−1, vk−2) + vk−1 − L(vk−1, vk−2)

= vk−1 .

The equality L(γk, vk−1) = 0 can be shownby imitating the proof in Exercise5.1.

5.3. It follows from Lemma4.2 that

zk−1 − zk−1

= zk−1 − L(zk−1, vk−1)

= zk−1 − E(zk−1) + 〈zk−1, vk−1〉[‖vk−1‖2

]−1(E(vk−1) − vk−1)

=[xk−1ξk−1

]−[E(xk−1)

E(ξk−1

)

]

+[〈xk−1, vk−1〉〈ξ

k−1, vk−1〉

] [‖vk−1‖2

]−1(E(vk−1) − vk−1)

whose first n-subvector and last p-subvector are, respectively, linear combinations(with constant matrices as coefficients) of

x0, ξ0, β

0, · · · , β

k−2, η

0, γ

0, · · · , γ

k−1,

http://dx.doi.org/10.1007/978-3-319-47612-4_5

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4

http://dx.doi.org/10.1007/978-3-319-47612-4_4


which are all independent of γk. Hence, we have

B〈zk−1 − zk−1, γk〉 = 0 .

5.4. The proof is similar to that of Exercise5.3.5.5. For simplicity, denote

B = [C0Var(x0)C�0 + R0]−1 .

It follows from (5.16) that

Var(x0 − x0)

= Var(x0 − E(x0)

− [Var(x0)]C�0 [C0Var(x0)C�

0 + R0]−1(v0 − C0E(x0)))

= Var(x0 − E(x0) − [Var(x0)]C�0 B(C0(x0 − E(x0)) + η

0))

= Var((I − [Var(x0)]C�0 BC0)(x0 − E(x0)) − [Var(x0)]C�

0 Bη0)

= (I − [Var(x0)]C�0 BC0)Var(x0) (I − C�

0 BC0[Var(x0)])+ [Var(x0)]C�

0 BR0BC0[Var(x0)]= Var(x0) − [Var(x0)]C�

0 BC0[Var(x0)]− [Var(x0)]C�

0 BC0[Var(x0)]+ [Var(x0)]C�

0 BC0[Var(x0)]C�0 BC0[Var(x0)]

+ [Var(x0)]C�0 BR0BC0[Var(x0)]

= Var(x0) − [Var(x0)]C�0 BC0[Var(x0)]

− [Var(x0)]C�0 BC0[Var(x0)] + [Var(x0)]C�

0 BC0[Var(x0)]= Var(x0) − [Var(x0)]C�

0 BC0[Var(x0)] .

5.6. From ξ0

= 0, we have

x1 = A0x0 + G1(v1 − C1A0x0)

and ξ1

= 0, so that

x2 = A1x1 + G2(v2 − C2A1x1) ,

etc. In general, we have

xk = Ak−1xk−1 + Gk(vk − Ck Ak−1xk−1)

= xk|k−1 + Gk(vk − Ck xk|k−1) .

http://dx.doi.org/10.1007/978-3-319-47612-4_5


Denote

P0,0 =[[Var(x0)

]−1 + C�0 R−1

0 C0

]−1

and

Pk,k−1 = Ak−1Pk−1,k−1A�k−1 + �k−1Qk−1�

�k−1 .

Then

G1 =[A0 �00 0

] [P0,0 00 Q0

] [A�0 C�

1��0 C�

1

]

·(

[ C1A0 C1�0 ][P0,0 00 Q0

] [A�0 C�

1��0 C�

1

]+ R1

)−1

=[P1,0C�

1 (C1P1,0C�1 + R1)

−1

0

],

P1 =([

A0 �00 0

]− G1[ C1A0 C1�0 ]

)[P0,0 00 Q0

][A�0 0

��0 0

]

+[0 00 Q1

]

=[[ In − P1,0C�

1 (C1P1,0C�1 + R1)

−1C1 ]P1,0 00 Q1

],

and, in general,

Gk =[Pk,k−1C�

k (Ck Pk,k−1C�k + Rk)

−1

0

],

Pk =[[ In − Pk,k−1C�

k (Ck Pk,k−1C�k + Rk)

−1Ck]Pk,k−1 00 Qk

].

Finally, if we use the unbiased estimate x0 = E(x0) of x0 instead of the somewhatmore superior initial state estimate

x0 = E(x0) − [Var(x0)]C�0 [C0Var(x0)C�

0 + R0]−1[C0E(x0) − v0] ,and consequently set

P0 =E

([x0ξ0

]−[E(x0)E(ξ

0)

])([x0ξ0

]−[E(x0)E(ξ

0)

])�

=[Var(x0) 0

0 Q0

],

then we obtain the Kalman filtering algorithm derived in Chaps. 2 and 3.

http://dx.doi.org/10.1007/978-3-319-47612-4_2

http://dx.doi.org/10.1007/978-3-319-47612-4_3


5.7. Let

P0 = [ [Var(x0)]−1 + C�0 R−1

0 C0]−1

and

Hk−1 = [ Ck Ak−1 − Nk−1Ck−1 ] .Starting with (5.17b), namely:

P0 =[( [Var(x0)]−1 + C0R

−10 C0)

−1 00 Q0

]=[P0 00 Q0

],

we have

G1 =[A0 �00 0

] [P0 00 Q0

][H

�0

��0 C

�1

]

·(

[ H0 C1�0 ][P0 00 Q0

][H

�0

��0 C

�1

]+ R1

)−1

=[(

A0P0H�0 + �0Q0�

�0 C

�1

)(H0P0H

�0 + C1�0Q0�

�0 C

�1 + R1

)−1

0

]

:=[G10

]

and

P1 =([

A0 �00 0

]−[G10

][ H0 C1�0 ]

)[P0 00 Q0

][A�0 0

��0 0

]

+[0 00 Q1

]

=[(A0 − G1H0)P0A�

0 + (I − G1C1)�0Q0��0 0

0 Q1

]

:=[P1 00 Q1

].

In general, we obtain⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

xk = Ak−1xk−1 + Gk(vk − Nk−1vk−1 − Hk−1xk−1)

x0 = E(x0) − [Var(x0)]C�0 [C0Var(x0)C�

0 + R0]−1[C0E(x0) − v0]Hk−1 = [ Ck Ak−1 − Nk−1Ck−1 ]Pk = (Ak−1 − Gk Hk−1)Pk−1A�

k−1 + (I − GkCk)�k−1Qk−1��k−1

Gk = (Ak−1 Pk−1 H�k−1 + �k−1Qk−1�

�k−1C

�k )·

(Hk−1 Pk−1 H�k−1 + Ck�k−1Qk−1�

�k−1C

�k + Rk−1)

−1

P0 = [ [Var(x0)]−1 + C�0 R−1

0 C0]−1

k = 1, 2, · · · .

http://dx.doi.org/10.1007/978-3-319-47612-4_5


By omitting the “bar” on Hk , Gk , and Pk , we have (5.21).

5.8. (a){

Xk+1 = AcXk + ζk

vk = CcXk .

(b)

P0,0 =⎡

⎣Var(x0) 0 0

0 Var(ξ0) 0

0 0 Var(η0)

⎤

⎦ ,

Pk,k−1 = AcPk−1,k−1A�c +

⎡

⎢⎢⎢⎢⎣

000Qk−1

rk−1

⎤

⎥⎥⎥⎥⎦,

Gk = Pk,k−1C�c

(C�c Pk,k−1Cc

)−1,

Pk,k = (I − GkCc)Pk,k−1 ,

X0 =⎡

⎣E(x0)00

⎤

⎦ ,

X k = Ac Xk−1 + Gk(vk − CcAc Xk−1) .

(c) The matrix C�c Pk,k−1Cc may not be invertible, and the extra estimates ξ

kand

ηk in X k are needed.

Chapter 6

6.1. Since

xk−1 = Axk−2 + �ξk−2

= · · · = Anxk−n−1 + noise

and

xk−1 = An[N�CANCA]−1(C�vk−n−1 + A�C�vk−n

+ · · · + (A�)n−1C�vk−2)

= An[N�CANCA]−1(C�Cxk−n−1 + A�C�CAxk−n−1

+ · · · + (A�)n−1C�CAn−1xk−n−1 + noise)

= An[N�CANCA]−1[N�

CANCA]xk−n−1 + noise

= Anxk−n−1 + noise ,

we have E(xk−1) = E(Anxk−n−1) = E(xk−1).

http://dx.doi.org/10.1007/978-3-319-47612-4_5

http://dx.doi.org/10.1007/978-3-319-47612-4_6


6.2. Sinced

ds

[A−1(s)A(s)

] = d

dsI = 0 ,

we have

A−1(s)

[d

dsA(s)

]+[d

dsA−1(s)

]A(s) = 0 .

Hence,

d

dsA−1(s) = −A−1(s)

[d

dsA(s)

]A−1(s) .

6.3. Let P = Udiag[ λ1, · · · , λn ]U−1. Then

P − λminI = Udiag[ λ1 − λmin, · · · , λn − λmin ]U−1 ≥ 0 .

6.4. Let λ1, · · · , λn be the eigenvalues of F and J be its Jordan canonical form.Then there exists a nonsingular matrix U such that

U−1FU = J =

⎡

⎢⎢⎢⎢⎢⎢⎣

λ1 ∗λ2 ∗

. . .. . .

. . . ∗λn

⎤

⎥⎥⎥⎥⎥⎥⎦

with each ∗ being 1 or 0. Hence,

Fk = U JkU−1 = U

⎡

⎢⎢⎢⎢⎢⎢⎣

λk1 ∗ · · · · · · ∗

λk2 ∗ · · · ∗

. . ....

. . . ∗λkn

⎤

⎥⎥⎥⎥⎥⎥⎦,

where each ∗ denotes a term whose magnitude is bounded by

p(k)|λmax |k

with p(k) being a polynomial of k and |λmax | = max( |λ1|, · · · , |λn| ). Since|λmax | < 1, Fk → 0 as k → ∞.


6.5. Since

0 ≤ (A − B) (A − B)� = AA� − AB� − BA� + BB� ,

we have

AB� + BA� ≤ AA� + BB� .

Hence,

(A + B) (A + B)� = AA� + AB� + BA� + BB�

≤ 2(AA� + BB�) .

6.6. Since xk−1 = Axk−2+�ξk−2

is a linear combination (with constant matricesas coefficients) of x0, ξ0, · · · , ξ

k−2and

xk−1 = A xk−2 + G(vk−1 − CA xk−2)

= A xk−2 + G(CAxk−2 + C�ξk−2

+ ηk−1

) − GCA xk−2

is an analogous linear combination of x0, ξ0, · · · , ξk−2

and ηk−1

, which are uncor-related with ξ

k−1and η

k, the two identities follow immediately.

6.7. Since

Pk,k−1C�k G�

k − GkCk Pk,k−1C�k G�

k

=GkCk Pk,k−1C�k G�

k + Gk RkG�k − GkCk Pk,k−1C

�k G�

k

=Gk RkG�k ,

we have

−(I − GkC)Pk,k−1C�G�

k + Gk RG�k = 0 .

Hence,

Pk,k = (I − GkC)Pk,k−1

= (I − GkC)Pk,k−1(I − GkC)� + Gk RG�k

= (I − GkC) (APk−1,k−1A� + �Q��) (I − GkC)� + Gk RG

�k

= (I − GkC)APk−1,k−1A�(I − GkC)�

+ (I − GkC)�Q��(I − GkC)� + Gk RG�k .

6.8. Imitating the proof of Lemma6.8 and assuming that |λ| ≥ 1, where λ isan eigenvalue of (I − GC)A, we arrive at a contradiction to the controllabilitycondition.6.9. The proof is similar to that of Exercise6.6.

http://dx.doi.org/10.1007/978-3-319-47612-4_6


6.10. From

0 ≤ 〈ε j − δ j , ε j − δ j 〉= 〈ε j , ε j 〉 − 〈ε j , δ j 〉 − 〈δ j , ε j 〉 + 〈δ j , δ j 〉

and Theorem6.2, we have

〈ε j , δ j 〉 + 〈δ j , ε j 〉≤ 〈ε j , ε j 〉 + 〈δ j , δ j 〉= 〈x j − x j + x j − x j , x j − x j + x j − x j 〉 + ‖x j − x j‖2n= ‖x j − x j‖2n + 〈x j − x j , x j − x j 〉

+ 〈x j − x j , x j − x j 〉 + 2‖x j − x j‖2n≤ 2‖x j − x j‖2n + 3‖x j − x j‖2n→ 5(P−1 + C�R−1C)−1

as j → ∞. Hence, Bj = 〈ε j , δ j 〉A�C� are componentwise uniformly bounded.6.11. Using Lemmas1.4, 1.6, 1.7 and 1.10 and Theorem6.1, and applyingExercise6.10, we have

tr[FBk−1−i (Gk−i − G)� + (Gk−i − G)B�k−1−i F

�]≤(n trFBk−1−i (Gk−i − G)�(Gk−i − G)B�

k−1−i F�)1/2

+ (n tr(Gk−i − G)B�k−1−i F

�FBk−1−i (Gk−i − G)�)1/2

≤(n trFF� · tr Bk−1−i B�k−1−i · tr(Gk−i − G)�(Gk−i − G))1/2

+ (n tr(Gk−i − G) (Gk−i − G)� · trB�k−1−i Bk−1−i · trF�F)1/2

=2(n tr(Gk−i − G) (Gk−i − G)� · trB�k−1−i Bk−1−i · trF�F)1/2

≤C1rk+1−i1

for some real number r1, 0 < r1 < 1, and some positive constant C independentof i and k.6.12. First, solving the Riccati equation (6.6); that is,

c2 p2 + [(1 − a2)r − c2γ2q]p − rqγ2 = 0 ,

we obtain

p = 1

2c2{c2γ2q + (a2 − 1)r +

√[(1 − a2)r − c2γ2q]2 + 4c2γ2qr } .

Then, the Kalman gain is given by

g = pc/(c2 p + r) .

http://dx.doi.org/10.1007/978-3-319-47612-4_6

http://dx.doi.org/10.1007/978-3-319-47612-4_1

http://dx.doi.org/10.1007/978-3-319-47612-4_1

http://dx.doi.org/10.1007/978-3-319-47612-4_1

http://dx.doi.org/10.1007/978-3-319-47612-4_1

http://dx.doi.org/10.1007/978-3-319-47612-4_6

http://dx.doi.org/10.1007/978-3-319-47612-4_6


Chapter 7

7.1. The proof of Lemma7.1 is constructive. Let A = [ai j ]n×n and Ac = [�i j ]n×n .It follows from A = Ac(Ac)� that

aii =i∑

k=1

�2ik , i = 1, 2, · · · , n,

and

ai j =j∑

k=1

�ik� jk , j �= i ; i, j = 1, 2, · · · , n.

Hence, it can be easily verified that

�i i =(aii −

i−1∑

k=1

�2ik

)1/2, i = 1, 2, · · · , n,

�i j =(ai j −

j−1∑

k=1

�ik� jk

)/� j j , j = 1, 2, · · · , i − 1; i = 2, 3, · · · , n,

and

�i j = 0 , j = i + 1, i + 2, · · · , n; i = 1, 2, · · · , n.

This gives the lower triangular matrix Ac. This algorithm is called the Choleskydecomposition. For the general case,we can use a (standard) singular value decom-position (SVD) algorithm to find an orthogonal matrix U such that

U diag[s1, · · · , sr , 0, · · · , 0]U� = AA� ,

where 1 ≤ r ≤ n, s1, · · · , sr are singular values (which are positive numbers) ofthe non-negative definite and symmetric matrix AA�, and then set

A = U diag[√s1, · · · ,√sr , 0, · · · , 0] .

7.2.

(a) L =⎡

⎣1 0 02 2 03 −2 1

⎤

⎦ . (b) L =⎡

⎣

√2 0 0√2/2

√2.5 0√

2/2 1.5/√2.5

√2.6

⎤

⎦ .

7.3.(a)

L−1 =⎡

⎣1/�11 0 0

−�21/�11�22 1/�22 0−�31/�11�33 + �32�21/�11�22�33 −�32/�22�33 1/�33

⎤

⎦ .

http://dx.doi.org/10.1007/978-3-319-47612-4_7

http://dx.doi.org/10.1007/978-3-319-47612-4_7


(b)

L−1 =

⎡

⎢⎢⎢⎣

b11 0 0 · · · 0b21 b22 0 · · · 0...

...... 0

bn1 bn2 bn3 · · · bnn

⎤

⎥⎥⎥⎦ ,

where⎧⎪⎨

⎪⎩

bii = �−1i i , i = 1, 2, · · · , n;

bi j = −�−1j j

∑ik= j+1 bik�k j ,

j = i − 1, i − 2, · · · , 1; i = 2, 3, · · · , n.

7.4. In the standard Kalman filtering process,

Pk,k �[0 00 1

],

which is a singular matrix. However, its “square-root” is

P1/2k,k =

[ε/

√1 − ε2 00 1

]�[ε 00 1

]

which is a nonsingular matrix.7.5. Analogous to Exercise7.1, let A = [ai j ]n×n and Au = [�i j ]n×n . It followsfrom A = Au(Au)� that

aii =n∑

k=i

�2ik , i = 1, 2. · · · , n,

and

ai j =n∑

k= j

�ik� jk , j �= i ; i, j = 1, 2, · · · , n.

Hence, it can be easily verified that

�i i =(aii −

n∑

k=i+1

�2ik

)1/2, i = 1, 2, · · · , n,

�i j =(ai j −

n∑

k= j+1

�ik� jk

)/� j j ,

j = i + 1, · · · , n; i = 1, 2, · · · , n.

and

�i j = 0 , j = 1, 2, · · · , i − 1; i = 2, 3, · · · , n.

This gives the upper-triangular matrix Au .


7.6. The new formulation is the same as that studied in this chapter except thatevery lower triangular matrix with superscript c must be replaced by the corre-sponding upper triangular matrix with superscript u.7.7. The new formulation is the same as that given in Sect. 7.3 except that all lowertriangular matrix with superscript c must be replaced by the corresponding uppertriangular matrix with superscript u.

Chapter 8

8.1. (a) Since r2 = x2 + y2, we have

r = x

rx + y

ry ,

so that r = v sinθ and

r = v sinθ + vθ cosθ .

On the other hand, since tanθ = y/x , we have θsec2θ = (x y − x y)/x2 or

θ = x y − x y

x2 sec2θ= x y − x y

r2= v

rcosθ ,

so that

r = a sinθ + v2

rcos2θ

and

θ =(

vr − vr

r2

)cosθ − v

rθsinθ

=(ar − v2sinθ

r2

)cosθ − v2

r2sinθcosθ .

(b)

x = f(x) :=⎡

⎣v sinθ

a sinθ + v2

r cos2θ

(ar − v2sinθ)cosθ/r2 − v2sinθ cosθ/r2

⎤

⎦ .

(c)

xk+1 =

⎡

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

xk[1] + hv sin(xk[3])

xk[2] + ha sin(xk[3]) + v2cos2(xk[3])/xk[1]

vcos(xk[3])/xk[1]

(axk[1] − v2sin(xk[3]))cos(xk[3])/xk[1]2−v2sin(xk[3])cos(xk[3])/xk[1]2

⎤

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

+ ξk

http://dx.doi.org/10.1007/978-3-319-47612-4_7

http://dx.doi.org/10.1007/978-3-319-47612-4_8


and

vk = [ 1 0 0 0 ]xk + ηk ,

where xk := [ xk[1] xk[2] xk[3] xk[4] ]�.(d) Use the formulas in (8.8).8.2. The proof is straightforward.8.3. The proof is straightforward. It can be verified that

xk|k−1 = Ak−1xk−1 + Bk−1uk−1 = fk−1(xk−1) .

8.4. Taking the variances of both sides of the modified “observation equation”

v0 − C0(θ)E(x0) = C0(θ)x0 − C0(θ)E(x0) + η0,

and using the estimate (v0 − C0(θ)E(x0))(v0 − C0(θ)E(x0))� for Var(v0 −C0(θ)E(x0)) on the left-hand side, we have

(v0 − C0(θ)E(x0))(v0 − C0(θ)E(x0))�

=C0(θ)Var(x0)C0(θ)� + R0 .

Hence, (8.13) follows immediately.8.5. Since

E(v1) = C1(θ)A0(θ)E(x0) ,

taking the variances of both sides of the modified “observation equation”

v1 − C1(θ)A0(θ)E(x0)

=C1(θ)(A0(θ)x0 − C1(θ)A0(θ)E(x0) + �(θ)ξ0) + η

1,

and using the estimate (v1 − C1(θ)A0(θ)E(x0))(v1 − C1(θ)A0(θ) ·E(x0))� forthe variance Var(v1 − C1(θ)A0(θ)E(x0)) on the left-hand side, we have

(v1 − C1(θ)A0(θ)E(x0))(v1 − C1(θ)A0(θ)E(x0))�

=C1(θ)A0(θ)Var(x0)A�0 (θ)C�

1 (θ) + C1(θ)�0(θ)Q0��0 (θ)C�

1 (θ) + R1 .

Then (8.14) follows immediately.8.6. Use the formulas in (8.8) directly.8.7. Since θ is a constant vector, we have Sk := Var(θ) = 0, so that

P0,0 = Var(xθ

) =[Var(x0) 0

0 0

].

http://dx.doi.org/10.1007/978-3-319-47612-4_8

http://dx.doi.org/10.1007/978-3-319-47612-4_8

http://dx.doi.org/10.1007/978-3-319-47612-4_8

http://dx.doi.org/10.1007/978-3-319-47612-4_8


It follows from simple algebra that

Pk,k−1 =[∗ 00 0

]and Gk =

[∗0

]

where ∗ indicates a constant block in the matrix. Hence, the last equation of (8.15)yields θk|k ≡ θk−1|k−1.8.8.

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

[x0c0

]=[x0

c0

], P0,0 =

[p0 0

0 s0

]

For k = 1, 2, · · · ,

Pk,k−1 = Pk−1,k−1 +[qk−1 0

0 sk−1

]

Gk = Pk,k−1

[ck−1

0

][[ck−1 0]Pk,k−1

[ck−1

0

]+ rk

]−1

Pk,k = [I − Gk[ck−1 0]]Pk,k−1[xkck

]=[xk−1

ck−1

]+ Gk(vk − ck−1 xk−1) ,

where c0 is an estimate of c0 given by (8.13); that is,

v20 − 2v0x0c0 + [(x0)2 − p0](c0)2 − r0 = 0 .

Chapter 9

9.1. (a) Let xk = [xk xk]�. Then⎧⎨

⎩

xk = −(α + β − 2)xk−1 − (1 − α)xk−2 + αvk + (−α + β)vk−1

xk = −(α + β − 2)xk−1 − (1 − α)xk−2 + β

hvk − β

hvk−1 .

(b) 0 < α < 1 and 0 < β < α2

1−α .9.2. System (9.11) follows from direct algebraic manipulation.9.3. (a)

� =

⎡

⎢⎢⎣

1 − α (1 − α)h (1 − α)h2/2 −sα−β/h 1 − β h − βh/2 −sβ/h−γ/h2 1 − γ/h 1 − γ/2 −sγ/h2

−θ −θ/h −θh2/2 s(1 − θ)

⎤

⎥⎥⎦

http://dx.doi.org/10.1007/978-3-319-47612-4_8

http://dx.doi.org/10.1007/978-3-319-47612-4_8

http://dx.doi.org/10.1007/978-3-319-47612-4_9

http://dx.doi.org/10.1007/978-3-319-47612-4_9


(b)

det[z I − �] =z4 + [(α − 3) + β + γ/2 − (θ − 1)s]z3+ [(3 − 2α) − β + γ/2 + (3 − α − β − γ/2 − 3θ)s]z2+ [(α − 1) − (3 − 2α − β + γ/2 − 3θ)s]z + (1 − α − θ)s .

X1 = zV (z − s)

det[z I − �] {αz2 + (γ/2 + β − 2α)z + (γ/2 − β + α)} ,

X2 = zV (z − 1)(z − s)

det[z I − �] {βz − β + γ}/h ,

X3 = zV (z − 1)2(z − s)

det[z I − �] γ/h2 ,

and

W = zV (z − 1)3

det[z I − �]θ .

(c) Let Xk = [ xk xk xk wk ]�. Then

xk = a1xk−1 + a2xk−2 + a3xk−3 + a4xk−4 + αvk

+ (−2α − sα + β + γ/2)vk−1 + [α − β + γ/2

+ (2α − β − γ/2)s]vk−2 − (α − β + γ/2)svk−3 ,

xk = a1 xk−1 + a2 xk−2 + a3 xk−3a4 xk−4 + (β/h)vk

− [(2 + s)β/h − γ/h]vk−1 + [β/h − γ/h

+ (2β − γ)s/h]vk−2 − [(β − γ)s/h]vk−3 ,

xk = a1 xk−1 + a2 xk−2 + a3 xk−3 + a4 xk−4 + (γ/h)vk

− [(2 + γ)γ/h2]vk−1 + (1 + 2s)vk−2 − svk−3 ,

wk = a1wk−1 + a2wk−2 + a3wk−3 + a4wk−4

+ (γ/h2)(vk − 3vk−1 + 3vk−2 − vk−3) ,

with the initial conditions x−1 = x−1 = x−1 = w0 = 0, where

a1 = −α − β − γ/2 + (θ − 1)s + 3 ,

a2 = 2α + β − γ/2 + (α + βh + γ/2 + 3θ − 3)s − 3 ,

a3 = −α + (−2α − β + γ/2 − 3θ + 3)s + 1 ,

and

a4 = (α + θ − 1)s .


(d) The verification is straightforward.9.4. The verifications are tedious but elementary.9.5. Study (9.19) and (9.20). We must have σp,σv, σa ≥ 0, σm > 0, and P > 0.9.6. The equations can be obtained by elementary algebraic manipulation.9.7. Only algebraic manipulation is required.

Chapter 10

10.1. For (1) and (4), let ∗ ∈ {+,−, · , /}. Then

X ∗ Y = {x ∗ y|x ∈ X, y ∈ Y

}

= {y ∗ x |y ∈ Y, x ∈ X

}

= Y ∗ X .

The others can be verified in a similar manner. As to part (c) of (7), without lossof generality, we may only consider the situation where both x ≥ 0 and y ≥ 0 inX = [x, x] and Y = [y, y], and then discuss different cases of z ≥ 0, z ≤ 0, andzz < 0.10.2. It is straightforward to verify all the formulas by definition. For instance,for part (j.1), we have

AI (BC) =⎡

⎣n∑

j=1

AI (i, j)

[n∑

�=1

Bj�C�k

]⎤

⎦

⊆⎡

⎣n∑

j=1

n∑

�=1

AI (i, j)Bj�C�k

⎤

⎦

=⎡

⎣n∑

�=1

⎡

⎣n∑

j=1

AI (i, j)Bj�

⎤

⎦C�k

⎤

⎦

= (AI B)C .

10.3. See: Alefeld, G. and Herzberger, J. (1983).10.4. Similar to Exercise1.10.10.5. Observe that the filtering results for a boundary system and any of its neigh-boring system will be inter-crossing from time to time.10.6. See: Siouris, G., Chen, G. and Wang, J. (1997).

http://dx.doi.org/10.1007/978-3-319-47612-4_9

http://dx.doi.org/10.1007/978-3-319-47612-4_9

http://dx.doi.org/10.1007/978-3-319-47612-4_10


Chapter 11

11.1.

φ2(t) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎩

1

2t2 0 ≤ t < 1

−t2 + 3t − 3

21 ≤ t < 2

1

2t2 − 3 t + 9

22 ≤ t < 3

0 otherwise.

φ3(t) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

1

6t3 0 ≤ t < 1

−1

2t3 + 2t2 − 2t + 2

31 ≤ t < 2

1

2t3 − 4t2 + 10t − 22

32 ≤ t < 3

−1

6t3 + 2t2 − 8t + 32

33 ≤ t < 4

0 otherwise.

11.2.

φn(ω) =(1 − e−iω

iω

)n

= e−inω/2(sin(ω/2)

ω/2

)n

.

11.3. Simple graphs.11.4. Straightforward algebraic operations.11.5. Straightforward algebraic operations.

Chapter 12

12.1 Apply matrix analysis (see Cattivelli and Sayed 2010).12.2 Straightforward algebraic operations.

http://dx.doi.org/10.1007/978-3-319-47612-4_11

http://dx.doi.org/10.1007/978-3-319-47612-4_12

Index

AAdaptive Kalman Filtering, 202

noise-adaptive filter, 202Adaptive System Identification, 120, 122Affine model, 51Algorithm for real-time application, 110α − β tracker, 148α − β − γ tracker, 144, 145, 147α − β − γ − θ tracker, 148, 149Angular displacement, 118, 135ARMA (autoregressive moving-average)

process, 30ARMAX (auto-regressive moving-average

model with exogeneous inputs), 68Attracting point, 118Augmented matrix, 207Augmented system, 79Azimuthal angular error, 47

BBayes formula, 12Bound (lower, upper), 167

CCholesky factorization, 107Colored noise (sequence or process), 69,

78, 148Conditional probability, 11Controllability matrix, 89Controllable linear system, 89

Correlated system and measurement noiseprocesses, 51

Covariance, 13Cramer’s rule, 141

DDecoupling formulas, 139Decoupling of filtering equation, 139Descartes rule of signs, 147Determinant preliminaries, 1Deterministic input sequence, 19Digital filtering process, 22Digital prediction process, 22Digital smoothing estimate, 197Digital smoothing process, 22

EElevational angular error, 47Estimate, 16

distributed state estimate, 185least-squares optimal estimate, 17linear estimate, 17minimum trace variance estimate, 54minimum variance estimate, 17, 37, 52optimal estimate, 17operator, 53

unbiased estimate, 17, 52Event (simple), 8Expectation, 9

conditional expectation, 14Extended Kalman filter, 115, 118, 120


245

246 Index

FFIR system, 204

GGaussian white noise sequence, 15, 121Geometric convergence, 92

IIIR system, 204Independent random variables, 13Innovations sequence, 35Inverse z-transform, 141

JJoint probability distribution (function), 10Jordan canonical form, 5, 7

KKalman–Bucy filter, 204Kalman filter, 19, 26, 33

extended Kalman filter, 115, 118, 120interval Kalman filter, 161limiting Kalman filter, 81, 82modified extended Kalman filter, 125steady-state Kalman filter, 82, 139, 189wavelet Kalman filter, 171

Kalman filtering equation (algorithm, orprocess), 26, 29, 38, 42, 57, 66,73–75, 79, 113

Kalman gain matrix, 24Kalman smoother, 197

LLeast-squares preliminaries, 15Limiting (or steady-state) Kalman filter, 81Limiting Kalman gain matrix, 82Linear deterministic/stochastic system, 19,

43, 65, 198, 204Linear regulator problem, 205Linear state-space (stochastic) system, 21,

33, 69, 81, 202, 205LU decomposition, 207

MMarginal probability density function, 10Matrix inversion lemma, 3Matrix Riccati equation, 83, 99, 139, 142

Matrix Schwarz inequality, 2, 17Minimum variance estimate, 17, 37, 52Modified extended Kalman filter, 125Moment, 10

NNonlinear model (system), 115Non-negative definite matrix, 1Normal distribution, 9Normal white noise sequence, 15

OObservability matrix, 83Observable system, 88Optimal estimate, 17

asymptotically optimal estimate, 94least-squares optimal estimate, 17optimal estimate operator, 55

Optimal prediction, 22, 202Optimal weight matrix, 17Optimality criterion, 20Outcome, 8

PParallel processing, 207Parameter identification, 123

adaptive parameter identificationalgorithm, 123

Positive definite matrix, 1Positive square root matrix, 16Prediction-correction, 22, 24, 29, 40, 82Probability density function, 9

conditional probability density function,12

Gaussian (or normal) probability densityfunction, 9, 11

joint probability density function, 12Probability distribution, 8, 10

function, 8joint probability distribution (function),

10Probability preliminaries, 8

RRadar tracking model (or system), 47, 63,

194Random sequence, 15Random signal, 177

Index 247

Random variable, 8independent random variables, 13uncorrelated random variables, 13

Random vector, 10Range, 47, 116Real-time application, 63, 76, 98, 110Real-time estimation/decomposition, 177Real-time tracking, 43, 76, 99, 142, 147

SSample space, 8Satellite orbit estimation, 118Schur complement technique, 207Schwarz inequality, 2

matrix Schwarz inequality, 2, 17vector Schwarz inequality, 2

Separation principle, 206Sequential algorithm, 101Square-root algorithm, matrix, 16, 101, 107Stabilizable system, 190Steady-state (or limiting) Kalman filter, 82Stochastic optimal control, 205Suboptimal filter, 144Systolic array, 206

implementation, 206

TTaylor approximation, 47, 126Trace, 5

UUncorrelated random variables, 13

VVariance, 10

conditional variance, 14

WWavelets, 171Weight matrix, 15

optimal weight matrix, 16White noise sequence (process), 15, 20, 69

Gaussian (or normal) white noisesequence, 15, 69

zero-mean Gaussian white noisesequence, 15

Wiener filter, 203Wireless sensor network (WSN), 185

Zz-transform, inverse, 140, 141

Documents

References - Springer978-3-319-47612-4/1.pdf · 216 AnswersandHintstoExercises 1.8. All the properties can be easily veriﬁed from the deﬁnitions. 1.9.WehavealreadyprovedthatifX1