Adap>ve Filters – Part 4 (RLS)web.cecs.pdx.edu/~ericwan/uploads/1/0/3/3/10336512/ece... · 2015. 11. 30. · 20Mσ2 x χ(R)=128.6 µ NLMS = 1 2 λ RLS =.98 0 100 200 300 400 500

ECE 5/639 1 Portland State University E. Wan

Adap>ve Filters – Part 4 (RLS) ECE 5/639

Sta>s>cal Signal Processing II: Linear Es>ma>on

Eric Wan, Ph.D. Fall 2015


Recursive Least Squares (RLS)

•  RLS was discovered by Gauss in 1821 but laid unused or ignored un>l 1950 when PlackeW rediscovered the original work.

•  Recall standard Least Squares

minc

e(n)2

n=1

N

∑y(n) = cTx(n)e(n) = y(n)− y(n)

cLS = XTX( )

−1XTy = R(N )−1d(N )

X =

x(1)T

x(2)T

!x(N )T

⎡

⎣

⎢⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥⎥

y =

y(1)y(2)!y(N )

⎡

⎣

⎢⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥⎥



•  RLS – resolve at every >me step

R(n) = XTX = x(k)x(k)Tk=1

n

∑ = x(k)x(k)Tk=1

n−1

∑ + x(n)x(n)T

c(n) = R(n)−1d(n)

R(n−1)R(n) = R(n−1)+ x(n)x(n)T

d(n) = XTy = x(k)y(n)k=1

n

∑ = x(k)yk=1

n−1

∑ + x(n)y(n)

d(n) = d(n−1)+ x(n)y(n)

c(n) = R(n)−1d(n)

R(n) = R(n−1)+ x(n)x(n)T

d(n) = d(n−1)+ x(n)y(n)

so

similarly

RLS.0

opera>ons – will fix this soon

O(M 3)

y(n) = c(n−1)T x(n)e(n) = y(n)− c(n−1)T x(n)

ε(n) = y(n)− c(n)T x(n)

a priori error a posteriori error


•  LMS –  Stochas>c Gradient Descent –  Many noisy steps to the boWom of bowl

•  RLS –  At each >me step we solve the LS problem

exactly

–  One-step convergence to the es>mate of the bowl

–  Re-es>mate new bowl at the next >me step and then jump to boWom of bowl

–  Newton’s like method

–  Convergence depends on convergence of and


R(n)d(n)



•  First modifica>on – Weighted least Squares –  Exponen>ally decaying window –  Provides a “forgebng” factor for tacking non-sta>onary environments

–  Also provides beWer numerical stability

minc

λ n−k e(n)2

n=1

n

∑ = e(n)TLe(n)

L =

λ n−1 0λ n−2

!0 1

⎡

⎣

⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥

T

c(n) = XTL−1X( )−1XTLy



•  First modifica>on – Weighted least Squares

R(n) = XTLX = λ n−kx(k)x(k)Tk=1

n

∑ = λ λ n−k−1x(k)x(k)Tk=1

n−1

∑ + x(n)x(n)T

R(n) = λR(n−1)+ x(n)x(n)T

d(n) = λd(n−1)+ x(n)y(n)

c(n) = R(n)−1d(n)

λ n−k

k=1

n

∑ =1−λ n

1−λE R(n)⎡⎣

⎤⎦=1−λ n

1−λR

Note a different “scale factor”, which just cancels out.

R(n−1)

RLS.01



•  Now propagate directly instead of

•  Matrix Inversion Lemma (Woodbury 1950) – –  For A, B posi>ve definite M x M matrices,

–  RLS Subs>tu>on

R(n)R−1(n)

A= B−1 +CD−1CT


A= R(n)B = λR−1(n−1)C = x(n)D =1

R−1(n) = λ−1R−1(n−1)− λ−2R−1(n−1)x(n)x(n)T R−1(n−1)1+ x(n)T R−1(n−1)x(n)λ−1

A−1 = B− BC D+CTBC( )−1CTB



•  Now define the a “gain” vector

•  We can also write as

R−1(n) = λ−1R−1(n−1)− λ−2R−1(n−1)x(n)x(n)T R−1(n−1)1+ x(n)T R−1(n−1)x(n)λ−1

This yields the RLS Ricca> equa>on

R−1(n) = λ−1 R−1(n−1)−g(n)x(n)T R−1(n−1)( )

g(n) = λ−1R−1(n−1)x(n)1+λ−1x(n)T R−1(n−1)x(n)

g(n)

g(n) 1+λ−1x(n)T R−1(n−1)x(n)( ) = λ−1R−1(n−1)x(n)

g(n) = λ−1R−1(n−1)x(n)−g(n)λ−1x(n)T R−1(n−1)x(n)= λ−1 R−1(n−1)−g(n)x(n)T R−1(n−1)( )x(n)

g(n) = R−1(n)x(n)



•  Now we can re-write the weight update

c(n) = R(n)−1d(n)= λ−1R−1(n−1)−λ−1g(n)x(n)T R−1(n−1)( ) λd(n−1)+ x(n)y(n)( )

c(n−1) −g(n)x(n)T c(n−1) R(n)−1x(n)y(n) = g(n)y(n)

c(n) = c(n−1)+g(n) y(n)− x(n)T c(n−1)( )

g(n) = R−1(n−1)x(n)λ + x(n)T R−1(n−1)x(n)

c(n) = c(n−1)+g(n) y(n)− x(n)T c(n−1)( )R−1(n) = λ−1 R−1(n−1)−g(n)x(n)T R−1(n−1)( )

RLS

which gives us

Note that g(n)e(n) = R−1(n)x(n)e(n)

≈ H −1∇

Opera>ons! O(M 2 )

Newton like algorithm



•  One more simplifica>on

g(n) = R−1(n−1)x(n)λ + x(n)T R−1(n−1)x(n)


RLS

g(n) = R−1(n−1)x(n)

g(n) = g(n)λ + x(n)T g(n)

x(n)T R−1(n−1) = g(n)T

g(n) = R−1(n−1)x(n)define

The book does this – may lead to numerical issues



•  Ini>aliza>on


g(n) = R−1(n−1)x(n)

g(n) = g(n)λ + x(n)T g(n)

R(1) = x(1)x(1)T

R(0) = δI

R(n) = λ n−kx(k)x(k)Tk=1

n

∑ +λ nδI


λ n−k e(n)2

n=1

n

∑ +λ nδ c2

Is rank 1. Not inver>ble

then


Recursive Least Squares (RLS) - Analysis

•  Computa>ons –  LMS: –  RLS:

•  Convergence Analysis –  Follows analysis for convergence of Weighted Least Squares

–  Assume a model

–  Then

–  Including ini>al condi>ons

2M +12M 2 + 4M

c(n) = XTL−1X( )−1XTLy

y = Xc0 + v

E c(n)⎡⎣ ⎤⎦= c0

R(0) = δI


Recursive Least Squares (RLS) - Analysis

•  Means Squared Convergence

•  Generally much faster convergence than LMS with less sensi>vity to eigenvalue spread.

λ =1

cov[!c(n)]= Pon−M

R−1

Pex (n) =M

n−M −1Po

0 < λ <1

cov[!c(n)] ≈ λ 2 cov[!c(n−1)]+ (1−λ)2PoR−1

cov[!c(∞)] ≈ 1−λ1+λ

PoR−1

Pex (∞) ≈1−λ1+λ

MPo

- Convergence with no error as 1/n - No Misadjustment

- Covariance affected by condi>on of

R−1

- Misadjustment depends on choice of

λ


Recursive Least Squares (RLS) - Example

•  Simple noise reduc>on

y(n) = s(n)

y(n)e(n)

s(n) -c(n)x(n)

v2 (n)noise

11−β z−1

v1(n)

χ (R)βIncreasing increases the eigenvalue spread

M = 0 σ 2v2 = .5σ 2

v1 =1

noise


Recursive Least Squares (RLS) - Example

0 100 200 300 400 500 600 700 8000

5

10

15

20

25

30e2(n)

LMSNLMSRLS

0 100 200 300 400 500 600 700 8000

0.5

1

1.5

2

2.5

3average [e2(n)]

LMSNLMSRLS

µLMS =1

20Mσ 2x

χ (R) =1.57

µNLMS =12

λRLS = .98


Recursive Least Squares (RLS) - Example µLMS =1

20Mσ 2x

χ (R) =12.9

µNLMS =12

λRLS = .98

0 100 200 300 400 500 600 700 8000

2

4

6

8

10e2(n)

LMSNLMSRLS

0 100 200 300 400 500 600 700 8000

0.5

1

1.5

2

2.5

3average [e2(n)]

LMSNLMSRLS



20Mσ 2x

χ (R) = 45.16

µNLMS =12

λRLS = .98

0 100 200 300 400 500 600 700 8000

5

10

15

20e2(n)

LMSNLMSRLS

0 100 200 300 400 500 600 700 8000

0.5

1

1.5

2

2.5

3average [e2(n)]

LMSNLMSRLS



20Mσ 2x

χ (R) =128.6

µNLMS =12

λRLS = .98

0 100 200 300 400 500 600 700 8000

5

10

15

20

25

30

35e2(n)

LMSNLMSRLS

0 100 200 300 400 500 600 700 8000

0.5

1

1.5

2

2.5

3average [e2(n)]

LMSNLMSRLS



20Mσ 2x

χ (R) = 409.45

µNLMS =12

λRLS = .98

0 100 200 300 400 500 600 700 8000

5

10

15

20

25e2(n)

LMSNLMSRLS

0 100 200 300 400 500 600 700 8000

0.5

1

1.5

2

2.5

3average [e2(n)]

LMSNLMSRLS



20Mσ 2x

χ (R) = 892.49

µNLMS =12

λRLS = .98

0 100 200 300 400 500 600 700 8000

10

20

30

40

50e2(n)

LMSNLMSRLS

0 100 200 300 400 500 600 700 8000

0.5

1

1.5

2

2.5

3average [e2(n)]

LMSNLMSRLS


RLS – Other Aspects

•  Fast RLS Algorithms

–  Rather complicated to implement in prac>ce


RLS – Other Aspects

•  Tracking performance for non-sta>onary data

•  Convergence is not the same as tracking •  While RLS generally converges faster and with less excess MSE than LMS, it does

not imply beWer tracking performance –  LMS tracking – follows the boWom of a moving bowl

–  RLS tracking – re-es>mate a changing bowl

•  Analysis is very complicated. Usually simple assump>ons are made on the non-sta>onary characteris>cs (e.g., Markov process)

•  No simple answer on whether LMS or RLS performs beWer – it depends.

Documents

Adap>ve Filters – Part 4 (RLS)web.cecs.pdx.edu/~ericwan/uploads/1/0/3/3/10336512/ece... · 2015. 11. 30. · 20Mσ2 x χ(R)=128.6 µ NLMS = 1 2 λ RLS =.98 0 100 200 300 400 500