33

Adaptive Noise Cancellation

Embed Size (px)

Citation preview

Page 1: Adaptive Noise Cancellation

Adaptive Noise Cancellationusing

LMS and RLS Algorithms

Project work submitted

by

V. A. Amarnath

B.Tech. Electronics and Communication Engineering

NIT Calicut

Page 2: Adaptive Noise Cancellation

Creative Commons LicenseAdaptive Noise Cancellation using LMS and RLS Algorithms by

V. A. Amarnath is licensed under aCreative Commons Attribution-NonCommercial-ShareAlike 2.5 India License.

CC© BY:© $\© C©

Page 3: Adaptive Noise Cancellation

Acknowledgement

I thank God Almighty for his showering his blessings and guiding me inaccomplishing this project.

I am extremely grateful to Shri. S. Anantha Narayanan, Director,NPOL, for giving me this golden opportunity to do my internship in thisprestigious organization.

I express my heartfelt gratitude to Dr. A. Unnikrishnan, Scientist `G',Chairman, HRD Council and Mrs. Lasitha Ranjit, Scientist `F', HRDCoordinator for facilitating the project work and Mr. K. V. Surendran,Technical O�cer for his support.

It would not have been possible to carry out this event without theguidance and encouragement of my guide, Mr. Sarath Gopi, Scientist `D',Naval Physical and Oceanographic Laboratory (NPOL), Kochi. I express mysincere gratitude towards him.

I also thank one and all, who have been directly or indirectly, involved insuccessful completion of this project.

V. A. Amarnath

i

Page 4: Adaptive Noise Cancellation

Contents

1 Wiener Filters 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Principle of Orthogonality . . . . . . . . . . . . . . . . . . . . . . 11.3 Minimum Mean Square Error . . . . . . . . . . . . . . . . . . . . 31.4 Wiener-Hopf Equations . . . . . . . . . . . . . . . . . . . . . . . 4

2 Least Mean Squares Algorithm 6

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Method of Steepest Descent . . . . . . . . . . . . . . . . . . . . . 72.3 LMS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Statistical LMS Theory and Small Step Analysis . . . . . . . . . 102.5 LMS algorithm Example - AR process . . . . . . . . . . . . . . . 122.6 LMS algorithm Example - FIR �lter . . . . . . . . . . . . . . . . 142.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Recursive Least Squares Algorithm 18

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 RLS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Applications of LMS and RLS Algorithms 22

4.1 Adaptive Noise Cancellation . . . . . . . . . . . . . . . . . . . . . 224.1.1 LMS Noise Cancellation . . . . . . . . . . . . . . . . . . . 234.1.2 RLS Noise Cancellation . . . . . . . . . . . . . . . . . . . 23

4.2 Adaptive Equalization . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

ii

Page 5: Adaptive Noise Cancellation

1 Wiener Filters

1.1 Introduction

Wiener �lter is a class of optimum discrete time �lters. Wiener �lters areused to solve signal estimation problem of stationary signals. Typically, Wiener�lters are used to �lter out noise that has corrupted a signal.

Wiener �lters are designed on the basis of the knowledge of the spectralproperties of the original signal and the noise. The design of yields a lineartime invariant �lter whose output tends to the original signal. Only if the signaland the noise are stationary linear stochastic processes, the Wiener �lter can bedesigned. The Wiener �lter works by trying to minimize the mean square error.Hence, they are also called minimum mean square error �lters.

Figure 1: Block Diagram of a Wiener Filter

In Fig. (1), u(n) represents the input to the �lter at any discrete time n andcorrespondingly y(n) represents the �lter output. w(n) represents the impulseresponse of the �lter. The output, y(n) is used to estimate a desired response,say d(n). Then, the estimation error, e(n) is de�ned as the di�erence betweenthe desired response, d(n) and the �lter output y(n). The requirement is tominimize the estimation error, e(n).

Wiener �lters considered in this project are all FIR �lters, as they are in-herently stable. FIR �lters have only forward paths, whereas IIR �lters havefeedforward and feedback paths. Therefore, if design of IIR �lters needs to becarefully done so that the �lter does not become unstable.

1.2 Principle of Orthogonality

From Figure 1 on page 1, the �lter output, y(n) can be de�ned by linearconvolution as,

y(n) =

∞∑i=0

w∗i u(n− i), n = 0, 1, 2, . . . (1)

1

Page 6: Adaptive Noise Cancellation

The aim of this �lter is to produce an estimate of the desired response, d(n).So, its required to minimize the estimation error, e(n) given by

e(n) = d(n)− y(n) (2)

To optimize the �lter design, the mean square value of e(n) needs to beminimized. The cost function is, therefore the mean square error given by

C = E [e(n)e∗(n)]

= E[|e(n)|2

](3)

When u(n) is complex, generally �lter coe�cients are also complex in nature.ith �lter coe�cient wi can be expressed as

wi = ai + jbi, i = 0, 1, 2, . . . (4)

For minimizing the cost function, C, the gradient operator is made use. Cattains a minimum value when all elements of ∇C are simultaneously equal tozero.

∇iC =∂C

∂ai+ j

∂C

∂bi(5)

= 0 (6)

Under these conditions, the �lter is said to be optimum in mean square errorsense. Now, substituting Eq. (3) in Eq. (5),

∇iC = E

[∂e(n)

∂aie∗(n) +

∂e∗(n)

∂aie(n) +

∂e(n)

∂bije∗(n) +

∂e∗(n)

∂bije(n)

](7)

Using Eqs. (2) and (4)

∂e(n)

∂ai= −u(n− i)

∂e(n)

∂bi= ju(n− i) (8)

∂e∗(n)

∂ai= −u∗(n− i)

∂e∗(n)

∂bi= −ju∗(n− i)

2

Page 7: Adaptive Noise Cancellation

Using Eqs. (8) in Eq. (7) and on simpli�cation gives,

∇iC = −2E [u(n− i)e∗(n)] (9)

For optimum operating conditions of the �lter, from Eq. (6),

E [u(n− i)e∗o(n)] = 0 (10)

where eo denotes the optimum estimation error. In words, Eq. (10) states

The necessary and su�cient condition for the cost function C toattain the minimum value is for the corresponding value of theestimation error eo(n) to be orthogonal to each input sample thatenters into the estimation of the desired response at time n.

Corollary:

When the �lter operates in its optimum condition, the estimateof the desired response de�ned by the �lter output yo(n) and thecorresponding estimation error eo(n) are orthogonal to each other.

1.3 Minimum Mean Square Error

When the �lter is operating under the optimum conditions,

eo(n) = d(n)− yo(n)

⇒ d(n) = d(n|Γn) + eo(n) (11)

where d(n|Γn) denotes the estimate of the desired response that is optimizedin MSE sense, given the input signal spans the space Γn.

Evaluating mean square value of Eq. (11),

σ2d = σ2

d+ εmin (12)

where σ2d is the variance of the desired response, σ2

dis the variance of the

estimate d(n|Γn) and εmin denotes the minimum mean square error. Here, it isassumed that both the random variables have zero mean. De�ning normalizedmean square error ε = εmin/σ2

d. From Eq. (12),

εmin = σ2d − σ2

d

εminσ2d

= 1−σ2d

σ2d

ε = 1−σ2d

σ2d

(13)

3

Page 8: Adaptive Noise Cancellation

From Eq. (13) it can be noted that,

0 ≤ ε ≤ 1

1.4 Wiener-Hopf Equations

In section 1.2 on page 1, the necessary and su�cient condition for optimumoperation of the �lter is speci�ed. Using Eqs. (1) and (2) in Eq. (10) gives,

E

[u(n− i)

(d∗(n)−

∞∑k=0

woku∗(n− k)

)]= 0 (14)

where wok is the kth coe�cient in the impulse response of the optimum �lter.From Eq. (14),

∞∑k=0

wokE [u(n− i)u∗(n− k)] = E [u(n− i)d∗(n)] (15)

The �rst expectation term, E [u(n− i)u∗(n− k)] in Eq. (15) is equal to theautocorrelation function of the �lter input for a lag k − i, i.e.,

Ruu(k − i) = E [u(n− i)u∗(n− k)] (16)

and the second expectation term, E [u(n− i)d∗(n)] in Eq. (15) is equal tothe cross correlation between the �lter input u(n− i) and the desired responsed(n) for a lag of −i, i.e.,

rud(−i) = E [u(n− i)d∗(n)] (17)

Using the de�nitions from Eqs. (16) and (17) in Eq. (15),

∞∑k=0

wokRuu(k − i) = rud(−i), i = 0, 1, 2, . . . (18)

The system of equations (18) de�nes the optimum �lter coe�cients in termsof the autocorrelation function of the input and the cross correlation between the�lter input and the desired response. These equations are caller Wiener-Hopfequations.

4

Page 9: Adaptive Noise Cancellation

Matrix Formulation of Wiener-Hopf Equations

Let Ruu denote the M × M correlation matrix of the �lter input u(n),

Ruu = E[u(n)uH(n)

]where u(n) = [u(n), u(n− 1), . . . , u(n−M + 1)]

Tis the

M × 1 input vector. The expanded form of Ruu is,

Ruu =

Ruu(0) Ruu(1) · · · Ruu(M − 1)R∗uu(1) Ruu(0) · · · Ruu(M − 2)

......

. . ....

R∗uu(M − 1) R∗uu(M − 2) · · · Ruu(0)

The M × 1 cross correlation vector is given by, rud = E [u(n)d∗(n)], in

expanded form is,

rud = [rud(0), rud(−1), . . . , rud(1−M)]T

Thus, the Wiener-Hopf equation given in Eq. (18) can be expressed as,

Ruuwo = rud (19)

where wo denoted the M × 1 optimum �lter weight vector given as

wo =[wo0,wo1, . . . ,wo(M−1)

]THence, Wiener-Hopf equation can be solved to obtain the �lter coe�cients if

the correlation matrix of the input vector u(n) and correlation vector betweeninput vector u(n) and desired response d(n) is known.

wo = R−1uurud (20)

where it is assumed that Ruu is non singular.

Need for adaptive �lters From the previous derivations, its clear that theWiener solution is the optimum solution for a given problem. But, once thesolution is obtained, the �lter coe�cients remains �xed. But, for most practicalcases the input signal keeps on varying due to several reasons like change inchannel characteristics, external disturbances, etc. Therefore, the need for anadaptive �lter arises. The LMS and the RLS are two basic adaptive �lters whichtries to reach the optimum solution at every instant.

5

Page 10: Adaptive Noise Cancellation

2 Least Mean Squares Algorithm

2.1 Introduction

The least mean square algorithm is one of the stochastic gradient algorithms.The LMS algorithm is signi�cant because of its simplicity. It does not requiremeasurements of the correlation functions nor does it require any matrix inver-sion.

The LMS algorithm is a linear adaptive �ltering algorithms consisting of twoprocesses,

1. a �ltering process, which computes the output of a linear �lter in responseto an input signal and generates an estimation error by comparing with adesired response.

2. an adaptive process, which involves the automatic adjustment of the pa-rameters of the �lter in accordance with the estimation error.

Figure 2: Block Diagram of an adaptive �lter

During the �ltering process, the desired response d(n) is also given along

with the input vector u(n). The �lter generates the output d(n|Γn) which isan estimate of desired response d(n). The estimation error is also computed asthe di�erence between the desired response and the �lter output. The estima-tion error e(n) and the input u(n) are applied to the adaptive weight controlmechanism. (See Fig. 2).

In the adaptive process, the the estimation error e(n) is �rst scaled by afactor µ and then the scalar inner product of the estimation error e(n) and theinput u(n − i) is computed for i = 0, 1, 2, . . . ,M − 1. The result so obtainedde�nes the correction δwi(n) applied to the weight wi(n) at iteration n+1. Thescaling factor µ is a positive quantity and is called the step size parameter.

The LMS algorithm makes use of u(n− i)e∗(i) as the estimate of element iwhich results in a gradient noise due to recursive computation of weights.

6

Page 11: Adaptive Noise Cancellation

2.2 Method of Steepest Descent

Method of steepest descent is an algorithm used to �nd the nearest localminimum. It is assumed that the gradient of the function can be computed.The algorithm starts at a point, say P0 and moves as many times as neededfrom Pi to Pi+1 by minimizing the line extending from Pi in the direction ofdownhill descent, ∇(Pi).

Figure 3: Method of Steepest Descent

For a one dimensional function J(x) the method can be expressed as

wi = wi−1 − η∇J(wi−1)

The algorithms iterates starting from w0 for some η > 0 till a �xed point isreached.

2.3 LMS Algorithm

LMS algorithm makes use of the steepest descent to compute the �lterweights w(n) which minimizes the cost function. Here, the cost function ischosen as mean square error, see Eq. (3),

C(n) = E[|e(n)|2

](21)

= E [e(n)e∗(n)] (22)

This cost function needs to be minimized to get least mean square error.Therefore, the partial derivative of the cost function with respect to the indi-vidual weights of the �lter coe�cient vector is computed.

7

Page 12: Adaptive Noise Cancellation

∇C(n) = ∇{E [e(n)e∗(n)]}= 2E [∇ (e(n)) e∗(n)]

= 2E[∇(d(n)− wHu(n)

)e∗(n)

]∴ ∇C(n) = −2E [u(n)e∗(n)] (23)

Gradient vector points in the direction of steepest ascent of the cost function.To �nd minimum of the cost function, its required to move in opposite directionof ∇C(n). Mathematically,

w(n+ 1) = w(n)− µ

2∇C(n) (24)

⇒ w(n+ 1) = w(n) + E [u(n)e∗(n)] (25)

The update equation for the weight w(n + 1) is as mentioned in Eq. (25).But, to compute this the knowledge of E [u(n)e∗(n)] is required. LMS doesnot use this expectation to update the weights, but computes an instantaneousestimate of this expectation.

The unbiased estimator of E [u(n)e∗(n)] is

E [u(n)e∗(n)] =1

N

N−1∑i=0

u(n− i)e∗(n− i) (26)

where N is the number of samples used for the estimate. Going for thesimplest case N = 1,

E [u(n)e∗(n)] = u(n)e∗(n)

Hence, Eq. (25) is simpli�ed as

w(n+ 1) = w(n) + µu(n)e∗(n) (27)

LMS Algorithm

Parameters:

M → �lter order

µ → step size parameter

Initialization:

w(0) = 0

8

Page 13: Adaptive Noise Cancellation

Data:

u(n) → M × 1 input vector at time n

d(n) → desired response at time n

Computation:

For n = 0, 1, 2, . . .

e(n) = d(n)− wHu(n)

w(n+ 1) = w(n) + µu(n)e∗(n)

E�ect of step size parameter and order of the �lter

The step size parameter and the order of the �lter are important ingredientsin the design of the LMS �lter. Proper selection of these gives as minimumerror and an optimum result close to the Wiener solution. How the step sizeparameter and order a�ects the �lter is shown in Fig. 4 and 5.

Figure 4: Learning curve for M = 3

It can be observed that changing µ a�ects the learning curve. Small µ causesthe curve to take longer iterations to converge to the �nal value, whereas largerµ speeds up the curve. Also, it can be noted that increasing the order speedsup the convergence of the curve, but the steady state error also increases.

9

Page 14: Adaptive Noise Cancellation

Figure 5: Learning curve for µ = 0.01

2.4 Statistical LMS Theory and Small Step Analysis

Previously, LMS �lter was referred to as a linear adaptive �lter. In real, LMSis a non linear �lter which violates the superposition and homogeneity conditionssatis�ed by a linear �lter. Consider a LMS �lter with initial condition w(0) = 0,

w(n) = µ

n−1∑i=−∞

e∗(i)u(i) (28)

The output y(n) of the LMS �lter is given by

y(n) = wH(n)u(n) (29)

⇒ y(n) = µ

n−1∑i=−∞

e(i)uH(i)u(n) (30)

From Eq. (30), it can be noted that the output y(n) of LMS �lter is a nonlinear function of the input u(n). The LMS �lter can be analyzed using theweight error vector given by,

w(n) = wo − w(n) (31)

10

Page 15: Adaptive Noise Cancellation

From Eq. (27),

wo − w(n+ 1) = wo − {w(n) + µu(n)e∗(n)} (32)

⇒ w(n+ 1) = w(n)− µu(n)[d(n)− wH(n)u(n)

]∗= w(n)− µu(n)

[d(n)− (wo − w(n)) uH(n)

]= w(n)− µu(n)

[{d(n)−wou

H(n)}

+ w(n)uH(n)]

∴ w(n+ 1) = w(n)− µu(n)[e∗o(n) + w(n)uH(n)

](33)

Now apply the expectation operator for Eq. (33),

E [w(n+ 1)] = E [w(n)]− E [µu(n)e∗o(n)]− E[µu(n)w(n)uH(n)

](34)

Computing E [µu(n)e∗o(n)] �rst,

E [µu(n)e∗o(n)] = E[µu(n) {d(n)−wou(n)}H

]= µ

{E [u(n)d(n)]−wH

o E [u(n)u(n)]}

= µ{rud −wH

o Ruu

}= µ

{rud −

(R−1uurud

)HRuu

}∴ E [µu(n)e∗o(n)] = 0 (35)

Using result obtained in Eq. (35) in Eq. (34) and assuming u and w areindependent,

E [w(n+ 1)] = E [w(n)]− µE[u(n)uH(n)

]E [w(n)]

= {I− µRuu}E [w(n)]

by eigen decomposition of Ruu,

E [w(n+ 1)] ={I− µU−1ΛU

}E [w(n)]

= U−1 {I− µΛ}UE [w(n)]

UE [w(n+ 1)] = {I− µΛ}UE [w(n)]

E [Uw(n+ 1)] = {I− µΛ}E [Uw(n)] (36)

changing variables, Uw(n)→ W(n)

E[W(n+ 1)

]= {I− µΛ}E

[W(n)

](37)

11

Page 16: Adaptive Noise Cancellation

Using Eq. (37), for n = 0, 1, 2, . . .

E[W(1)

]= {I− µΛ}E

[W(0)

]E[W(2)

]= {I− µΛ}2E

[W(0)

]... =

...

E[W(n)

]= {I− µΛ}nE

[W(0)

](38)

For the �lter to be stable, the Eq. (38) should converge. Thus,

|1− µλ| < 1

∴ µλ > 0 and µλ < 2

or, µ > 0 and µ <2

λmax(39)

where λ represents the eigen values of the autocorrelation matrix Ruu.Eq. (39) dictates the selection of step size parameter µ in the LMS algorithm.

2.5 LMS algorithm Example - AR process

The di�erence equation of the AR process of order one is given as,

u(n) = −au(n− 1) + v(n)

where a is the parameter of the process and v(n) is zero mean white noisewith variance σ2

v . To estimate the parameter a , an adaptive predictor usingLMS algorithm is used.

The algorithm can be written as

w(n+ 1) = w(n) + µu(n− 1)f(n)

f(n) = u(n)− w(n)u(n− 1)

The algorithm is implemented in Matlab and the code is given in the ap-pendix, section.(2.7). The observations are given below.

12

Page 17: Adaptive Noise Cancellation

Figure 6: Transient behavior of weight w(n) for µ = 0.05

The blue curve shows how the weight varies with number of iterations. Thecurve appears like really noisy. Therefore, its di�cult to obtain the value towhich the curve is actually tending to. The experiment is repeated a number oftimes, here for 100 times. Then, the ensemble weight is calculated and plottedagainst number of iterations. The red curve shows this. Here, it can be observedthat the weight tends to a particular value, close to the desired one.

It is also observed that changing µ a�ects the plot. Decreasing µ causes thecurve to take longer iterations to reach the �nal value, whereas increasing µspeeds up the curve.

Figure 7: Transient behavior of squared prediction error - Learning curve forµ = 0.05

13

Page 18: Adaptive Noise Cancellation

2.6 LMS algorithm Example - FIR �lter

Here, the LMS algorithm is used to estimate the �lter coe�cients of a FIR�lter. The desired response is found using the given transfer function of the�lter.

F (z) = 1 +1

2z−1 +

1

5z−2

The Matlab code is included in the appendix in section (2.7)

Figure 8: Squared prediction error - Learning curve

14

Page 19: Adaptive Noise Cancellation

(a) w(1)

(b) w(2)

(c) w(3)

Figure 9: Plot of weights vs no. of iterations

15

Page 20: Adaptive Noise Cancellation

2.7 Appendix

Code - AR process estimation

clear all;

clc;

EW = zeros(1, 1000);

EF = zeros(1, 999);

for j=1:100

u = rand(1, 1000); %input

w = zeros(1, 1000);

w(1) = 0; %initialization

mu = 0.05;

f(1) = u(1); %causal system

for i=2:999

f(i) = u(i) - w(i)*u(i-1);

w(i+1) = w(i) + mu*u(i-1)*f(i);

end

EW = (EW*(j-1)+w)/j; %ensemble weight

EF = (EF*(j-1)+f)/j; %ensemble error

end

figure(1);

clf;

hold on;

plot(w);

plot(EW,'r');

figure(2);

clf;

hold on;

plot(EF.^2,'r');

Code - FIR �lter coe�cient estimation

% clear all,clc,close all

mu = 0.01;

num = [1, .5, .2];

den = [1];

ord = 10;

N = 500;

16

Page 21: Adaptive Noise Cancellation

EE = zeros(1, N);

EW = zeros(ord, N);

WE = zeros(ord, N);

for j=1:100

clear w;

x = rand(1,N); %input

y = filter(num,den,x); %desired output

for i=1:ord

w(:,i) = 0;

end

X = zeros(1,ord);

X(1) = x(1);

X = X';

w = w';

z(2) = w(:,1)'*X;

e(2) = y(2) - z(2) ; %causal

w(:,2) = w(:,1) + mu*X*conj(e(2));

for i=2:N-1

X = circshift(X,[1,0]);

X(1) = x(i);

z(i+1) = w(:,i)'*X;

e(i+1) = y(i+1) - z(i+1);

w(:,i+1) = w(:,i) + mu.*X.*conj (e(i+1));

end

EE = (EE*(j-1)+e.^2)/j;

EW = (EW.*(j-1)+w)./j;

end

figure(1);

plot(EE.^2,'b');

% for j=1:ord

% figure(j+1);

% hold on;

% plot(EW(j,:),'b');

% plot(w(j,:),'r');

% end

17

Page 22: Adaptive Noise Cancellation

3 Recursive Least Squares Algorithm

3.1 Introduction

Recursive Least Squares algorithm is a method of least squares used to de-velop a adaptive �lter. Its a recursive algorithm to estimate the weight vectorat iteration n using the least squares estimate of the vector at iteration n− 1.

The most signi�cant feature of RLS algorithm is that it converges muchfaster than the LMS algorithm. RLS succeeds in doing this by using the inversecorrelation matrix of the input data, assumed of zero mean. The RLS algorithmis much more computationally intensive due to the same reason.

3.2 RLS Algorithm

Beginning from Eqs. (1) and (2),

y(n) = wT (n)u(n) (40)

e(n) = d(n)−wT (n)u(n) (41)

The cost function C for RLS is de�ned as,

C =

n∑i=0

λn−ie2(i) (42)

where λ is called the �forgetting factor� which gives exponential weights toolder error samples. The cost function C is dependent on the �lter coe�cientswn. The cost function C is minimized by taking partial derivative with respectto the �lter coe�cients wn.

∂C (wn)

∂wn(k)= 2

n∑i=0

λn−ie(i)∂e(i)

∂wn(k)(43)

= 2

n∑i=0

λn−ie(i)u(i− k)

For minimum value of the cost function C, Eq. (43) is equal to zero,

⇒n∑i=0

λn−ie(i)u(i− k) = 0

n∑i=0

λn−i

d(i)−M∑j=0

(wn(j)u(i− j))

u(i− k) = 0

18

Page 23: Adaptive Noise Cancellation

M∑j=0

wn(j)

{n∑i=0

λn−iu(i− j)u(i− k)

}=

n∑i=0

λn−id(i)u(i− k) (44)

Eq. (44) in matrix form is expressed as,

Ruu(n)wn = rud(n)

∴ wn = R−1uu (n)rud(n) (45)

Now, the terms in Eq. (45) can be expressed recursively as,

rud(n) =

n∑i=0

λn−id(i)u(i)

=

n−1∑i=0

λn−id(i)u(i) + d(n)u(n)

rud(n) = λrud(n− 1) + d(n)u(n) (46)

Also,

Ruu(n) =

n∑i=0

λn−iu(i)uT (i)

=

n−1∑i=0

λn−iu(i)uT (i) + u(n)uT (n)

Ruu(n) = λRuu(n− 1) + u(n)uT (n) (47)

The matrix inversion lemma is made use to evaluate R−1uu (n) from Eq. (47),which states

(A+ UCV )−1

= A−1 −A−1U(C−1 + V A−1U

)−1V A−1 (48)

where A is n× n, U is n× k, C is k × k and V is k × n.In this case, A → λRuu(n − 1), U → u(n), C → I and V → uT (n), on

substitution in Eq. (48),

R−1uu (n) ={λRuu(n− 1) + u(n)uT (n)

}−1= λ−1R−1uu (n− 1)− λ−1R−1uu (n− 1)u(n){

1 + uT (n)λ−1R−1uu (n− 1)u(n)}−1

uT (n)λ−1R−1uu (n− 1)

Let P(n) = R−1uu (n),

19

Page 24: Adaptive Noise Cancellation

⇒ P(n) = λ−1P(n− 1)− g(n)uT (n)λ−1P(n− 1) (49)

where the gain vector g(n) is given by

g(n) = λ−1P(n− 1)u(n){

1 + uT (n)λ−1P(n− 1)u(n)}−1

(50)

on simplifying,

g(n) = λ−1[P(n− 1)− g(n)uT (n)P(n− 1)

]u(n)

∴ g(n) = P(n)u(n) (51)

Hence using Eqs. (47) and (49) in Eq. (45),

wn = P(n) rud(n)

= λP(n) rud(n− 1) + d(n)P(n) u(n)

= λ[λ−1P(n− 1)− g(n)uT (n)λ−1P(n− 1)

]rud(n− 1) + d(n)g(n)

∴ wn = wn−1 + g(n)[d(n)− uT (n)wn−1

](52)

wn = wn−1 + g(n)α(n) (53)

Hence, Eq. (52) gives the update equation for the RLS algorithm. Thecorrection factor at time n− 1 is given by α(n)= g(n)

[d(n)− uT (n)wn−1

]and

uT (n)wn−1 is the a priori error calculated before the �lter is updated.

RLS Algorithm

Parameters:

M → �lter order

λ → forgetting factor

δ → regularization parameter to initialize P(0)

δ =

{small positive constant for high SNR

large positive constant for low SNR

Initialization:

w(0) = 0

P(0) = δ−1I where I is a (M + 1)× (M + 1) identity matrix

Data:

u(n) → M × 1 input vector at time n

d(n) → desired response at time n

20

Page 25: Adaptive Noise Cancellation

Computation:

For n = 0, 1, 2, . . .

α(n) = d(n)−w(n− 1)Tx(n)

g(n) = P(n− 1)x(n){λ+ xT (n)P(n− 1)x(n)

}−1P(n) = λ−1P(n− 1)− g(n)xT (n)λ−1P(n− 1)

w(n) = w(n− 1) + α(n)g(n)

Selection of the forgetting factor and the regularization parameter

Smaller value of the forgetting factor λ implies smaller contribution fromolder samples. That is, the �lter is more sensitive to the recent samples. λ canbe considered as a measure of the memory of the �lter. λ = 1 corresponds toin�nite memory.

High SNR values implies low noise level in the corrupted signal, RLS al-gorithm shows exceptionally fast convergence if regularization parameter δ isgiven a small value. For low SNR cases, when the noise level is exceptionallyhigh, the regularization parameter δ is given a high value. This increases theperformance of the �lter.

Comparison between LMS and RLS algorithms

When compared to LMS algorithm, RLS algorithm o�ers a faster conver-gence and lower error at steady state. But, this RLS algorithm is much morecomputationally complex and if not proper design procedures are not followedthe RLS algorithm may diverge away resulting in instability.

Fig. (10) shows the squared error plots of the LMS and the RLS algorithms.It can be noted that RLS converges much faster with lower steady state errors.

Figure 10: Learning curves of LMS and RLS algorithms

21

Page 26: Adaptive Noise Cancellation

4 Applications of LMS and RLS Algorithms

4.1 Adaptive Noise Cancellation

The traditional method for noise cancellation makes use of the concept ofthe notch �lters. Notch �lters removes a particular frequency of noise for whichit is designed for. These �lters are designed with prior knowledge of the signaland the noise. The �lters have �xed weights. But, practically when the noisecharacteristics are not known, adaptive noise cancellers can help. Adaptive�lters have the ability to adjust the �lter coe�cients by itself and doesn't requirethe prior knowledge of signal or noise characteristics.

Adaptive noise cancellers is a dual input system. The primary input is theinformation signal and a noise that are uncorrelated to each other. The sec-ondary input is called the reference input which supplies a correlated version ofthe noise. Noise cancellation works by �ltering the reference input and then sub-tracting from the primary input to give the information signal. This eliminatesor attenuates the noise content in the primary input.

Figure 11: Block diagram of adaptive noise canceller

Concept: Fig. (11) shows the typical noise cancellation system. s is the signaltransmitted from the transmitter and n0 is the noise added by the channel. sand n0 are uncorrelated. n1 is another source of input which is uncorrelated tos but correlated to n0. The noise n1 is �ltered to produce an output y which isan estimate of n0. The output y is subtracted from the primary input s+n0 togive the system output z = s+n0−y. The noise n1 is processed by an adaptive�lter which tries to minimize the error signal continuously. Here, the objectiveis to produce a system output z that best �t in to the transmitted signal s.This is accomplished by feeding the system output z as the error signal e tothe adaptive �lter. The adaptive algorithm actually tries to minimize the totalsystem output power.

Problem: A music �le corrupted with drum beats is given. The objective isto cancel out the drum beats and generate a �le which is noise free. Also given

22

Page 27: Adaptive Noise Cancellation

a �le containing drum beats which is uncorrelated to the music but correlatedto the drum beats in the �rst �le.

Here, the music �le is taken as the primary input and the second �le withdrum beats is taken as the secondary input. Then, the adaptive noise canceller isimplemented using the LMS and the RLS algorithms as explained next sections.

4.1.1 LMS Noise Cancellation

The adaptive �lter may use the LMS algorithm for estimating the �ltercoe�cients. The LMS algorithm explained in section (2.3) is implemented fornoise cancellation. This is made use to process a noise corrupted audio �le andto cancel out the noise present in it. This simulated use Matlab, the code isgiven in the appendix, section (4.3).

The plots in Fig. (12) below shows the power spectral density of the noisecorrupted input signal and the noise cancelled output signal.

(a) PSD of input noise corrupted signal (b) PSD of the output signal

Figure 12: Noise cancellation using LMS algorithm

4.1.2 RLS Noise Cancellation

For adaptive noise cancellation, the RLS algorithm given in section (3.2) canalso be used. When the RLS algorithm is used to estimate the �lter coe�cients,its observed that the noise cancellation occurs much more faster and the steadystate error is also lesser. From the qualitative analysis of the output signal, itcan be concluded that the performance of the RLS �lters are better comparedto the LMS �lters.

The Matlab code for the RLS noise cancellation is given in the appendix,section (4.3). The plots in Fig. (13) below shows the power spectral density of

23

Page 28: Adaptive Noise Cancellation

the noise corrupted input signal and the noise cancelled output signal.

(a) PSD of input noise corrupted signal (b) PSD of the output signal

Figure 13: Noise cancellation using RLS algorithm

4.2 Adaptive Equalization

Equalizers is a system which tries to recover the message or informationsignal from transmitted over a ISI channel. Adaptive equalizers are a class ofequalizers. These try to overcome the e�ects of the multipath propagation andthe Doppler spreading on phase shift modulation schemes.

Intersymbol Interference is a form of distortion of the signals caused whenone symbol interferes with subsequent symbols. This has a similar e�ect likenoise interference and makes the communication unreliable. The transmittersand the receivers need to incorporate design features that minimizes the e�ect ofISI. Adaptive equalization and error correcting codes are most commonly usedto take care of ISI e�ects.

Causes:

Bandlimited Channels: In most practical cases the channels arebandlimited. Therefore, the signal frequencies below and abovethe cuto� frequencies are attenuated by the channel. This af-fects the shape of the pulse received. As in frequency domain,higher frequencies are attenuated, the pulse spreads in time do-main. This causes the symbol to interfere with the neighbouringsymbols.

Multipath propagation: This is yet another cause of ISI. Whenthe signal is transmitted over a wireless channel, the signal may

24

Page 29: Adaptive Noise Cancellation

propagate through various paths owing to re�ection, refraction,ionospheric re�ection. This results in arrival di�erent versions ofthe signal at di�erent times. This delay may cause a symbol tointerfere with other symbols resulting in ISI.

Adaptive Equalizer:

At receiver the adaptive equalizer tries to undo the e�ects of the channel onthe signal before decoding the information from the received signal.

The transmitter sends a sequence of data before the actual message is trans-mitted, called the training sequence. This sequence is already known to the re-ceiver. Therefore, the receiver tries to estimate transfer function of the channel,from the received sequence. The adaptive �lter the plays the role for estimatingthe training sequence. The known sequence is taken as the primary input andthe received sequence is chosen as the secondary input. The error signal is thedi�erence between the known sequence and the �ltered sequence. The adaptivealgorithm tries to minimize the error signal.

Once the channel transfer function is known, linear convolution with thereceived sequence gives the estimate of the transmitted sequence. The delaydue to the convolution operation have to be taken care of.

Figure 14: BER vs 1/σ2n

Here, the adaptive equalizer is implemented using the LMS algorithm givenin section (2.3). For a channel with the transfer function h(n) the adaptiveequalization is implemented.

25

Page 30: Adaptive Noise Cancellation

The channel transfer function is given as

h(n) =

{12

{1 + cos

(2π3 (n− 2)

)}n = 1, 2, 3

0 otherwise

The Matlab code for the same is given in appendix, section (4.3). Fig. (14)shows the variation of bit error rate (BER) with inverse of the noise variance(1/σ2

n).

4.3 Appendix

Noise Cancellation using LMS algorithm

%lms code

clear all;clc;

[y,fs,nbit] = wavread('file4_i.wav'); %noise corrupted file

N = length(y);

[u,fs,nbit] = wavread('file4_n.wav'); %reference input

ord = 11;

mu = 0.1;

w = zeros(ord,1);

X = zeros(1,ord);

X(1) = u(1);

X = X';

z = w'*X;

e(2) = y(2) - z ; %causal

w = w + mu*X*conj(e(2));

for i=2:N-1

X = circshift(X,[1,0]);

X(1) = u(i);

e(i+1) = y(i+1) - w'*X;

w = w + mu.*X.*conj (e(i+1));

end

periodogram(y,[],'onesided',512,fs);

figure(2);

periodogram(e,[],'onesided',512,fs);

wavwrite(e',fs,nbit,'file4_f.wav');

Noise Cancellation using RLS algorithm

%rls code

clear all;clc;

[y,fs,nbit] = wavread('file4_i.wav'); %noise corrupted file

% y = y(1:100000);

N = length(y);

[u,fs,nbit] = wavread('file4_n.wav'); %reference input

ord = 16;

26

Page 31: Adaptive Noise Cancellation

del = 0.005;

lam = 1;

w(:,1) = zeros(ord,1);

X = zeros(ord,1);

X(1) = u(1);

P = eye(ord)./del;

pi = P*X;

k(:,2) = pi./(lam+X'*pi);

e(2) = y(2) - w(:,1)'*X;

w(:,1) = w(:,1) + k(:,2)*conj(e(2));

P = lam'*P-lam'*k(:,2)*X'*P;

for i=2:N-1;

X = circshift(X,[1,0]);

X(1) = u(i);

pi = P*X;

k(:,i+1) = pi./(lam+X'*pi);

e(i+1) = y(i+1) - w(:,1)'*X;

w(:,1) = w(:,1) + k(:,i+1)*conj(e(i+1));

P = lam'*P-lam'*k(:,i+1)*X'*P;

end

wavwrite(e',fs,nbit,'rls4.wav');

periodogram(y,[],'onesided',512,fs);

figure(2);

periodogram(e,[],'onesided',512,fs);

Adaptive Equalization using LMS algorithm

clear all;clc;

N = 40000;

F = 3;

ord = 7;

mu = 0.05;

var = 0.01;

for cc=1:10

for i = 1:N

s(i) = round(rand);

x(i) = (-1)^s(i);

end

sig_pow = sum(abs(s).^2)/length(x);

for i = 1:3

h(i) = .5*(1+(cos(2*pi*(i-2)/F)));

end

y_free = conv(x,h);

SNR = 5;

k = 1;

while SNR<25

27

Page 32: Adaptive Noise Cancellation

y = awgn(y_free,SNR);

w = zeros(ord,1);

X = zeros(1,ord);

X(1) = x(1);

X = X';

z = w'*X;

e(2) = y(2) - z ; %causal

w = w + mu*X*conj(e(2));

for i=2:N/4

X = circshift(X,[1,0]);

X(1) = x(i);

e(i+1) = y(i+1) - w'*X;

w = w + mu.*X.*conj (e(i+1));

end

dec = conv(w,y(N/4:N));

for i = 1:length(dec)

if dec(i)<0

dec(i) = -1;

else

dec(i) = 1;

end

end

res = dec(3:.75*N+2);

c(1,:) = x((N/4)+1:N);

c(2,:) = res;

count = 0;

for i = 1:.75*N

if c(1,i)~=c(2,i)

count = count + 1;

end

end

BER = count/(.75*N);

v(k) = (10^(SNR/10))/sig_pow;

b(k) = BER;

SNR = SNR + .3;

k = k + 1;

if cc==1

B = b;

end

end

B = (B*(cc-1)+b)/cc;

end

plot(v,B);

28

Page 33: Adaptive Noise Cancellation

References

[1] Simon Haykin, Adaptive Filter Theory 4th Ed., Pearson Education

[2] Bernad Widrow, John R. Glover, John M. McCool, John Kaunitz, CharlesS. Williams, Robert H. Hearn, James R. Zeidler, Eugene Dong and RobertC. Goodlin, �Adaptive Noise Cancellation: Principles and Applications�,Proceedings of IEEE, vol. 63, pp. 1692-1716, Dec. 1975

[3] Simon Haykin, Communication Systems 4th Ed., Wiley & Sons Inc.

[4] Eric W. Weisstein, "Method of Steepest Descent." From MathWorld�A Wol-fram Web Resource.

[5] Wikipedia Foundation Inc.: Wiener Filter, LMS Filter, RLS Filter

29