Upload
herbert-gregory-thornton
View
275
Download
5
Tags:
Embed Size (px)
Citation preview
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 1
BMEBME445252 Bio Biomedimedical Signal cal Signal ProcessingProcessing
Lecture 3Lecture 3
Signal conditioningSignal conditioning
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 2
Lecture 3 Outline
In this lecture, we’ll study the following signal conditioning methods (specifically for noise reduction) Ensemble averaging Median filtering Moving average filtering Principal component analysis Independent component analysis (in brief)
Before we study these, an introduction to some mathematics will be given
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 3
Mean
The arithmetic mean is the "standard" average, often simply called the "mean"
where N is used to denote the data size (length) In MATLAB, n=1,….N but sometimes we use n=0,1,….N-1.
Example An experiment yields the following data: 34,27,45,55,22,34 To get the arithmetic mean
How many items? There are 6. Therefore N=6 What is the sum of all items? =217. To get the arithmetic mean divide sum by N, here 217/6=36.1667
Expectation What is expected value of X, E[X]? Simply said, it refer to the sum divided by the
quantity, i.e. mean of the value in the square brackets
Eg: E[x2]=
N
n
nxN
x1
][1
1
0
][1 N
n
nxN
x
1
0
2][1 N
n
nxN
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 4
Very often, we set the mean to zero before performing any signal analysis This is to remove the dc (0 Hz) noise
xm=x-mean(x)
0 50 100 150 200 250 30013
14
15
16
17
18
19
20
21
22
0 50 100 150 200 250 300-4
-3
-2
-1
0
1
2
3
4
5
Mean removal for signals
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 5
Mean removal across channels/recordings
Sometimes, a noise corrupts all the signals in a multi-channel signal or across all the recordings of a single channel signal Since the noise is common to all the channels/recordings, the simplest
way of removing this noise is to remove mean across channels/recordings
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 6
Standard deviation ()
Measures how spread out are the values in a data set Suppose we are given a signal x1, ..., xN of real value numbers (all recorded signals are real
values) The arithmetic mean of this population is defined as
The standard deviation of this population is defined as
Given only a sample of values x1,...,xN from some larger population, many authors define the sample (or estimated) standard deviation by
This is known as an unbiased estimator for the actual standard deviation
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 7
Standard deviation example
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 8
Interpreting standard deviation
A large standard deviation indicates that the data points are far from the mean and a small standard deviation indicates that they are clustered closely around the mean
For example, each of the three samples (0, 0, 14, 14), (0, 6, 8, 14), and (6, 6, 8, 8) has an average of 7.
Their standard deviations are 7, 5 and 1, respectively.
The third set has a much smaller standard deviation than the other two because its values are all close to 7.
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 9
Normalisation Sometimes, we may wish to normalise a signal to mean=0 and set the standard
deviation to 1 For example, if we record the same signal but using different instruments with different
amplification factor, it will be difficult to analyse the signals together In this regard, we will normalise the signals using
)(
)][(][
xstd
xnxnxnor
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 10
Variance Variance is simply the square of standard deviation
Uncertainty measure Variance may be thought of as a measure of uncertainty
When deciding whether measurements agree with a theoretical prediction, variance could be used
If variance (using the predicted mean) is high, then the measurements contradict the prediction
Example: say we have predicted that x[1]=7, x[2]=6, x[3]=5 x is measured 3 times => (7.2 6.7 5.6); (4.2 6.8 5.2); (11.2 6.3 5.9)
Do this =>
Compute the variance using the predicted value as mean
var[1]=12.76, var[2]=0.610, var[3]=0.605
So, we know that x[1] measurements are contradicting the prediction and probably not x[2] and x[3] measurements
N
ii xx
N 1
2)(1
1var
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 11
Covariance If we have multi-channel/multi-trial recorded signals, we can have cross variance or simply
covariance
Covariance measure the variance between different signals (from different channels/recordings)
Covariance between two signals, X and Y with respective means, μ and ν,
The covariance sometimes is used as a measure of "linear dependence" between the two signals but correlation is a better measure
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 12
Correlation Correlation between two signals, X and Y is
It is simply normalised covariance
It measures linear dependence between X and Y
The correlation is 1 in the case of an increasing linear relationship, −1 in the case of a decreasing linear relationship, and some value in between in all other cases, indicating the degree of linear dependence between the variables
The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 13
Application of correlation (example)
The diagram shows how the unknown signal can be identified A copy of a known reference signal is correlated with the unknown signal The correlation will be high if the reference is similar to the unknown signal The unknown signal is correlated with a number of known reference
functions A large value for correlation shows the degree of similarity to the reference The largest value for correlation is the most likely match
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 14
Application of correlation (another example)
Application to heart disease detection using ECG signals Cross correlation is one way in which different types of heart diseases can be identified
using ECG signals Each heart disease has a unique ECG signal Some example of ECG signals for different diseases are shown below
The system has a library of pre-recorded ECG signals (known as templates) An unknown ECG signal is correlated with all the ECG templates in this library The largest correlation is the most likely match of the heart disease
0 100 200 300 400 500 600 700-20
0
20
40
60
80
100
120
Am
plit
ud
e (
arb
itra
ry u
nits
Sampling points
Sinus bradycardia
0 100 200 300 400 500 600 700-40
-20
0
20
40
60
80
100
120Normal Sinus Rhythm
Sampling points
Am
plit
ud
e (
arb
itra
ry u
nits
)
0 100 200 300 400 500 600 700-20
-10
0
10
20
30
40
50
60
70
Sampling points
Am
plit
ud
e (
arb
itra
ry u
nits
)
Right Bundle Branch Block
0 100 200 300 400 500 600 700-20
-10
0
10
20
30
40
50
Sampling points
Am
plit
ud
e (
arb
itra
ry u
nits
)
Accelerated Junctional Rhythm
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 15
Signal-to-noise ratio (SNR)
Before we move into the noise reduction methods, we need a measure of noise in the signals This is important to gauge the performance of the noise reduction
techniques
For this purpose, we use SNR SNR=10log10[(signal energy)/(noise energy)]
The original noise x(noise) = x(original signal) – x(noisy signal)
After using some noise reduction method, x(noise) = x(original signal) – x(noise reduced signal)
N
n
nxEnergy1
2)(
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 16
Ensemble averaging If we have many recordings, we can use ensemble averaging to reduce noise that is not
correlated between the recordings
Ensemble averaging to reduce noise from Evoked Potential (EP) EEG
Repeated different recordings are known as trials EP EEG signals from trial to another are about the same (high correlation) But noise will be different from one trial to another (low correlation) Hence, it would be possible to use ensemble averaging to reduce noise
0 50 100 150 200 250 300-1
-0.5
0
0.5
1
1.5
2
2.5
3
0 50 100 150 200 250 300-4
-3
-2
-1
0
1
2
3
4
5
0 50 100 150 200 250 300-4
-3
-2
-1
0
1
2
3
4
…………….
0 50 100 150 200 250 300-5
-4
-3
-2
-1
0
1
2
3
4
EP EEG
EP EEG+noise (trial 1)
0 50 100 150 200 250 300-2
-1
0
1
2
3
4
EP EEG after ensemble averaging
EP EEG+noise (trial 2) EP EEG+noise (trial 20)
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 17
Worked example 1 - Ensemble averaging
Assume we have 3 signals corrupted with noise. Assume we have the original also (for SNR computation). 1. Set the mean to zero first
2. The ensemble average is (the average is done for each sample point n)
3. The noises in the signals are(original signal – noise corrupted signal)
n 0 1 2 3
Noisy signal 1 -2.2 -0.2 0.2 2.2
Noisy signal 2 -2.1 0.1 0.1 1.9
Noisy signal 3 -1.9 -0.2 0.1 2.0
Original -2.0 0.0 0.0 2.0
n 0 1 2 3
Ensemble average -2.1 -0.1 0.1 2.0
n 0 1 2 3
Noisy signal 1 2.9 4.9 5.3 7.3
Noisy signal 2 2.9 5.1 5.1 6.9
Noisy signal 3 3.1 4.8 5.1 7
Original 3 5 5 7
n 0 1 2 3
Signal 1 noise 0.2 0.2 -0.2 -0.2
Signal 2 noise 0.1 -0.1 -0.1 0.1
Signal 3 noise -0.1 0.2 -0.1 0.0
Ensemble average noise 0.1 0.1 -0.1 0.0
4. SNR=10log10(e(signal)/e(noise))
Original signal energy 8
signal 1 signal 2 signal 3ensemble average
noise energy 0.16 0.04 0.06 0.03
SNR 16.99 23.01 21.25 24.26
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 18
Median filtering Similar to ensemble averaging, if we have many recordings, we can use median filtering to
reduce noise that is not correlated between the recordings
What is median filtering?
If we have x[1] as [3 2 1 0 6 7 9 3 2] from 9 trials, we sort the numbers from small to big, then the centre value (i.e. 5th) as the median
Sorted x[1] is [0 1 2 2 3 3 6 7 9], so median x[1]= 3
Median filtering is advantageous as compared to ensemble averaging if there is one trial containing a lot of noise AND if the number of trials/recordings are small
This is because the one heavily noise corrupted signal will distort the ensemble average values but will less likely affect the median values
.
.
.
n=1,2,…………………………………………….,Nm
123....M
Number of trials
obtain median values
Data length
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 19
Worked example 2 – median filtering
Assume we have 3 signals corrupted with noise, one heavily corrupted
(assume the mean has been set to zero)
2. The noises in the signals are (original signal – noise corrupted signal)
4. SNR=10log10(e(signal)/e(noise))
1. The ensemble average and median filtered signals
Which technique gave better noise reduction using SNR – ensemble averaging or median filtering?
Why?
n 0 1 2 3
Noisy signal 1 -2.15 0.95 -0.25 1.45
Noisy signal 2 -9 0 5 4
Noisy signal 3 -2.25 -0.45 0.85 1.85
Original -2.0 0.0 0.0 2.0
n 0 1 2 3
Ensemble average -4.47 0.17 1.87 2.43
Median filter -2.25 0 0.85 1.85
n 0 1 2 3
Noise in signal 1 0.15 -0.95 0.25 0.55
Noise in signal 2 7 0 -5 -2
Noise in signal 3 0.25 0.45 -0.85 0.15
Noise in ensemble averaging 2.47 -0.17 -1.87 -0.43
Noise in median filter signal 0.25 0 -0.85 0.15
Original signal energy 8
signal 1 signal 2 signal 3ensemble average
Median filter
noise energy 1.29 78 1.01 9.81 0.81
SNR 7.93 -9.89 8.99 0.88 9.95
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 20
Moving average filtering How do we reduce noise if we have only one signal from one recording/ trial?
We can’t use ensemble averaging and median filtering
Normally, in any signal, the few points before and after a certain point n are correlated (i.e. related)
But generally the noise is not correlated
So, we can use moving average (MA) filtering
It is defined as
where S is the filter order
Example, for S=3, y[5]=(x[5]+x[6]+x[7])/3
For signals x and y to remain of same sample length: We have to pad (S-1) zeros to the signal x to get the last (S-1) points of the signal y
1
][1
][Sn
ni
nxS
ny
Signal – correlation is
high
Noise – correlation is
lown n+S-1
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 21
Moving average filtering –zero padding
If zero padding
is NOT allowed
If zero padding is allowed
x[n], N=256
y[n], N=254 if S=3
x[1] x[256]
Length y is S-1 less than x
Because y[254]=(x[254]+x[255]+x[256])/3Moving averaged signal
x[n], N=256
y[n], N=256 - no matter what value of S
x[1] x[256]
Length y is same as x
Because y[254]=(x[254]+x[255]+x[256])/3 y[255]=(x[255]+x[256]+0])/2 y[256]=(x[256]+0+0])
Moving averaged signal
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 22
Example - moving average filtering
Assume we have a EEG signal corrupted with noise
Set the mean to zero
Apply moving average filter to the noisy signal (use filter order=3 and 5)
The higher filter order will remove more noise, but it will also distort the signal more (i.e. remove the signal parts also)
So, a compromise has to be found for the value of S (normally by trial and error)
load eeg;N=length(eeg);for i=1:N-3,eegMA1(i)=(eeg(i)+eeg(i+1)+eeg(i+2))/3;endeegMA1(255)=(eeg(255)+eeg(256))/2;eegMA1(256)=eeg(256)/1;for i=1:N-5,eegMA2(i)=(eeg(i)+eeg(i+1)+eeg(i+2)+eeg(i+3)+eeg(i+4))/5;endeegMA2(253)= (eeg(253)+eeg(254)+eeg(255)+eeg(256))/4;eegMA2(254)=(eeg(254)+eeg(255)+eeg(256))/3;eegMA2(255)=(eeg(255)+eeg(256))/2;eegMA2(256)=eeg(256)/1;subplot(3,1,1), plot(eeg, 'g ');subplot(3,1,2), plot(eegMA1,'r');subplot(3,1,3), plot(eegMA2,‘b');
0 50 100 150 200 250 300-5
0
5
0 50 100 150 200 250 300-5
0
5
0 50 100 150 200 250 300-5
0
5
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 23
Median filter for noisy images
Consider applying median filtering to some noisy images In computer, these grayscale images are stored as 2D arrays
x(i,j) where I and j are the coordinates and x is the grayscale values (in general from 0 (black) 255 (white))
After applying median filter
Mean (averaging) filter could be applied in similar manner though for images, median filter normally gives better results
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 24
Principal component analysis
PCA can be used to reduce noise from signals provided we have repeated recordings or signals from a number of trials or multi-channel signals
Principal components (PCs) are obtained from PCA, which are orthogonal signals, i.e. signals that are uncorrelated to each other
Since noise is less correlated between the trials as compared to the signals, the first few PCs will account for the signals while the last few PCs will account for the noise
By discarding the last few PCs before reconstruction, we’ll get the signals without noise/with less noise
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 25
Principal component analysis -algorithm
PCA algorithm Organise the data, X in M x N matrix Set mean to zero Compute CX=covariance of matrix, X Compute eigenvalue, eigenvector of CX Sort eigenvectors (i.e. principal components) in descending order Compute Zscores Decide how many PCs to keep using some criteria Reconstruct the noise reduced signals using the first few PCs and
Zscores
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 26
Eigenvector, eigenvalue – a brief review
The steps of setting mean to zero and computing covariance have been covered earlier, so let us move to the step of computing eigenvector, eigenvalue
Let us assume that A=cov(X), where X is the mean zero data
In MATLAB, [V,D] = eig(A) produces matrices of eigenvalues (D) and eigenvectors (V) of matrix A
It is obtained from A.*V = D.*V Note: A has to be a square matrix
Eg:
So is the eigenvector and 4 is the eigenvalue
can be assumed to be the vector direction
And eigenvalue=4 is the weight of this vector
2
3.4
2
3.
12
32
2
3
3
2
2
3
A
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 27
Eigenvector, eigenvalue (cont.)
Finding the eigenvalues and eigenvectors for bigger than 3 x 3 matrix is extremely difficult, so we will skip the algorithms and just use MATLAB function eig
Example, for the following square matrix:
Decide which, if any, of the following vectors are eigenvectors of that matrix and give the corresponding eigenvalue
Answer: The eigenvector is because = 1.
The eigenvalue is 1
1
2
2
2
0
1
3
1
1
0
1
0
1
2
3
206
214
103
0
1
0
206
214
103
0
1
0
0
1
0
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 28
Sort the eigenvectors
Sort the eigenvectors from big to small using eigenvalues Let’s use the example we saw earlier for ensemble averaging and median filtering X=[2.9 4.9 5.3 7.3; 2.9 5.1 5.1 6.9; 3.1 4.8 5.1 7] Xm=[-2.2 -0.2 0.2 2.2; -2.1 0.1 0.1 1.9; -1.9 -0.2 0.1 2.0]
A=Cov(Xm’)
The eigenvectors are, [V,D]=eig(A)
The corresponding eigenvalues are 0.0017, 0.0272, 8.4578
So now sort the eigenvectors in the order of eigenvalues: 8.4578, 0.0272, 0.0017
So the eigenvectors are
3.2533 2.9333 2.8800 2.9333 2.6800 2.5933 2.8800 2.5933 2.5533
V =
0.7103 0.3335 0.6198 -0.1039 -0.8213 0.5610 -0.6962 0.4629 0.5487
D =
0.0017 0 0 0 0.0272 0 0 0 8.4578
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 29
Zscores
Zscores=Vsort’*Xm where V is the sorted eigenvectors and Xm is the mean zero data
matrix
In the previous example, the size of A=3
So, we will have 3 Zscores
Zscores will have the same dimensions as Xm
Zscore 1
Zscore 2
-3.5843 -0.1776 0.2349 3.5270 0.1115 -0.2414 0.0309 0.0991 -0.0218 -0.0132 0.0621 -0.0270
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 30
How to select the number of PCs to keep
The PCs with higher eigenvalues represent the signals while the PCs with lower eigenvalues represent the noise
So we keep the first few PCs and discard the rest
But how many PCs do we keep?
Using certain percentage of variance to retain, normally 95% or 99%
Eigenvalues represent the weight of the PCs i.e. some sort of variance (power) measure of the PCs
So, we can use sum(D1:Dq)/sum(D1:Dlast)>0.99, where D represents the eigenvalues [D1,D2,D3,…Dlast]
In our example, say we wish to retain 99% variance eigenvalues are 8.4578, 0.0272, 0.0017
Sum(D1:Dlast)= 8.4867 Sum(D1:D1)=8.4578; sum(D1:D1)/sum(D1:Dlast)=0.9996 Sum(D1:D2)=8.4849; sum(D1:D1)/sum(D1:Dlast)=0.9998 Sum(D1:Dlast)=8.4867; sum(D1:Dlast)/sum(D1:Dlast)=1.0
Since the first eigenvalue accounted for 99.96% variance (which is more than 99%) and we can discard the second and third PC
If we wish to retain 99.97%, how many PCs do we retain? Answer=2
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 31
Reconstruct using the selected PCs To get back the original signals without noise, we need to reconstruct using the selected PCs
Xnonoise=Vselected*Zscoreselected
In our example, only 1 PC was selected, so the first eigenvector and the first Zscore will be used to get back the 3 noise reduced signals
Xnonoise=Vsort(:,1)*Zscore(1,:)
noise=Xm-Xnonoise
Energy (noise) =
Original signal, x=[-2 0 0 2]; this is the actual original mean removed signal - from the earlier slide
Energy (original signal)=8;
SNR=
SNR using PCA is generally higher than ensemble averaging or median filtering and we do get 3 signal outputs unlike one signal output from ensemble averaging or median filtering
-2.2217 -0.1101 0.1456 2.1861 -2.0107 -0.0996 0.1318 1.9786 -1.9668 -0.0975 0.1289 1.9353
0.0217 -0.0899 0.0544 0.0139 -0.0893 0.1996 -0.0318 -0.0786 0.0668 -0.1025 -0.0289 0.0647
0.0117 0.0550 0.0200
28.3483 21.6265 26.0222
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 32
Principal component analysis – an example of application
Consider the following 3 noise corrupted signals
Obtain the principal components (in descending order of eigenvalue magnitude)
Obtain the Zscores
Decide how many PCs to retain - assume that we retain only the first PC
By retaining the first PC only for reconstruction, we will have 3 noise reduced EP
0 50 100 150 200 250 300-4
-3
-2
-1
0
1
2
0 50 100 150 200 250 300-4
-3
-2
-1
0
1
2
3
0 50 100 150 200 250 300-2
-1
0
1
2
3
4
Reconstruct using only one PC
EP signal (trial 1)
EP signal (trial 2)
EP signal (trial 3)
0 50 100 150 200 250 300-20
-15
-10
-5
0
5
10
15
20
0 50 100 150 200 250 300-20
-15
-10
-5
0
5
10
15
0 50 100 150 200 250 300-20
-15
-10
-5
0
5
10
15
20
Noisy EP signal (trial 1)
Noisy EP signal (trial 2)
Noisy EP signal (trial 3)
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 33
Independent component analysis –a brief study ICA is a new method that could be used to separate noise from signal Sometimes known as blind source separation Requires more than one signal recording (like PCA) ICA separates the signals into independent signals (signals and noises – we keep the signals,
discard the noises) Example: Assume, we have 3 observed (i.e. recorded signals): x1[n], x2[n] and x3[n] from 3 original signals
sources: s1[n], s2[n] and s3[n] x1[n]=a11.s1[n]+a12.s2[n]+a13.s3[n] x2[n]=a21.s1[n]+a22.s2[n]+a23.s3[n] x3[n]=a31.s1[n]+a32.s2[n]+a33.s3[n]
The matrix, is known as mixing matrix
ICA can be used to obtain the original signals by obtaining the unmixing matrix W W=A-1
The original signals can be obtained by using s1[n]=w11.x1[n]+w12.x2[n]+w13.x3[n] s2[n]=w21.x1[n]+w22.x2[n]+w23.x3[n] s3[n]=w31.x1[n]+w32.x2[n]+w33.x3[n]
333231
232221
131211
aaa
aaa
aaa
A
333231
232221
131211
www
www
www
W
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 34
Independent component analysis – a pictorial example
Figures from Independent Component Analysis, Hyvarinen, Karhunen and Oja
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 35
Maximising non-gaussianity using kurtosis How ICA works?
The central limit theorem says that sums of non-gaussian random variables are closer to gaussian than the original ones
=> the independent signals are less gaussian than the combined signals So by maximising non-gaussian behaviour, we get closer to the original signals Kurtosis could be used to measure gaussian behaviour
BUT what is gaussian? See next slide
more gaussianless gaussian
Source (original signals)
Mixed (combined signals)
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 36
Gaussian and probability distributions
Gaussian (or normal) probability distribution is
BUT what is probability distribution?
Probability distribution for discrete-time signals is simply the number of occurences vs value
Eg: if x has values from 1 to 10
Gaussian distribution
Super-gaussian distribution The data close to mean have higher occurences
Sub-gaussian distribution Most the data have similar number of occurences
count(1:10)=0;for i=1:10,y=find(x==i);count(i)=length(y);endplot(y);x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7] 0 2 4 6 8 10
0
0.5
1
1.5
2
2.5
3
Probability distribution of x
2
2
2 2
)(exp
2
1)(
xxpdf
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 37
Kurtosis
Non-gaussianity can be measure using kurtosis
Gaussian signals have kurtosis=3 Sub-gaussian signals have lower kurtosis value Super-gaussian signals have higher kurtosis value
Examples
x = [4 1 2 3 9 8 6 5 7 3 4 2 2 6 9 5 6 7];y=kurtosis(x,0); %unbiased kurtosis using MATLAB
y=1.9509
0 200 400 600 800 1000-4
-3
-2
-1
0
1
2
3
4
Gaussian distribution signalx = randn(1,100000); % gaussian signal with mean=0, std=1plot(x);y=kurtosis(x,0) %unbiased kurtosis using MATLAB
y=3.00
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 38
0 50 100 150 200 250 300-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
EP signal, kurtosis=3.32
0 50 100 150 200 250 300-3
-2
-1
0
1
2
3
noise, kurtosis=2.81
0 50 100 150 200 250 300-5
-4
-3
-2
-1
0
1
2
3
4
X2=EP+noise, kurtosis=2.61
0 50 100 150 200 250 300-4
-3
-2
-1
0
1
2
3
X1= EP+noise, kurtosis=2.79
Example – Kurtosis for EP and noise
Original signals Recorded signals
Can you see that kurtosis is lower for combined signals, i.e. the actual independent signals (i.e. sources) have higher kurtosis
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 39
ICA tries to obtain EP and noise by estimating the unmixing matrix
The solution is In the beginning, we don’t know the unmixing matrix!
A simple ICA method is to randomly generate values in [0,1] for the unmixing matrix
Now, EP[n]=w11.X1[n]+w12.X2[n] and noise[n]=w21.X1[n]+w22.X2[n]
Kurtosis values are computed for these estimated EP and noise
Repeat with other random values for the unmixing matrix (say for a thousand times)
The unmixing matrix that gave the highest kurtosis values will denote the actual EP and noise
Actual ICA algorithms use complicated neural network learning algorithms, so we’ll skip them
It suffices to know that by using certain measures like kurtosis (representing non-gaussianity), we can separate the signals into independent components
][2
][1
2221
1211
][
][
nX
nX
ww
ww
nnoise
nEP
Simple ICA algorithm – an example using EP and noise
][2
][1
5.09.0
9.08.0
][
][1
nX
nX
nnoise
nEP
Unmixing matrix
Lecture 3 BME452 Biomedical Signal Processing 2013 (copyright Ali Işın, 2013) 40
Study guide (Lecture 3)
From this week’s lecture, you should know
Basic mathematics– mean, standard deviation, variance, covariance, correlation, autocorrelation, SNR, etc.
Uses of these basic maths in signal analysis
Noise reduction methods like ensemble averaging, median filtering, moving average filtering, principal component analysis and basics of independent component analysis
End of lecture 3