17
National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin National Aerospace University, Dept. of Transmitters, Receivers and Signal Processing, 17 Chkalova St, 61070 Kharkov, Ukraine e-mails: [email protected] , [email protected] , [email protected] Igor Djurović University of Montenegro, Electrical Engineering Department, Cetinsky Put bb, 81000, Podgorica, Montenegro e-mail [email protected] Presentation outline Presentation outline 1. Considered practical situations 2. Definition of a sample meridian 3. Main properties of sample meridian 4. Considered distributions 5. Analysis of statistical characteristics of sample meridian estimator 6. Dependence upon sample size 7. Comparison of estimators and practical examples 8. Conclusions and future work

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

Embed Size (px)

Citation preview

Page 1: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF

DATA SAMPLES

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin National Aerospace University,

Dept. of Transmitters, Receivers and Signal Processing, 17 Chkalova St, 61070 Kharkov, Ukraine

e-mails: [email protected], [email protected], [email protected]

Igor DjurovićUniversity of Montenegro,

Electrical Engineering Department,Cetinsky Put bb, 81000, Podgorica, Montenegro

e-mail [email protected]

Presentation outlinePresentation outline

1. Considered practical situations

2. Definition of a sample meridian

3. Main properties of sample meridian

4. Considered distributions

5. Analysis of statistical characteristics of sample meridian estimator

6. Dependence upon sample size

7. Comparison of estimators and practical examples

8. Conclusions and future work

Page 2: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

2

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

CONSIDERED PRACTICAL SITUATIONS

Quite often people rely on Gaussian model of noise (interference, disturbing process). However,

there are many practical situations for which this assumption is not valid:

Atmospheric noise (is well described by symmetric α-stable distribution with α about 1.5);

Acoustic noise in indoor environment;

Clutter from sea surface and vegetation in maritime and micro-Doppler radar surveillance

systems;

Noise in bispectrum domain and in Wigner distributions domain for low input SNR;

Noise components in spectral components of DFT in robust DFT framework;

Mixed noise in images where impulse noise arises due to coding/decoding errors, etc.

In all these cases, noise distributions have heavier tails than Gaussian.

Even if input noise is Gaussian, it might become having heavier tails after non-linear

transformations. K.Barner and T.Aysal have shown that a product of two random variables has

heavier tail distribution than any distribution of multiplied random variables.

Thus, one needs robust estimators to cope with non-Gaussian noise.

Page 3: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

3

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

PRACTICAL EXAMPLES

Fig.1 - Micro-Doppler reflections from vegetation (bushes) in windy weather (a), PCK characterizing Gaussianity (b) and MAD characterizing interference intensity (c)

a) b) c)

Fig.2 - Image formed by maritime polarization radar (VH) (a) and histograms in clutter regions for middle (b) and large (c) distances

a) b) c)

Page 4: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

4

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

DEFINITION OF A SAMPLE MERIDIAN

Analytically a sample meridian estimator of location is defined as

1

ˆ arg min ln { , 1, ; } (1)N

i ii

x meridian x i N

where N denotes a sample size; xi is an i-th element of a data sample; δ is called medianity parameter.

Its cost function is

1

ln (2)N

ii

x

The function φ(β) monotonically decreases for β<xMIN and increases for β<xMAX where xMIN and xMAX are

the minimal and maximal elements of a data sample, respectively.

Thus, minima of cost function are from xMIN and xMAX and the number of minima is limited, it increases if

δ reduces. Moreover, minima can be observed only for β that coincides with one of elements of original

sample.

These properties show that a very simple algorithm for finding meridian estimate can be realized. One

has to calculate φ(xi), i=1,…,N and to find such i for which φ(xi) is the smallest. No sorting is needed,

only N calculations of (2) and N-1 comparisons.

Page 5: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

5

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

MAIN PROPERTIES OF SAMPLE MERIDIAN ESTIMATOR

1)1) the meridian estimator is invariant to translation of distribution center (mean if it exists). This

allows restricting consideration of meridian estimator properties by the case of distributions with

zero mean (center);

2)2) we concentrate on studying only distributions symmetric with respect to their means; for such

distributions the meridian estimator produces unbiased estimates for arbitrary δ;

3)3) for a rather large δ, a sample meridian coincides with a sample median and, thus, statistical

characteristics of the corresponding estimators are practically the same; however, it is not clear

what is “rather large δ” in practice;

4)4) for a rather small δ, sample meridian estimator can serve as distribution mode finder; but it is not

clear what is “rather small δ” in practice;

There are also other questionsThere are also other questions:

Can the meridian estimator accuracy be better than accuracy for other robust estimators of

location and under what conditions?

Is there some dependence of the meridian estimator accuracy on sample size?

What are practical recommendations for setting δ?

Page 6: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

6

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

CONSIDERED DISTRIBUTIONS

Special attention in our analysis should be paid to non-Gaussian heavy tailed distributions.

The first one is Gaussian PDF with variance σ2G and the second PDF is Cauchy one which is a particular

case of symmetric α-stable distributions: f1(γ,x)= γ/(π(γ2+x2)), where γ is the parameter characterizing

PDF scale. For both PDFs, their maximums are bell-shaped and continuous derivatives.

Let us also analyze PDFs f2(x), f3(x), f4(x) of the following three random variables:

1) Y1=X1X2 (denoted as dgauss);

2) Y2=X1X2X3 (denoted as tgauss);

3) Y3=(X1)3(denoted as gauss3).

Here X1, X2, and X3 are independent zero-mean Gaussian variables with standard deviations σX1, σX2, σX3,

respectively.

All three PDFs f2(x), f3(x), f4(x) have heavier tails than Gaussian. If σX1=σX2=σX3=1 then for the PDFs f2(x)

and f3(x) their variances are equal to 1 and for the PDF f4(x) it variance approximately equals to 15.

One more peculiarity of f2(x), f3(x), f4(x) is that they all have peaky (sharp, not bell-shaped) maximums.

Page 7: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

7

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

CONSIDERED DISTRIBUTIONS

For heavy tail PDFs, it is reasonable to characterize their scale by a robust estimate of scale. Median of absolute deviations (MAD) can serve as such a characteristic.

For PDF f2(x), MAD=0.545σX1σX2;

for f3(x) one has MAD=0.292σX1σX2σX3;

for f4(x) MAD=0.462σ3X3.

The following expressions are valid for Gaussian and

Cauchy PDFs:

MAD=σG/1.483 and MAD=1.5γ.

A characterization of tail weight could be percentile

coefficient of kurtosis (PCK). For Gaussian PDF it is

equal to 0.265, for heavier tail distributions the PCK

values are smaller:

0.16 for Cauchy PDF; 0.178, 0.132, and 0.076 for PDFs

f2(x), f3(x) and f4(x), respectively.

Fig. 3 – Gaussian distribution and PDFs dgauss, tgauss, gauss3, full appearance (upper plot) and for the range of argument values (bottom plot).

Page 8: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

8

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

ANALYSIS OF STATISTICAL CHARACTERISTICS OF SAMPLE MERIDIAN

ESTIMATOR

Consider conditions under which a sample meridian differs from median for the same data sample. Let us determine probability P that meridian(xi; i=1,…,N; δ)=median(xi).

For N=3 the probability is equal to unity for all five considered PDFs and for any δ. So, let us study larger N, namely N=5, 9, 15, 25 that correspond to typical situations in signal and image processing in a sliding window manner.

Fig. 4 - Dependences of P on δ for N=5 (left), 9 (right)

All curves have similar behavior and they have “saturation” to unity for rather large δ and a flat region with P≠1 for rather small values of δ. Transition zones between these two “saturations” are observed for δ within the limits from 10-2 to 102 or, more generally speaking, for δ/MAD from 10-2 to 102.

For heavy tailed distributions it is more correct to analyze the obtained dependencies with respect to normalized values δ/MAD since standard deviation is a non-robust characteristic of scale and theoretically it can be infinite.

Page 9: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

9

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

ANALYSIS OF STATISTICAL CHARACTERISTICS OF SAMPLE MERIDIAN

ESTIMATOR

Fig. 5 - Dependences of P on δ

for N=15 (upper plot) and 25

(bottom plot)

At the same time, there are some differences and other interesting

observations.

First, P becomes practically equal to 1 for δ exceeding 10 standard

deviations of Gaussian noise (σG). For sharp peak PDFs, P reaches 1 for

δ/MAD of the order 10…50.

The second observation is that the probabilities P depend upon N for small

δ. For larger N, P becomes smaller. The smallest P takes place for

Gaussian PDF, for other distributions that are heavy tailed the values of P

are approximately the same.

There is no sense to study them for δ/MAD larger than 100 since then the

sample meridian and median coincide and, thus, have the same statistical

characteristics.

The properties of the sample meridian and median differ in the area of

δ/MAD<100 although this does not necessarily mean that in this case the

meridian estimator accuracy is better than that of the median estimator.

Page 10: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

10

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

ANALYSIS OF STATISTICAL CHARACTERISTICS OF SAMPLE MERIDIAN

ESTIMATOR

For analysis of meridian estimator accuracy, we have determined:

- root mean square error (RMSE) σβ of obtained estimates;

- we also used where is the meridian estimate for the j-th

sample of data obeying a given PDF, Nexp defines the number of experiments (analyzed data samples,

Nexp=1000).

Expedience of analyzing MADβ stems from the fact that has been found in experiments – PDF of the

obtained meridian estimates of location for PDFs f2(x), f3(x) and f4(x) occurred to be symmetric but non-

Gaussian.

For Gaussian and Cauchy distributions, the meridian estimate distribution is close to Gaussian, especially

if N is large enough. Similar effects have been observed for sample median estimates in the case of

Laplacian distribution.

In the case of Gaussian PDF, σβ(δ) and MADβ(δ) for δ/MAD<0.1 are by approximately two times larger

than δ/MAD>100. There is no minimums of the curves σβ(δ) and MADβ(δ). Both curves are

monotonically decreasing. Both σβ(δ) and MADβ(δ) are proportional to σG and they are approximately

inversely proportional to N0.5.

expˆ ˆ{ { } , 1,..., }j jMAD med med j N ˆ

j

Page 11: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

11

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

ANALYSIS OF STATISTICAL CHARACTERISTICS OF SAMPLE MERIDIAN

ESTIMATOR

Let us analyze data obtained for Cauchy PDF, N=64.

As it is seen, all dependencies are monotonously decreasing. For δ/MAD<0.1, σβ(δ) and MADβ(δ) by

approximately two times larger than for δ/MADX>10 where

{ { } , 1,..., }X i iMAD med x med x i N

Parameters σβ(δ) and MADβ(δ) are

proportional to data scale defined by γ. For

providing the best accuracy of the meridian

estimator it is enough to set δ=(10…

20)MADX.

Fig. 6 - Dependences of σβ and MADβ on δ for

Cauchy PDF with γ=1 and 2, N=64

Thus, a common conclusion for

Gaussian and Cauchy PDFs is that

accuracy of the meridian estimator

cannot be better than for median

estimator.

for a data sample at hand.

Page 12: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

12

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

ANALYSIS OF STATISTICAL CHARACTERISTICS OF SAMPLE MERIDIAN

ESTIMATOR

Let us now study statistical characteristics of meridian estimator for peaky PDFs starting from the PDF

f2(x). Both curves σβ(δ) and MADβ(δ) have minimums observed for δ≈1 and δ≈0.1, respectively.

The ratio MADβ(δ)/σβ(δ)<0.5, i.e., considerably smaller than if meridian estimator is applied to Gaussian

data. This indirectly shows that the estimates do not obey Gaussian distribution and are heavy tailed.

This has been confirmed by analysis of their histograms.

ˆj

Fig. 7 - Dependences σβ(δ) and MADβ(δ) for PDF f2(x), N=64

Therefore, if δ is set optimal, there is sense to apply the meridian estimator.

Page 13: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

13

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

ANALYSIS OF STATISTICAL CHARACTERISTICS OF SAMPLE MERIDIAN

ESTIMATOR

For data samples with PDFs f3(x) and f4(x), the plots σβ(δ) and

MADβ(δ) have obvious minimums of both σβ and MADβ.

For f3(x), minimums are observed for δ≈0.1 and δ≈0.01,

respectively. It is possible to recommend using δ/MADY2≈0.01

that clearly corresponds to modal mode of the meridian

estimator.

MADβ(δ)/σβ(δ) for δ/MADβ(δ)≈0.01 becomes about 0.3. This

means that PDF of meridian estimates is non-Gaussian.

For f4(x), minimum of σβ(δ) has place for δ≈0.001, minimum of

MADβ(δ) is observed for δ≈10-5, i.e. again the meridian estimator

looks for distribution mode. Thus, δ/MADY3 should be about 10-4.

This means that the meridian estimator with properly adjusted

δ/MADY3 is able to perform considerably better than the median

estimator. MADβ/σβ occurs to be very small, about 0.02. This

means that meridian estimates have very heavy tails.

Fig. 8 - Dependences σβ(δ) and

MADβ(δ) for PDFs f3(x) (upper plot)

and f4(x) (bottom plot), N=64

Page 14: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

14

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

DEPENDENCE UPON SAMPLE SIZE

Table 1. Accuracy of meridian estimator depending upon N for quasi-optimal δ

As it is seen, σβ and MADβ decrease faster than N-0.5, especially for data samples with the most heavy

tailed PDF f4(x). It is also worth stressing that optimal δq opt decreases if N becomes larger. This is a

specific property that has not been observed for the myriad estimator. This property makes more

complicated designing an adaptive algorithm for determination of δq opt for limited a priori information

on PDF a data sample obeys to.

We have also analyzed dependence of δq opt on data scale. As expected, experiments carried out for all

considered PDFs have demonstrated that δq opt should be directly proportional to MAD of a given

distribution where MADX can serve as its estimate.

Page 15: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

15

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

COMPARISON OF ESTIMATORS

Table 2. Comparison of accuracy for the mean, median and meridian estimators for data samples with different N and PDFs

Thus, the meridian estimator can be useful for applications when one deals with very impulsive noise environments where noise PDF is not bell-shaped.

We also expect that the properties of meridian and median estimators can differ a lot in cases of processing data samples with asymmetric distributions.

Page 16: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

16

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

COMPARISON OF ESTIMATORS AND PRACTICAL EXAMPLES

Edge of unity amplitude (49-th sample) corrupted by Gaussian noise with SD=0.1.

Fig.9 - Median filter output bias, residual noise variance and aggregate error

Fig.10 - Meridian filter (δ=0.03) output bias, residual noise variance and aggregate error

Meridian filter with optimally set δ is able to produce smaller bias and aggregate error in the neighborhood of noisy step edge.

Page 17: National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES Dmitriy Kurkin, Alexey Roenko, Vladimir

National Aerospace University of Ukraine ANALYSIS OF MERIDIAN ESTIMATOR PERFORMANCE FOR NON-GAUSSIAN PDF DATA SAMPLES

17

Dmitriy Kurkin, Alexey Roenko, Vladimir Lukin,Igor Djurovic

CONCLUSIONS AND FUTURE WORK

The studies carried out have shown that there exist non-Gaussian distributions for which the

sample meridian with δq opt is able to produce more accurate estimates of location parameter

than one of the most known robust estimators, sample median and myriad.

All three PDFs for which the benefits of the sample meridian have been observed are peaky

(not bell-shaped) ones that is explained by the peculiarities of the used cost function (2).

For PDFs with heavier tails one has to set smaller δq opt. At the same time, δq opt should be

proportional to data scale.

These observations let us hope that adaptive algorithm for determination δ of the meridian

estimator can be designed.