Influence function of the error rate of generalized k-means - Joint ...€¦ · Inﬂuence function...

Influencefunction of

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Influencefunction of theerror rate

Bias of theerror rate

Simulationstudy

Futureresearch

Influence function of the error rate ofgeneralized k-meansJoint work with G. Haesbroeck

Ch. Ruwet

UNIVERSITY OF LIÈGE

30 March 2009

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Introduction

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

The k -means clustering method

Aim of clustering : Group similar observations in kclusters C1, . . . , Ck .

The k -means algorithm constructs clusters in order tominimize the within cluster sum of squared distances.

Let us focus on k = 2 groups.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Classification based on clustering

In case of natural groups in the data, clustering may beused to find these groups via a classification rule :

x ∈ Cj ⇔ ‖x − xj‖ = min1≤i≤2

‖x − xi‖

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Example in 1D

Height of students in 1BM

Height (in cm)

160 170 180 190

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Example in 1D

Height (in cm)

160 170 180 190

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Example in 2D

40 50 60 70 80 90

Students of 1BM

Weight (in kg)

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Example in 2D

40 50 60 70 80 90

Students of 1BM

Weight (in kg)

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated example in 1D

Height (in cm)

120 140 160 180

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Height (in cm)

120 140 160 180

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

The generalized 2-means clustering method

The clusters’ centers (T1, T2) are solution of

mint1,t2⊂R2

inf1≤j≤2

‖xi − tj‖)

for a suitable nondecreasing penalty function Ω.

Classical penalty functions :

Ω(x) = x2 → 2-means method

Ω(x) = x → 2-medoids method

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated example with the 2-medoidsmethod

Height (in cm)

120 130 140 150 160 170 180 190

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

The generalized 2-means clustering method

The classification rule :

x ∈ Cj ⇔ Ω(‖x − Tj‖) = min1≤i≤2

Ω(‖x − Ti‖)

In one dimension, the estimated clusters are simply:

C1 =] −∞, C[

C2 =]C, +∞[

where C =T1 + T2

2is the cut-off point.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Error rate

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Example in 1D

Height (in cm)

160 170 180 190

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Example in 1D with the 2-means

Height (in cm)

160 170 180 190

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Height (in cm)

160 170 180 190

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Height (in cm)

160 170 180 190

GirlsBoys

ER=0.326

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Height (in cm)

120 140 160 180

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated example in 1D with the 2-means

Height (in cm)

120 140 160 180

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Height (in cm)

120 140 160 180

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Height (in cm)

120 140 160 180

GirlsBoys

ER=0.609

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated example in 1D with the2-medoids

Height (in cm)

120 140 160 180

GirlsBoys

ER=0.370

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Classification setting

Suppose

X arises from 2 groups G1 and G2 with πi(F ) = IPF [X ∈ Gi ]

F is a mixture of two distributions

F = π1(F )F1 + π2(F )F2

with densities f1 and f2.

Additional assumption : one dimension !

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

The generalized 2-means as statisticalfunctionals

The clusters’ centers (T1(F ), T2(F )) are solution of

mint1,t2⊂R2

inf1≤j≤2

‖x − tj‖)

dF (x)

for a suitable nondecreasing penalty function Ω.

The classification rule is

RF (x) = Cj(F ) ⇔ Ω(‖x−Tj(F )‖) = min1≤i≤2

Ω(‖x−Ti(F )‖)

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Optimality in classification

A classification rule is optimal if the corresponding errorrate is minimal

The optimal classification rule is the Bayes rule :

x ∈ C1 ⇔ π1(F )f1(x) > π2(F )f2(x)

(Anderson, 1958)

The 2-means procedure is optimal under the model

FN = 0.5 N(µ1, σ2) + 0.5 N(µ2, σ

2) with µ1 < µ2

(Qiu and Tamhane, 2007)

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Theoretical vs empirical error rate

Theoretical error rate :• Training sample according to F : estimation of the rule• Test sample according to Fm : evaluation of the rule• In ideal circumstances : F = Fm

ER(F , Fm) =2

πj(Fm)IPFm

RF (X ) 6= Cj(F )∣

∣ Gj]

Empirical error rate :• Training sample according to F : estimation and

evaluation of the rule

ER(F , F ) =2

πj(F )IPF[

RF (X ) 6= Cj(F )∣

∣ Gj]

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Theoretical error rate :• Training sample according to F : estimation of the rule• Test sample according to Fm : evaluation of the rule• In ideal circumstances : F = Fm

ER(F , Fm) =2

πj(Fm)IPFm

RF (X ) 6= Cj(F )∣

∣ Gj]

Empirical error rate :• Training sample according to F : estimation and

evaluation of the rule

ER(F , F ) =2

πj(F )IPF[

RF (X ) 6= Cj(F )∣

∣ Gj]

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated distribution

A contaminated distribution is defined by

zzuuuu

1 − ε : F ε : G

where G is a arbitrary distribution function.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated distribution

A contaminated distribution is defined by

zzuuuu

1 − ε : F ε : G

where G is a arbitrary distribution function.

To see the influence of one singular point x , G = ∆x leadingto

Fε = (1 − ε)F + ε∆x

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated mixture

−4 −2 0 2 4

Mixture

−4 −2 0 2 4

Contaminated mixture

(1−ε)0.4

(1−ε)0.6

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Theoretical vs empirical error rate undercontamination

Now, the training sample is distributed as Fε which is acontaminated mixture.

Theoretical error rate :

ER(Fε, Fm) =2

πj(Fm)IPFm

RFε(X ) 6= Cj(Fε)

∣ Gj]

Empirical error rate :

ER(Fε, Fε) =2

πj(Fε)IPFε

RFε(X ) 6= Cj(Fε)

∣ Gj]

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Now, the training sample is distributed as Fε which is acontaminated mixture.

ER(Fε, Fm) =2

πj(Fm)IPFm

RFε(X ) 6= Cj(Fε)

∣ Gj]

Empirical error rate :

ER(Fε, Fε) =2

πj(Fε)IPFε

RFε(X ) 6= Cj(Fε)

∣ Gj]

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Graphically :

Fm = FN ≡ 0.5 N(−1, 1) + 0.5 N(1, 1) an optimal model

C(FN) = −1+12 = 0 (Qiu and Tamhane, 2007)

Fε = (1 − ε)Fm + ε∆x

ε varying and x = −0.5

x ∈ G1 varying and ε = 0.1

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Theoretical error rate (with the 2-means) :

0.0 0.2 0.4 0.6 0.8

ε−2 −1 0 1 2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Empirical error rate (with the 2-means) :

0.0 0.2 0.4 0.6 0.8

If x is well placedIf not

ε−2 −1 0 1 2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Influence function of theerror rate

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Influence functions

Hampel et al (1986) : For any statistical functional T andany distribution F ,

IF(x ; T, F ) = limε→0

T(Fε) − T(F )

∂εT(Fε)

ε=0where

Fε = (1 − ε)F + ε∆x ;

EF [IF(X ; T, F )] = 0;

T(Fε) ≈ T(F ) + εIF(x ; T, F ) for ε small enough(First-order von Mises expansion of T at F ).

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Influence functions

Hampel et al (1986) : For any statistical functional T andany distribution F ,

IF(x ; T, F ) = limε→0

T(Fε) − T(F )

∂εT(Fε)

ε=0where

Fε = (1 − ε)F + ε∆x ;

EF [IF(X ; T, F )] = 0;

T(Fε) ≈ T(F ) + εIF(x ; T, F ) for ε small enough(First-order von Mises expansion of T at F ).

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

ER(Fε, F ) ≈ ER(F , F ) + εIF(x ; ER, F )

ER(Fε, Fε) ≈ ER(F , F ) + εIF(x ; ER, F )

ER(Fε, FN) ≥ ER(FN , FN) ⇒ IF(x ; ER, FN) ≡ 0

Empirical error rate : The IF does not vanish!From now on, ER(F ) = ER(F , F ).

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

ER(Fε) =?

ER(F ) =2

πj(F )IPF[

RF (X ) 6= Cj(F )∣

∣ Gj]

= π1(F )1 − F1(C(F )) + π2(F )F2(C(F ))

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

ER(Fε) =?

ER(F ) =2

πj(F )IPF[

RF (X ) 6= Cj(F )∣

∣ Gj]

= π1(F )1 − F1(C(F )) + π2(F )F2(C(F ))

Under Fε = (1 − ε)F + ε∆x , one has

ER(Fε) = π1(Fε)

1 − F1,ε (C(Fε))

+ π2(Fε)F2,ε (C(Fε))

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

ER(Fε) =?

ER(F ) =2

πj(F )IPF[

RF (X ) 6= Cj(F )∣

∣ Gj]

= π1(F )1 − F1(C(F )) + π2(F )F2(C(F ))

Under Fε = (1 − ε)F + ε∆x , one has

ER(Fε) = π1(Fε)

1 − F1,ε (C(Fε))

+ π2(Fε)F2,ε (C(Fε))

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

πi(Fε) =? and Fi ,ε =?

Fε = (1 − ε)F + ε∆x

πi(Fε) = (1 − ε)πi(F ) + εIx ∈ Gi

Fi,ε =

1 −εIx ∈ Gi

πi(Fε)

Fi +εIx ∈ Gi

πi(Fε)∆x

⇒ Fε = π1(Fε)F1,ε + π2(Fε)F2,ε

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Fε = (1 − ε)F + ε∆x

πi(Fε) = (1 − ε)πi(F ) + εIx ∈ Gi

Fi,ε =

1 −εIx ∈ Gi

πi(Fε)

Fi +εIx ∈ Gi

πi(Fε)∆x

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Fε = (1 − ε)F + ε∆x

πi(Fε) = (1 − ε)πi(F ) + εIx ∈ Gi

Fi,ε =

1 −εIx ∈ Gi

πi(Fε)

Fi +εIx ∈ Gi

πi(Fε)∆x

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Influence function of the empirical error rate

Proposition

For all x 6= C(F ),

IF(x ; ER, F ) = −ER(F )

+ Ix ≤ C(F )(1 − 2 Ix ∈ G1) + Ix ∈ G1

+12(IF(x ; T1, F ) + IF(x ; T2, F ))

π2(F )f2(C(F )) − π1(F )f1(C(F )).

Expressions of IF(x ; T1, F ) and IF(x ; T2, F ) have beencomputed by García-Escudero and Gordaliza (1999).

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Graphics of IF(x ; ER, F )

−4 −2 0 2 4

x in G1x in G2

−4 −2 0 2 4

x in G1x in G2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

−4 −2 0 2 4

pi1=0.2pi1=0.4pi1=0.6pi1=0.8

−4 −2 0 2 4

pi1=0.2pi1=0.4pi1=0.6pi1=0.8

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

∆ = µ2 − µ1

−4 −2 0 2 4

Delta=1Delta=3Delta=5

−4 −2 0 2 4

Delta=1Delta=3Delta=5

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

−4 −2 0 2 4

sigma=0.5sigma=1sigma=1.5sigma=2

−4 −2 0 2 4

sigma=0.5sigma=1sigma=1.5sigma=2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Bias of the error rate

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Definition

In practice, F is replace by Fn, the empirical cdf

The bias is defined by Bn(ER) = EF [ER(Fn) − ER(F )]

Fernholz (2001) :

Bn(ER) =1

2nEF [IF2(X ; ER, F )] + o(n−1)

IF2(x ; ER, F ) =∂2

∂ε2 ER((1 − ε)F + ε∆x)

This result is true if ER(.) is Hadamard or Fréchetdifferentiable.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Computation of the bias

Proposition

Under asymptotic normality and consistency of T1 and T2

(Pollard, 1981 and 1982) :

Bn(ER) ≈1

4nπ2(F )f2(C(F )) − π1(F )f1(C(F ))

(EF [IF2(X ; T1, F )] + EF [IF2(X ; T2, F )])

8nπ2(F )f ′2(C(F )) − π1(F )f ′1(C(F ))

(ASV(T1) + ASV(T2) + 2 ASC(T1, T2)).

Expressions of IF2(X ; T1, F ) and IF2(X ; T2, F ) have beencomputed.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Asymptotic difference

How much difference in error rate is to be expected byestimating a clustering rule from a finite sample ?

A-Diff(ER) = limn→∞

n Bn(ER)

⇒ Graphical comparisons of the 2-means and 2-medoidsmethods :

F = π1 N(−∆/2, 1) + (1 − π1) N(∆/2, 1)

• π1 varying and ∆ = 3• ∆ varying and π1 = 0.4

FN = 0.5 N(−∆/2, 1) + 0.5 N(∆/2, 1)

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Graphics of A-Diff(ER) under F

0.2 0.4 0.6 0.8

050 2−means

2−medoids

0 2 4 6 8 10

2−means2−medoids

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Graphic of A-Diff(ER) under FN

0 2 4 6 8 10

2−means2−medoids

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation study

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation settings

F = 0.4 N(−1.5, 1) + 0.6 N(1.5, 1)

n = 100Fε = (1 − ε)F + ε∆x with

• ε = 0.01 and |x | = 5• ε = 0.05 and |x | = 5• ε = 0.01 and |x | = 50• ε = 0.05 and |x | = 50

N = 1000

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

in C1 in C2 in C1 in C2

from G1

from G2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083from G1

0.078 0.072from G2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1

0.078 0.072 0.081 0.073from G2

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1 0.064 0.139

0.078 0.072 0.081 0.073from G2 0.119 0.083

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1 0.064 0.139 0.066 0.127

0.078 0.072 0.081 0.073from G2 0.119 0.083 0.116 0.072

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1 0.064 0.139 0.066 0.127

0.39 0.61

0.078 0.072 0.081 0.073from G2 0.119 0.083 0.116 0.072

0.41 0.59

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1 0.064 0.139 0.066 0.127

0.39 0.61 0.072 0.083

0.078 0.072 0.081 0.073from G2 0.119 0.083 0.116 0.072

0.41 0.59 0.081 0.073

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1 0.064 0.139 0.066 0.127

0.39 0.61 0.072 0.0830.35 0.650.078 0.072 0.081 0.073

from G2 0.119 0.083 0.116 0.0720.41 0.59 0.081 0.0730.45 0.55

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Simulation results

2-means 2-medoids

(0.071) (0.072)

0.068 0.083 0.071 0.083from G1 0.064 0.139 0.066 0.127

0.39 0.61 0.072 0.0830.35 0.65 0.35 0.650.078 0.072 0.081 0.073

from G2 0.119 0.083 0.116 0.0720.41 0.59 0.081 0.0730.45 0.55 0.45 0.55

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Contaminated example with the 2-medoids

Height (in cm)

120 130 140 150 160 170 180

GirlsBoys

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Conclusion of these simulations

A well-placed contamination in the smallest groupmakes the error rate decrease

Too much contamination makes the error rate of the2-means break down

Unfortunately, the error rate of the 2-medoids alsobreaks down when there are too much and too faroutliers !

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Future research

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Future research

Generalized trimmed 2-means : for α ∈ [0, 1],(T1(F ), T2(F )) are solution of

minA:F (A)=1−α

mint1,t2⊂R

inf1≤j≤2

‖x − tj‖)

dF (x).

Theoretical error rate

ER(Fε, Fm) =2

πj(Fm)IPFm

RFε(X ) 6= Cj(Fε)

∣ Gj]

More than 1 dimension or more than 2 groups

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Thank you for yourattention!

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Bibliography

CROUX C., FILZMOSER P. and JOOSSENS K. (2008),Classification efficiences for robust linear discriminantanalysis, Statistica Sinica 18, pp. 581-599.

CROUX C., HAESBROECK G. and JOOSSENS K. (2008),Logistic discrimination using robust estimators : aninfluence function approach, The Canadian Journal ofStatistics, 36, pp. 157-174.

FERNHOLZ L. T., On multivariate higher order von Misesexpansions in Metrika 2001, vol. 53, pp. 123-140.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Bibliography

GARCÍA-ESCUDERO L. A. and GORDALIZA A.,Robustness Properties of k Means and Trimmed kMeans, Journal of the American Statistical Association,September 1999, Vol. 94, nr 447, pp. 956-969.

HAMPEL F.R., RONCHETTI E.M., ROUSSEEUW P.J.,STAHEL W.A., Robust Statistics : The Approach Basedon Influence Functions, John Wiley and Sons,New-York, 1986.

ANDERSON T.W., An Introduction to MultivariateStatistical Analysis, Wiley, New-York, 1958, pp.126-133.

the error rateof

generalizedk-means

Ch. Ruwet

Introduction

Error rate

Simulationstudy

Futureresearch

Bibliography

POLLARD D., Strong Consistency of k-MeansClustering, The Annals of Probability, 1981, Vol.9, nr4,pp.919-926.

POLLARD D., A Central Limit Theorem for k-MeansClustering, The Annals of Probability, 1982, Vol.10, nr1,pp.135-140.

QIU D. and TAMHANE A. C. (2007), A comparative studyof the k -means algorithm and the normal mixturemodel for clustering : Univariate case, Journal ofStatistical Planning and Inference, 137, pp. 3722-3740.

Influence function of the error rate of generalized k-means - Joint ...€¦ · Inﬂuence function...

Documents

Payment Error Rate Measurement (PERM)

1 ANALISIS PENGARUH NOISE TERHADAP BIT ERROR RATE

Bit Error Rate Tester - Test and Measurement Equipment | … · · 2017-08-01Bit Error Rate Tester ... characterization of digital communications signaling systems ... operation

Channel Impulse Response and Bit Error Rate - CiteSeer

Inﬂuence of the particle injection rate, droplet size

Calciﬁcation rate inﬂuence on trace element concentrations

MANOVA: Type I Error Rate Analysis

Evaluating the Bit Error Rate for M-PSK

Bit Error Rate - MIT - Massachusetts Institute of …web.mit.edu/6.02/www/s2011/handouts/L07_slides.pdf6.02 Spring 2011 Lecture 7, Slide #2 Bit Error Rate The bit error rate (BER),

Transient Fault Detection and Reducing Transient Error Rate

Hardware Error Rate Characterization with Below-nominal Supply

Center-Based Contracts Error Rate Review Guide...Center-Based Contracts Error Rate Review Guide ... 3)

Influence function of the error rate of classification …...Influence function of the error rate of classification based on ... ... A

Minimum Error Rate Training in Statistical Machine Translation

Error rate reduction of parity checked telemetry ... - CORE

Packet error rate and bit error rate non-deterministic relationship in optical network applications

Inﬂuence of cooling rate on crystallisation kinetics on ... · Inﬂuence of cooling rate on crystallisation kinetics on microstructure of cast zinc alloys Mariusz Krupin´ski

Analyzing the Rate at Which Languages Lose the Inﬂuence of

Tektronix BitAlyzer BA1500, BA1600 Bit Error Rate Analyzer

Design of LDPC Decoders for Improved Low Error Rate ...Design of LDPC Decoders for Improved Low Error Rate ... Design of LDPC Decoders for Improved Low Error Rate Performance by