15
Chapter 18 Goodness-of-Fit Tests for Reliability Modeling Alex Karagrigoriou Abstract In this work we provide goodness-of-fit (GoF) tests which are based on measures of divergence and can be used for assessing the appropriateness of both discrete and continuous distributions. The main focus is on continuous distribu- tions like the exponential, Weibull, Inverse Gaussian, Gamma and lognormal which frequently appear in engineering and reliability. An extensive simulation study shows that the new tests maintain good stability in level and high power across a wider range of distributions and sample sizes than other tests. Keywords Generalized goodness-of-fit tests Measures of divergence Lifetime distribution Multinomial populations 18.1 Introduction The v 2 test has passed into common use since its introduction by Pearson (1900). It has, however, long been recognized that in certain cases it is inadequate as a test for goodness-of-fit (GoF) and in particular in the case when the deviations of the observations from the hypothesis tested are consecutively positive (or negative); for the v 2 test, by taking into account the square of the difference between the observed and expected values, renders it impossible to pay attention to the sign of this difference. It is because of this inadequacy that Neyman (1937) introduced the ‘‘smooth’’ test for GoF. A. Karagrigoriou (&) 55 Lomvardou Street, 11474 Athens, Greece e-mail: [email protected] A. Lisnianski and I. Frenkel (eds.), Recent Advances in System Reliability, Springer Series in Reliability Engineering, DOI: 10.1007/978-1-4471-2207-4_18, Ó Springer-Verlag London Limited 2012 253

[Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

  • Upload
    ilia

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Page 1: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

Chapter 18Goodness-of-Fit Tests for ReliabilityModeling

Alex Karagrigoriou

Abstract In this work we provide goodness-of-fit (GoF) tests which are based onmeasures of divergence and can be used for assessing the appropriateness of bothdiscrete and continuous distributions. The main focus is on continuous distribu-tions like the exponential, Weibull, Inverse Gaussian, Gamma and lognormalwhich frequently appear in engineering and reliability. An extensive simulationstudy shows that the new tests maintain good stability in level and high poweracross a wider range of distributions and sample sizes than other tests.

Keywords Generalized goodness-of-fit tests � Measures of divergence � Lifetimedistribution � Multinomial populations

18.1 Introduction

The v2 test has passed into common use since its introduction by Pearson (1900). Ithas, however, long been recognized that in certain cases it is inadequate as a testfor goodness-of-fit (GoF) and in particular in the case when the deviations of theobservations from the hypothesis tested are consecutively positive (or negative);for the v2 test, by taking into account the square of the difference between theobserved and expected values, renders it impossible to pay attention to the sign ofthis difference. It is because of this inadequacy that Neyman (1937) introduced the‘‘smooth’’ test for GoF.

A. Karagrigoriou (&)55 Lomvardou Street, 11474 Athens, Greecee-mail: [email protected]

A. Lisnianski and I. Frenkel (eds.), Recent Advances in System Reliability,Springer Series in Reliability Engineering, DOI: 10.1007/978-1-4471-2207-4_18,� Springer-Verlag London Limited 2012

253

Page 2: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

The smooth Neyman test relies on normalized Legendre polynomials on [0, 1].The Neyman smooth tests were constructed to be asymptotically locally uniformlymost powerful, symmetric, unbiased and of specified size against specified smoothalternatives (Neyman 1937). For the order k smooth alternative, the resulting teststatistic is based on the summation of the squares of the first k Legendre poly-nomials. Then, under the null hypothesis, the test statistic follows the central chi-square distribution with k degrees of freedom.

The smooth tests were found to provide useful diagnostic and inferential toolsbut, over the years, have been taken over by the so-called omnibus tests of theCramer-von Mises family. Indeed, another way to view the gof problem is throughthe Cramer-von Mises family of tests. The general Cramer-von Mises family oftest statistics focuses on the squared differences given by

Q ¼Z1

�1

½FnðyÞ � F0ðyÞ�2WðyÞdF0ðyÞ

where Fn(y) is the empirical distribution function, F0(y) is the hypothesized dis-tribution and W(y) is an appropriate scaling function. For different forms of thefunction W(y) we obtain the Cramer-von Mises test statistic (Cramer 1928;von Mises 1931), the Anderson–Darling test statistic (Anderson and Darling 1954)and through a modification, the Watson test (Watson 1961). All these test statisticsadvocate the use of tables or formulas for critical values which depend on theparameters of the underlying distribution.

All the above tests are related to and based on measures of disparity ordivergence between the true but unknown and the hypothesized distributions.Measures of divergence between two probability distributions have a very longhistory. One could consider as pioneers in this field the famous Mathematiciansand Statisticians of the twentieth century, Pearson (1900), Mahalanobis (1936),Levy (1925, p. 199–200) and Kolmogorov (1933). The most popular measures ofdivergence are considered to be the Kullback–Leibler (KL) measure of divergenceand the measures associated with the Pearson’s chi-square test and the likelihoodratio test. The tests based on these measures can among other areas, be used inreliability applications. In many reliability applications, certain classes of lifedistributions have been introduced to describe several aging criteria. Suchexamples include the residual lifetime that represents the remaining life of anaging system and the renewal lifetime of a device replaced upon failure. Amongthe most well-known families of life distributions are the classes of increasingfailure rate (IFR), decreasing failure rate (DFR), increasing failure rate average(IFRA), new better than used (NBU), new better than used in expectation (NBUE)and harmonic new better than used in expectation (HNBUE). For the residuallifetime, if the mean residual life (i.e., the expected value of the remaining life) is adecreasing function then we encounter an important and practical class of lifedistributions, the decreasing mean residual lifetime distribution (DMRL). In most

254 A. Karagrigoriou

Page 3: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

of these problems the null distribution is the exponential distribution and thealternative is a member of one of the above classes other than the exponential.

In this work we focus on testing problems in reliability, like the ones discussedearlier. In Sect. 18.2 we discuss various measures of divergence and in Sect. 18.3the associated tests of fit. In Sect. 18.4 we provide simulations for continuousdistributions in order to explore the capabilities of the proposed tests for variouscontinuous distributions.

18.2 Divergence Measures

The most famous family of measures is the /-divergence between two functionsf and g, known also as Csiszar’s measure of information which was introduced andinvestigated independently by Csiszar (1963) and Ali and Silvey (1966) and isgiven by

D/ f ; gð Þ ¼Z

X

g xð Þ/ f xð Þg xð Þ

� �dl xð Þ; / 2 U ð18:1Þ

where U is the class of all convex functions /(x), x C 0, such that at x = 1,/(1) = 0, at x = 0, 0/ 0=0ð Þ ¼ 0 and 0/ u=0ð Þ ¼ lim

u!1/ uð Þ=u. For various func-

tions for / the measure takes different forms. Note that the well-known KLdivergence measure (Kullback and Leibler 1951) is obtained for / xð Þ ¼ x log xð Þ�xþ 1 2 U. If /ðuÞ ¼ 1=2ð1� uÞ2,

/ðuÞ ¼ U2;kðuÞ ¼1

k kþ 1ð Þ ukþ1 � u� k u� 1ð Þ� �

; k 6¼ 0;�1; ð18:2Þ

or /ðuÞ ¼ ð1�ffiffiffiupÞ2, Csiszar’s measure yields the Pearson’s chi-squared diver-

gence, the Cressie and Read power divergence (Cressie and Read 1984) and theMatusita’s divergence (Matusita 1967), respectively. More examples can be foundin Arndt (2001), Pardo (2006) and Vajda (1989, 1995).

A unified analysis has been provided by Read and Cressie (1988) who intro-duced for both the continuous and the discrete case a family of measures ofdivergence known as power divergence family of statistics that depends on aparameter k and is used for GoF tests for multinomial distributions. The Cressieand Read family includes among others the well-known Pearson’s v2 divergencemeasure and for multinomial models the log likelihood ratio statistic.

Csiszar’s family of measures was recently generalized by Mattheou andKaragrigoriou (2010) to the U-family of measures between two functions f andg which is given by

18 Goodness-of-Fit Tests for Reliability Modeling 255

Page 4: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

DaX g; fð Þ ¼

Zg1þa xð ÞU f xð Þ

g xð Þ

� �dlðxÞ; U 2 U�; a [ 0 ð18:3Þ

where l represents the Lebesgue measure and U* is the class of all convexfunctions U on 0;1½ Þ such that U 1ð Þ ¼ U0 1ð Þ ¼ 0 and U00 1ð Þ 6¼ 0. We also assumethe conventions 0U 0=0ð Þ ¼ 0 and 0U u=0ð Þ ¼ limu!1 U uð Þ=u; u [ 0:

Expression (18.3) covers not only the continuous case but also a discrete settingwhere the measure l is a counting measure. Indeed, for the discrete case, thedivergence in (18.3) is meaningful for probability mass functions f and g whosesupport is a subset of the support Sl, finite or countable, of the counting measure lthat satisfies l xð Þ ¼ 1 for x 2 Sl and 0 otherwise.

The discrete version of the U�family of divergence measures is presented inthe definition below.

Definition 1 For two discrete distributions P ¼ p1; . . .; pmð Þ0 and Q ¼ q1; . . .; qmð Þ0with sample space X ¼ x : p xð Þq xð Þ[ 0f g, where p xð Þ and q xð Þ are the prob-ability mass functions of the two distributions, the discrete version of theU�family of divergence measures with a general function U as in (18.3) and a [ 0is given by

da ¼Xm

i¼1

q1þai U

pi

qi

� �: ð18:4Þ

For U having the special form

U1 uð Þ ¼ u1þa � 1þ 1a

� �ua þ 1

að18:5Þ

we obtain the BHHJ measure of Basu et al. (1998) which was proposed for thedevelopment of a minimum divergence estimating method for robust parameterestimation. Observe that for U having the special form U uð Þ ¼ / uð Þ and a ¼ 0 weobtain the Csiszar’s /-divergence family of measures while for a ¼ 0 and forU uð Þ ¼ U2;kðuÞ as in (18.2), it reduces to the Cressie and Read power divergencemeasure. Other important special cases are the ones for which the function UðuÞtakes the form U2ðuÞ ¼ ð1þ kÞU2;kðuÞ and

U1; a uð Þ ¼ 11þ a

U1 uð Þ ¼ 11þ a

u1þa � 1þ 1a

� �ua þ 1

a

� �: ð18:6Þ

It is easy to see that for a! 0 the measure U2ð�Þ reduces to the KL measure.Note that from the statistical point of view, the Cressie and Read (CR) family isvery important since as a by-product, a statistic, called power divergence statisticthat emerged for GoF purposes for multinomial distributions (Read and Cressie1988) has received widespread attention.

256 A. Karagrigoriou

Page 5: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

18.3 Tests of Fit Based on Divergence Measures

It is important to point out that any type of data can be handled by the above testsof fit. To introduce the basic ideas, we discuss first the problem of testing a simpleGoF hypothesis. More specifically, assume that X1; . . .;Xn are i.i.d. random vari-ables with common distribution function (d.f.) F. Given some specified d.f. F0, theclassical GoF problem is concerned with testing H0 : F ¼ F0: The above problemis frequently treated by partitioning the range of data into m disjoint intervals andby testing the hypothesis

H0 : p ¼ p0 ð18:7Þ

about the vector of parameters of a multinomial distribution.Let P ¼ Eif gi¼1;...;m be a partition of the real line R in m intervals. Let

p ¼ p1; . . .; pmð Þ0 and p0 ¼ p10; . . .; pm0ð Þ0 be the true and the hypothesized prob-abilities of the intervals Ei; i ¼ 1; . . .;m, respectively, in such a way thatpi ¼ PF Eið Þ, i ¼ 1; . . .;m and pi0 ¼ PF0 Eið Þ ¼

REi

dF0, i ¼ 1; . . .;m.

Let Y1; . . .; Yn be a random sample from F, let ni ¼Pnj¼1

IEi Yj

� �, where

IEi Yj

� �¼

1; if Yj 2 Ei

0; otherwise

(;

bp ¼ p̂1; p̂2; . . .; p̂mð Þ0 with p̂i ¼ ni=n, i ¼ 1; . . .;m be the maximum likelihoodestimator (MLE) of pi, the true probability of the Ei interval and

Pmi¼1 ni ¼ n. For

testing the simple null hypothesis H0 : p ¼ p0; the most commonly used teststatistics are Pearson’s or chi-squared test statistic, given by

X2 ¼Xm

i¼1

ni � npi0ð Þ2

npi0ð18:8Þ

and the likelihood ratio test statistic given by

G2 ¼ 2Xm

i¼1

ni logni

npi0

� �: ð18:9Þ

Both these test statistics are particular cases of the family of power-divergencetest statistics (CR test) which has been introduced by Cressie and Read (1984) andis given by

Ikn bp; p0ð Þ ¼ 2n

k kþ 1ð ÞXm

i¼1

p̂ip̂i

pi0

� �k

�1

!¼ 2n

Xm

i¼1

pi0U2;kp̂i

pi0

� �; k 6¼ �1; 0

ð18:10Þ

18 Goodness-of-Fit Tests for Reliability Modeling 257

Page 6: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

where �1\k\1. Particular values of k in (18.10) correspond to well-knowntest statistics: Chi-squared test statistic v2 k ¼ 1ð Þ, likelihood ratio test statisticG2 k! 0ð Þ, Freeman–Tukey test statistic (Freeman and Tukey 1959) k ¼ �1=2ð Þ,minimum discrimination information statistic (Gokhale and Kullback1978) k! �1ð Þ, modified chi-squared test statistic k ¼ �2ð Þ and Cressie–Readtest statistic k ¼ 2=3ð Þ.

Although the power-divergence test statistics yield to an important family ofGoF tests, it is possible to consider the more general Csiszar’s family of/-divergence test statistics for testing (18.7) which contains (18.10) as a particularcase, is based on the discrete form of (18.1) and is defined by

I/n bp; p0ð Þ ¼ 2n

/00 1ð ÞXm

i¼1

pi0/p̂i

pi0

� �ð18:11Þ

with / xð Þ a twice continuously differentiable function for x [ 0 such that/00 1ð Þ 6¼ 0:

Csiszar’s family of measures was recently generalized by Mattheou andKaragrigoriou (2010) to the U-family of measures given by (18.4). A general testof fit based on this U-family of measures was recently proposed by Mattheou andKaragrigoriou (2010):

IUn bp; p0ð Þ ¼ 2nda

U00 1ð Þ ; da ¼Xm

i¼1

p1þai0 U

p̂i

pi0

� �; U 2 U�: ð18:12Þ

Cressie and Read (1984) obtained the asymptotic distribution of the power-divergence test statistics IU

n bp; p0ð Þ for UðuÞ ¼ U2;aðuÞ; Zografos et al. (1990)extended the result to the family IU

n bp; p0ð Þ for a ¼ 0 and U ¼ / 2 U� andMattheou and Karagrigoriou (2010) extended the result to cover any functionU 2 U�:

Theorem 1 Under the null hypothesis H0 : p ¼ p0 ¼ p10; . . .; pm0ð Þ0, the asymp-totic distribution of the U-divergence test statistic, IU

n bp; p0ð Þ divided by a constantc, is a chi-square with m - 1 degrees of freedom:

1c

IUn bp; p0ð Þ �!L

n!1v2

m�1;

where c ¼ 0:5ðmini

pai0 þmax

ipa

i0Þ.

The following two theorems for the CR test statistic and the /-family of teststatistics are special cases of the above theorem for c ¼ 1; a ¼ 0 and for theappropriate forms of the function Uð�Þ.

Theorem 2 Under the null hypothesis H0 : p ¼ p0 ¼ p10; . . .; pm0ð Þ0, theasymptotic distribution of the divergence test statistic Ik

n bp; p0ð Þ with U ¼ U2;k

given in (18.10), is chi-square with m - 1 degrees of freedom:

258 A. Karagrigoriou

Page 7: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

Ikn bp; p0ð Þ ¼ IU2;k

n bp; p0ð Þ �!Ln!1

v2m�1:

Theorem 3 Under the null hypothesis H0 : p ¼ p0 ¼ p10; . . .; pm0ð Þ0, theasymptotic distribution of the /-divergence test statistic, I/

n bp;p0ð Þ given in(18.11), is chi-square with m - 1 degrees of freedom:

I/n bp; p0ð Þ �!L

n!1v2

m�1:

Consider the hypothesis

H0 : pi ¼ pi0 vs: H1 : pi ¼ pib; i ¼ 1; . . .;m:

Suppose the null hypothesis indicates that pi ¼ pi0, i ¼ 1; 2; . . .;m when in factit is pi ¼ pib; 8i. As it is well known, if pi0 and pib are fixed then as n tends toinfinity, the power of the test tends to 1. In order to examine the situation when thepower is not close to 1, we must make it continually harder for the test as nincreases. This can be done by allowing the alternative hypothesis steadily closerto the null hypothesis. As a result, we define a sequence of alternative hypothesesas follows

H1;n : pi ¼ pin ¼ pi0 þ di=ffiffiffinp

; 8i; ð18:13Þ

which is known as Pitman transition alternative or Pitman (local) alternative orlocal contiguous alternative to the null hypothesis H0 : pi ¼ pi0 (Cochran 1952;Lehmann 1959). In vector notation the null hypothesis and the local contiguousalternative hypotheses take the form

H0 : p ¼ p0 vs: H1;n : p ¼ pn ¼ p0 þ d=ffiffiffinp

;

where pn ¼ p1n; . . .; pmnð Þ0 and d ¼ ðd1; . . .; dmÞ0 is a fixed vector such thatPmi¼1 di ¼ 0. Observe that as n tends to infinity the local contiguous alternative

converges to the null hypothesis at the rate Oðn�1=2Þ.The following theorem by Mattheou and Karagrigoriou (2010) provides the

asymptotic distribution of the U-divergence test, under contiguous alternatives.

Theorem 4 Under the alternative hypothesis given in (18.13), the asymptoticdistribution of the U�divergence test statistic, IU

n bp; p0ð Þ divided by a constant c, isa non-central chi-square with m - 1 degrees of freedom:

1c

IUn bp; p0ð Þ �!L

n!1v2

m�1;d;

where c ¼ 0:5ðmini

pai0 þmax

ipa

i0Þ and noncentrality parameter d ¼Pm

i¼1d2

ipi0

.

18 Goodness-of-Fit Tests for Reliability Modeling 259

Page 8: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

An alternative testing procedure which is used exclusively for the MRL class ofdistributions is based on the following measure of departure (Ahmad 1992):

DF ¼Z1

0

f2x�FðxÞ � vðxÞgdFðxÞ;

where vðxÞ ¼R1

x�FðuÞdu and �FðxÞ ¼ 1� FðxÞ. The problem of exponentiality has

also been investigated by Ebrahimi et al. (1992), who proposed a test of fit usingthe KL divergence and its relation to the entropy measure. Although the abovetests have been used for the exponential distribution, other distributions can alsobeen used. The problem of testing the hypothesis that the underlying distributionbelongs to the family of logistic distributions has been addressed and examined byAguirre and Nikulin (1994). Also, very recently, the problem of normality anduniformity was addressed by Vexler and Gurevich (2010) who proposed anempirical likelihood methodology that standardizes the entropy-based tests. Otherrelated GoF tests have been investigated among others, by Bagdonavicius andNikulin (2002), Huber-Carol and Vonta (2004), Marsh (2006), Menendez et al.(1997) and Zhang (2002).

We close this section with the problem of composite null hypothesis. Althoughthe simple null hypothesis discussed earlier appears frequently in practice, it iscommon to test the composite null hypothesis that the unknown distributionbelongs to a parametric familyfFhgh2H, where H is an open subset in Rk. In thiscase we can again consider a partition of the original sample space with theprobabilities of the elements of the partition depending on the unknownk-dimensional parameter h. Then, the hypothesis can be tested by the hypotheses

H0 : pi ¼ piðh0Þ vs: H1 : pi ¼ piðhÞ;

where h0 the true value of the parameter under the null model. Pearson encoun-tered this problem in the well-known chi-square test statistic and suggested the useof a consistent estimator for the unknown parameter. He further claimed that theasymptotic distribution of the resulting test statistic, under the null hypothesis,remains a chi-square random variable with m-1 degrees of freedom. Later, Fisher(1924) established that the correct distribution does not have m - 1 degrees offreedom. In general, the following theorem holds:

Theorem 5 Under the composite null hypothesis, the asymptotic distribution of

the U-divergence test statistic, IUn bp; p0ðbhÞ� �

divided by a constant c, is a chi-

square with m - k -1 degrees of freedom:

1c

IUn bp; p0ðbhÞ� �

�!Ln!1

v2m�k�1;

where c ¼ 0:5ðmini

pai0ðbhÞ þmax

ipa

i0ðbhÞÞ and bh a consistent estimator of h.

260 A. Karagrigoriou

Page 9: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

18.4 Applications and Simulations

It is important to point out that any type of data can be viewed as multinomial databy dividing the range of data into m categories. In that sense data related tobiomedicine, engineering and reliability that usually come from continuous dis-tributions can be transformed into multinomial data and tests of fit based on theabove measures can be applied. Some of the most popular of such continuousdistributions are the exponential, lognormal, Gamma, Inverse Gaussian, Weibull,Pareto and Positive Stable distributions. For instance, the family of the two-parameter inverse Gaussian distribution (IG2) is one of the basic models fordescribing positively skewed data which arise in a variety of fields of appliedresearch as cardiology, hydrology, demography, linguistics, employment service,etc. Such examples include the repair times of an airborne communicationtransceiver (Chhikara and Folks 1977) and quality characteristics (Sim 2003).Recently, Huberman et al. (1998) have demonstrated the appropriateness of theIG2 family for studying the internet traffic and in particular the number of visitedpages per user within an internet site. Most applications of IG2 are justified on thefact that the IG2 is the distribution of the first passage time in Brownian motionwith positive drift. Furthermore, distributions like the Weibull, the Positive Stableand the Pareto are frequently encountered in survival modeling. The main problemof determining the appropriate distribution is extremely important for reducing thepossibility of erroneous inference. In addition, the existence of censoring schemesin survival modeling makes the determination of the proper distribution anextremely challenging problem. Finally, distributions like the exponential, theGamma, the lognormal and others are very common in lifetime problems.

In reliability theory one encounters the exponential and Weibull distributionsand subsequently the IFR and DFR distributions. Such models are consideredsuitable in the understanding of aging (IFR) or of objects whose reliabilityproperties improve over time (DFR). An important aspect then is to check thevalidity of a specific model assumption. Some research has been done in thisregard. For example, Gail and Ware (1979) studied grouped censored survival databy comparing with a known survival distribution, while Akritas (1988) constructeda Pearson-type GoF measure for one-sample data that allows for random censor-ship. Here, in order to assess the applicability and performance of the proposedtests we focus on the comparison of the proposed method with classical GoFstatistics under the condition of no censoring. More specifically we test

H0 : FðtÞ ¼ 1� expð�tÞ vs: H1 : FðtÞ ¼ 1� expðtcÞ: ð18:14Þ

Observe that the alternative is the Weibull distribution with shape parameterc.Note that equivalently, we can test the corresponding hazard or survival functions.The comparison is based on the power and the achieved size of the test. ExtendingAkritas (1988) alternatives we choose c ¼ 1=ð1þ b=

ffiffiffinpÞ; with b = -4(1)4,

n = 20, 50 and 120 and m = 3 number of intervals/categories. Observe that forb = 0 the data come from the exponential null model so that the values appearing

18 Goodness-of-Fit Tests for Reliability Modeling 261

Page 10: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

in the tables refer to the size of the test. For each combination of alternatives wesimulate 10,000 samples. Among the members of the U-family we choose the oneassociated with the function U1 given in (18.5) with index a = 0.05. The com-peting tests are the KL-test, the v2 test (v2-test), the Matusita test (Mat-test) and theCressie and Read test (CR-test). For the intervals we are using a symmetrictrinomial distribution with p1 ¼ p3 ¼ 0:20 and p2 ¼ 0:60 (Tables 18.1, 18.2, 18.3)as well as an equiprobable trinomial model with p1 ¼ 1=3, i ¼ 1; 2; 3(Tables 18.4, 18.5, 18.6).

In addition, we compare the standard exponential model with parameter equalto 1 against exponential models with parameter c ¼ 1þ b=

ffiffiffinp6¼ 1; namely

H0 : FðtÞ ¼ 1� expð�tÞ vs: H1 : FðtÞ ¼ 1� expð�tcÞ: ð18:15Þ

We pick again the same choices for the parameter b but present the result onlyfor the equiprobable trinomial model (Tables 18.7, 18.8, 18.9). Recall that the nullmodel is obtained for b = 0.

The results clearly show that the U test achieves a very good level which isalways close to the nominal 5% level. In addition, the power of the test isextremely good in all cases examined. Observe that for the case of the hypothesis(18.14) the U-test has a behavior very similar to the Matusita test for both sym-metric and equiprobable splitting. At the same time the other three tests have acomparable performance although the KL-test is slightly better for the equiprob-able case.

We close this section by applying the proposed test to the data set on mileagesfor 19 military personnel carriers that failed in service (Grubbs 1971) the samplemean of which is found to be equal to 997.95. The data have been analyzed byEbrahami et al. (1992) who found that the exponential distribution cannot berejected at the 10% level. Our methodology confirms this result. Indeed, byapplying Theorem 1 for the equiprobable trinomial model and with U as in (18.5),we see that c�1 � IU

n bp; p0ð Þ ¼ 0:4585, which is smaller than the critical point ofv2

3�1;0:10 ¼ 4:605. Thus, the exponential distribution cannot be rejected. Note that

Table 18.1 Achieved power at the 5% for contiguous Weibull alternatives with index b, m = 3(symmetric) and n = 20

b KL-test v2-test Mat-test CR-test U-test

-4 99.99 99.99 99.99 99.99 99.99-3 94.60 94.64 92.95 94.66 90.81-2 52.04 52.12 48.23 52.73 46.32-1 15.01 14.24 13.74 15.14 14.950 5.98 4.45 5.51 5.29 7.781 10.73 7.26 10.23 8.76 14.432 22.87 15.70 22.35 19.62 28.353 39.15 28.05 38.59 34.79 45.194 56.15 42.92 55.75 51.17 62.46

262 A. Karagrigoriou

Page 11: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

the same conclusion is also reached under the symmetric trinomial model withp1 = p3 = 0.2 and p3 = 0.6, although the test statistic is much larger:c�1IU

n bp; p0ð Þ ¼ 1:9803. It should be noted that the equiprobable model is exten-sively used in the literature, especially in small sample problems. In addition notethat in such problems, the number of intervals should not be very large (not largerthan 5).

Table 18.2 Achieved power at the 5% for contiguous Weibull alternatives with index b, m = 3(symmetric) and n = 50

b KL-test v2-test Mat-test CR-test U-test

-4 100.00 99.99 100.00 99.99 100.00-3 92.39 89.48 92.91 89.48 93.98-2 47.06 39.93 50.65 39.93 53.20-1 13.17 10.32 16.03 10.22 17.160 4.86 5.25 6.25 4.83 6.251 10.23 13.42 10.65 11.89 10.452 24.01 29.55 24.12 27.74 23.883 43.53 49.65 43.53 47.70 43.364 62.05 67.28 62.05 65.99 61.96

Table 18.3 Achieved power at the 5% for contiguous Weibull alternatives with index b, m = 3(symmetric) and n = 120

b KL-test v2-test Mat-test CR-test U-test

-4 99.24 98.95 99.25 99.02 99.27-3 83.25 80.28 83.48 80.74 84.03-2 41.72 37.15 42.34 37.96 43.41-1 13.14 10.68 13.76 11.13 14.270 5.19 4.94 5.33 5.12 5.231 10.47 11.80 10.17 11.86 9.652 26.63 29.62 25.90 29.62 24.793 49.54 52.89 48.79 52.89 47.62

Table 18.4 Achieved power at the 5% for contiguous Weibull alternatives with index b, m = 3(equiprobable) and n = 20

b KL-test v2-test Mat-test CR-test U-test

-4 100.00 100.00 100.00 100.00 100.00-3 93.94 92.55 94.54 92.55 94.54-2 39.19 38.43 40.83 38.43 40.83-1 11.54 10.60 12.41 10.60 12.410 5.94 5.40 6.69 5.40 6.691 8.19 7.50 9.44 7.50 9.442 13.98 12.66 15.36 12.66 15.363 21.55 19.48 23.75 19.48 23.754 28.92 26.04 31.36 26.04 31.36

18 Goodness-of-Fit Tests for Reliability Modeling 263

Page 12: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

Remark Although the Monte Carlo experiments contacted in this work (see(18.14) and (18.15)) refer to the classification problem, one can easily address theformal GoF test by considering general alternatives. In such a case, for thedetermination of proper and legitimate alternatives one may choose to consider

Table 18.5 Achieved power at the 5% for contiguous Weibull alternatives with index b, m = 3(equiprobable) and n = 50

b KL-test v2-test Mat-test CR-test U-test

-4 98.26 98.19 98.68 98.19 98.68-3 69.25 69.80 72.17 69.80 72.17-2 27.37 28.17 30.16 28.17 30.16-1 8.89 9.01 10.63 9.01 10.630 5.30 5.16 6.37 5.16 6.371 7.20 7.22 8.64 7.22 8.642 14.22 14.13 16.51 14.13 16.513 23.83 23.66 27.04 23.66 27.044 34.60 33.75 38.80 33.75 38.80

Table 18.6 Achieved power at the 5% for contiguous Weibull alternatives with index b, m = 3(equiprobable) and n = 120

b KL-test v2-test Mat-test CR-test U-test

-4 89.09 88.76 89.09 88.76 89.30-3 58.20 57.82 58.21 57.82 58.84-2 25.71 25.22 25.38 25.22 26.10-1 8.68 8.45 8.57 8.45 8.910 5.16 4.91 5.16 4.91 5.551 8.47 8.24 8.46 8.24 8.892 16.04 15.62 16.17 15.62 17.163 28.97 28.15 29.36 28.15 30.464 43.20 42.42 43.54 42.42 44.99

Table 18.7 Achieved power at the 5% for contiguous exponential alternatives with index b,m = 3 (equiprobable) and n = 20

b KL-test v2-test Mat-test CR-test U-test

-4 99.99 100.00 99.99 100.00 99.99-3 87.41 89.67 87.57 89.67 87.57-2 43.70 45.35 44.53 45.35 44.53-1 13.39 12.90 14.45 12.90 14.450 5.84 5.16 6.64 5.16 6.641 10.49 9.53 11.80 9.53 11.802 23.53 20.83 25.63 20.83 25.633 40.72 36.81 43.92 36.81 43.924 60.00 55.54 63.07 55.54 63.07

264 A. Karagrigoriou

Page 13: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

distributions with the same mean, the same skewness or the same kurtosis as thenull distribution (see for instance Ebrahami et al. 1992 and Koutrouvelis et al.2010).

18.5 Conclusions

The aim of this work is the investigation of generalized tests of fit for lifetimedistributions which are based on the U-divergence class of measures. In particular,we present various test statistics associated with the above testing problem andcalculate the size and the power by simulating samples from decreasing and IFRdistributions which often appear in engineering or aging systems and reliabilitytheory. For comparative purposes we are using well-known tests like the v2 test,the KL test, the Matusita test and the Cressie and Read test.

The results show that the U-test as well as the Matusita test perform better thanthe KL test in most cases and also have the advantage of distinguishing betweennull and alternative hypothesis when they are very close.

Table 18.8 Achieved power at the 5% for contiguous exponential alternatives with index b,m = 3 (equiprobable) and n = 50

b KL-test v2-test Mat-test CR-test U-test

-4 92.27 97.80 97.48 97.80 97.48-3 73.83 76.20 75.71 76.20 75.71-2 34.66 36.44 37.25 36.44 37.25-1 10.74 11.13 12.51 11.13 12.510 5.17 5.22 6.10 5.22 6.101 9.79 9.86 11.67 9.86 11.672 22.79 22.27 25.62 22.27 25.623 44.06 42.96 48.61 42.96 48.614 65.02 63.42 69.17 63.42 69.17

Table 18.9 Achieved power at the 5% for contiguous exponential alternatives with index b,m = 3 (equiprobable) and n = 120

b KL-test v2-test Mat-test CR-test U-test

-4 93.49 93.47 93.07 93.47 93.20-3 69.04 68.72 68.09 68.72 68.39-2 33.64 33.20 33.06 33.20 33.62-1 10.94 10.69 10.60 10.69 11.220 5.23 5.07 5.23 5.07 5.681 9.87 9.52 10.02 9.52 10.522 25.13 24.30 25.59 24.30 26.823 49.38 48.37 49.99 48.37 51.574 71.38 70.61 71.77 70.61 73.22

18 Goodness-of-Fit Tests for Reliability Modeling 265

Page 14: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

References

Aguirre N, Nikulin M (1994) Chi-squared goodness-of-fit test for the family of logisticdistributions. Kybernetika 30:214–222

Ahmad IA (1992) A new test for mean residual life times. Biometrika 79:416–419Akritas MG (1988) Pearson-type goodness-of-fit tests: the univariate case. J Amer Stat Ass

83:222–230Ali SM, Silvey SD (1966) A general class of coefficients of divergence of one distribution from

another. J R Stat Soc B 28:131–142Anderson TW, Darling DA (1954) A test of goodness of fit. J Amer Stat Ass 49:765–769Arndt C (2001) Information Measures. Springer, BerlinBagdonavicius V, Nikulin MS (2002) Goodness-of-fit tests for accelerated life models. In:

Huber-Carol C, Balakrishnan N, Nikulin MS, Mesbah M (eds) Goodness-of-fit tests andmodel validity. Birkhauser, Boston, pp 281–297

Basu A, Harris IR, Hjort NL, Jones MC (1998) Robust and efficient estimation by minimizing adensity power divergence. Biometrika 85:549–559

Chhikara RS, Folks JL (1977) The inverse Gaussian distribution as a lifetime model.Technometrics 19:461–468

Cochran WG (1952) The v2 test of goodness of fit. Ann Math Stat 23:315–345Cramer H (1928) On the composition of elementary errors. Skand Aktuarietids 11:13–74,

141–180Cressie N, Read TRC (1984) Multinomial goodness-of-fit tests. J R Stat Soc B 5:440–454Csiszar I (1963) Eine Informationstheoretische Ungleichung und ihre Anwendung auf den Bewis

der Ergodizitat on Markhoffschen Ketten. Publ Math Inst Hung Acad Sc 8:84–108Ebrahami N, Habibullah M, Soofi ES (1992) Testing exponentiality based on Kullback–Leibler

information. J R Stat Soc B 54:739–748Fisher RA (1924) The conditions under which v2 measures the discrepancy between observation

and hypothesis. J R Stat Soc B 87:442–450Freeman MF, Tukey JW (1959) Transformations related to the angular and the squared root. Ann

Math Stat 27:601–611Gail MH, Ware JH (1979) Comparing observed life table data with a known survival curve in the

presence of random censorship. Biometrics 35:385–391Gokhale DV, Kullback S (1978) The information in contingency tables. Marcel Dekker,

New YorkGrubbs FE (1971) Approximate fiducial bounds on reliability for the two-parameter negative

exponential distribution. Technometrics 13:873–876Huber-Carol C, Vonta F (2004) Frailty models for arbitrarily censored and truncated data.

Lifetime Data Anal 10:369–388Huberman BA, Pirolli PLT, Pitkow JE, Lukose RM (1998) Strong regularities in World Wide

Web surfing. Science 280:95–97Kolmogorov AN (1933) Sulla determinazione empirica di una legge di distribuzione. Giornale

dell’Istituto Italiano degli Attuari 4:83–91Koutrouvelis IA, Canavos GC, Kallioras AG (2010) Cumulant plots for assessing the Gamma

distribution. Commun Stat Theory Meth 39:626–641Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22:79–86Lehmann EL (1959) Testing statistical hypothesis. Wiley, New YorkLevy P (1925) Calcul des Probabilites. Gauthiers-Villars, ParisMahalanobis BC (1936) On the generalized distance in statistics. Proc. Nation. Acad. Sc. (India)

2:49–55Marsh P (2006) Data driven likelihood ratio tests for goodness-to-fit with estimated parameters.

Discussion Papers in Economics, Department of Economics and Related Studies, TheUniversity of York, 2006/20

266 A. Karagrigoriou

Page 15: [Springer Series in Reliability Engineering] Recent Advances in System Reliability || Goodness-of-Fit Tests for Reliability Modeling

Mattheou K, Karagrigoriou A (2010) A new family of divergence measures for tests of fit.Aust NZ J Stat 52:187–200

Matusita K (1967) On the notion of affinity of several distributions and some of its applications.Ann Inst Stat Math 19:181–192

Menendez ML, Pardo JA, Pardo L, Pardo MC (1997) Asymptotic approximations for thedistributions of the (h, u)-divergence goodness-of-fit statistics: applications to Renyi’sstatistic. Kybernetes 26:442–452

Neyman J (1937) ‘Smooth’ test for goodness of fit. Skand Aktuarietidskr 20:150–199Pardo L (2006) Statistical inference based on divergence measures. Chapman and Hall/CRC,

Boca RatonPearson K (1900) On the criterion that a given system of deviations from the probable in the case

of a correlated system of variables is such that it can be reasonable supposed to have arisenfrom random sampling. Philosophy Magazine 50:157–172

Read TRC, Cressie N (1988) Goodness-of-fit statistics for discrete multivariate data. Springer,New York

Sim CH (2003) Inverse Gaussian control charts for monitoring process variability. Commun StatSimul Comput 32:223–239

Vajda I (1989) Theory of statistical inference and information. Kluwer, DorfrechtVajda I (1995) Information-theoretic methods in Statistics. In: Research Report. Acad Sc Czech

Rep Inst Inf Theory Automat, PragueVexler A, Gurevich G (2010) Empirical likelihood ratios applied to goodness-of-fit tests based on

sample entropy. Comput Stat Data Anal 54:531–545von Mises R (1931) Wahrscheinlichkeitsrechnung und ihre anwendung in der statistik und

theoretischen physik. Deuticke, LeipzigWatson GS (1961) Goodness-of-fit tests on the circle. Biometrika 48:109–114Zhang J (2002) Powerful goodness-of-fit tests based on likelihood ratio. J R Stat Soc B 64:

281–294Zografos K, Ferentinos K, Papaioannou T (1990) U-divergence statistics: Sampling properties,

multinomial goodness of fit and divergence tests. Commun Stat Theory Meth 19:1785–1802

18 Goodness-of-Fit Tests for Reliability Modeling 267