ChJS-03!01!05 Uncorrected Proofs

8/2/2019 ChJS-03!01!05 Uncorrected Proofs

1/20

Chilean Journal of Statistics (ChJS) www.soche.cl/chjs ISSN 0718-7912/20 (print/online version)

Dear author:

Please find attached the proofs of your article.

You can submit your corrections via e-mail to [email protected],Cc to [email protected] or by fax to (56)(32)2508322.

Please always indicate the line number to which the correction refers.

You can insert your corrections in the proof PDF and email to us the annotatedPDF.

For fax submission or scanned by email, please ensure that your corrections areclearly legible. Use a fine black pen and write the correction in the margin, not too

close to the edge of the page.

Remember to indicate the article title, article number, and your name whensending your response via e-mail or fax.

Check the questions that may have arisen during copy editing and insert youranswers/corrections.

Check that the text is complete and that all figures, tables and their legends areincluded. Also check the accuracy of special characters, equations, and electronicsupplementary material if applicable. If necessary refer to the edited manuscript.

Please take particular care that all such details are correct.

Please do not make changes that involve only matters of style. We have followedChJS style. Substantial changes in content, e.g., new results, corrected values, titleand authorship are not allowed without the approval of the responsible editor. Insuch a case, please contact the Editorial Office and return his/her consent togetherwith the proof.

Your article will be published Online First (iFirst) approximately one week afterreceipt of your corrected proofs. This is the official first publication. Furtherchanges are, therefore, not possible.

The printed version will follow in a forthcoming issue.

Please note after online publication, subscribers (personal/institutional) to this journalwill have access to the complete article using the URL: http://www.soche.cl/chjs.

Kind regards,

Editorial OfficeChilean Journal of Statistics
http://www.soche.cl/chjsmailto:[email protected]:[email protected]:[email protected]://www.soche.cl/chjshttp://www.soche.cl/chjsmailto:[email protected]:[email protected]://www.soche.cl/chjs


2/20

Journal: Chilean Journal of Statistics

Article: ChJS Vol. 03 No. 01 Art. 05

Author Query Form

Please ensure you fill out your response to the queries raised below and

return this form along with your corrections

Dear author:

During the typesetting process of your article, the following queries have arisen. Please checkyour typeset proof carefully against the queries listed below and mark the necessary changeseither directly on the proof/online grid or in the Authors response area provided below.

Query Details required Authors response

1 Please send corresponding authorsmailing address.

2 Line 142, Cook (1982) is not in the

reference list. Please add or remove3 Line 444, Reference Henderson (1975)

is not cited. Please cite or remove.

4 Line 484, Reference Wei et al. (1998)is not cited. Please cite or remove.


3/20

Chilean Journal of StatisticsVol. 3, No. 1, April 2012, 118

UNCORRECTED PROOFS1Statistical Modeling2

Research Paper3

On linear mixed models and their influence4diagnostics applied to an actuarial problem5

Luis Gustavo Bastos Pinho, Juvencio Santos Nobre and Slvia Maria de Freitas6

Department of Applied Mathematics and Statistics, Federal University of Ceara, Fortaleza, Brazil7

(Received: 10 June 2011 Accepted in final form: 23 September 2011)8

Abstract9

In this paper, we motivate the use of linear mixed models and diagnostic analysis in10practical actuarial problems. Linear mixed models are an alternative to traditional cred-11ibility models. Frees et al. (1999) showed that some mixed models are equivalent to some12widely used credibility models. The main advantage of linear mixed models is the use13of diagnostic methods. These methods may help to improve the model choice and to14identify outliers or influential subjects which deserve better attention by the insurer. As15an application example, the well known data set in Hachemeister (1975) is modeled by16a linear mixed model. We can conclude that this approach is superior to the traditional17credibility one since the former is more flexible and allows the use of diagnostic methods.18

Keywords: Credibility models Hachemeister model Linear mixed models19 Diagnostics Local influence Residual analysis.20Mathematics Subject Classification: 62J05

62J20.21

1. Introduction22

One of the main concerns in actuarial science is to predict the future behavior for the23aggregate amount of claims of a certain contract based on its past experience. By accurately24predicting the severity of the claims, the insurer is able to provide a fairer and thus more25competitive premium.26

Statistical analysis in actuarial science generally belongs to the class of repeated measures27studies, where each subject may be observed more than once. By subject we mean each28element of the observed set which we want to investigate. Workers of a company, class of29employees, and different states are possible examples of subjects in actuarial science. To30

model actuarial data, a large variety of statistical models can be used, but it is usually31difficult to choose a model due to the data structure, in which within-subject correlation32is often seen. Correlation misspecification may lead to erroneous analysis. In some cases33this error is very severe. A clear example may be seen in Demidenko (2004, pp. 2-3) and34a similar artificial situation is reproduced in Figure 1, that shows the relation between35the number of claims and the number of policy holders of an insurer for nine different36regions within a country. In each region the two variables are measured once a year on the37

Corresponding author. Email: [email protected]

ISSN: 0718-7912 (print)/ISSN: 0718-7920 (online)

c Chilean Statistical Society Sociedad Chilena de Estadsticahttp://www.soche.cl/chjs


4/20

2 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

same day for three consecutive years. In Figure 1(a) we do not consider the within-region38(within-subject) correlation. The dashed line is a simple linear regression and suggests that39the more the policy holders, the less claims occur. In Figure 1(b) we joined the observations40for each region by a solid line. It is clear now that the number of claims increases with the41number of policy holders.42

12 14 16 18 20

500

550

600

Number of policy holders (thousands)

Numberofclaims

12 14 16 18 20

500

550

600

Number of policy holders (thousands)

Numberofclaims

Figure 1. (a) Not considering the within-subject correlation, (b) considering the within-subject correlation.

It is necessary to take into consideration that each region may have a particular behavior43which should be modeled, but only this is usually not enough. Techniques summarized44under the name of diagnostic procedures may help to identify issues of concern, such as high45influential observations, which may distort the analysis. For linear homoskedastic models,46a well known diagnostic procedure is the residual plot. For linear mixed models better47types of residuals are defined. Besides residual techniques, which are useful, there is a less48used class of diagnostic procedures, which includes case deletion and measuring changes in49the likelihood of the adjusted model under minor perturbations. Several important issues50

may not be noticed without the aid of these last diagnostics methods.51For introductory information regarding regression models and respective diagnostic anal-52

ysis; see Cook and Weisberg (1982) or Drapper and Smith (1998). For a comprehensive53introduction to linear mixed models, see Verbeke and Molenberghs (2000), McCulloch54and Searle (2001) and Demidenko (2004). Diagnostic analysis of linear mixed models were55presented and discussed in Beckman et al. (1987), Christensen and Pearson (1992), Hilden-56Minton (1995), Lesaffre and Verbeke (1998), Banerjee and Frees (1997), Tan et al. (2001),57Fung et al. (2002), Demidenko (2004), Demidenko and Stukel (2005), Zewotir and Galpin58(2005), Gumedze et al. (2010) and Nobre and Singer (2007, 2011).59

The seminal work of Frees et al. (1999) showed some similarities and equivalences be-60tween mixed models and some well known credibility models. Applications to data sets in61actuarial context may be seen in Antonio and Beirlant (2006). Our contribution is to show62

how to use diagnostic methods for linear mixed models applied to actuarial science. We63illustrate how to identify outliers and influential observations and subjects. We also show64how to use diagnostics as a tool for model selection. These methods are very important65and usually overlooked by most of the actuaries.66

This paper is divided as follows. In Section 2 we present a motivational example using67a well known data set. In Section 3 we briefly present the linear mixed models. Section 468contains a short introduction to the diagnostic methods used in the example. In Section695 we present an application based on the motivational example. Section 6 shows some70conclusions. Finally, in an Appendix, we present mathematical details of some formulas71and expressions used in the text.72


5/20

Chilean Journal of Statistics 3

2. Motivational Example73

For a practical example, consider the Hachemeister (1975) data on private passenger bodily74injury insurance. The data were collected from five states (subjects) in the US, through75twelve trimesters between July 1970 and June 1973, and show the mean claim amount and76the total number of claims in each trimester. The data may be found in the actuar package77

(see Dutang et al., 2008) from R (R Development Core Team, 2009) and are partially shown78in Table 1.79

Table 1. Hachemeisters data.Trimester State Mean claim amount Number of claims

1 1 1738 78611 2 1364 16221 3 1759 11471 4 1223 4071 5 1456 29022 1 1642 9251...

......

...12 1 2517 9077

12 2 1471 186112 3 2059 112112 4 1306 34212 5 1690 3425

In Figure 2 we plot the individual profiles for each state and the mean profile. It suggests80that the claims have a different behavior along the trimesters for each state. One may81notice that the claims from state 1 are greater than those from other states for almost82every observation, and the claims from states 2 and 3 seem to grow more slowly than83those from state 1. If the insurer wants to accurately predict the severity, the subjects84individual behavior must also be modeled. Traditionally this is possible with the aid of85credibility models; see, e.g., Buhlmann (1967), Hachemeister (1975) and Dannenburg et86

al. (1996). These models assign weights, known as credibility factors, to a pair of different87estimates of severity.88

2 4 6 8 10 12

1000

1500

2000

2500

Trimester

Averageclaim

amount

State 1State 2State 3State 4State 5Average

Figure 2. Individual profiles and mean profile for Hachemeister (1975) data.

Credibility models may be functionally defined as

ZB + (1 Z)C,


6/20


where A represents the severity in a given state, Z is a credibility factor restricted to [0, 1],89B is a priori estimate of the expected severity for the same estate and C is a posteriori90estimate also of the expected severity. Considering a particular state, B may be equal to91the sample mean of the severity of its observations and C equal to the overall sample mean92of the data in the same period.93

Frees et al. (1999) showed that it is possible to find linear mixed models equivalent to94

some known credibility models, such as Buhlmann (1967) and Hachemeister (1975) models.95Information about linear mixed models is provided in the next section.96

3. Linear Mixed Models97

Linear mixed models are a popular alternative to analyze repeated measures. Such models98may be functionally expressed as99

yi = Xi + Zibi + ei, i = 1, . . . , k , (1)

where yi = (y1, y2, . . . , yni) is a ni1 vector of the observed values of the response variable100

for the ith subject, Xi is a ni p known full rank matrix, is a p 1 vector of unknown101parameters, also known as fixed effects, which are used to model E[yi], Zi is a niq known102full rank matrix, bi is a q 1 vector of latent variables, also known as random effects,103used to model the within-subject correlation structure, and ei = (ei1, ei2, . . . , eini)

is the104ni 1 random vector of (within-subject) measurement errors. It is usually also assumed105that ei

ind Nni(0, 2Ini), where Ini denotes the identity matrix of order ni for i = 1, . . . , k,106bi

iid Nq(0, 2G) for i = 1, . . . , k in which G is a q q positive definite matrix, and107ei and bj are independent i, j. Under these assumption, this is called a homoskedastic108conditional independence model. It is possible to rewrite model given in Equation (1) in a109more concise way as110

y = X + Zb + e, (2)

where y = (y1 , . . . , yk )

, X = (X1 , . . . , Xk )

, Z = ki=1 Zi, b = (b1 , . . . , bk ) and111e = (e1 , . . . ,e

k )

, with

representing the direct sum.112It can be shown that, conditional on known covariance parameters of the model, it is113

conditional to the elements of G and 2, the best linear unbiased estimator (BLUE) for114 and the best linear unbiased predictor (BLUP) for b are given by115

= (XV1X)1XV1y, (3)

and116

b = DZV1(y

X),

respectively, where D = 2G, V = 2(In + ZGZ), with n =

ki=i ni; see Hachemeister117

(1975).118Maximum likelihood (ML) and restricted maximum likelihood (RML) methods can be119

used to estimate the variance components of the model. The latter, proposed in Patterson120and Thompson (1971), is usually chosen since it often generates less biased estimators121related to the variance structure. When estimates for V are used in Equation (3) to122

obtain and b, they are called empirical BLUE (EBLUE) and empirical BLUP (EBLUP),123respectively. Usually the estimation of the parameters involves the use of iterative methods124for maximizing the likelihood function.125


7/20


Linear mixed models are not the only way to deal with repeated measures studies. Other126popular alternatives are the generalized estimation equations (see Liang and Zeger, 1986;127Diggle et al., 2002) and multivariate models as seen in Johnson and Whichern (1982)128and Vonesh and Chinchilli (1997). But usually these alternatives are more restrictive than129linear mixed models, and they only model the marginal expected value of the response130variable.131

4. Diagnostic Methods132

Diagnostic methods comprehend techniques whose purpose is to investigate the plausi-133bility and robustness of the assumptions made when choosing a model. It is possible to134divide the techniques shown here in two classes: residual analysis, which investigates the135assumptions on the distribution of errors and presence of outliers; and sensitivity analysis,136which analyzes the sensitivity of a statistical model when subject to minor perturbations.137Usually, it would be far more difficult, or even impossible, to observe these aspects in a138traditional credibility model.139

In the context of traditional linear models (homoskedastic and independent), examples140

of diagnostic methods may be seen in Hoaglin and Welsch (1978), Belsley et al. (1980) and141Cook (1982). Linear mixed models, extensions and generalizations are briefly discussed142here and may be seen in Beckman et al. (1987), Christensen and Pearson (1992), Hilden-143Minton (1995), Lesaffre and Verbeke (1998), Banerjee and Frees (1997), Tan et al. (2001),144Fung et al. (2002), Demidenko (2004), Demidenko and Stukel (2005), Zewotir and Galpin145(2005), Nobre and Singer (2007, 2011) and Gumedze et al. (2010).146

4.1 Residual analysis147

In the linear mixed models class three different kinds of residuals may be considered. The148conditional residuals: e = y X Zb, the EBLUP: Zb, and the marginal residuals:149 = y X. These predict respectively conditional error e = y E[y|b] = y X Zb,150random effects Zb = E[y|b] E[y], the marginal error = y E[y] = y X. Each of151the mentioned residuals is useful to verify some assumption of the model, as seen in Nobre152and Singer (2007) and briefly presented next.153

4.1.1 Conditional residuals154

To identify cases with a possible high influence on 2 in linear mixed models, Nobre and155Singer (2007) suggested the standardization for the conditional residual given by156

ei =ei

qii,

where qii represents the ith element in the main diagonal of Q defined as157

Q = 2(V1 V1X(XV1X)1XV1).

Under normality assumptions on e, this standardization identifies outlier observations and158subjects; see Nobre and Singer (2007). To do so, the same authors consider the quadratic159form MI = y

QUI(UI QUI)

1UI Qy, where UI = (uij)(nk) = (Ui1 , . . . , Uik), with Ui160representing the ith column of the identity matrix of order n. To identify an outlier subject161let I be the index set of the subject observations and evaluate MI for this subset.162


8/20


Table 2. Diagnostic techniques involving residuals.

Diagnostic Graph

Linearity of fixed effects vs. explanatory variables (fitted values)Presence of outliers e vs. observation indexHomoskedasticity of the conditional errors e vs. fitted valuesNormality of the conditional errors QQ plot for the least confounded residualsPresence of outlier subjects Mahalanobis distance vs. observation index

Normality of the fixed effects weighted QQ plot for bi

4.1.2 Confounded residuals163

It can be shown that, under the assumptions made by model given in Equation (1), we164have165

e = RQe+ RQZb and Zb = ZGZQZb + ZGZQe,166

where R = 2In. These identities tell us that e and Zb depend on b and e and thus167are called confounded residuals; see Hilden-Minton (1995). To verify the normality of the168conditional errors using only e may be misleading because of the presence of b in the169

above formulas. Hilden-Minton (1995) defined the confounding fraction as the proportion170of variability in e due to the presence of b. The same work suggested the use of a linear171transformation L such that Le has the least confounding fraction possible. The suggested172transformation also generates uncorrelated homoskedastic residuals. It is more appropri-173ated to analyze the assumption of normality in the conditional errors using Le instead174ofe as suggested by Hilden-Minton (1995) and verified by simulation in Nobre and Singer175(2007).176

4.1.3 EBLUP177

The EBLUP is useful to identify outlier subjects given that it represents the distance178between the population mean value and the value predicted for the ith subject. A way of179using the EBLUP to search for outliers subjects is to use the Mahalanobis distance (see180

Waternaux et al., 1989), i = bi (Var[bi bi])1bi. It is also possible to use the EBLUP181to verify the random effects normality assumption. For more information; see Nobre and182Singer (2007). In Table 2 we summarize diagnostic techniques involving residuals discussed183in Nobre and Singer (2007).184

4.2 Sensitivity analysis185

Influence diagnostic techniques are used to detect observations that may produce excessive186influence in the parameters estimates. There are two main approaches of such techniques:187global influence, which is usually based on case deletion; and local influence, which intro-188duces small perturbations in different components of the model.189

In normal homoskedastic linear regression, examples of sensitivity measures are the190 Cook distance, DFFITS and the COVRATIO; see Cook (1977), Belsley et al. (1980) and191Chatterjee and Hadi (1986, 1988).192

4.2.1 Global influence193

A simple way to verify the influence of a group of observations in the parameters es-timates is to remove the group and observe the changes in the estimation. The groupof observations are influential if the changes are considerably large. However, in LMM,it may not be practical to reestimate the parameters every time a set of observations isremoved. To avoid doing so, Hilden-Minton (1995) presented an update formula for the


9/20


BLUE and BLUP. Let I = {i1, . . . , ik} be the index set of the removed observations andUI = (Ui1 , . . . ,Uik). Hilden-Minton (1995) showed that

(I) = (XMX)1XMUI(I) and b b(I) = DZQUI(I),

where the subscript (I) indicates that the estimates were obtained without the observations194

indexed by I and (I) = (UIQUI)1UIQy.195A suggestion to measure the influence on the parameters estimates in linear mixed

models is to use the Cook distance (see Cook, 1977) given by

DI =( (I))(XV1X)1( (I))

c=

(y y(I))V1(y y(I))c

,

such as seen in Christensen and Pearson (1992) and Banerjee and Frees (1997), where cis a scale factor. However, it was pointed out by Tan et al. (2001) that DI is not alwaysable to measure the influence on the estimation properly in the mixed models class. Thesame authors suggest the use of a measure similar to the Cook distance, but conditionalto BLUP (b). The conditional Cook distance is defined for the ith observation as

Dcondi =k

j=1

Pj(i)Var[y|b]1Pj(i)(n 1)k + p , i = 1, . . . , k ,

where Pj(i) = yjyj(i) = (Xj+Zjbj)(Xj(i)+Zjbj(i)). The same authors decomposed196Dcondi = D

condi1 + D

condi2 + D

condi3 and commented the interpretation of each part of the197

decomposition. Dcondi1 is related to the influence in the fixed effects, Dcondi2 is related to the198

influence on the predicted values and Dcondi3 to the covariance of the BLUE and the BLUP,199which should be close to zero if the model is valid.200

When all the observations from a subject are deleted, it is not possible to obtain the201

BLUP for the random effects of that subject, making it impossible to obtain D

cond

I as202

stated above. For this purpose, Nobre (2004) suggested using Dcondi = (ni)1

jIDcondj ,203

where I indexes the observation from a subject, as a way to measure the influence of a204subject on the parameters estimates when its observations are deleted.205

There are natural extensions of leverage measures for linear mixed models. These can206be seen in Banerjee and Frees (1997), Fung et al. (2002), Demidenko (2004) and Nobre207(2004). However, they only provide information about leverage regarding fitted marginal208values. This has two main limitations as commented in Nobre and Singer (2011). First209we may be interested in detecting high-leverage within-subject observations. Second, in210some cases the presence of high-leverage within-subject observations does not imply that211the subject itself is detected as a high-leverage subject. Suggestions of how to evaluate212the within-subject leverage may be seen in Demidenko and Stukel (2005) and Nobre and213

Singer (2011).2144.2.2 Local influence215

The concept of local influence was proposed by Cook (1986) and consists in analyzing the216sensitivity of a statistical model when subjected to small perturbations. It is suggested217to use an influence measure called the likelihood displacement. Considering the model218described in Equation (2), the log-likelihood function may be written as219

L() =

ki=1

Li() = 12

ki=1

ln |Vi| + (yi Xi)V1(yi Xi)

.


10/20


The likelihood displacement is defined as LD() = 2{L() L()}, where is a l 1perturbations vector in an open set Rl; is the parameters vector of the model,including covariance parameters; is the ML estimate of and is the ML estimate whenthe model is perturbed. It is necessary to assume that 0 exists such that L() = L(0)

and such that LD has its first and second derivatives in a neighborhood of (

,0 ). Cook

(1986) considered a Rl+1 surface formed by the influence function () = (, LD())

and the normal curvature in the vicinity of0 in the direction of a vector d, denoted byCd. In this case, the normal curvature is given by

Cd = 2|dHL1Hd|,

where L = 2L()/ and H = 2L()/ both evaluated at = ; see Cook220

(1986). It can be shown that Cd always lies between the minimum and maximum eigen-221value of the matrix F = HL1H, so dmax, the eigenvector associated to the highest222eigenvalue, gives information about the direction that exhibits more sensitivity of LD()223in a 0 neighborhood. Beckman et al. (1987) made some comments on the effectiveness of224the local influence approach. Lesaffre and Verbeke (1998) and Nobre (2004) showed some225examples of perturbation schemes in the linear mixed models context.226

Perturbation scheme for the covariance matrix of the conditional errors. To ver-ify the sensitivity of the model to the conditional homoskedasticity assumption, pertur-bations are inserted in the covariance matrix of the conditional errors. This can be doneby considering Var[] = 2(), where () = diag(), with = (1, . . . , N)

, theperturbation vector. For this case we have 0 = 1N. The log-likelihood function in thiscase is given by

L = L() = 12

ln |V()| + (y X)V()1(y X)

,

where V = ZDZ + 2().227

Perturbation scheme for the response. For the local influence approach, Beckman et228al. (1987) proposed the perturbation scheme229

y() = y + s,

where s represents a scale factor and is a n 1 perturbation vector. For this scheme230we have 0 = 0, with 0 representing the n 1 null vector. In this case, the perturbed231log-likelihood function is proportional to232

L() = 12

(y + s X)V1(y + s X).

Perturbation scheme for the random effects covariance matrix. It is possible to233assess the sensitivity of the model in relation to the random effects homoskedasticity234assumption by perturbing the matrix G. Nobre (2004) suggested the use of Var[bi] = iG235as a perturbation scheme. In this case is a q 1 vector and 0 = 1q. The perturbed236log-likelihood function is proportional to237

L() = 12

ki=1

ln |Vi()| + (yi Xi)1V()1(yi Xi)

.


11/20


Perturbation weighted case. Verbeke (1995) and Lesaffre and Verbeke (1998) suggested238perturbing the log-likelihood function as239

L(|) =k

i=1

iLi().

Such a perturbation scheme is appropriate for measuring the influence of the ith subject240using the normal curvature in its direction and is given by241

Ci = 2|di HL1Hdi|,

where di is a vector whose entries are 1 in the ith coordinate and zero everywhere else.242Verbeke (1995) showed that if Ci has a high value, then the ith subject has great influence243

in the value of. A threshold of twice the mean value of all Cjs helps to decide whether244or not the observation is influential.245

Lesaffre and Verbeke (1998) extracted from Ci some interpretable measures. They es-246

pecially propose using XiX

i 2

, Ri2

ZiZ

i 2

, Ini RiR

i 2

and V1i 2, where247 Xi = V1/2Xi, Zi = V1/2i Zi, Ri = V1/2i ei, to evaluate the influence of the ith subject248in the model parameter estimates. The actual interpretation of each of these terms can be249seen in the original paper.250

4.2.3 Conform local influence251

The Cd measure proposed by Cook (1986) is not invariant to scale re-parametrization.252To obtain a similar standardized measure and make it more comparable, Poon and Poon253(1999) used the conform normal curvature instead of the normal curvature given by254

Bd() =2|dHL1Hd|

2H

L1

H.

It can be shown that 0 Bd() 1 to d direction and that Bd is invariant to conform scale255re-parametrization. A re-parametrization is said to be conform if its jacobian J is such that256JJ = tIs, to some real t and integer s. They showed that if 1, . . . , l are the eigenval-257ues of F matrix with v1, . . . , vl representing the respective normalized eigenvectors, then258

the value of the conform normal curvature in vi direction is equal to i/l

i=1 2i and259 l

i=1 B2vi() = 1. If every eigenvector has the same conform normal curvature, its value is260

equal to 1/

l. Poon and Poon (1999) proposed to use this measure as a referential to mea-261sure the intensity of the local influence of an eigenvector. It can also be shown that when262d has the direction of dmax the conform normal curvature also attains its maximum. In263

this way, the normal curvature and the conform normal curvature are equivalent methods.264

5. Application265

According to Frees et al. (1999), the random coefficient models are equivalent to the266Hachemeister linear regression model which is used for the example data in Hachemeister267(1975). The random coefficient model to the data in Table 1 may be described as268

yij = i + ji + eij, i = 1, . . . , 5, j = 1, . . . , 12,


12/20


where yij represents the average claim amount for state i in the jth trimester, i =269 + ai and i = + bi, with fixed and , and (ai, bi)

N2(0, D), in which D is a2702 2 covariance matrix. Before adjusting the model to the data, we used R to apply the271asymptotic likelihood ratio test described in Giampaoli and Singer (2009) to compare the272suggested random coefficients model and a random intercept model. The p-value obtained273from the test was 0.0514. It indicates that it may be enough to consider the random274

effect for the intercept only. This decision is also supported by the Bayesian information275criterion (BIC), which is equal to 808.3 for the single random effect model and 811.6 for the276model with two random effects. We may also use another set of tests, involving bootstrap,277monte-carlo and permutational methods, to investigate whether or not should we prefer278the random intercept model. These tests may be seen in Crainiceanu and Ruppert (2004),279Greven et al. (2008) and Fitzmaurice et al. (2007). However, this is very distant from our280goals and is not discussed here. For the sake of simplicity and based on the presented281reasons we shall use the random intercept model, which differs a little from the model282proposed by Frees et al. (1999). Thus, the model to be adjusted for the data in this283example is284

yij = i + j + ij , i = 1, . . . , 5, j = 1, . . . , 12, (4)

where i = + ai, and are the same as defined before. Assume also that Var[ij ] = 2285

and Var[ai] = 2a.286

The model parameter estimates were obtained by the RML method using the lmer()287function from the lme4 package in R. The standard errors were obtained from SAS c (SAS288Institute Inc., 2004) using the proc MIXED. The estimates are shown in Table 3.289

Table 3. Model parameter estimates.

Parameter 2 2a

Estimate 1460.32 32.41 32981.53 73398.25SE 131.07 6.79 6347.17 24088.00

Figure 3 shows the five conditional regression lines obtained from the linear mixed model290given in Equation (4). The adjusted model clearly suggests that the claim amount is higher291in state 1. Also it suggests a similarity in the claim amounts from states 2 and 4. Besides292that, we can expect a smaller risk from policies in state 5, since they are much closer to293the respective adjusted conditional line. Further information is explored by the diagnostic294analysis commented next.295

0 10 20 30 40 50 60

1000

1500

2000

2500

Conditional regression lines

Observation

Aggregateclaim

amount

State 1

State 2

State 3

State 4

State 5

Figure 3. Conditional regression lines.


13/20


5.1 Diagnostic analysis296

The standardized residuals proposed by Nobre and Singer (2007) suggest that observation2974.7 (obtained from state 4 in the seventh trimester) may be considered an outlier as shown298in Figure 4(a). According to the QQ plot in Figure 4(b) it is reasonable to assume that the299conditional errors are normally distributed. The Mahalanobis distance in Figure 4(c) was300normalized to fit the interval [0, 1] and suggests that the first state may be an outlier. The301measure MI proposed by Nobre and Singer (2007) in Figure 4(d), also normalized, suggests302that none of the states have outlier observations. The Mahalonobis distance should not303be confounded with MI. The first is based on the EBLUP and the last is based on the304conditional errors, and thus they have different meanings. For both analyses, an observation305is highlighted if it is greater than twice the mean of the measures.306

1 2 3 4 5

3

2

1

0

1

2

3

(a)

State

Standardized

conditional

residual

4.7

2 1 0 1 2

3

2

1

0

1

2

3

(b)

Quantiles of N(0,1)

Leastconfounded

residualstandardized

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1

.0

(c)

State

Mahalanobisdistance

1

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1

.0

(d)

State

MI

Figure 4. Residual analysis: (a) standardized residuals, (b) least confounding residuals, (c) EBLUP, (d) values forMI .

The conditional Cook distance is shown in Figure 5. The distances were normalized for307comparison. Figure 5(a) suggests that observation 4.7 is influential in the model estimates.308The first term of the distance decomposition suggests that no observations were influential309in the estimate of as shown in Figure 5(b). The second term of the decomposition310suggests that observation 4.7 is potentially influential in the prediction of b as seen on311Figure 5(c). The last term, Di3, is as close to zero as expected and is omitted.312


14/20


1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

(a)

State

Cooksconditionaldistance

4.7

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

(b)

State

D

1

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

(c)

State

D2

4.7

Figure 5. (a) Conditional Cook distance, (b) Di1, (c) Di2.

Figure 6 shows the local influence analysis using three different perturbation schemes.313The first, in Figure 6(a), is related to the conditional errors covariance matrix, as suggested314in Beckman et al. (1987), and indicates that the observations from the fourth state, espe-315cially 4.7, are possibly influential in the homoskedasticity and independence assumption316for the conditional errors. Notice that it is possible to explain the influence of observation3174.7 analyzing Figure 2. This observation has a value considerably higher than the others318from the same state. Figure 6(b) demonstrates the perturbation scheme for the covariance319matrix associated to the random effects as shown in Nobre (2004). Alternative perturba-320tion schemes for this case can be seen at Beckman et al. (1987). These schemes suggest that321

all states are equally influential in the random effects covariance matrix estimate. Finally,322

there are evidences that the observations in the fourth state may not be well predicted by323the model; see Figure 6(c).324

After the diagnostic we proceed to a confirmatory analysis by removing the observations325from states 1 and 4, one at a time and then both at the same time. The new estimates are326shown in Table 4. For each parameter, we calculate the relative change in the estimated327values, defined for parameter , as328

RC() =

(i) 100%.


15/20


1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

(a)

State

Absolutevalueso

fdmaxcomponents

4.7

4.14.114.12

1.2

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

(b)

State

Absolutevalueofdmaxcomponents

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

(c)

State

||IRR

T||

2

4

Figure 6. Perturbation schemes: (a) conditional covariance matrix, (b) random effects covariance matrix, (c) valuesfor Ini RiRi2.

Table 4. Estimates and relative changes for the model given in Equation (4) parameter estimates with and withoutstates 1 and 4.

Situation 2

2a

Complete data 1460.32 32.41 32981.53 73398.25Without State 1 1408.63 (3.67) 25.26 (22.06) 34666.31 ( 5.11) 34335.64 (53.22)Without State 4 1530.94 (4.61) 33.50 ( 3.36) 24940.12 (24.38) 59214.50 ( 19.32)

Without States 1 and 4 1485.56 (1.70) 24.32 (24.96) 24497.48 (25.72) 23707.07 (67.70)

If all five states were equally influential, we would expect the value for RC to lie around329

1/5 = 20% after removing a state. If RC() exceeds two times this value, that is 40%,330for some parameter we consider the state was potentially influential. It is possible to331conclude that three observations from state 1 were influential in the within-subject variance332estimate. From Figure 2, one can explain this influence noticing that all the observations333from state 1 had higher values compared to the other states. Notice that such influence was334not detected in Figure 5(b), but was pointed out by the Mahalanobis distance in Figure3354(c). Removing state 1 from our analysis and running every diagnostic procedure again336we detect no excessive influence and the only issue is the observation 4.7, which is still an337outlier. From this result the model is validated and it is assumed to be robust and ready338for use.339


16/20


340

6. Conclusions341

The use of linear mixed models in actuarial science should be encouraged given their342capability to model the within-subject correlation, their flexibility and the presence of343diagnostic tools. Insurers should not use a model without validating it first. For the specific344

example seen here, the decision makers may consider a different approach for state 1.345After removing observations from state 1 there was a relative change of more than 50% in346the random effect variance estimate, which reflects significantly in the premium estimate.347Such analysis would not be possible in the traditional credibility models approach. This348illustrates how the model can be used to identify different sources of risk and can be used349in portfolio management. Linear mixed models are also usually easier to understand and350to present, when compared to standard actuarial methods, such as the credibility models351and Bayesian approach for determining the fair premium. The natural extension of this352work is to repeat the estimation and diagnostic procedures, adapting what is necessary,353to the generalized linear mixed models, which are also useful to actuarial science. Some354works have already been made in this area; see, e.g., Antonio and Beirlant (2006). It is also355interesting to continue a further analysis of the example in Hachemeister (1975), using the356diagnostic procedures again when weights are introduced to the covariance matrix of the357conditional residuals in the random coefficient models, and to evaluate the robustness of358the linear mixed models equivalent to the other classic credibility models. Again, this care359is justified because the fairest premium is more competitive in the market.360

Appendix361

We present here expressions for matrix H and the derivatives seen in the different pertur-362bation schemes presented in Section 4.2.2. These calculations are taken from Nobre (2004)363and are presented here to make this text more self-content.364

Appendix A. Perturbation Scheme for the Covariance Matrix of the365Conditional Errors366

Let H(k) be the kth column ofH and f be the number of distinct components of matrix367D, then368

H(k) =

2L()

k

,

2L()

k2,

2L()

k1, . . . ,

2L()

kf

,

where369

2L()k

= ;=0

= XDke,

370

2L()

k2

= ;=0

= 12

2 tr

DkZDZ 2eDkV1e + 2eDke ,

371

2L()

ki

= ;=0

= 12

trDkZDiZ

2eDkZDiZe

,


17/20


with372

Dk =V1()

k

= ;=0

= 2V(k)(V(k)), e Di = Di

= ;=0

,

and V(k) representing the kth column ofV1.373

Appendix B. Perturbation Scheme for the Response374

It can be shown that375

2L()

= ;=0

= sV1X,376

2L()

2 =;=0 = sV1V1e,377

2L()

i

= ;=0

= sV1ZDiZV1e,

implying378

H = sV1X, V1e,ZD1ZV1e, . . . ,ZDfZV1e .

Appendix C. Perturbation Scheme for the Random Effects Covariance379Matrix380

For this scheme we have381

H(k) =

2L()

k

,

2L()

k2,

2L()

k1, . . . ,

2L()

kf

.

It can be shown that382

2L(k)

=;=0 = Xk V1k ZkGZk V1k ek,383

2L(k)

2

= ;=0

= trV1k ZkGkZk 2ek V1k ZkGZk V1k V1k ek,

384

2L(k)

i

= ;=0

= trV1k ZkGkZk V1k ZkGiZk ek V1k ZkGZk V1k ZkGiZk V1k ek.


18/20


Acknowledgements385

We are grateful to Conselho Nacional de Desenvolvimento Cientfico e Tecnolgico (CNPq386project # 564334/2008-1) and Fundao Cearense de Apoio ao Desenvolvimento Cientfico387e Tecnolgico (FUNCAP), Brazil for partial financial support. We also thank an anonym388referee and the executive editor for their careful and constructive review.389

References390

Antonio, K., Beirlant, J., 2006. Actuarial statistics with generalized linear mixed models.391Insurance: Mathematics and Economics, 75, 643676.392

Banerjee, M., Frees, E.W., 1997. Influence diagnostics for linear longitudinal models. Jour-393nal of the American Statistical Association, 92, 9991005.394

Beckman, R.J., Nachtsheim, C.J., Cook, R.D., 1987. Diagnostics for mixed-model analysis395of variance. Technometrics, 29, 413426.396

Belsley, D.A., Kuh, E., Welsch, R.E., 1980. Regression Diagnostics: Identifying Influential397Data and Sources of Collinearity. John Wiley & Sons, New York.398

Buhlmann, H., 1967. Experience rating and credibility. ASTIN Bulletin, 4, 199207.399 Chatterjee, S., Hadi, A.S., 1986. Influential observations, high leverage points, and outliers400in linear regression (with discussion). Statistical Science, 1, 379393.401

Chatterjee, S., Hadi, A.S., 1988. Sensitivity Analysis in Linear Regression. John Wiley &402Sons, New York.403

Christensen, R., Pearson, L.M., 1992. Case-deletion diagnostics for mixed models. Tech-404nometrics, 34, 3845.405

Cook, R.D., 1977. Detection of influential observation in linear regression. Technometrics,40619, 1528.407

Cook, R.D., 1986. Assessment of local influence (with discussion). Journal of The Royal408Statistical Society Series B - Statistical Methodology, 48, 117131.409

Cook, R.D., Weisberg, S., 1982. Residuals and Influence in Regression. Chapman and Hall,410

London.411Crainiceanu, C.M., Ruppert, D., 2004. Likelihood ratio tests in linear mixed models with412

one variance component. Journal of The Royal Statistical Society Series B - Statistical413Methodology, 66, 165185.414

Dannenburg, D.R., Kaas, R., Goovaerts, M.J., 1996. Practical actuarial credibility models.415Institute of Actuarial Science and Economics, University of Amsterdam, Amsterdam.416

Demidenko, E., 2004. Mixed Models - Theory and Applications. Wiley, New York.417Demidenko, E., Stukel, T.A., 2005. Influence analysis for linear mixed-effects models,418

Statistics in Medicine, 24, 893909.419Diggle, P.J., Heagerty, P., Liang, K.Y., Zeger, S.L., 2002. Analysis of Longitudinal Data.420

Oxford Statistical Science Series.421Drapper, N.R., Smith, N., 1998. Applied Regression Analysis. Wiley, New York.422

Dutang, C., Goulet, V., Pigeon, M., 2008. Actuar: an R package for actuarial science.423Journal of Statistical Software, 25, 137.424

Fitzmaurice, G.M., Lipsitz, S.R., Ibrahim, J.G., 2007. A note on permutation tests for425variance components in multilevel generalized linear mixed models. Biometrics, 63, 942426946.427

Frees, E.W., Young, V.R., Luo, Y., 1999. A longitudinal data analysis interpretation of428credibility models. Insurance: Mathematics and Economics, 24, 229247.429

Fung, W.K., Zhu, Z.Y., Wei, B.C., He, X., 2002. Influence diagnostics and outliers tests430for semiparametric mixed models. Journal of The Royal Statistical Society Series B -431Statistical Methodology, 64, 565579.432


19/20


Giampaoli, V., Singer, J., 2009. Restricted likelihood ratio testing for zero variance com-433ponents in linear mixed models. Journal of Statistical Planning and Inference, 139,43414351448.435

Greven, S., Crainiceanu, C.M., Kuchenhoff, H., Peters, A., 2008. Likelihood ratio tests for436variance components in linear mixed models. Journal of Computational and Graphical437Statistics, 17, 870891.438

Gumedze, F.N., Welham, S.J., Gogel, B.J., Thompson, R., 2010. A variance shift model439for detection of outliers in the linear mixed model. Computational Statistics and Data440Analysis, 54, 21282144.441

Hachemeister, C.A., 1975. Credibility for regression models with application to trend.442Proceedings of the Berkeley Actuarial Research Conference on Credibility, pp. 129163.443

Henderson, C.R., 1975. Best linear unbiased estimation and prediction under a selection444model. Biometrics, 31, 423447.445

Hilden-Minton, J.A., 1995. Multilevel diagnostics for mixed and hierarchical linear models.446Ph.D. Thesis. University of California, Los Angeles.447

Hoaglin, D.C., Welsch, R.E., 1978. The hat matrix in regression and ANOVA. American448Statistical Association, 32, 1722.449

Johnson, R.A., Whichern, D.W., 1982. Applied Multivariate Stastical Analysis. Sixth edi-450

tion. Prentice Hall. pp. 273332.451Lesaffre, E., Verbeke, G., 1998. Local influence in linear mixed models. Biometrics, 54,452

570582.453Liang, K.Y., Zeger, S.L., 1986. Longitudinal analysis using generalized linear models.454

Biometrika, 73, 1322.455McCulloch, C.E, Searle, S.R., 2001. Generalized, Linear, and Mixed Models. Wiley, New456

York.457Nobre, J.S., 2004. Mtodos de diagnstico para modelos lineares mistos. Unpublished Master458

Thesis (in portuguese). IME/USP, Sao Paulo.459Nobre, J.S., Singer, J.M., 2007. Residuals analysis for linear mixed models. Biometrical460

Journal, 49, 863875.461Nobre, J.S., Singer, J.M., 2011. Leverage analysis for linear mixed models. Journal of462

Applied Statistics, 38, 10631072.463Patterson, H.D., Thompson, R., 1971. Recovery of interblock information when block sizes464

are unequal. Biometrika, 58, 545554.465Poon, W.Y., Poon, Y.S., 1999. Conformal normal curvature and assessment of local in-466

fluence. Journal of The Royal Statistical Society Series B - Statistical Methodology, 61,4675161.468

R Development Core Team, 2009. R: A Language and Environment for Statistical Com-469puting. R Foundation for Statistical Computing, Vienna, Austria.470

SAS Institute Inc., 2004. SAS 9.1.3 Help and Documentation. SAS Institute Inc., Cary,471North Carolina.472

Tan, F.E.S., Ouwens, M.J.N., Berger, M.P.F., 2001. Detection of influential observation in473longitudinal mixed effects regression models. The Statistician, 50, 271284.474

Verbeke, G., 1995. The linear mixed model. A critical investigation in the context of longi-475tudinal data analysis. Ph.D. Thesis. Catholic University of Leuven, Faculty of Science,476Department of Mathematics, Leuven, Belgium.477

Verbeke, G., Molenberghs, G., 2000. Linear Mixed Models for Longitudinal Data. Springer.478Vonesh, E.F., Chinchilli, V.M., 1997. Linear and Nonlinear Models for the Analysis of479

Repeated Measurements. Marcel Dekker, New York.480Waternaux, C., Laird, N.M., Ware, J.H., 1989. Methods for analysis of longitudinal data:481

blood-lead concentrations and cognitive development. Journal of the American Statisti-482cal Association, 84, 3341.483


20/20


Wei, B.C., Hu, Y.Q., Fung, W.K., 1998. Generalized leverage and its applications. Scan-484dinavian Journal of Statistics, 25, 2537.485

Zewotir, T., Galpin, J.S., 2005. Influence diagnostics for linear mixed models. Journal of486Data Science, 3, 53177.487

Documents

ChJS-03!01!05 Uncorrected Proofs