176

Characterisation and Some Statistical Aspects of Univariate and

  • Upload
    buidang

  • View
    220

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Characterisation and Some Statistical Aspects of Univariate and

Characterisation and Some Statistical

Aspects of Univariate and Multivariate

Generalised Pareto Distributions

Nader Tajvidi

Department of MathematicsG�teborg����

Page 2: Characterisation and Some Statistical Aspects of Univariate and
Page 3: Characterisation and Some Statistical Aspects of Univariate and

Characterisation and Some Statistical

Aspects of Univariate and Multivariate

Generalised Pareto Distributions

Nader Tajvidi

GOTEBORG

CH

AL

ME

RSTEKNISKAHO

GSK

OLA

DEPARTMENT OF MATHEMATICS

G�TEBORG

����

Page 4: Characterisation and Some Statistical Aspects of Univariate and

G�teborg ����

ISBN ������������

ISSN ������X

Page 5: Characterisation and Some Statistical Aspects of Univariate and

Preface

There are occasions in life when there is reason to look back and try to rememberthe things which have happened and also evaluate the time which has gone by�Usually I feel this on new year�s eve� but now� when I am about to �nish mythesis� I feel the same way�

I have been studying at Chalmers University of Technology for about sevenand half years now� During this period I have had the opportunity to meetmany interesting people who have helped and inspired me in one way or another�Without them� I would not have been able to write this thesis�

First� I wholeheartedly thank my supervisor Holger Rootz�n for introducingme to the broad �eld of extreme value theory and its diverse areas of applica�tions� for giving me the opportunity to combine my own interests in computerintensive methods with interesting and meaningful research subjects in extremevalue theory� and �nally for all the inspiring discussions and appropriate criti�cisms of my research on Tuesday and Thursday evenings at the department�

I am also grateful to Jacques de Mar� for helping and encouraging me to startmy graduate studies� My �rst contact with Jacques was when I was studyingIndustrial Engineering at Chalmers and he has provided great support all theseyears�

Thanks also to H�kan Pramsten from Lnsfrskringsbolagens AB for hisenthusiasm� continuous interest� and stimulating discussions in di�erent stagesof projects we have been involved in�

Further thanks to the people at the administration of the Department ofMathematics� especially Inga�Lill Sandman and Lotta Fernstrm who have al�ways been very kind and supportive�

As always� my sincere thanks to my brother Reza who has been my bestfriend forever and my sister Lida who always has time for me� and also forteaching me to cook Iranian food�

And �nally� a huge personal thanks to my mother and father who have helpedme every step of the way� I have felt your support even from a long distanceand it has been invaluable for me� Thanks�

Nader Tajvidi

October � �

Page 6: Characterisation and Some Statistical Aspects of Univariate and
Page 7: Characterisation and Some Statistical Aspects of Univariate and
Page 8: Characterisation and Some Statistical Aspects of Univariate and
Page 9: Characterisation and Some Statistical Aspects of Univariate and

Contents

List of Papers I

� Overview of the Papers �

� What Is Left to Do� ��

References ��

Paper A

Con�dence Intervals and Accuracy Estimation for Heavytailed

Generalised Pareto Distribution ��

� Introduction ��

� Parameter Estimation in the Generalised Pareto Distribution �

��� Bootstrap Con�dence Intervals � � � � � � � � � � � � � � � � � � � ��

��� Likelihood�based Con�dence Intervals � � � � � � � � � � � � � � � ��

��� Accuracy Measures for ML Estimators � � � � � � � � � � � � � � � ��

� Simulation Results �

��� Simulation Results of Bootstrap Con�dence Intervals � � � � � � � ��

��� Simulation Results of Likelihood�based Con�dence Intervals � � � ��

��� Estimates of Some Accuracy Measures of ML Estimators � � � � � �

� Discussion ��

References ��

Paper B

Multivariate Generalised Pareto Distributions �

� Introduction �

Page 10: Characterisation and Some Statistical Aspects of Univariate and

� Univariate Extreme Value Distributions ��

� Generalised Pareto Distributions ��

� Multivariate Generalised Pareto Distributions ��

� Characterisation of Multivariate Max Stable Distributions �

��� Two Properties of Multivariate Generalised Pareto Distributions ��

Bivariate Pareto Distributions �

� Modelling the Dependence Function �

��� The Generalised Symmetric Logistic Model � � � � � � � � � � � � ��

��� The Generalised Symmetric Mixed Model � � � � � � � � � � � � � ��

� Statistics of Bivariate Generalised Pareto Distributions ��

��� Likelihood Inference � � � � � � � � � � � � � � � � � � � � � � � � � ��

Application to Wind Data ��

References �

Paper C

Design and Implementationof StatisticalComputations for Gen

eralised Pareto Distributions �

� Introduction �

� Design and Implementation� General Ideas �

� Optimising a Likelihood Function

��� Design and Implementation � � � � � � � � � � � � � � � � � � � � � �

��� Discussion � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ���

Page 11: Characterisation and Some Statistical Aspects of Univariate and

� Bootstrap Simulations ���

��� Design and Implementation � � � � � � � � � � � � � � � � � � � � � ���

��� Discussion � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ���

� Data Analysis ���

��� To Prepare the Data for Analysis � � � � � � � � � � � � � � � � � � ���

��� To Analyse the Data � � � � � � � � � � � � � � � � � � � � � � � � � ���

��� To Examine and Present the Results � � � � � � � � � � � � � � � � ��

Other Implementation Details ��

� Summary ���

References ���

Page 12: Characterisation and Some Statistical Aspects of Univariate and
Page 13: Characterisation and Some Statistical Aspects of Univariate and

List of Papers

This thesis is composed of the following papers� which are referred to in the textas Paper A� B and C�

Paper A Con�dence Intervals and Accuracy Estimation for Heavytailed Gen�eralised Pareto Distribution �� ���

Paper B Multivariate Generalised Pareto Distributions �� ���

Paper C Design and Implementation of Statistical Computations for Gener�alised Pareto Distributions �� ���

In addition� the complete simulation results obtained in Paper A are availableas a separate appendix�

I

Page 14: Characterisation and Some Statistical Aspects of Univariate and

II

Page 15: Characterisation and Some Statistical Aspects of Univariate and

� Overview of the Papers

Theory of extremes and it�s broad spectrum of application areas such as civil en�gineering� materials science� insurance and environmental science have recentlybeen subject of much theoretical and practical work� Extreme value theoryhas been used to predict the occurrence of extreme events or values based onless extreme sampling data� While most statistical procedures are in some wayconcerned with describing the norm� extreme value models seek to describe theunusual� In some engineering situations and in many design problems for en�vironmental engineers� extremes have overriding importance� This thesis is ane�ort to address some theoretical and applied statistical issues of univariate andmultivariate extreme value modelling� The main interest has been on practical�ity of the methods so when a new method has been developed� it�s performancehas been studied with the help of both real life data and simulations�

In ���� the theory has been used successfully for modelling of storm damagesin Sweden� The problem is illustrated in Figure � which shows the relative sizesof the accumulated loss in the most severe storms encountered by the Swedishinsurance group Lnsfrskringar in a ���year period �� ���� ���

Feb 92

Dec 88

Jan 84

Jan

83

Jan 93

Figure �� Windstorm losses � �� � � ��

This consists of �� storm events� with a total claimed amount of ��� millionSwedish crowns �MSEK�� It can be seen that the most costly storm contributesabout ��� of the total amount for the period� that it is ��� times bigger thanthe second worst storm� and that four storms together make up about half of

Page 16: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

the claims� Further� the second and third largest storms occurred in � �� and� ��� and then there is a big gap until � �� when the biggest one came�

The questions for the insurance companies then are

� How can one predict the size of the next very severe storm�

� How much reserves are needed to protect the company against it�

Extreme value theory provides a framework to model extreme losses and as aresult answers to question such as ones above�

The classical approach to extreme value theory starts from limit distributionsof sample maxima� Let X�� X�� � � �� Xn be a sequence of mutually independentrandom variables with common distribution function F �x�� De�ne

Mn � max�X�� X�� � � �� Xn�� n � N�

Suppose there exists sequences an � � and bn � R such that

limn��

P �Mn � bn

an� x� � lim

n��Fn�anx� bn� � G�x� �����

and G�x� is a non�degenerate distribution function�

If ����� holds we say that F �x� belongs to domain of attraction of G�x� andwrite F � D�G�� It can be shown that G�x� must be of one of the followingtypes �or families of types�� see e�g� �����

Type I�

���x� �

�� x � �exp��x��� x � �

for some � � ��

Type II�

���x� �

�exp����x��� x � �� x � �

for some � � ��

Type III�

�x� � exp��e�x� x � R�

This is a basic result from classical extreme value theory and sometimescalled the Extremal Types Theorem� All the possible limit distributions above

Page 17: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

can be uni�ed as the following��parameter family which is called the GeneralisedExtreme Value Distribution �GEVD�� A distribution function �d�f�� G is calleda GEVD if it has the form

G�x �� �� �� � expf���� �x� �

����

�g �����

where � � �� � and � are real parameters� and x� � max�x� ��� The support ofthe distribution is x � �� �

� for � � � and x � �� �� for � � �� For � � �� we

interpret G�x� to be the limit G�x� � exp��e� x��

� ��

The origins of extreme value theory go back to the pioneering work of Fisherand Tippet ���� who in � �� derived the three limit distributions above formaxima and essentially proved the Extremal Types Theorem� Gnedenko �� ���gave full proof of the theorem and also provided much more detailed analysisof the domain of attraction conditions� The representation ����� for the GEVDis due to von Mises �� ���� Since then there have been signi�cant advances inthe probabilistic and statistical theory of univariate and multivariate extremevalues and a fairly large amount of literature is now available in di�erent aspectsof the subject� c�f� ���� ��� ��� and �� ��

A rather recent approach for modelling extreme events is based on so calledpeak over threshold �POT� methods ����� ��� and ������ The basic model uses thegeneralised Pareto distribution �GPD� for modelling exceedances of a randomvariable over a high threshold and can also handle covariates� The GPD isde�ned as�

H�x �� �� � �� ��� �x

���� � �� �

x

�� ��

Here � � � and � are real parameters and the support of distribution is x � �for � � � and � � x � �� for � � �� For � � � we interpret H to be theexponential distribution H�x� � �� e�x��� Pickands ���� showed that the GPDarises as a limiting distribution for the excess over thresholds if and only ifthe parent distribution belongs to the domain of attraction of an extreme valuedistribution� Speci�cally� let Fu�x� be the conditional distribution of excess overthe threshold u� i�e�

Fu�x� � P �X � u� xjX � u� �F �u� x�� F �u�

�� F �u��

Thenlim

u�xFinf

�����sup

��x��jFu�x� �H�x �� ��j � �

for some � if and only if F � D�G� for some GEVD G� Here

xF � supfx � F �x� � �g

is the right�hand endpoint of F �

Page 18: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

This model has proven to be one of the most e�cient ways to apply extremevalue theory in practice� It has applications in many �elds including hydrology�environmental extreme events� reliability and insurance ����� Di�erent estima�tion methods for parameters of the GPD have previously been considered inthe literature ����� ��� ��� ��� �� but the main focus has been on the case with� � ����� In applications like ���� the region � � ���� is the principal interestand this region is the main area of our study in Paper A�

Usually only a limited amount of extreme value data is available and it ishence important to develop methods which e�ciently use the data and haveacceptable performance for small to moderate sample sizes� Hosking and Wallis���� used the asymptotic distribution of the estimators and studied empiricalcoverage probability of approximate con�dence intervals for the parameters andquantiles of the GPD� They conclude that con�dence intervals for � and thep�quantile� Q�p�� with p � �� require very large sample sizes� ��� or more�before acceptable accuracy is obtained�

Paper A compares the performance of some other methods for constructingcon�dence intervals for the parameters of the GPD� Our main emphasis is onsmall to moderate sample sizes and the heavytailed region of the GPD� Simu�lations were performed for sample sizes n � ��� ��� ��� ���������� ������� andwith � taking the values � � ������������������ ���� Maximum likelihoodestimators are invariant under scale changes� so without loss of generality � wasset to � in all simulations� For each combination of � and n� ��� random samplesfrom the GPD were generated� For each sample x� we calculated ML estimatesof parameters and ����quantile of the GPD� For each sample� ���� indepen�dent bootstrap samples x���x��� � � � �x����� were generated� each consisting ofn data values drawn by replacement from x� For each bootstrap sample� we

calculated ML estimates of the parameters� b��� b�� and �Q�������

We considered two bootstrap methods for constructing con�dence intervals����� � ��� viz�

� percentile bootstrap intervals�

� bias�corrected and accelerated intervals �BCa��

The simulations showed that the BCa method is uniformly more accurate thanthe percentile bootstrap� In many applications� lower limited con�dence intervalfor the shape parameter of the GPD is of main interest� The simulation resultsindicated that there is a need for improvement in empirical coverage probabilityof the BCa method for such intervals�

Con�dence intervals for a parameter may be obtained directly from the like�lihood function by inverting the likelihood ratio �LR� test� E�g� if H� is asimple hypothesis specifying the value for just one parameter� � � �� � then the

Page 19: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

hypothesis is rejected at the level � if

l���� � l� ���� �

������

where l� ��� denotes the unrestricted maximisation of the log�likelihood function�l���� is the restricted maximisation� and ��

��� is the �th quantile of the ���distribution with one degree of freedom� Thus � � �� is not rejected if the log�likelihood evaluated at �� is not more than �

������ units less than the maximum

of the log�likelihood at �� The values of �� satisfying this requirement determinea ������ ��� likelihood con�dence interval for �

The asymptotic ��r approximation does not always perform well for small

samples� Lawley ���� suggests a general method for improving the approxima�tion of the distribution of the likelihood ratios� The main idea is to obtaina corrected statistic� say LR�� which is distributed as ��

r when terms of or�der O�n��� and smaller are neglected� Lawley assumes that the likelihoodfunction depends upon p � q population parameters and considers testing thecomposite hypothesis H� that p��� p��� � � � � p�q have speci�ed values whileother parameters are unspeci�ed and unknown� He shows that LR criteriafor testing H� has mean q � �p�q � �p � O�n��� where the �k are of orderO�n��� and are de�ned in equation ��� of the article� He also shows that

LR� � ��� �q ��p�q� �p����� log�l����l� ����� has the same moments as �� with

q degrees of freedom� neglecting quantities of order O�n����

In the GPD there are only � parameters� � and �� Thus we only need tocalculate �� and �� where �� must be calculated separately for � and �� Thisinvolves calculating the expected values of the �rst four derivatives of the log�likelihood function� We also have to calculate the inverse of the expected valueof the hessian of the log�likelihood function� Design and implementation of thesecalculations are discussed in Paper C�

We obtained the following correction factors for � and �

�� ���� � ��� � � ��� �� � ��� ��

� ��� � � ��� ��� � � �� n

�� ���� � �� � � ��� �� � �� �� � �� ��

� ��� � � ��� ��� � � �� n�

In some applications� con�dence intervals for upper quantiles of the GPDare of major interest� It is possible to reparametrise the GPD with the upperp�quantile a parameter in the distribution and then use the methods discussedabove� The corresponding correction factor for p�quantile is rather compli�cated� Paper A gives an Internet address where the programs �in Mathematica�

Page 20: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

Fortran and C� for calculating the correction factor for p�quantile can be ob�tained�

In Paper A we compare the performance of likelihood�based intervals withand without correction factors� Simulations showed that for sample sizes largerthan �� lower limited intervals for � can be best constructed by using correctionfactors in the likelihood�based con�dence intervals�

The paper also investigates performance of some bootstrap methods to es�timate several accuracy measures of maximum likelihood estimators of the pa�rameters and ���quantile of the GPD� These measures include

� jackknife and bootstrap estimates of bias�

� jackknife and bootstrap estimates of standard error�

� asymptotic estimates of standard error�

� coe�cient of variation� and

� root mean square error�

Many problems which arise in practice are multivariate� During the past fewyears� characterisation and probabilistic theory of multivariate extremes havebeen extensively developed� see e�g� ���� ��� �� �� ��� ��� ��� ����

Multivariate extreme value theory is concerned with extremes of several de�pendent populations� The traditional approach to the de�nition of multivari�ate extremes values is through componentwise ordering �for a discussion aboutmultivariate ordering see ����� Ordering or ranking here takes place withinone or more of the marginal samples� Thus the maximum of a set of vec�

tors fXj � j � �� � � � � ng � f�X���j � � � � � X

�d�j �� j � �� � � � � ng is de�ned by taking

componentwise maxima� i�e� the maximum� Mn� is de�ned by

Mn � �M ���n � � � � �M �d�

n � � �n�j�

X���j � � � � �

n�j�

X�d�j �

whereW

denotes maximum�

Now� let fXn� n � �g be independent and identically distributed vectorswith distribution function F �x�� The limiting joint distributions of componen�twise maxima� subject to location and scale renormalisation� are multivariateextreme value distributions �MEVD�s�� The probabilistic theory of these dis�tributions has extensively been discussed in the books by Galambos���� andResnick �� �� The family is in�nite dimensional but there are several restric�tions on the structure of dependence between the marginals�

Page 21: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

Assume that there exist normalising constants ��i�n � �� u�i�n � R� � � i �

d� n � � such that as n��

P ��M �i�n � u�i�n ���i�n � x�i�� � � i � d�

� Fn�����n x��� � u���n � � � � � ��d�n x�d� � u�d�n �� G�x� �����

with the limit distribution G such that each marginal Gi� i � �� � � � � d is non�degenerate� If ����� holds� F is said to be in the domain of attraction of G� asbefore denoted by F � D�G�� and G is said to be a multivariate extreme valuedistribution�

A multivariate convergence of types argument shows that the class of limitd�f��s for ����� is the class of max�stable distributions which is de�ned as follows�

De�nition ���� A d�f� G in Rd is called max�stable if for every t � � thereexist functions ��i��t� � �� �i��t� such that

Gt�x� � G������t�x��� � ����t�� � � � � ��d��t�x�d� � �d��t���

If G is a multivariate extreme value distribution� then each of marginal distri�butions is a Univariate Extreme Value Distribution� The univariate extremaldistributions can all be obtained from one another by means of simple functionaltransformations� see e�g���� � page ���� and ������ If the random vector X hasa multivariate extreme value distribution� then so does Y if Y has marginalcomponents which are derived from the corresponding marginal components ofX by these transformations� It follows that a particular marginal distributionmay be chosen and that much of the interest is in the so called dependence func�tion� Di�erent authors have assumed di�erent marginal distributions� Tiago deOliveira ��� ������ and ����� used the standard Gumbel distribution �exp��e�x��whereas Pickands �see ���� and ���� chapter ��� considered min�stable distri�butions assuming the marginal distributions to be exponential� De Haan andResnick chose the unit Fr�chet distribution ���x� � e���x for margins �������

Max�stable distributions form a subclass of the max�in�nitely divisible �max�id� d�f��s which is de�ned as follows

De�nition ���� A d�f� G in Rd is called max�in�nitely divisible �max�id� ifF t�x�� � � � � xd� is a d�f� for every t � ��

One characterisation of max�id distributions is presented in the Resnick �� �page ���� and brie�y discussed in Paper B� To characterise the max�stable dis�tributions with non�degenerate marginals� we �rst assume that each margin ofG has the unit Fr�chet extreme value distribution� i�e�

G����� � � �� xi� � � � ��� � ���xi� � exp��x��i �� xi � ��

Page 22: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

A max�stable distribution with unit Fr�chet marginals will be denoted by G��The characterisation in the general case is discussed in Resnick �� �� In thebivariate case it gives

G��x� y� � e�������x�y��c�

where �� is called exponent measure and is given by

������ �x� y��c� � ��

x�

y�A�

x

x� y�

and

A�q� �

Z �

maxfw��� q�� ��� w�qgS�dw��

A�q� is called the dependence function� see e�g� ���� ���� Here S is �nite positivemeasure on interval ��� ��� To get Fr�chet marginals we needZ �

�wS�dw� �

Z �

���� w�S�dw� � �

and A�q� must satisfy the following conditions

� A��� � A��� � ��

� maxfq� �� qg � A�q� � �� and

� A�q� is convex for q � ��� ���

For statistical purposes� there have been two main approaches in the lit�erature� namely modelling the dependence function with subfamilies indexedby a �nite number of parameters ����� ��� ���� and nonparametric estimationof dependence function� c�f� ���� ��� � �� The parametric models can also bedivided into two classes di�erentiable and non�di�erentiable models� The dif�ferentiable models have densities and can be symmetric or asymmetric� Thenon�di�erentiable models give distributions which are singular� Tawn ���� re�views the parametric methods and also introduces two new asymmetric di�er�entiable models� A survey of existing non�di�erentiable models can be found inTigao de Oliveira ����� and ������

In Paper B we introduce two new parametric di�erentiable models� Amongthe symmetric di�erentiable models� the logistic model �originally due to Gum�bel�� has shown to be most useful model� see e�g� ����� Our �rst model is ageneralisation of logistic model to a ��parameter family� It is called generalisedsymmetric logistic model and has exponent measure

������ �x� y��c� � ��

xp�

yp�

k

�xy�p����p � �� � k � ��p� ��� p � ���

Page 23: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

The dependence function for the model is

A�w� �����w�

p�wp � k ���� w� w�

p

� �p

In this model

� k � � gives the symmetric logistic model�

� independence corresponds to p � � and k � �� and

� complete dependence is obtained for k � � and p � ���

We call the second model generalised symmetric mixed model� It�s exponentmeasure is

������ �x� y��c� ��

x�

y� k�

xp � yp���p� �� � k � �� p � ��

which corresponds to the following dependence function

A�w� � �� k

���� w��p � w�p��p

In this model

� independence corresponds to k � � or p � �� and

� complete dependence can be obtained with k � � and p � � �which isnot possible in the symmetric or asymmetric mixed model discussed in������

To develop methods which use more of the available data and as a resultcontribute to better estimation of parameters in the model� it is natural to gen�eralise threshold method to higher dimensions ���� ����� In Paper B we workdirectly with the exceedances over a high threshold and de�ne and motivatemultivariate GPD�s� We also consider estimation of parameters in some speci�cbivariate generalised Pareto distributions �BGPD�s�� In the latter case a modelis �tted to the joint distribution of a bivariate observation subject to the con�dition that at least one component exceeds a high threshold� This allows us tostudy dependent extremes and for example to estimate a bivariate upper quan�tile curve or calculate the probability that the thresholds are simultaneouslyexceeded by two variables�

We prove a theorem which motivates two de�nitions of multivariate gen�eralised Pareto distributions �MGPD�s�� The following is an outline of thetheorem�

Page 24: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

For �F �x� � �� F �x� � P �X � x� and xF � supfx � F �x� � �g we considerthe conditional distribution

P �X � u� ��u�xjX � u� ��F �u� ��u�x�

�F �u�

and show that if

� G is a multivariate extreme value distribution in Rd and

� F � D�G�

then for every x� � Support �G�� there exist a curve l in Rd and a function��u� � ����u��� � � � � �d�ud�� such that for u on l

�F �u� ��u�x��F �u�

� �H�x� u xF

where �H�x� � � logG�x�x��� logG�x�� � and that a suitably formulated converse statement

also holds�

Hence there is a close connection between MEVD�s and MGPD�s and acharacterisation of MEVD�s gives a corresponding characterisation of MGPD�s�We discuss some properties of MGPD�s and give a characterisation of bivariatePareto distributions by using polar transformation of marginals� We also con�sider some speci�c examples of these distributions� The discussion in the restof paper is restricted to bivariate case� We consider bivariate extreme valuedistributions which correspond to our new models� Maximum likelihood esti�mation of the parameters in the corresponding BGPD�s is also discussed andthe procedure is illustrated with an application to a bivariate series of winddata� The data comes from a project for modelling of wind storm damages inthe south of Sweden and consists of the maximum wind speeds in ��� storms inthe period of � ���� �� The Swedish Metrological and Hydrological Institutecalculates a grid of wind speeds for each storm in Sweden� These calculationsare based on pressure measurements in di�erent meteorological stations whichcover an speci�c area� From �� grid points in the province Sk�ne in the south ofSweden we chose two points with the least correlation� At the �rst stage we per�formed a univariate Pareto analysis on each margin� The main purpose of thesecalculations was to �nd an appropriate threshold �u� for each margin� BesidesML estimates we also compared parameter estimates with Method of Moments�MoM� and Probability Weighted Moments �PWM�� Simultaneous estimationof parameters in a model with arbitrary marginals also illustrated� We use thegeneralised symmetric logistic model and give ML estimates of the parameterswith di�erent thresholds� Di�erent bivariate upper quantile curves for the winddata are also presented� Our main aim here is to illustrate how BGPD�s can beused for modelling of extreme events�

��

Page 25: Characterisation and Some Statistical Aspects of Univariate and

� OVERVIEW OF THE PAPERS

The last section of the paper is devoted to a small simulation study whichcompares sensitivity of maximum likelihood estimates of the parameters to achange in the threshold on each margin�

For application of multivariate extreme value theory� quite a number of para�metric and non�parametric models have been proposed ����� ��� ��� ���� but�compared to univariate extremes� inference and modelling of multivariate ex�tremes are in their early years and usefulness and performance of these methodshave not completely been explored yet� A successful application of multivariatetheory in general requires availability of high quality software� Unfortunately�to the best of our knowledge� there is at present no complete software availablefor application of multivariate extreme value theory�

Since extreme events by de�nition are rare� there is usually only little dataavailable� and hence e�cient use of data is very important� This makes highdemands on development of statistical methods and also on quality of softwareused for applying the methods� Despite the diversity of available tools� there isno single tool which can be used in all stages of a such application�

In Paper C we discuss nature and complexity of problems involved in bothapplication of existing theory and in statistical research� We use Papers A�B and ���� as examples to explain the di�culties which might arise and alsopresent the methods which have been used to solve them� A common themein the papers is application of univariate and multivariate generalised Paretodistributions� However we believe that many of the problems discussed areof rather general nature and can be of interest for researchers in other areasof applied statistics� Comparing the designs and implementations discussedin Paper C shows that there does not exist a !unique" solution in this kindof application and that problems should be analysed independently and thatalgorithms for solving them should be designed separately�

In the paper we also present a general framework for design and implemen�tation of statistical computations� We argue that this approach can create the�exibility which is needed in this type of application� Some recent approaches incomputer software development are also discussed and some pointers for furtherstudy in the subject are provided� ��� �� �� ���� Although we believe that theproblems involved in research can not be handled by a single tool� it is still pos�sible to suggest some tools which are likely to be useful in some stages of almostevery research project� In the !Discussion" Sections of Paper C we present anddiscuss some of these tools� c�f� ��� �� ��� ��� ����

Many of the results presented in this thesis have involved development ofsoftware for carrying out the calculations and also designing computer experi�ments� We have made all of the developed programs and the complete simulationresults available at http���www�math�chalmers�se��nader�software�html�Furthermore� in Paper C we give a rather detailed discussion of design andimplementation of the calculations� There are two reasons for this

��

Page 26: Characterisation and Some Statistical Aspects of Univariate and

� WHAT IS LEFT TO DO�

�i� It provides an easy way for others to inspect the programs and replicatethe results�

�ii� Many of the programs can directly be used in similar applications� Thisgives a possibility to save the time which is needed to develop the softwarefor the same purposes�

Finally� considering the amount of computations which is usually involved inmethodological research� this seems to be the only feasible way to provide suf��cient details to the readers�

� What Is Left to Do

The short answer to the above question is !a lot#"� however� here is a brief list�

� Paper A�

� We have increased the number of random samples from ��� to �����The results presented in this paper must be recalculated with thenew set of simulations�

� In the paper we presented the correction factor for p�quantile afterreparametrisation of the GPD� Optimisation routines for calculatingpro�le likelihood function in the new model should be developed�This makes it possible to compare performance of likelihood�basedcon�dence intervals with the bootstrap methods�

� Paper B�

� In this paper we only considered an example with the generalisedsymmetric logistic model� The optimisation routine for this modelshould be improved to contain the special case of exponential mar�gins�

� For other parametric families of dependence functions a similar opti�misation routine for the likelihood function should be developed�

� More detailed simulation studies are needed to understand the be�haviour of these models under di�erent circumstances� The result ofsimulations can be used to understand the trade o� between di�er�ent parameters in each model� A relevant question is whether theparametrisation is unique or not�

� In addition to the likelihood method� other methods of parameterestimation must be studied for these models� In particular� our ex�perience shows that the optimisation routines are very sensitive tothe choice of initial point� Developing other estimation methods canyield a better choice of initial point in the likelihood method�

��

Page 27: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

� A natural question concerns consistency and accuracy of estimations�How large should a sample be� What is the asymptotic distributionof estimators�

� Tests for independence of marginal distributions should be developed�

� The standard likelihood ratio tests can be used for testing betweenmodels of the same family� How can we decide between di�erentfamilies�

� Is it possible to improve existing symmetric models to non�symmetricmodels� In the models discussed in this paper we obtain non�symmetryby transforming the margins� The obvious possibility is to improvethe existing symmetric models by introducing new parameters whichstand for non�symmetry�

� In this paper we have given two de�nitions of MGPD�s� Are thereother alternative de�nitions�

� For each model optimisation routines should be improved so thatcovariates can be incorporated into the analysis�

� Paper C�

� The software should be made more !user�friendly"� Appropriate doc�umentation should be written to explain what each program does andhow it can be used�

� There are similar di�culties as those discussed in Paper A in calcu�lating con�dence intervals for the GEVD� It would be interesting tosee if a similar scaling factor as in GPD can be calculated for GEVD�

� To study the new possibilities which can be produced by combiningobject�oriented software like S�plus and Java� A speci�c piece of re�search would be to develop applets for interactive statistical graphics�For example one generates a �D data set in S�plus� and attaches a �lecontaining those data �via a parameter� to a Java spin applet� Dis�tributed WWW software and specialised user�interface design are majortopics for the next few years in applied statistics�

References

��� Balkema� A� and Resnick� S� �� ��� Max�in�nite divisibility� J� Appl�Probab� ��� �� ��� �

��� Barnett� V� �� ��� The ordering of multivariate data �with discussion�� J�R� Statist� Soc� A ��� ��������

��� Bates� D�M� �� �� Data Manipulation in Perl� Computing Science andStatistics� Proceedings of the �th Symposium on the Interface� ��� ��������

��

Page 28: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

��� Becker� R�A�� Chambers J�M� and Wilks A�R� �� ��� The new S LanguageA Programming Environment For Data Analysis and Graphics� Wadsworth$ Brooks%Cole Computer Science Series�

��� Berliner� B �� �� Parallelizing Software Development�Conference Proceed�ings of the USENIX Association�s Winter � � conference� Washington�DC�

��� Budd� T� �� �� An Introduction to Object�Oriented Programming�Addison�Welsey Publishing Company� Inc�

��� Chambers� J�M� and Hastie� T�J� �� �� Statistical Models in S� Wadsworth$ brooks%cole Computer Science Series�

��� Coles� S� G� and Tawn� J�A� �� �� Modelling multivariate extreme events�J� R� Statist� Soc� B ��� ����� ��

� � Davis� R�A� and Resnick� S�I� �� ��� Tail estimates motivated by extremevalue theory� Ann� Statist� ��� ����������

���� Davison� A�C� and Smith� R�L� �� �� Models for exceedances over highthresholds� J� R� Statist� Soc� B ��� � ������

���� Deheuvels� P� and Tiago de Oliveira J� �� � � On the Non�parametric es�timation of the bivariate extreme value distributions� Statist� Probab� Lett��� ��������

���� Efron� B� and Tibshirani� R� J� �� �� An Introduction to the Bootstrap�Chapman $ Hall� New York�

���� Fisher� R�A� and Tippet� L�H�C� �� ��� Limiting forms of the frequencydistribution of the largest or smallest member of a sample� Proc� CambridgePhil� Soc� ��� ����� ��

���� Galambos� J��� ��� The Asymptotic Theory of Extreme Order Statistics��nd ed� Melbourne Krieger�

���� Galambos� J� and Kotz S� �� ��� Characterizations of Probability Distribu�tions� Lecture notes in Mathematics� New York Springer�Verlag�

���� Gumbel� E�J� �� ���Statistics of Extremes� New York Columbia UniversityPress�

���� Gumbel� E�J� �� ��� Bivariate exponential distributions� J� Amer� Statist�Assoc� ��� � ������

���� Haan� L� de and Resnick� S�I� �� ��� Limit theory for multivariate sampleextremes� Z� Wahrscheinlichkeitstheorie v� Geb� ��� ��������

�� � Hall� P� �� �� The bootstrap and Edgeworth Expansion� Springer�Verlag�New York�

��

Page 29: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

���� Hosking� J�R�M� and Wallis� J�R� �� ��� Parameter and quantile estimationfor the generalised Pareto distribution� Technometrics �� �� ��� �

���� Joe� H�� Smith� R�L� and Weissman� I��� �� Bivariate Threshold Methodsfor Extremes� J� R� Statist� Soc� B ��� ��������

���� Lawley� D�N� �� ��� A General Method for Approximating to the Distri�bution of Likelihood Ratio Criteria� Biometrica ��� � ������

���� Leadbetter� M�R�� Lindgren� G� and Rootz�n� H� �� ��� Extremes andRelated Properties of Random Sequences and Processes� Berlin Springer�Verlag�

���� Lemay� L� and Perkins� C� L� �� �� Teach yourself Java in �� days� SamsPublishing�

���� Lewis� B�� Laliberte� D�� Stallman� R� and the GNU Manual Group �� ��GNU Emacs Lisp Reference Manual� Free Software Foundation� Inc�

���� Marshall� A�W� and Olkin� I� �� ��� A generalised bivariate exponentialdistribution� J� Appl� Probab� �� � ������

���� Pickands� J� III �� ��� Statistical inference using extreme order statistics�Ann� Statist� �� �� �����

���� Pickands� J� �� ��� Multivariate extreme value distributions�Proc� �rd Ses�sion I�S�I�� �� �����

�� � Resnick� S�I� �� ��� Extreme values Regular Variation and Point Processes�Berlin Springer�Verlag�

���� Rootz�n� H� and Tajvidi N� �� �� Extreme value statistics and wind stormlosses a case study� To appear in Scandinavian Actuarial Journal�

���� Rosbjerg� D�� Madsen� H� and Rasmussen� P� F� �� �� Prediction in par�tial duration series with generalised Pareto�distributed exceedances� WaterResources Research ��� ����������

���� Sibuya� M� �� ��� Bivariate extremal statistics� Ann� Ins� Statist� Math�XI� � ������

���� Smith� R�L� �� ��� Statistics of extreme values� Proc� �th Session I�S�I�Paper ����� Amsterdam�

���� Smith� R�L�� �� ��� Maximum likelihood estimation in a class of nonregularcases� Biometrika ��� ��� ��

���� Smith� R�L�� �� ��� Estimating tails of probability distributions� A� Statist���� ����������

���� Stallman� R� �� �� GNU Emacs Manual Eleventh Edition Updated forEmacs Version � ���� Free Software Foundation� Inc�

��

Page 30: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

���� Smith� R�L�� Tawn� J�A� and Yuen� H�K� �� �� Statistics of multivariateextremes� Int� Statist� Inst� Rev� ��� ������

���� Tawn� J� A��� ��� Bivariate extreme value theory Models and estimation�Biometrika ��� � ������

�� � Tiago de Oliveira J� �� ��� Regression in the non�di�erentiable bivariateextreme models� J� Amer� Statist� Assoc� � ��������

���� Tiago de Oliveira J� �� ��� Bivariate and multivariate extremes distribu�tion� Statistical Distributions in Scienti�c Work �� ��������

���� Tiago de Oliveira J� �� ��� Bivariate extremes Foundations and statistics�Proc� �th Int� Symp� Mult� Anal�� North Holland� New York�

���� Tiago de Oliveira J� �� ��� Bivariate models for extremes� statistical deci�sion� Statistical Extremes and Applications Reidel Dordrecht� ��������

���� Tiago de Oliveira J� �� � � Intrinsic estimation of the dependence structurefor bivariate extremes� Statist� Probab� Lett� �� ��������

���� Wall� L� and Schwartz� R�L� �� �� Programming Perl� O�Reilly $ Asso�ciates� Inc�

��

Page 31: Characterisation and Some Statistical Aspects of Univariate and

Con�dence Intervals and Accuracy Estimation

for Heavytailed Generalised Pareto Distribution

Nader Tajvidi

June ����

Abstract

A rather recent approach for modelling extreme events is the so calledpeak over threshold �POT� method� The generalised Pareto distribution�GPD� is a two�parameter family of distributions which can be used tomodel exceedances over a threshold� We compare the empirical cover�age of standard bootstrap and likelihood�based con�dence intervals forthese parameters� Simulation results indicate that none of the methodsgive satisfactory intervals for small sample sizes� By applying a generalmethod of D� N� Lawley� small sample correction factors for likelihood ra�tio statistics of parameters and quantiles of the GPD have been calculated�In many applications� lower limited con�dence interval for the shape pa�rameter and upper limited con�dence interval for the scale parameter ofthe GPD are of main interest� Simulations show that for sample sizeslarger than � such intervals can be best constructed by incorporating thecomputed correction factors to the likelihood�based con�dence intervals�While corrected likelihood method has better empirical coverage proba�bility� the mean length of produced intervals are not longer than corre�sponding bootstrap con�dence intervals� This article also investigates theperformance of some bootstrap methods for estimation of accuracy mea�sures of maximum likelihood estimators of parameters and quantiles ofthe GPD�

Keywords� Generalised Pareto distribution� Maximum likelihood�Small sample properties� Bartletts correction� Bootstrap� Pro�le likeli�hood� Con�dence intervals� Quantiles�

AMS ���� subject classi�cation� Primary ��F � Secondary ��F�����E��� �G��

�Research supported by stiftelsen L�nsf�rs�kringsbolagens forskningsfond�

��

Page 32: Characterisation and Some Statistical Aspects of Univariate and

� INTRODUCTION

� Introduction

The generalised Pareto distribution �GPD� is widely used for modelling ex�ceedances of a random variable over a high threshold and has proven to be oneof the best ways to apply extreme value theory in practice ����� ��� and �����In ���� the theory has been successfully used to model wind storm damage inSweden� In this kind of application� con�dence regions for parameters are ofspecial interest� In this article we investigate performance of several bootstrapand likelihood�based methods for constructing con�dence intervals for these pa�rameters ������� Maximum likelihood estimation of parameters of the GPD haspreviously been studied by Hosking and Wallis ���� However� they only con�sidered the region � � ���� �� the shape parameter in the GPD�� In someapplications �e�g� wind storm damages� the heavytailed region � � ���� is ofprincipal interest� and this region will be the main area of our study in thisarticle� However� to compare our results with the results reported in the abovearticle� we also consider some values of � � ����� It should also be mentionedthat this only applies to point estimation of parameters in the GPD� Bootstrapand likelihood�based con�dence intervals were not considered in the article andperformance of these methods for the whole region of � is of interest� It turnsout that none of the !standard" methods produce a satisfactory con�dence level�Using a general method of D� N� Lawley � � for approximating the distributionof likelihood ratio �LR�� correction or scaling factors for likelihood ratio statis�tics of parameters and p�quantile of the GPD have been calculated� Simulationstudies have shown that for small samples the only method with acceptableperformance is con�dence intervals based on the pro�le likelihood function withthe scaling factor incorporated�

By computer simulations� we also compare the performance of several boot�strap methods for estimation of bias� standard error and some other accuracymeasures of maximum likelihood �ML� estimators of the parameters and quan�tiles of the GPD�

In Section � we discuss ML estimation of parameters of the GPD� Sec�tion ��� is devoted to bootstrap con�dence intervals� In Section ��� we discusscon�dence intervals based on the pro�le likelihood function� We also presentthe calculations giving the scaling factors for �� � �the scale parameter in theGPD� and quantiles of the GPD� Section ��� brie�y discusses some accuracymeasures for point estimators of parameters� Section � is devoted to simulationresults� Section ��� contains simulation results for bootstrap con�dence inter�vals and we continue to likelihood�based intervals in Section ���� At the endof this section� the results of the simulations calculated by incorporating thescaling factors into the likelihood�based con�dence intervals are presented� InSection ��� we discuss results concerning point estimators of parameters and ���quantile of the GPD� In order not to make the article too long� we onlypresent some !typical" simulation results in this section� For reference purposes�the complete simulation results are given in an appendix to this report� Sec�

��

Page 33: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

tion � contains a discussion of some aspects of the simulation results concerningcon�dence intervals for parameters of the GPD�

� Parameter Estimation in the Generalised Pareto

Distribution

In this section we discuss several methods for constructing con�dence intervalsfor parameters in the GPD� We will also study some accuracy measures ofpoint estimators of parameters from a general point of view� We begin with twobootstrap methods� Then we brie�y discuss the theory of likelihood�based con��dence intervals� As mentioned above� we are mainly interested in performanceof these methods for small samples so we will consider a method for improvingperformance of likelihood�based intervals for small to moderate samples� Thiscan be achieved by incorporating a scaling factor into the likelihood ratio func�tion� The last section is devoted to accuracy measures of point estimators ofparameters�

A random variable X has a GPD� with shape parameter � and scale param�eter �� if it has the cumulative distribution function

F �x �� �� � �� ��� �x

���� � �� �

x

�� �� �����

Here � � � and � are real parameters and the support of distribution is x � �for � � � and � � x � �� for � � � �also the opposite parametrisation� with� replaced by �� is used by some authors�� For � � � we interpret F to be theexponential distribution F �x� � � � e�x��� The quantile function of the GPDis given by

Q�p� �

����� ��� p�� �� � � � ��� ln��� p� � � � ��

�����

Hosking and Wallis ��� studied di�erent methods of parameter estimation inthe GPD� They considered the case � � ���� and� using simulations� showedthat for small to moderate sample sizes the method of moments or the probabil�ity weighted moments performed better than ML estimators� Maximum likeli�hood estimation of � and � was also considered by Smith ���� who showed thatfor � � ���� under certain regularity conditions� the ML estimators are asymp�totically normal and asymptotically e�cient� A maximum likelihood quantileestimator is obtained by substituting the ML estimates �� and �� into ������

Page 34: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

For a sample x � fx�� � � � � xng� the log�likelihood function is given by

L��� �x� �

����������n ln� � � �

�� ��

nXi�

ln��� �xi�� � � � �

�n ln� � ��

nXi�

xi � � � ��

�����

With the largest observation denoted by x�n�� the range for � is � � � for � � �and � � �x�n� for � � �� The ML estimates must be determined numericallybecause the minimal su�cient statistics for the GPD are the order statistics andthere is no obvious simpli�cation of the nonlinear likelihood function in ������However� it is possible to reduce the two�dimensional numerical search for thezeros of the log�likelihood gradient vector to a one�dimensional search� see ����Considering availability of high quality optimisation software and also complex�ity of the algorithm in ���� it does not seem that this approach can reduce theprogramming burden of two�dimensional parameter estimation� Furthermore�it is not clear how this approach can be used to construct likelihood�based con��dence intervals for the parameters�

��� Bootstrap Condence Intervals

There are two main bootstrap techniques for constructing con�dence intervalfor a parameter of a distribution� viz� percentile bootstrap intervals and the bias�corrected and accelerated �BCa� intervals� Both of these methods are discussedin detail in ��� and therefore we here only give a very short presentation� First�some words about notation� Lower case bold letters such as x denote vectors� Asuperscript !�" indicates a bootstrap random variable for example� x� indicatesa bootstrap data set generated from a data set x� Parameters are denoted byGreek letters such as � and �� We also use this notation to indicate �xed valuesof parameters in the simulations� A hat on a letter indicates an estimate� suchas ��� The notation is adopted from ��� and seems to be rather standard in thebootstrap context�

For a con�dence level ����� percentile bootstrap intervals are de�ned by the� and �� � percentiles of the cumulative distribution function of the bootstrapreplications of the statistic of the interest� E�g� for b�� this gives the interval

�b�lo� b�up� � �b���� �� b������ �� �����

where b���� � is the ����th percentile of the bootstrap distribution of b��� Ifthe bootstrap distribution of the statistics is roughly normal� then the standardnormal and percentile intervals will nearly agree� Besides non�normality� thereare other ways these intervals can fail� Efron and Tibshirani ��� Chapters ��and ��� suggest a further extension of percentile intervals called bias�correctedand accelerated �BCa� con�dence intervals� The BCa intervals endpoints are

��

Page 35: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

also given by percentiles of the bootstrap distribution� but not necessarily thesame ones as in ������ The percentiles used depend on two numbers �a and �z��called the acceleration and bias�correction� We do not give the formulas hereand instead refer the interested reader to ��� for the details�

Another method for constructing a con�dence interval for an unknown quan�tity� say Q������ is to assume that the relationship between Q����� and its ML

estimate �Q����� is identical to the relationship between �Q����� and its bootstrap

estimates��Q������ That is� Q������Q����� � �Q�������Q����� so that a con�dence

interval for an unknown Q����� can be constructed by using bootstrap replica�

tions��Q�����

�� This argument is sometimes called !Russian doll principle"� see

e�g� ����

We also study two measures related to con�dence intervals� viz� length andsymmetry� The length of a con�dence interval for � is obviously b�up � b�lo� Thesymmetry is de�ned as

symmetry �b�up � b�b�up � b�lo

where b� is the point estimate of the parameter� Standard intervals are symmet�rical by de�nition and have symmetry � ����

��� Likelihoodbased Condence Intervals

The likelihood ratio �LR� test is de�ned as follows� For a parameter space �� ahypothesis H� places constraints on some of the parameters� i�e� � � �� ���

is subspace of ��� Given an observation x � �x�� � � � � xn� of n independent andidentically distributed �i�i�d� random variables with distribution function F anddensity f�x �� � d

dxF �x ��� the likelihood function is

L��� �nYi�

f�xij���

If the maximum of L over all of � is L���� and in the subspace �� is L� ���� thenthe likelihood ratio is

� �L� ����

L����� � � � ��

Standard large�sample theory gives that� under suitable regularity condi�tions� �� log� is asymptotically distributed as ��

r where r is the number ofrestrictions imposed by H�� It should be emphasised that� this approximation

��

Page 36: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

is known to hold for large sample sizes� but it sometimes gives inaccurate ap�proximation for small to moderate sample sizes�

Con�dence intervals for a parameter may be obtained directly from the like�lihood function by inverting the likelihood ratio test� E�g� if H� is a simplehypothesis specifying the value for just one parameter� say � � �� � then thehypothesis is rejected at the level � if

l���� � l� ���� �

������

where l� ��� denotes the unrestricted maximum of the log�likelihood function�l���� is the restricted maximum�and ��

��� is the �th quantile of the ���distributionwith one degree of freedom� Thus � � �� is not rejected if the log�likelihoodevaluated at �� is not more than �

������ units less than the maximum of the

log�likelihood at �� The values of �� satisfying this requirement determine a������ ��� likelihood con�dence interval for �

It is worth mention that if the log�likelihood function is quadratic in theparameter� then the con�dence intervals based on the information matrix areidentical to the likelihood�based con�dence intervals� This means that the ��method for constructing con�dence intervals is expected to perform best whenthe log�likelihood is quadratic about the ML estimate� However� often a moreaccurate picture of the uncertainty in the parameter estimates can be obtainedby examining the likelihood function directly� Although this may be di�cultin higher dimensions� it is often feasible to look at one�dimensional projections�or pro�les� of the likelihood function� To do this� we choose a parameter� say� and �x it to a value di�erent from its ML estimate� Then we maximise thelikelihood function with respect to the remaining parameters� By repeating thisprocedure� we can get a plot which shows the pro�le of the likelihood function�

It is common to convert the pro�le likelihood function to the !signed square�root" scale� The main reason for this is that it is easier to evaluate deviationsfrom straight line behaviour than deviations from quadratic behaviour� If wedenote value of the pro�le log�likelihood function by lp and the chosen parameterby � the converted pro�le function is

� � ���� � sign� �� � ��

ql� ���� lp����

As mentioned above� the asymptotic ��r distribution does not always perform

well for small samples� Lawley � � suggests a general method for approximatingto the distribution of the LR� The main idea is to obtain a corrected statistic�say LR�� which is distributed as ��

r when terms of order O�n��� and smallerare neglected� In a general case� Lawley assumes that the likelihood functiondepends upon p� q population parameters and considers testing the compositehypothesis H� that p��� p��� � � � � p�q have speci�ed values while the other

��

Page 37: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

parameters are unspeci�ed and unknown� He shows that LR criteria for testingH� has mean q � �p�q � �p � O�n��� where the �k are of order O�n��� andare de�ned in equation ��� of the article� He also shows that the LR� � �� ��q ��p�q � �p����� log�� has the same moments as �� with q degrees of freedom�

neglecting quantities of order O�n����

In the GPD there are only � parameters� � and �� Thus we only need tocalculate �� and �� where �� must be calculated separately for � and �� Thisinvolves calculating the expected values of the �rst four derivatives of the log�likelihood function ������ We also have to calculate the inverse of the expectedvalue of the hessian of the log�likelihood function� The procedure for computingall of the terms in equation ��� of Lawley�s article is lengthy and complicatedand here we just present the results of the calculations� For more details werefer to ���� where we discuss the design and some details of the calculations�At the end of this section we provide a pointer to where the programs can beobtained�

It should be mentioned that there is a misprint in equation ��� of Lawley�sarticle� There is only one quadratic term in that equation� However� it seemsthat the power � of this term should be removed� There are some intuitivereasons for this� For example� it is easy to see that simplifying the equationwith this quadratic term results in a constant and a term of O�n���� Obviously�k are of O�n��� which implies that all the terms� after simpli�cation� must beof O�n���� This can also be seen by comparing equations ��� and ��� of thepaper� We have not checked the details of calculations which lead to equation��� but all the following results have been calculated with the revised version ofthe equation�

We obtained the following results by direct evaluation of equation ���

�� ����� ��� � � ��� �� � �� �� � � ��

� ��� � �� ��� � � ��� n

���� ��� � �� � � �� �� � �� �� � �� ��

� ��� � � ��� ��� � � �� n

���� ��� � �� � � �� �� � � ��

��� � � ������ � � �� n

With �� � �� � ���� and �� � �� � ���� � we hence get

�� ���� � ��� � � ��� �� � ��� ��

� ��� � � ��� ��� � � �� n�����

�� ���� � �� � � ��� �� � �� �� � �� ��

� ��� � � ��� ��� � � �� n� �����

��

Page 38: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

-1

-0.8

-0.6

-0.4

-0.2

0

gamma20

40

60

80

100

n

0

0.2

0.4

epsgamma

-1

-0.8

-0.6

-0.4

-0.2

0

gamma

Figure �� Plot of �� �

-1

-0.8

-0.6

-0.4

-0.2

0

gamma20

40

60

80

100

n

0

0.1

0.2epssigma

-1

-0.8

-0.6

-0.4

-0.2

0

gamma

Figure �� Plot of �� �

Figures � and � show plots of �� and ���

In some applications� con�dence intervals for upper quantiles of the GPDare of major interest� It is possible to reparametrise the GPD with the upperp�quantile a parameter and then use the methods discussed above� E�g� such areparametrisation is

F �x �� xp� � ����� ��� p�� x

xp

� ��

In this equation� for each p� xp gives the upper p�quantile of the GPD� AML estimate of the upper p�quantile can now be calculated directly� Usingthis reparametrisation we calculated the scaling factor needed for constructinglikelihood�based con�dence intervals for the upper p�quantile� The resultedfactor for �xp is too long to be presented here but mathematically� it cansimply be used in the same way as the scaling factors ����� and ����� to im�prove the con�dence interval for upper p�quantile� Figure � shows the �xp forp � ���� As mentioned earlier� some details of these calculations are discussedin ����� Programs for calculating the scaling factor �xp as well as �� and �� �inMathematica� Fortran and C� are available at http���www�math�chalmers�se��nader�software�html�

��� Accuracy Measures for ML Estimators

In this section we discuss some accuracy measures of ML estimators of theparameters and quantiles of the GPD� These estimates are usually obtainedby repeated calculations of a parameter based on resampling form the originalsample� Generally� for a parameter �� the bootstrap procedure is to take msamples with replacement from original sample x and assess the variability ofan estimator b� about true � by the variability of b���j�� j � �� � � � �m� about b�where b���j� is the value of the estimator in the jth bootstrap sample� With

��

Page 39: Characterisation and Some Statistical Aspects of Univariate and

� PARAMETER ESTIMATION IN THE GENERALISED PARETO

DISTRIBUTION

-1

-0.8

-0.6

-0.4

-0.2

0

gamma20

40

60

80

100

n

0

0.1

0.2

0.3

0.4

epsxp

-1

-0.8

-0.6

-0.4

-0.2

0

gamma

Figure �� Plot of �xp �

this notation� the bootstrap estimate of bias of b� is simply

dbias�b��boot � b�� � b� �����

where b�� �Pmj� b���j�m�

The jackknife estimate of the bias is de�ned as follows� Given a data set x�the ith jackknife sample xi is de�ned to be x with the ith data point removed�

xi � �x�� x�� � � � � xi��� xi��� � � � � xn� � i � �� �� � � � � n�The ith jackknife replication b�i of the statistic b� is evaluated for each xi� Thejackknife estimate of bias is then

dbias�b��jack � �n� ���b� � b�� �����

where b� �Pn

i��b�in��One can also use simulated samples to estimate the real bias of an estimator�

In this case we generate l independent random samples from the distributionfunction with �xed value of � and from each sample estimate b��k�� k � �� � � � � l�This gives the following estimate of bias

dbias�b�� � b� � � ��� �

where b� �Pl

k� b��k�l���

Page 40: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

The bootstrap estimate of the standard error is simply the standard devia�tion of the bootstrap replications� i�e�

bse�b��boot �vuut mX

j�

�b���j� � b�����m � ��� ������

The jackknife estimate of the standard error is similarly de�ned as

bse�b��jack �vuutn� �

n

nXi�

�b��i� � b��� ������

where n is the sample size in the simulations�

In a similar way� simulated samples can be used to obtain an estimate of thereal standard error� bse�b���

For su�ciently large sample sizes� the asymptotic variance of estimatorscan be used� For the GPD� Smith ���� showed that the ML estimators of theparameters are asymptotically normal with

nvar

b�b��

��� ��� ���� ������ �� ������ ��

�������

as n���

To obtain an aggregate measure of performance for point estimators� biasand standard error of estimators may be combined in a single criterion� Thecoe�cient of variation �cv� is the ratio between bias and standard error� Smallvalues of cv indicate that the bias of the estimators can be ignored� This is ofpractical importance for con�dence intervals and will be discussed in the nextsection� Another combined measure of accuracy is the root mean square error�rmse�� For example for b�� it is de�ned to bep

se�b��� � bias�b���� ������

� Simulation Results

As discussed in the introduction� our main emphasis is on small to moderatesample sizes and the heavytailed region with negative �� Simulations were per�formed for sample sizes n � ��� ��� ��������� ���� ���� ��� and with � takingthe values � � ������������������ ���� Maximum likelihood estimators areinvariant under scale changes� so without loss of generality � was set to � inall simulations� For each combination of � and n� ��� random samples fromthe GPD were generated� For each sample x� we calculated ML estimatesof parameters and ����quantile of the GPD� These estimates are denoted

��

Page 41: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

by b�� b� and �Q������ For each sample� ���� independent bootstrap samplesx���x��� � � � �x����� have been generated� each consisting of n �the sample sizein the simulations� data values drawn by replacement from x� For each boot�

strap sample� we calculated ML estimates of the parameters� b��� b�� and �Q�������

In the following sections we will present and discuss the simulation results� Thetables which contain complete results of simulations are presented in an ap�pendix to this report� Hence� when we mention a table we actually refer to thetables in the Appendix� A Postscript version of the Appendix is also availableat http���www�math�chalmers�se��nader�software�appendix�ps�

It should be mentioned that the main part of these simulations has beendone in S�plus ���� ���� In ���� we discuss some details of the calculations�A complete source of all of the programs can be obtained from http���www�

math�chalmers�se��nader�software�html� In this kind of application� it isimportant to understand the internals of the pseudo�random number generator�The current generator in S�plus is an implementation of George Marsaglia�s!Super�Duper" generator� see ��� ��� for technical details� In S�plus the randomseed is stored as a vector of �� integers� For most starting seeds the period ofgenerator is ���� � ���� ��� � ���� � ����� which is quite su�cient for ourpurposes�

��� Simulation Results of Bootstrap Condence Intervals

In this section we present the results concerning bootstrap con�dence intervals�We discuss the case � � �� in detail and refer to the tables in the Appendix forthe other ��values� Figure � compares the percentage of misses to the right� i�e�percentage of samples where the interval did not cover the real value� for thepercentile and BCa con�dence intervals for �� In the �gure� the con�dence levelis �� and the dashed line shows the nominal percentage of misses� i�e� ��� The��value in the simulations is ��� Note that these are based on ��� con�denceintervals and that for the con�dence level ��� the standard deviation of thepercentage of misses due to sampling is about ��� As mentioned earlier� thepercentile interval works well when the underlying distribution of statistic isroughly normal� Figure � shows histogram of ���� bootstrap estimates of �with the simulated value of � � ���

It is clear from the �gures that� due to skewness in the underlying distri�bution of b�� percentile intervals are unable to catch the true value of �� TheBCa interval achieves better balance between the left and right sides� but likethe percentile interval it still undercovers overall �especially for small samplesizes�� Figure � shows the percentage of misses of two sided con�dence intervalsfor � in all sample sizes� Figure � presents the mean length of ��� con�denceintervals based on percentile and BCa methods for � � ��� The BCa intervalsare slightly longer than percentile intervals� Figure � gives the mean symmetry

��

Page 42: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

02

46

8

gamma=−1

bias correctedpercentile

20

20

30

30

40

40

50

50

75

75

100

100

150

150

200

200

Figure �� The percentage of misses of aone sided �� upper limited bootstrapcon�dence interval for � in ��� simu�lated samples� The number on the topof each bar shows the sample size in thesimulations�

−1.0 −0.5 0.0 0.5 1.0

050

100

150

n=20 , gamma=−1

Figure �� Histogram of ���� bootstrapestimates of � from a sample with � ����

of the con�dence intervals for � with con�dence level ��� As we see� BCa

intervals are more symmetric than percentile intervals but generally they arelonger in the left side�

n=200n=150n=100n=75n=50n=40n=30n=20

90% 95% 90% 95%

020

4060

8010

012

0

gamma=−1

bias corrected percentile

Figure � Barplot of percentage ofmisses of two sided con�dence intervalsfor �� The dashed lines show the nom�inal percentage of misses in � groupswith con�dence levels �� and ���

0.0

0.5

1.0

1.5

2.0

gamma=−1

bias correctedpercentile

2020

3030

4040

5050

75 75

100 100

150 150200 200

Figure �� Mean length of bootstrapcon�dence intervals for � in ��� simu�lations� The number on the top of eachbar shows the sample size in the simu�lations�

For � the pattern is almost the same� Figure compares performance ofboth bootstrap methods for two�sided con�dence intervals for �� It is clear thatBCa outperforms the percentile con�dence interval� Figure �� shows that meanlength of BCa intervals are shorter than corresponding percentile intervals� Foreach bootstrap estimate of b� and b�� equation ����� gives an estimate of Q������

��

Page 43: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

0.0

0.2

0.4

0.6

gamma=−1

bias correctedpercentile

20

20

30

30

40

40

50

50

75

75

100

100

150

150

200

200

Figure �� Mean length of symmetry of con�dence intervals for � in ��� simu�lation of a sample with simulated value of � � ��� The dashed lines shows thecomplete symmetry ���� The number on the top of each bar shows the samplesize in the simulations�

Figure �� shows percentage of misses in two�sided con�dence intervals forQ������Again� BCa outperforms the percentile con�dence interval� but neither is par�ticularly good� It is interesting to note that� as it is shown in Figure ��� themean length of BCa intervals for Q����� are longer than corresponding percentileintervals�

Table � gives the percentage of misses in ��� simulated con�dence intervalsfor all values of �� Table � shows percentage of misses on the left side of intervals�As seen in the table� the percentile bootstrap interval overcovers to the right andundercovers to the left� As mentioned above� BCa intervals undercover overall�This is seen in Table �� Table � gives the mean length of ��� con�dence intervalsbased on percentile and BCa methods� Table � shows the mean symmetry ofthe con�dence intervals in the simulations�

For � the pattern is almost the same� Tables � to �� give the correspondingsimulation results for b�� Table � shows that the BCa intervals have bettercoverage for � than for �� It is clear in Table that� for all sample sizes� BCa

intervals for � are shorter than percentile intervals�

Tables �� to �� give the corresponding simulation results for �Q������

Page 44: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

n=200n=150n=100n=75n=50n=40n=30n=20

90% 95% 90% 95%

020

4060

8010

012

0

gamma=−1

bias corrected percentile

Figure � Barplot of percentage ofmisses of bootstrap con�dence intervalsfor �� The dashed lines show the nom�inal percentage of misses in � groupswith con�dence levels �� and ���

01

23

gamma=−1

bias correctedpercentile

20

20

30

30

4040

5050

75 75100 100

150 150200 200

Figure ��� Mean length of bootstrapcon�dence intervals for � in ��� simu�lation of a sample with � � ��� Thenumber on the top of each bar showsthe sample size in the simulations�

n=200n=150n=100n=75n=50n=40n=30n=20

90% 95% 90% 95%

020

4060

8010

012

0

gamma=−1

bias corrected percentile

Figure ��� Barplot of percentage ofmisses of bootstrap con�dence inter�vals for Q������ The dashed lines showthe nominal percentage of misses in �groups with con�dence levels �� and ���

020

4060

8010

0

gamma=−1

bias correctedpercentile

20

2030

3040

40 50

50 7575 100 100 150 150 200 200

Figure ��� Mean length of bootstrapcon�dence intervals for Q����� in ���simulation of a sample with � � ���The number on the top of each barshows the sample size in the simula�tions�

��

Page 45: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

��� Simulation Results of LikelihoodbasedCondence Intervals

In this section we study likelihood�based con�dence intervals� For the samesimulated data as in the previous section� we constructed con�dence intervalsbased on the pro�le likelihood function�

Before presenting the results� it should be mentioned that for some samplesit is not possible to construct likelihood�based con�dence intervals for the pa�rameters in the GPD� The range for � is � � � for � � � and � � �x�n� �x�n�the largest observation� for � � �� It seems that automatically� in pro�ling�� when we decrease �� the ML estimate of � also decreases� This happens�because we try to �t a long tail distribution to a dataset which actually comesfrom a distribution with shorter tail than the �tted one� As we mentionedabove� for � � �� � has the lower bound �x�n�� This means that � can notget arbitrary small and when� for a particular sample� it reaches to it�s lowerlimit� � can not decrease anymore and this means that a con�dence intervalfor � can not be constructed for that sample� In pro�ling � the opposite thinghappens� In this case� increasing � results in an increase in � which is restrictedfrom above by �xn� It is also clear from ����� that� ML estimates do not existfor � � �� since for this case the likelihood function can get arbitrary large byletting �� � x��n�� A third limitation appears when we use the scaling factors

calculated in Section ���� As discussed there� equations ����� and ����� involvecalculating the expected value of the fourth derivative of the log�likelihood func�tion� These integrals are convergent only for � � ����� This implies that in theprocess of pro�ling likelihood function for constructing con�dence intervals for�� if we need to go further than � � ���� to the right� we will not be able to�nd an upper limit for �� These outcomes have been called !not�availables" inthe simulation results�

Figure �� shows the pro�le likelihoods and the resulting con�dence intervalsin a simulated sample with � � ���� and � � �� The con�dence interval fails tocover the true value of �� Figure �� shows the percentage of misses in one�sidedpro�le likelihood con�dence intervals� As seen� the coverage of intervals are notsatisfactory for small sample sizes� This is the case for sample size �� for allvalues of �� The reason obviously is asymptotic approximation of �� It is thusimportant to consider corrections discussed in Section ��� which can be appliedto improve the �nite�sample performance�

Using the scaling factors obtained in Section ���� we repeated the calcula�tions for likelihood�based con�dence intervals� Figure �� shows the correctedlikelihood�based con�dence interval for the same sample as in Figure ��� Notethat now the true value of � is included in the interval� Figure �� comparesthe results of likelihood�based intervals for the case � � ��� Figure ����a� showsthat� for sample sizes �� and ��� one�sided lower limited ����� corrected pro�lelikelihood intervals perform slightly better than ordinary pro�le intervals but

��

Page 46: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

Two sided confidence level= 0.95

gamma

zeta

−0.6 −0.4 −0.2

−3

−2

−1

01

23

−0.595 −0.14

Two sided confidence level= 0.95

sigma

zeta

0.8 1.0 1.2 1.4 1.6 1.8 2.0

−3

−2

−1

01

23

0.945 1.694

Figure ��� Pro�le likelihood functions and con�dence intervals for � and � ina simulated sample with � � ���� and � � ��

for other sample sizes the correction has no e�ect�

Figure ����b� compares the results for upper limited ���� intervals� Asseen� except for sample size ��� corrected pro�le likelihood intervals performbetter than ordinary pro�le con�dence intervals but there does not seem to beany signi�cant di�erence between the methods� Figure ����c� shows again fortwo�sided �� intervals� except for sample size ��� corrected pro�le con�denceintervals perform a little better than likelihood�based intervals� Figure ����d�compares the mean length of the con�dence intervals for ��

Due to reasons discussed above� corrected pro�le�likelihood method per�form best for lower limited con�dence intervals� Although� the di�erence inFigure ����a� is not large� the improvement can best seen by comparing Fig�ure ����a� with Figure ����a�� As seen there� corrected pro�le likelihood methodsomewhat improves the empirical coverage of the intervals for all sample sizes�Again� for the same reasons� the corrected pro�le likelihood intervals fail muchmore than the nominal value for upper�limited intervals� see Figure ����b�� Thishas a direct e�ect on empirical coverage of two sided�intervals� as shown in Fig�ure ����c��

��

Page 47: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

02

46

810

1214

gamma=−0.2

lower limitedupper limited

20

20

30

30

40

40

50

50

75

75

100

100

150150

200

200

Figure ��� Percentage of misses of pro�le likelihood con�dence intervals for �in ��� samples with � � ����� The dashed lines shows the nominal percentageof misses�

Figure �� shows the corresponding results for �� Again� there is some im�provement in the coverage for lower limited intervals� Figure ����c� shows thatpercentage of misses of intervals are around their expected values in both meth�ods� Figure ����d� shows that� for small samples� the mean lengths of thecorrected pro�le likelihood con�dence intervals for � are slightly longer thanordinary pro�le likelihood intervals�

It is interesting to see if higher values of � have the same e�ect on empiri�cal coverage probability of intervals for �� Figure � ��a� shows the results for� � ����� As seen� there is still an improvement in coverage of lower limitedcon�dence intervals for small sample sizes� Figure � ��b� shows that there is theopposite behaviour for upper limited intervals where the corrected intervals areworse than ordinary pro�le likelihood intervals� The same pattern can be seenin Figure � ��c� which shows the simulation results for two�sided intervals� Fig�ure � ��d� does not give any indication of signi�cant di�erence in mean lengthof intervals�

As discussed in Section ���� the empirical coverage probability of the boot�strap con�dence intervals for the ���quantile is less than nominal coverageprobability of intervals and therefore other methods should be considered� Onesolution is to use the Bonferroni�s method and construct a simultaneous inter�val for � and � and transform the con�dence region to a con�dence intervalfor Q������ We applied this method to our simulations but it resulted in very

��

Page 48: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

Two sided confidence level= 0.95

gamma

zeta

−0.6 −0.4 −0.2

−3

−2

−1

01

23

−0.602 −0.134

Two sided confidence level= 0.95

sigma

zeta

0.8 1.0 1.2 1.4 1.6 1.8 2.0

−3

−2

−1

01

23

0.941 1.703

Figure ��� Corrected pro�le likelihood functions and con�dence intervals for� and � in a simulated sample with � � ���� and � � ��

long intervals with too high empirical coverage probabilities� We found theseintervals of no practical use and therefore we do not presents the simulationresults here�

Table �� gives percentage of misses or !not�availables" at each side of a con��dence interval for � with con�dence level ��� Table �� gives total percentageof times a two�sided con�dence interval fails to cover the true value of the pa�rameter �it can be a miss or a !not�availables"�� The aggregate percentage ofmisses and !not�availables" in ��� simulations of two�sided intervals are alsopresented in this table� Obviously� for small samples the coverage is quite low�Tables �� and � give the corresponding results for �� Table �� gives the meanlength of con�dence intervals for � and �� These values are calculated for thosewhich covered the true value of the parameters� Table �� shows percentage ofmisses or !not�availables" at each side of a �� con�dence interval for � withthe correction factor incorporated into the likelihood function� Table �� givespercentage of times a corrected con�dence interval fails to cover the true valueof the parameter �it can be a miss or a !not�availables"� compare with Table ����The last column gives percentage of misses of two�sided intervals in ��� simula�tions� Tables �� and �� give the corresponding results for �� Table �� gives the

��

Page 49: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

202030

3040405050

7575

100100

150150

200200

0 1 2 3 4 5

(a)

profilecorrected profile

2020

3030

4040

5050

7575

100100

150150

200200

0 2 4 6 8

(b)

2020

3030

4040

5050

7575100100150

150200200

0 2 4 6 8 10

(c)

2020

3030

4040

5050

7575

100100

150150

200200

0.0 0.5 1.0 1.5 2.0

(d)

Figure��Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��for�����a�Percentageofmissestotheleft�b�

Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�denceintervals�

Thedashedlinesshowthenominalpercentageofmisses�

��

Page 50: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

2020

3030

4040

505075

75100

100150

150200

200

02468

(a)

prof

ileco

rrec

ted

prof

ile

2020

3030

4040

5050

7575

100100

150150200200

05101520

(b)

2020

3030

404050

5075

75100

100150150

200200

05101520

(c)

2020

3030

4040

5050

7575

100100

150150

200200

0.00.20.40.60.81.01.2(d

)

Figure���Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��for�������a�Percentageofmissestotheleft

side�b�Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�dence

intervals�Thedashedlinesshowthenominalpercentageofmisses�

��

Page 51: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

2020

3030

40405050

7575

100100

150150

200200

0 1 2 3 4 5

(a)

2020

3030

404050

5075

75100100

150150

200200

0 1 2 3

(b)

2020

3030

404050

5075

75100100150

150200

200

0 2 4 6 8

(c)

profilecorrected profile

2020

3030

4040

5050

7575

100100

150150

200200

0.0 0.5 1.0 1.5 2.0

(d)

Figure���Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��for�����a�Percentageofmissestotheleft

side�b�Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�dence

intervals�Thedashedlinesshowthenominalpercentageofmisses�

��

Page 52: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

2020

3030

4040

5050

7575

100100

150150200

200

012345

(a)

2020

3030

4040

5050

7575

100100150150200200

051015202530

(b)

2020

3030

4040

5050

7575

100100

150150200200

051015202530

(c)

prof

ileco

rrec

ted

prof

ile

2020

3030

4040

5050

7575

100100

150150

200200

0.00.51.01.5(d

)

Figure��Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��for�������a�Percentageofmissestotheleft

side�b�Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�dence

intervals�Thedashedlinesshowthenominalpercentageofmisses�

��

Page 53: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

mean length of con�dence intervals for � and �� These values are calculated forthose intervals which covered the true value of the parameters�

As discussed in Section ���� one can use !Russian doll principle" to constructa con�dence interval for an unknown quantity� Table �� shows the percentageof misses in ��� upper limited con�dence intervals for Q������ Figure �� showsthat empirical coverage probability of such intervals for small samples is lessthan the nominal value�

Table �� gives the results for lower�limited con�dence intervals and Table ��shows the total percentage of misses of two�sided intervals in ��� simulations�Finally Table � gives the mean length of such intervals�

05

1015

20

percentile 90%percentile 95%20

20

30

30

40

40

50

50

75

75

100

100

150

150

200

200

Figure ��� Percentage of misses of bootstrap con�dence intervals for Q�����according to !Russian doll principle"� The number on the top of each bar showsthe sample size in the simulations�

��� Estimates of Some Accuracy Measures of ML Estimators

In Section ��� we discussed estimation of bias and other accuracy measuresof point estimators� In this section we will present simulations results for theparameters and ���quantile of the GPD� As mentioned earlier� in our studywe used ��� random samples and for each sample we generated m � ����bootstrap samples� For each sample a bootstrap estimate of bias was obtainedfrom ����� and a jackknife estimate of bias from ������ Finally� equation ��� �

Page 54: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

gamma=−1

0.0 0.02 0.04 0.06 0.08 0.10 0.12 0.14

oo

o

oo

o

ooo

oo

o

oo

o

oo

o

oo

o

oo

o

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

Jackknife. Bootstrap..

Simulations

n=20

n=30

n=40

n=50

n=75

n=100

n=150

n=200

Figure ��� Bias estimates of �� for � � ���

gives the estimate of bias obtained from simulations� Tables ��� �� and �� in theAppendix contain these estimates of bias for �� � and Q����� in all combinationsof � and n� As mentioned earlier� Hosking and Wallis ��� also studied bias ofdi�erent estimators of parameters and ���quantile of the GPD for � � �����Comparing the relevant parts of our results with their Tables � and � show thatthey generally agree �for example compare the results for n � ��� ���� ��� and� � ��������� of Table � of their article with equivalent results in Tables ��and �� of the Appendix� as they of course should� Figure �� shows dotchartsof the estimates of bias for �� with true value of � � ��� The bias estimatesare generally positive and they are not severe for sample sizes greater than��� In most cases the jackknife estimates of bias are larger than the bootstrapestimates� The bootstrap estimates are more stable than other two methods forall values of ��s� and it seems that the true value of � does not a�ect the biasof the ML estimators�

Bootstrap and jackknife estimates of standard errors are obtained from ������and ������� For b�� these estimates are given in the �rst two columns of Table ���There are also two possibilities to calculate the asymptotic variance using for�mulas in ������� One can either use the estimated values of the parameters�b�� b�� or the real ��� ���values in the simulations� The third and fourth columnsin Table �� give these estimates for b�� The last column in the Table gives para�metric bootstrap estimate of standard error of �b��� Figure �� shows the meanof these estimates for �� in the simulations with � � ��� As we see in the �gure�

��

Page 55: Characterisation and Some Statistical Aspects of Univariate and

� SIMULATION RESULTS

even for sample size ��� the estimates agree very closely� Note also that even forsample size ���� there is still a sizeable amount of variation in ML estimators�Tables ��� �� and �� in the Appendix give the complete results of simulationsfor all combinations of � and n�

gamma=−1

0.2 0.3 0.4 0.5

oo

o

oo

o

oo

o

oo

o

oo

o

oo

o

ooo

ooo

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

Jackknife. Bootstrap.

Simulations

n=20

n=30

n=40

n=50

n=75

n=100

n=150

n=200

Figure ��� Standard error estimates of �� for � � ���

Table �� gives three di�erent estimates of correlation between b� and b�� The�rst column is the mean of the bootstrap sample correlations between b�� and b���The second column gives the asymptotic correlation obtained from ������ byusing the real values of � and �� The last column is the mean of estimatedasymptotic correlation which is calculated by using b� and b� instead of � and �in �������

The ratio of bias to standard error �coe�cient of variation� gives some in�dications of how serious bias of an estimator is� A rule of thumb says that abias of less than ���� standard error can be ignored� Tables ��� �� and � show

estimates of coe�cient of variation for b�� b� and �Q������ As we see in the Ta�bles� for sample sizes greater than ��� the estimates of the cv for b� and b� areless than this ����� Note also that because of the positive bias of b� and b�� the

bias of �Q����� is very small and as a result estimates of cv for �Q����� are muchsmaller than the corresponding estimates for b� and b��

The root mean square error is another measure of accuracy of point estima�tors which takes into account both bias and standard error �de�ned in ������ onpage ���� Estimates of root mean square error �rmse� for the parameters are

��

Page 56: Characterisation and Some Statistical Aspects of Univariate and

� DISCUSSION

given in Tables ��� �� and ���

Discussion

In this section we discuss the simulations results for con�dence intervals for �� �and Q������ As seen in Section �� the BCa method had better empirical coverageprobability than the percentile bootstrap method� Further� likelihood�basedcon�dence intervals with Lawley�s correction performed better than ordinarypro�le likelihood intervals� We hence compare bias corrected bootstrap intervalsto corrected pro�le likelihood intervals�

Figure �� shows the results of simulations for � � ��� For lower limitedintervals the corrected pro�le likelihood intervals seem to perform somewhatbetter than the bootstrap intervals for all sample sizes� The �gure shows theresults for � � ��� but the conclusion holds also for other values of �� see Ta�bles � and ��� This is a nice feature of these intervals which is worth emphasisebecause in many applications lower limited con�dence intervals for � are of prin�cipal interest� For example� in predicting the worst storm damage in Sweden thelower limit of a con�dence interval for � should be used in the prediction� see����� Also for upper limited intervals� except for sample size ��� corrected pro��le intervals perform better than bias corrected con�dence intervals� It shouldfurther be mentioned that the corrected likelihood�based con�dence intervalsgenerally perform better for low values of �� The reasons for this are discussedin Section ��� but brie�y are that to calculate integrals which lead to the scalingfactors ����� and ������ we must assume that � � ����� This means that for adesired con�dence level� if we have to go further than � � ���� to the rightwe will not be able to �nd an upper limit for �� These outcomes are denotedby !not�availables" in our tables� They are more likely to occur if the truevalue of � is close to �� This pattern is very clear in Tables �� and ��� Fortwo�sided con�dence intervals� again except for sample size ��� corrected pro�lecon�dence intervals perform better than bootstrap intervals� The mean lengthof both types of con�dence intervals for � are very similar�

Figure �� shows the corresponding results for �� For lower limited intervalsthere is no clear pattern and both methods are performing acceptably well� Aswe mentioned earlier� in some applications upper limited intervals for � are ofprincipal interest� Figure ����b� shows that except for sample size ��� pro�lelikelihood intervals perform better than bias corrected intervals� The coverageprobability of two�sided intervals are similar for both methods� as are the meanlengths of the intervals�

The discussion above was based on the simulation results for � � ��� Ineach application� depending on the value of �� sample size and type of desiredcon�dence interval� it is possible to go to the tables in the Appendix and take

��

Page 57: Characterisation and Some Statistical Aspects of Univariate and

� DISCUSSION

bias correctedcorrected profile

2020

3030

404050

5075

75100

100150

150200200

0 2 4 6

(a)

2020

3030

4040

5050

7575

100100

150150

200200

0 2 4 6 8

(b)

2020

3030

4040

5050

7575

100100

150150

200200

0 2 4 6 8 10

(c)

2020

3030

4040

5050

7575

100100

150150

200200

0.0 0.5 1.0 1.5 2.0

(d)

Figure���Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��

�a�Percentageofmissestotheleftside�b�

Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�denceintervals�

Thedashedlinesshowthenominalpercentageofmisses�

��

Page 58: Characterisation and Some Statistical Aspects of Univariate and

� DISCUSSION

2020

3030

4040

5050

7575

100100

150150

200200

01234

(a)

2020

3030

4040

50507575100

100150

150200

200

01234

(b)

2020

3030

4040

5050

7575

100100

150150

200200

02468

(c)

bias

cor

rect

edco

rrec

ted

prof

ile

2020

3030

4040

5050

7575

100100

150150

200200

0.00.51.01.52.02.5(d

)

Figure���Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��

�a�Percentageofmissestotheleftside�b�

Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�denceintervals�

Thedashedlinesshowthenominalpercentageofmisses�

��

Page 59: Characterisation and Some Statistical Aspects of Univariate and

� DISCUSSION

bias correctedcorrected profile

2020

3030

404050

5075

75100

100150

150200200

0 2 4 6

(a)

2020

3030

4040

5050

7575

100100

150150

200200

0 2 4 6 8

(b)

2020

3030

4040

5050

7575

100100

150150

200200

0 2 4 6 8 10

(c)

2020

3030

4040

5050

7575

100100

150150

200200

0.0 0.5 1.0 1.5 2.0

(d)

Figure���Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��for�&�����a�Percentageofmissestotheleftside

�b�Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�denceintervals�

Thedashedlinesshowthenominalpercentageofmisses�

��

Page 60: Characterisation and Some Statistical Aspects of Univariate and

� DISCUSSION

2020

3030

4040

5050

7575

100100

150150

200200

01234

(a)

2020

3030

4040

50507575100

100150

150200

200

01234

(b)

2020

3030

4040

5050

7575

100100

150150

200200

02468

(c)

bias

cor

rect

edco

rrec

ted

prof

ile

2020

3030

4040

5050

7575

100100

150150

200200

0.00.51.01.52.02.5(d

)

Figure��Con�denceintervalsfor�withtwo�sidedcon�dencelevel ��for�&�����a�Percentageofmissestotheleftside

�b�Percentageofmissestotherightside�c�Percentageofmissesintwo�sidedintervals�d�Meanlengthofcon�denceintervals�

Thedashedlinesshowthenominalpercentageofmisses�

��

Page 61: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

a closer look at simulation results close to the case in hands�

Hosking and Wallis ��� used the asymptotic distribution of the estimatorsand studied empirical coverage probability of approximate con�dence intervalsfor the parameters and quantiles of the GPD� In Table � of their article theempirical coverage probability of con�dence intervals for � � ���� and ��� arepresented� They conclude that con�dence intervals for � and Q�p� with p � �� require very large sample sizes� ��� or more� before acceptable accuracy is ob�tained� Figure �� shows the results of corrected pro�le likelihood and bias cor�rected con�dence intervals for � with the true value of � � ����� The correctedpro�le lower limited intervals have acceptable accuracy for all sample sizes� Forupper limited con�dence intervals� bias corrected bootstrap intervals performbetter than likelihood method� For two sided intervals Figure ����c� shows thatboth methods have acceptable performance for sample sizes larger than �� andFigure ����d� indicates no signi�cant di�erence in the length of these intervals�Figure �� shows the results for � with � � ����� In this case the bootstrapmethod performs generally better than the corrected pro�le likelihood method�although for small samples the mean length of the produced intervals are longerthan corresponding likelihood�based intervals�

For the cases with � � ����� and for one sided lower limited con�denceintervals� the corrected likelihood�based method performs better than the biascorrected and accelerated bootstrap method and is recommended in this kindof application�

As the discussion above shows� neither the bias corrected and acceleratedmethod nor the corrected pro�le likelihood method is uniformly more accuratethan the other one and their empirical coverage probability depends on the typeof con�dence intervals� sample size and value of shape parameter� Generallythese methods have acceptable performance even for samples with �� observa�tions and for small samples of the heavytailed GPD� these methods should betried �rst�

Acknowledgement

My thanks are due to my supervisor Holger Rootz�n for his support and manyhelpful suggestions during the work� The �nancial support of stiftelsen Lns�frskringsbolagens forskningsfond is also gratefully acknowledged�

References

��� Becker� R�A�� Chambers J�M� and Wilks A�R� �� ��� The new S LanguageA Programming Environment For Data Analysis and Graphics� Wadsworth$ brooks%cole Computer Science Series�

��

Page 62: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

��� Chambers� J�M� and Hastie� T�J� �� �� Statistical Models in S� Wadsworth$ brooks%cole Computer Science Series�

��� Davison� A�C� and Smith� R�L� �� �� Models for exceedances over highthresholds� J� R� Statist� Soc� B ��� � ������

��� Efron� B� and Tibshirani� R� J� �� �� An Introduction to the Bootstrap�Chapman $ Hall� New York�

��� Grimshaw� S� D� �� �� Computing maximum likelihood estimates for thegeneralised Pareto distribution� Technometrics ��� No� ��

��� Hall� P� �� �� The bootstrap and Edgeworth Expansion� Springer�Verlag�New York�

��� Hosking� J�R�M� and Wallis� J�R� �� ��� Parameter and quantile estimationfor the generalised Pareto distribution� Technometrics �� �� ��� �

��� Kinderman� A� J� and Monahan� J� F� �� ��� Computer generation of ran�dom variables using the ratio of uniform deviates� ACM Transactions onMathematical Software �� ��������

� � Lawley� D�N� �� ��� A General Method for Approximating to the Distri�bution of Likelihood Ratio Criteria� Biometrica ��� � ������

���� Rootz�n� H� and Tajvidi N� �� �� Extreme value statistics and wind stormlosses a case study� To appear in Scandinavian Actuarial Journal�

���� Pickands� J� III �� ��� Statistical inference using extreme order statistics�Ann� Statist� �� �� �����

���� Ripley� B� D� �� ��� Stochastic Simulation� New York Jhon Wiley and Sons�

���� Smith� R�L�� �� ��� Maximum likelihood estimation in a class of nonregularcases� Biometrika ��� ��� ��

���� Smith� R�L� �� ��� Statistics of extreme values� Proc� �th Session I�S�I�Paper ����� Amsterdam�

���� Smith� R� �� � � Extreme value analysis of environmental time series anapplication to trend detection in ground�level ozone� Stat� Sci� �� ����� ��

���� Tajvidi� N� �� �� Design and Implementation of Statistical Computa�tions for Generalised Pareto Distributions� Technical Report Mathematicalstatistics Chalmers University of Technology�

��

Page 63: Characterisation and Some Statistical Aspects of Univariate and

Multivariate Generalised Pareto Distributions

Nader Tajvidi

September ����

Abstract

A more recent approach for modelling extreme events is based on socalled peak over threshold �POT� methods� The generalised Pareto dis�tribution �GPD� is widely used for modelling exceedances of a randomvariable over a high threshold and it has shown to be one of the bestways to apply extreme value theory in practice� In this paper we give amultivariate analogue of the GPD and consider estimation of parametersin some speci�c bivariate generalised Pareto distributions �BGPDs�� Wegeneralise two of existing bivariate extreme value distributions and studymaximum likelihood estimation of parameters in the corresponding BG�PDs� The procedure is illustrated with an application to a bivariate seriesof wind data� The behaviour of maximum likelihood estimators �MLEs�of parameters is also studied in a small simulation�

Keywords� Generalised Pareto distribution� Multivariate extremevalue theory� Multivariate Pareto distribution� Non�homogeneous Pois�son process� Maximum likelihood� Threshold method� Extreme wind speeds�

AMS ���� subject classi�cation� ��E�� �F �� ��E � �G����H ��

� Introduction

Modelling of univariate extreme values has developed extensively during thepast few years� Mainly there have been two new approaches� One approach isbased on the asymptotic joint distribution of extreme order statistics� see e�g������ An alternative approach is the so called threshold methods� The basic ideabehind these methods is to model all the exceedances over a high threshold us�ing a generalised Pareto distribution �GPD������� ���� and ����� These methodshave been used in place of the traditional methods which are based on the lim�iting extreme value distributions �� � and ����� By including more relevant datathese new approaches result in greater estimation precision than the classicalapproach�

Page 64: Characterisation and Some Statistical Aspects of Univariate and

� INTRODUCTION

Multivariate extreme value distributions arise in connection with extremesof a random sample from a multivariate distribution� Sibuya���� was one of the�rst to extend the extremal concept to two dimensions and to study asymp�totic properties of the joint distribution of componentwise maxima of bivariaterandom vectors� Theoretical development in the general multivariate case is ex�tensively discussed in the books by Galambos��� and Resnick�� �� Several recentpapers������ ���� ����� ����� have explored their statistical application�

There are several possibilities for ordering multivariate data� see e�g� thereview by Barnett���� For multivariate extreme values the most widely usedmethod is the so called marginal or M�ordering� As the name suggests� orderingor ranking here takes place within one or more of the marginal samples� Thusthe maximum of a set of vectors is de�ned by taking componentwise maxima�i�e�� for a series of independent and identically distributed vectors fXn� n ��g � f�X���

n � � � � � X�d�n �� n � �g� the maximum�Mn� is de�ned by

Mn � �M ���n � � � � �M �d�

n � � �n�

j�

X���j � � � � �

n�j�

X�d�j ��

Here and hereafterW

denotes maximum�

Under rather general conditions� the distribution of the suitably normalisedMn converges to a multivariate extreme value distribution� In applications Mn

is often the vector of annual maxima� As in the univariate case we are interestedin developing methods which utilise more of the available data and as a resultcontribute to better estimation of parameters in the model� In this paper we givea multivariate analogue of the GPD and consider estimation of parameters insome speci�c bivariate generalised Pareto distributions �BGPD�s�� In the lattercase a model is �tted to the joint distribution of a bivariate observation subjectto condition that at least one component exceeds a high threshold� This allowsus to study dependent extremes and for example to estimate a bivariate upperquantile curve or calculate the probability that the thresholds are simultaneouslyexceeded by two variables� This approach doesn�t require multivariate ordering�It permits simultaneous estimation of marginal and dependence parameters�

We begin� in Sections � and �� with a brief summary of univariate extremevalue distributions and the corresponding generalised Pareto distributions� InSection � we present the main theorem which motivates our de�nition of mul�tivariate generalised Pareto distribution �MGPD�� The family of multivariateextreme value distributions is in�nite dimensional but there exists several char�acterisation of these distributions� In Section � we present a summary of di�er�ent characterisations� We develop multivariate Pareto distributions which cor�respond to a characterisation by Resnick and De Haan ��� � and ������ Usuallywe call these distributions !bivariate Pareto distributions" �BPD�s� or !multi�variate Pareto distributions" �MPD�s�� As we will see later these correspondto multivariate generalised Pareto distributions with so called standard Paretomarginals� We use !bivariate generalised Pareto distribution" or !multivariate

��

Page 65: Characterisation and Some Statistical Aspects of Univariate and

� UNIVARIATE EXTREME VALUE DISTRIBUTIONS

generalised Pareto distribution" to refer to distributions with the generalisedPareto marginals� In Section � we consider the general form of the BPD andgive some parametric subfamilies of this distribution� Section � is devoted toa summary of the most common dependence functions and generalisation oftwo existing models� In Section �� we consider estimation of parameters in thebivariate case� We begin with some numerical examples of the BPD which cor�respond to our new extreme value distribution and then discuss extension ofthem to BGPD� We also discuss maximum likelihood estimation of parametersin this case and give general form of the likelihood function which can be usedfor simultaneous estimation of marginal and dependence parameters� Section is devoted to illustrating the methods by applying them to a wind data set�

� Univariate Extreme Value Distributions

The classical approach to extreme value theory starts from limit distributionsof sample maxima� Let X�� X�� � � �� Xn be a sequence of mutually independentrandom variables with common distribution function F �x�� De�ne

Mn � max�X�� X�� � � �� Xn�� n � N�

Suppose there exists a pair of sequences an � � and bn � R such that

limn��

P �Mn � bn

an� x� � lim

n��Fn�anx� bn� � G�x� �����

and G�x� is a non�degenerate distribution function�

A basic result from classical extreme value theory� sometimes called theExtremal Types Theorem� states that G�x� must be a Generalised Extreme ValueDistribution� If ����� holds we say that F �x� belongs to domain of attraction ofG�x� and write F � D�G�� All the possible limit distributions G in ������ i�e�the Generalised Extreme Value Distributions� can be uni�ed as the following��parameter family�

A d�f� G is called a Generalised Extreme Value Distribution if it has theform

G�x �� �� �� � expf���� �x� �

����

�g �����

where � � �� � and � are real parameters� and x� � max�x� ��� The supportof the distribution is x � � � �

� for � � � and x � � � �� for � � �� For

� � �� we interpret G�x� to be the limit G�x� � exp��e� x��� �� Note that G for

� � � is a reparametrisation of the Fr�chet distribution �� � exp���x���

�����with � � ��

� �� for � � � is a reparametrisation of the Weibull distribution

�� � exp���x��� ��� �with � � �� � and G is the Gumbel distribution� �x� �

exp��e� x��� � for � � ��

��

Page 66: Characterisation and Some Statistical Aspects of Univariate and

� GENERALISED PARETO DISTRIBUTIONS

The proof of the Extremal Types Theorem uses the concept of max�stability�A non�degenerate d�f� G�x� is max�stable if for n � �� �� � � � there are constantsan � � and bn � R such that Gn�anx � bn� � G�x�� A convergence of typesargument quickly shows that the class of max�stable distributions coincides withthe class of all possible non�degenerate limits G in ������ The proof is thenconcluded by showing that a max�stable d�f� G�x� necessarily must have theform ������ see e�g� ����� The convergence to types theorem can be used to showthat if G is max�stable� there exists real functions ��t� � � and �t� de�ned fort � � such that Gt���t�x� �t�� � G�x� for all real x�

� Generalised Pareto Distributions

As discussed in the introduction� a rather recent approach to modelling extremeevents is the so called peak over threshold �POT� method� This method is basedon �tting a probability model to the observations exceeding a high threshold�There is a close connection with the GPD�

H�x �� �� � �� ��� �x

���� � �� �

x

�� �� �����

Here � � � and � are real parameters and the support of distribution is x � �for � � � and � � x � �� for � � �� For � � � we interpret H to bethe exponential distribution H�x� � � � e�x�� � Pickands ���� showed that theGPD arises as a limiting distribution for the excess over thresholds if and onlyif the parent distribution belongs to the domain of attraction of an extremevalue distribution� Speci�cally� if we let Fu�x� be the conditional distributionof excess over the threshold u then

limu�xF

inf�����

sup��x��

jFu�x� �H�x �� ��j � �

for some � if and only if F � D�G� for some Generalised Extreme Value Distri�bution G� Here xF � supfx � F �x� � �g is the right�hand endpoint of F �

It should be mentioned that there are other parametrisations of Pareto distri�bution� Several families of Pareto distributions have extensively been discussedin the book by Arnold���� He begins with the classical Pareto distribution withthe survival function

�H�x� � �x����� x � ��

Here � � � is the scale parameter and � � � is called Pareto�s index of inequality�The distribution has �nite mean for � � �� By introducing new parameterswhich relate to location� scale and shape� he progresses to more complicatedfamilies of distributions� Arnold also gives a comprehensive study of inferencesfor di�erent families of Pareto distribution� The relation between Extreme ValueDistributions and GPD is not discussed in this book�

��

Page 67: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

It turns out that� for the purpose of characterisation of multivariate Paretodistributions� it is convenient to consider the special case in ����� with � � ��and � � �� We follow the notation introduced by Arnold �see ��� page ���� andcall this distribution the standard Pareto distribution� Thus a random variablewill be said to have the standard Pareto distribution if its survival function isof the form

�H�x� � �� � x���� x � �� �����

Note that if X has the standard Pareto distribution� then X�� has also thesame distribution�

Multivariate Generalised Pareto Distributions

Suppose fXn� n � �g � f�X���n � � � � � X

�d�n �� n � �g are independent� identically

distributed random vectors with d�dimensional distribution function F � Asbefore� let Mn be the vector of componentwise maxima�

Mn � �M ���n � � � � �M �d�

n � � �n�j�

X���j � � � � �

n�j�

X�d�j ��

Assume that there exist normalising constants ��i�n � �� u

�i�n � R� � � i � d� n �

� such that as n��

P ��M �i�n � u�i�n ���i�n � x�i�� � � i � d� �

� Fn�����n x��� � u���n � � � � � ��d�n x�d� � u�d�n �� G�x� �����

with the limit distribution G such that each marginal Gi� i � �� � � � � d is non�degenerate� If ����� holds� F is said to be in the domain of attraction of G� asbefore denoted by F � D�G�� and G is said to be a multivariate extreme valuedistribution�

As in the univariate case� a multivariate convergence of types argumentshows that the class of limit d�f��s for ����� is the class of max�stable distribu�tions� where a d�f� G in Rd is max�stable if for i � �� � � �d and every t � � thereexist functions ��i��t� � �� �i��t� such that

Gt�x� � G������t�x��� � ����t�� � � � � ��d��t�x�d� � �d��t���

It is clear that each marginal of G must be one of the three extreme valuedistributions de�ned in Section ��

It is convenient to have a convention to handle vectors occuring in the sameexpression but not all of the same length� We use the convention that thevalue of the expression is a vector with the same length as that of the longestvector occuring in the expression� Shorter vectors are recycled as often as need

��

Page 68: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

be� perhaps fractionally� until they match the length of the longest vector� Inparticular a single number is repeated the appropriate number of times� Alloperations on vectors are performed element by element� For example if X andY are bivariate vectors and � is an scalar� we have

�X � ��X�� �X�� � ��X � ���X�� ��X��

and

XY � �X�Y�� X�Y�� � X�Y � �X� � Y�� X� � Y���

Note that this rule applies also when we take supremum or in�mum of a set�

For a multivariate distribution F �x� we denote the tail of distribution by�F �x�

�F �x� �� �� F �x� � P �X � x�

and use the notation xF for the right end point of a d�f� F � i�e��

xF � supfx � F �x� � �g�

The following result will motivate a de�nition of the multivariate generalisedPareto distribution�

Theorem ���� Suppose G is a multivariate extreme value distribution in Rd�Then F � D�G� if and only if for every x� � Support �G� there exist a curvel in Rd and a function ��u� � ����u��� � � � � �d�ud�� such that for u on l

�F �u� ��u�x��F �u�

� � logG�x � x��

� logG�x�������

as u xF for x � Rd�

Proof� For brevity write S for the support ofG� so that S � fx � G�x� � ��� ��g�By de�nition� F � D�G� if and only if there exists �n � � and un � Rd suchthat

Fn��nx� un�� G�x�� �����

By taking logarithms it is seen that ����� is equivalent to

n �F ��nx � un��� logG�x�� �����

Hence� if F � D�G� then for x� � S and x� x� � S

n �F �un � �n�x � x���

n �F �un � �nx���� � logG�x� x��

� logG�x���� �Hx��x� � n���

��

Page 69: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

Now let u�n � un � �nx�� Then this can be written as

�F �u�n � �nx��F �u�n�

� � logG�x� x��

� logG�x��� n��� ��� �

From the characterisation of normalising constants in the univariate case it canbe shown that we can always choose un and �n such that u�n is nondecreasingin n and that u�n � xF � Given u�n we de�ne l to be a curve which goes throughthe points u�n in Rd� with each segment of l between u�n and u�n�� chosen suchthat

lu�n�u�n�� � � fx � Rd � u�n � x � u�n��g�

The following �gure illustrates l in R��

U2

U1

U’

Figure �� Illustration of l in R��

Now for u on l let

n�u� � inffn � u�n�� � ug ��and for u�n�u� � u � u�n�u���� de�ne ��u� � �n�u�� By monotonicity� for x � ��

�F �u�n�u��� � �n�u�x�

�F �u�n�u���

�F �u� ��u�x��F �u�

��F �u�n�u� � �n�u�x�

�F �u�n�u�����

This can be written in the following form

n�u�

n�u� � �

�n�u� � �� �F �u�n�u��� � �n�u�����n�u�x�n�u���

��

n�u� �F �u�n�u��

� �F �u� ��u�x��F �u�

� �n�u� � ��

n�u�

n�u� �F �u�n�u� � �n�u�x�

�n�u� � ��� �F �u�n�u�����

��

Page 70: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

Hence� letting u� xF � and using that then n�u���� ����� and the de�nitionof u�n we get ������

Conversely suppose that ����� holds� We want to show that F � D�G��Choose bn on l such that

bn � inffx � �� F �x� � �

ng

and

an � ��bn��

Now bn � xF when n�� and hence

limn��

n��� F �anx� bn�� � limn��

n��� F �bn��� � F �anx� bn�

�� F �bn�

� �H�x� limn��

n��� F �bn��� ������

Consequently� if we show that as� n���

limn�� � F �bn�� � � ������

then the theorem will be proved�

To prove ������ we �rst observe that in view of the de�nition of bn andobserving that an � �� for any � � � we have

�� F �bn � an�� � �

n� �� F �bn�

or equivalently� for �H�x� � �Hx��x��

� � n��� F �bn�� � �� F �bn�

�� F �bn � an��� �

�H����

We know that �H�x� is continuous and H��� � � so letting � � � and using������ we obtain

Fn�anx � bn�� e� H�x�

which is equivalent to

Fn�anx � bn�� �G�x� x�����

log�G�x��� �

Since G is max�stable� this proves that F � D�G�� �

Our �rst de�nition is as follows�

��

Page 71: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

De�nition ���� We say thatH is a multivariate generalised Pareto distribution�MGPD� with positive support if

�H�x� �� logG�x� x��

� logG�x��

for some extreme value distribution G for x � � and with x� in the support ofG and �H�x� � � otherwise�

Further understanding of the meaning of this de�nition is obtained from thefollowing consideration� It su�ces to consider the bivariate case� Suppose that�X�Y � F �x� y� and de�ne

�H�x� y� ��� F �x� a� y � b�

�� F �a� b��

P ��X�Y � � �x� a� y � b��

P ��X�Y � � �a� b���

� P ��X�Y � � �x� a� y � b�j�X�Y � � �a� b�� for �x� y� � ��� ��

H�x� y� �

��� �H�x� y� �x� y� � ��� ��� otherwise

�note H��� � �� �H��� � �� � � �� also�� Then H is the d�f� of

��X � a��� �Y � b���j��X � a��� �Y � b��� � ��� ���

i�e��

P ��X � a�� � x� �Y � b�� � yj�X � a��� �Y � b��� � ��� ��� � H�x� y�

for �x� y� � R��

Figure � illustrates BGPD with positive support and the mass on the axes�

It should be mentioned that with a slight change in the range of x an alter�native de�nition of MGPD can be obtained� For convenience we give also thede�nition for this case�

De�nition ���� We say thatH is a multivariate generalised Pareto distribution�MGPD� with full support if

�H�x� �� logG�x� x��

� logG�x��

for some extreme value distribution G with x � � and x� in the support of G�

Figure � illustrates BGPD with full support� It is interesting to comparethese two de�nitions of BGPD�s �Figures � and ��� Note that in both cases

��

Page 72: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

a a+x X

b+y

Y

b

Figure �� Bivariate generalised Pareto distribution with positive support� Thearrows illustrate that the mass in the each shaded area is moved to the axes�The support of the distribution is �x� y� � ��� ��� Compared to De�nition ���we have replaced x� by �a� b� and assumed that G is de�ned in R���

we only consider observations which are !extreme" in at least one margin �hereextreme means that the marginal observation is over a high threshold in thatmargin�� The di�erence is that in the �rst de�nition� the total mass is con�centrated in �x� y� � ��� ��� This consist of a bivariate measure on R�� and twounivariate measures on the axes� In the second de�nition there is only one bivari�ate measure on �x� y� � ��� ��� In statistical applications this di�erence impliesthat some components of the observations from a MGPD with full support canbe negative� For further discussion of these two de�nitions see Sections � and �see also Figure � on page ����

Theorem ��� shows that there is a close connection between multivariategeneralised Pareto distributions and multivariate extreme value distributions�If G is a multivariate extreme value distribution� then each of marginal distri�butions is a Univariate Extreme Value Distribution� The univariate extremaldistributions can all be obtained from one another by means of simple functionaltransformations� see e�g���� � page ���� and ������ If the random vector X has amultivariate extreme value distribution� then so does Y if Y has marginal com�ponents which are derived from the corresponding marginal components ofX bythese transformations� It follows that a particular marginal distribution may be

��

Page 73: Characterisation and Some Statistical Aspects of Univariate and

� MULTIVARIATE GENERALISED PARETO DISTRIBUTIONS

a a+x X

b+y

Y

b

Figure �� Bivariate generalised Pareto distribution with full support� Theshaded area illustrates the bivariate measure in the support of the distribution�x� y� � ��� ��� Compared to De�nition ��� we have replaced x� by �a� b� andassumed that G is de�ned in R���

chosen and that much of the interest is in the so called dependence function �seeSections � and � for further discussion of dependence functions�� Di�erent au�thors have assumed di�erent marginal distributions� Tiago de Oliveira ��� ������and ����� used the standard Gumbel distribution �exp��e�x�� whereas Pickands�see ���� and ��� chapter ��� considered min�stable distributions assuming themargins to be exponential� The random vector X or its distribution is calledexponential in the sense of Pickands if its survival function G�x� satis�es therelation

�t log�G�xt�� � � log�G�x��

for any vector x and any scalar t� � � t � �� The survival function G�x� isde�ned as

G�x� � P �d�i�

fXi � xig�

This is a class of multivariate exponential distributions which generalises themin�stability property of univariate exponential distribution to higher dimen�sions� Multivariate exponential distributions has also been studied by Gumbel�

Page 74: Characterisation and Some Statistical Aspects of Univariate and

� CHARACTERISATION OF MULTIVARIATE MAX�STABLE

DISTRIBUTIONS

Marshall and Olkin� see e�g� ���� and ����� De Haan and Resnick chose the unitFr�chet distribution ���x� � e���x for margins�

In order to characterise MGPD�s� we will use characterisation of max�stabledistributions� In the next section we present a characterisation of max�stabledistributions due to Resnick and de Haan �� �� and then give the correspondingrepresentation of MPD� Finally we will discuss some properties of MGPD�s�

� Characterisation of Multivariate Max�Stable

Distributions

Max�stable distributions form a subclass of the max�in�nitely divisible �max�id� d�f��s which is the class of d�f��s F �x�� � � � � xd� for which F t�x�� � � � � xd� is ad�f� for every t � �� Balkema and Resnick��� have studied the necessary andsu�cient conditions for a distribution function to be max�id� If G is a max�stable d�f�� then clearly Gt is also a d�f� for every t � � and hence it is obviousthat every max�stable d�f� is max�id�

One characterisation of max�id distributions is presented in the followingproposition from Resnick�� � page �����

Proposition ���� The following are equivalent�

�i� F is max�id�

�ii� For some k � ������d there exists a Radon measure � on E �� �k���rfkg such that

F �y� �

�expf������y�cg y � k� otherwise

������

and such that�

��E r ������d� � � �di� fy � E � y�i� ���

� � and

Either k � �� or x � k and x�i� � �� for some i � d implies������x�c� ���

The measure � in ������ called an exponent measure�

To characterise the max�stable distributions with non�degenerate marginals�we �rst assume that each margin of G has the unit Fr�chet extreme value dis�tribution � i�e��

G����� � � � � xi� � � � ��� � ���xi� � exp��x��i �� xi � ��

��

Page 75: Characterisation and Some Statistical Aspects of Univariate and

� CHARACTERISATION OF MULTIVARIATE MAX�STABLE

DISTRIBUTIONS

Each max�stable distribution with unit Fr�chet marginals will be denoted byG�� It can be shown that this standardisation doesn�t introduce any di�culties�Speci�cally let G be a multivariate distribution with non�degenerate marginalsand denote its i�th marginal distribution by Gi� De�ne

Ui ���

logGi� � � i � d�

The distribution of �U��X��� � � � � Ud�Xd�� has unit Fr�chet marginals and wedenote it� as mentioned above� by G�� i�e�

G��x� � G�G�� �e���x��� � � � � G�d �e���xd ��� x � �

withG��y� � inffx � G�x� � yg�

In order to check ������ we need also to standardise marginal distributions of F �

Vi ��

�� Fi� � � i � d

and de�ne

F��x� � F �F�� ��� �

x��� � � � � F�d ��� �

xd��� x � ��

Note that F� is the distribution function of �V��X��� � � � � Vd�Xd��� Now it canbe shown that if F � D�G� then F� � D�G�� and if F� � D�G�� and ����� holdsfor each margin then F � D�G�� �see �� � page ������

Considering properties of ���x�� max�stability for G� can be written as

Gt��tx�� � � � � txd� � G��x�� � � � � xd� �t � �

and for the exponent measure� say �� of G�� it is translated into a homogeneityproperty�

������x�c� � t������ tx�c� � t���t���x�c��

For �xed t the measure t���t�� agrees with �� on a generating class closed underintersections and hence

���B� � t���tB� ������

for all B � B������d�� The next proposition characterises the max�stable dis�tributions with �� marginals� Here k � k is an arbitrary norm in Rd�

Proposition ���� The following statements are equivalent�

�i� G��x� is a multivariate extreme value distribution with �� marginals�

��

Page 76: Characterisation and Some Statistical Aspects of Univariate and

� CHARACTERISATION OF MULTIVARIATE MAX�STABLE

DISTRIBUTIONS

�ii� There exists a �nite measure S on

� � fy � t � kyk � �g� ������

satisfying Z�a�i�S�da� � �� � � i � d� ������

such that for X � Rd

G��x� � expn�Z�

d�i�

a�i�x�i�S�da�

o�

The only constraints on S are given by equation ������� This means that no�nite parametrisation exists for this family of distributions�

A useful interpretation of the above proposition is based on the limitingpoint process of normalised random variables� Suppose that X��X�� � � � is asequence of IID random variables on Rd� whose distribution function F is inthe domain of attraction of a multivariate extreme value distribution G� withunit Fr�chet marginals ���x� � exp��x���� x � �� Consider the point processPn � fXjn� j � �� �� � � � � ng� Then Pn converges to a non�homogeneous Poisson

process on Rd� n f�g� see e�g����� and ������ The limiting intensity measure �� ofthis process satis�es ������� i�e�� it is homogeneous of order ��� This means thatfor all measurable sets B which are bounded away from � we have

��fBtg � t��fBg� ������

Now for an arbitrary norm k � k de�ne the transformation

T � X� �kXk� kXk��X�� ������

Thus T �x� � �r�W� is a kind of a polar transformation� It will give us anotherinterpretation of ������� Fix a Borel set B �W and de�ne D�r�B� � f�l�w� �l � r�w � Bg� It is easy to show that for M �r� � ���T���D�r�B��� the relation������ in the new coordinate system becomes M �r� � tM �rt�� Setting t � r��

and M ��� � S�B� gives

M �r� � r��S�B� ������

where S is �nite measure on W� This equation gives

���T���dr � dw�� � r��drdS�w�� ���� �

The representation of limiting distribution of componentwise maxima inProposition ��� is a consequence of the limiting Poisson process with intensity���� �� This is easily seen by observing that for A � ���x�c

��

Page 77: Characterisation and Some Statistical Aspects of Univariate and

� CHARACTERISATION OF MULTIVARIATE MAX�STABLE

DISTRIBUTIONS

P �n��Mn � x� � P �n��Xj � A� j � �� �� � � � � n�

and the latter probability converges to expf����A�g� where

������x�c� �ZA

r��drdS�w� ������

Z�dS�w�

Z �

rVdi���xi�wi�

r��dr ������

and the last equation is the same as exponent measure in Proposition ������

Now we de�ne r � kXk �Pdi�Xi� In this case S is a positive �nite measure

on the �d����dimensional unit simplex

� � fw �dXi�

wi � �� wi � �� i � �� � � � � dg

and we have the following representation for intensity of the limiting Poissonprocess

�����x�c �Z�

d�i�

�wixi�dS�w� ������

for some measure S satisfying condition �������

An equivalent representation for min�stable distributions has been obtainedby Pickands����� He showed that dependence function�see Sections � and � forfurther discussion of dependence functions� for any min�stable distribution withexponential margins can be represented in the above form�

With this characterisation of max�stable distributions and considering Propo�sition ���� we obtain the following representation for MPD with positive support�see De�nition �����

Proposition ���� H�x� is a MPD with positive support if there exists a �nitemeasure S satisfying ������ such that for x� � Rd� and x � �

�Hx��x� �

R�Wdi��

a�i�

�x�x���i��S�da�R

�Wdi��

a�i�

x�i��

�S�da��

������x� x��c�������x��c�

� ������

It is straightforward to give an analogue characterisation of MPD with fullsupport �see De�nition �����

��

Page 78: Characterisation and Some Statistical Aspects of Univariate and

� CHARACTERISATION OF MULTIVARIATE MAX�STABLE

DISTRIBUTIONS

��� Two Properties of Multivariate Generalised ParetoDistributions

In the univariate case the family of GPD�s is closed with respect to changes ofthe threshold� More precisely� with the notation of ������

�H�x� �� ���H�x� �� ��

� �H�x� � x� �� � � �x���

for any x� � x��

A similar relation holds in the multivariate case� We show that the familyof MGPD is closed with respect to change of x�� More precisely� we show thatfor x� � S and x� � x��

�Hx��x� � �Hx���x �m�

for some � � � and m � Rn� In fact� since G is max�stable� and writing t �logG�x��logG�x��

�Hx� �x� �� logG�x� x��

� logG�x���� log�G�x� x���

logG�x��

logG�x��

� logG�x���

�� logG���t��x � x�� � �t��

� logG�x���

�� logG�x� � ��t�x� ��t�x� � �t� � x��

� logG�x���

� �Hx���x �m�

for � � ��t� and m � ��t�x� � �t� � x��Note also that in the case with unit Frech�t marginals� if we suppose that

components of u are equal� we obtain the following representation of MPD whichshows that �Hu�ax� � �H��axu� for t � � and a � �� We have

� logG��u� ax�

� logG��u��

�����u� ax�c

�����u�c�

� �����u�����u�c� ax�����u�c�c �

� ����� �������c� ax�����u�c�c �

�������� ax����u�

c

������c�c

�������c�

������� axu�c

�������c�

A further simpli�cation is possible if we assume x � tu� The last expressionthen becomes

������� at�c

�������c�� logG���� at�

� logG����� t � �

��

Page 79: Characterisation and Some Statistical Aspects of Univariate and

� BIVARIATE PARETO DISTRIBUTIONS

which shows that �Hu�ax� � �H��at�� These relations become of particular im�portance in estimation of parameters in MPD�s� If one assumes that the thresh�olds are equal then they can simply be interpreted as scale parameters in themodel and a major simpli�cation of likelihood function is possible� See Sec�tions � and for further discussion of estimation of parameters in BPD�s andBGPD�s�

� Bivariate Pareto Distributions

We now consider the bivariate case and give a simple representation of bivariatePareto distributions� We consider also some speci�c examples of BPD�s� Whend � �� � is one�dimensional and Relation ������ becomes

����� �x� y��c �

Z �

maxfwx� ��� w�ygS�dw�

� ��

x�

y�A�

x

x� y� ������

where

A�q� �

Z �

maxfw��� q�� ��� w�qgS�dw�� ������

The function A�q� is called the dependence function� see e�g� ����� Here Sis �nite positive measure on interval ��� ��� In order that ������ is satis�ed weneed Z �

wS�dw� �

Z �

�� �w�S�dw� � �� ������

For every w � ��� ��� maxfw�� � q�� �� � w�qg is a convex function andhence A�q� is convex since convexity is preserved through maximisation andsummation� By ������ we have thatA��� � A��� � �� In addition from ������ wesee that the function is bounded above by � and below by maxfq� ��qg� The �rstboundary corresponds to independent extremes� In this case the correspondingmeasure S puts mass one at each of the endpoints � and �� The other boundarycorresponds to completely dependent extremes� that is P �X � Y � � �� Themeasure S then puts mass � at �

� � Equation ������ gives that S�dw�� is aprobability measure with mean �

� � As we mentioned earlier there is no �nite�dimensional parametric family for the dependence function� In Sections � and �we will consider di�erent parametric subfamilies of dependence function andwill study estimation of parameters in the corresponding BPD� We now give arepresentation of BPD in terms of dependence functions�

��

Page 80: Characterisation and Some Statistical Aspects of Univariate and

� BIVARIATE PARETO DISTRIBUTIONS

Proposition ��� H�x� y� is a BPD with positive support if there exists a con�vex function on ��� �� satisfying A��� � A��� � � and maxfw� ��wg � A�w� � �such that for �x�� y�� � R�� and �x� y� � ��� ��

�H�x� y� �� �x�x�

� �y�y�

�A� x�x�x�x��y�y�

� �x�

� �y��A� x�

x��y��

A similar characterisation can be given for BPD�s with full support�

Sometimes it is easier to work with the exponent measure directly insteadof the dependence function� The advantage is that many algebraic expressionstake on a particular simple form� For example if in ������ we transform to usualpolar coordinates� we get

T � R� � R� � ��� ���

with r� � x� � y� and sin� � x�

x��y� �

Now by using ������ it can be shown that �see ���� and Figure �� G��x� y� ismax�stable if and only if the exponent measure �� can be represented as

����� �x� y��c � x��

Z��arctany�x�

cos S�d� � y��

Z�arctany�x����

sin S�d�

������where S�� is a �nite measure on ��� ��� such thatZ ��

cos S�d� �

Z ��

sin S�d� � ��

This gives the following representation of BPD

�H�x� y� ������� �x� x�� y � y���

c�

������ �x�� y���c�� ������

�x� x����R arctan y�y�

x�x�� cos S�d� � �y � y����

R ��arctan�y�y����x�x��

sin S�d�

x���

R arctany��x�� cos S�d� � y��

R ��arctany��x�

sin S�d��

From representation of BPD in ������ we see that in particular

�� �H��� �� � �� y���

x���

R arctany��x��

cos S�d� � y���

R ��arctany��x�

sin S�d�

which gives the total mass on the X�axis for BPD with positive support�

Let�s look at some examples of Bivariate Pareto distributions which can beobtained of the characterisation �������

��

Page 81: Characterisation and Some Statistical Aspects of Univariate and

� BIVARIATE PARETO DISTRIBUTIONS

x

y

arctan�yx�

��

x cos�� � y sin��

x cos�� � y sin��

Figure �� Bivariate Pareto distribution in a polar coordinate system�

Example � �see �� � page ����� If S��� � � for � � � � then�Z arctany�x

cos S�d� � sin �arctan�y�x�� � sin�arctan�yx�� �yp

x� � y�

and similarlyRarctan y�x���� sin S�d� �

xpx��y�

� This gives that

������ �x� y��c� �y

xpx� � y�

�x

ypx� � y�

� �x�� � y������

and hence for x�� y� � ��

�H�x� y� ���x � x��

�� � �y � y��������

�x��� � y��

� ����

for x� y � �� �

Example � �see ����� Take S��� � � �R ��cos t sin tdt � � �

� so thatS�d� � � cos sin d� ThenZ ��

� cos� sin d � � �

Z ��

� sin� cos d

��

Page 82: Characterisation and Some Statistical Aspects of Univariate and

MODELLING THE DEPENDENCE FUNCTION

and

Z arctany�x

cos� sin d � �� x�

�x� � y�����

Z ��

arctany�x

cos sin� d � �� y�

�x� � y�����

and this results in the following BPD

�H�x� y� ��x� x��

�� � �y � y���� � ��x� x��

� � �y � y��������

x��� � y��

� � �x�� � y�������

for x� y � � and x�� y� � �� �

Example � �see ����� For � � k � �� let

Sf�g � Sf��g � �� k� S��� � �

Z �

�k�cos y � sin y���dy�

This gives the exponent measure

������ �x� y��c� � x�� � y�� � k�x� y���

and the corresponding BPD is

�H�x� y� ��x� x���� � �y � y���� � k�x� x� � y � y����

x��� � y��

� � k�x� � y����

for x� y � � and x�� y� � �� �

Note that for k � �� Sf�g � Sf��g � � and S places no mass elsewhere�This means that all the mass is concentrated on the axes and hence correspondsto an independent extreme value distribution� It is also interesting to note thatif we take Sf

�g �p� in ������� we get ������ �x� y��c� � x�� � y�� and so that

the mass will be concentrated on the �� line in the positive quadrant�

� Modelling the Dependence Function

A notable feature of bivariate extreme value theory is that no natural parametricfamily exists for the dependence between marginal distribution� Basically therehave been two approaches to modelling dependence functions� nonparametricand parametric methods� The parametric models can also be divided to twoclasses di�erentiable and non�di�erentiable models� The di�erentiable mod�els have densities and can be symmetric or asymmetric� The non�di�erentiable

��

Page 83: Characterisation and Some Statistical Aspects of Univariate and

MODELLING THE DEPENDENCE FUNCTION

models give distributions which are singular� Tawn ���� reviews the parametricmethods and also introduces two new asymmetric di�erentiable models� A sur�vey of existing non�di�erentiable models can be found in Tigao de Oliveira�����and ������

In this section we consider a few existing parametric di�erentiable modelsand generalise two of them to a ��parameter family� As we mentioned in theprevious section the class of dependence functions is a convex set so other de�pendence functions can be obtained by the mixing�

�i� The mixed model with unit Fr�chet margins has the exponent measure

������ �x� y��c� �

x�

y�

x� y� � � � ��

Its dependence function is

A�w� � w� � w � �� � � � ��

For � � we have the independent case but complete dependence is notpossible in this model�

�ii� The logistic model has exponent measure

������ �x� y��c� � �x�r � y�r���r� r � ��

The dependence function is

A�w� � f���w�r �wrg��r� r � ��

The independent case corresponds to r � � and for r � �� we get thecomplete dependence which is the only situation without density�

Extensions of these classes to asymmetric families have been developed by Tawn�����

�iii� The asymmetric mixed model has dependence function

A�w� � �w� � w� � � � ��w � �� � � �� � �� � �� � �� � ���

The corresponding exponent measure is

������ �x� y��c� �

x� � �x� y � ��x� y � x� y � �x y� � �x y� � x y� � y�

x y �x� y���

We obtain the symmetric mixed model for � � � and the independent casefor � � � �� Complete dependence is not possible in this family either�In this family the parameter � stands for non�symmetry in the model�

Page 84: Characterisation and Some Statistical Aspects of Univariate and

MODELLING THE DEPENDENCE FUNCTION

�iv� The asymmetric logistic model has dependence function

A�w� � f����w��r � ��w�rg��r � � � ��w� �� � � � �� � � �� r � ��

with exponent measure

������ �x� y��c� ���� ��x� �� � � y � x ��

r xr��r yr

�x�y�r ��r � y ��

r xr��r yr

�x�y�r ��r

x y�

For � � � � this model reduces to the corresponding symmetric logisticmodel which gives the diagonal case for r � ��� Independence is obtainedfor � � and for � � � or r � �� This model contains also some otherexisting models�

For some other parametric families of dependence function see Coles andTawn ����

�� The Generalised Symmetric Logistic Model

Among the symmetric di�erentiable models� the logistic model �originally dueto Gumbel�� has shown to be most useful model� see e�g� ����� Now we presenta generalisation of this model to a ��parameter family� We �rst note that fordi�erentiable parametric models� the constraints on dependence function� men�tioned in Section �� translate to

A��� � A��� � � ���� �

�� � A���� � � and � � A���� � � ������

A���w� � �� � � w � �� ������

The generalised symmetric logistic model has exponent measure

������ �x� y��c� � ��

xp�

yp�

k

�xy�p����p � �� � k � ��p� ��� p � �� ������

and the dependence function for this model becomes

���� w�p � wp � k ���� w� w�

p�

� �p

In order to �nd the domain of the parameters we check the conditions ���� ������� Obviously ���� � is satis�ed� The �rst derivative of dependence function�A��w� is

��

Page 85: Characterisation and Some Statistical Aspects of Univariate and

MODELLING THE DEPENDENCE FUNCTION

�����w����p �w���p �

k ��� �w� ���� w� w����p

��

����w�

p�wp � k ���� w� w�

p

���� �p

and we have

A���� ��

k��� p � ��� p � ��

This gives � � k � � for p � �� Considering the second derivative of A�w� leadsto a second degree equation� Straightforward but lengthy calculations show that������ is satis�ed if � � k � ��p � ��� Figure � shows the parameter region forthis model�

k

2

2

064 p

logistic model

equivalent models

Figure �� The parameter region for the generalised symmetric logistic model�

Note that for k � � this model reduces to logistic model with p � �� Thismeans that for k � � and p � c with c � �� the same model can be obtainedby choosing k � � and p � �c �see Figure ��� In order to obtain a uniqueparametrisation� k � � should be removed from the parameters region� In thismodel independence corresponds to p � � and k � �� We get the completedependence for k � � and p � ���

��

Page 86: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

�� The Generalised Symmetric Mixed Model

It is also possible to generalise the mixed model to a ��parameter di�erentiablemodel� We call the following exponent measure the generalised symmetric mixedmodel

������ �x� y��c� ��

x�

y� k�

xp � yp���p� �� � k � �� p � ��� ������

The dependence function of this model is

A�w� � �� k���� w��p � w�p

� �p

The �rst condition ���� � is satis�ed for p � �� The �rst derivative A��w� is

k����w����p � w���p

� ����w��p � w�p

���� �p

It is easy to show that

limw��

A��w� � �k for p � �

and hence ������ gives the restriction � � k � ��

The second derivative A���w� is

k �� � p� ���w�p w���p

��� w������ w��p � w�p

� �p

���� w�p � wp��

which obviously is non�negative for � � k � � and p � �� In this model we haveindependence for k � � or p � �� Complete dependence can be obtained withk � � and p ��� Note that complete dependence is not possible in any of theexisting mixed models�

� Statistics of Bivariate Generalised Pareto Dis�

tributions

In this section we study estimation of parameters in BGPD�s� Recall that wehave given two de�nitions of MGPD�s� As discussed in Section �� the main

��

Page 87: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

di�erence is in treating observations which are not !extreme" in all margins�Decision about which model should be used depends mainly on the kind ofavailable data in each case� For example if data consists of two sequences of ob�servations with di�erent lengths� the �rst de�nition will be the obvious choice�Another example is when observations in one sequence have been made condi�tionally on the second sequence� This is the case when for example in a bivariateseries of wind speeds and wave heights� the wave height has been measured whenthe wind speed has been greater than a speci�c level� In this section we considersome examples of BGPD�s and also discuss maximum likelihood estimation ofparameters according to the �rst de�nition� i�e� BGPD�s with positive support�In order to illustrate some properties of these distributions we begin with theBPD which corresponds to the generalised symmetric logistic model and givesome numerical examples� With a change in notation we denote the di�erentthresholds by a and b instead of by x� and y�� In the model these are un�known parameters and must be estimated� As before we denote the tail of thedistribution by �H�x� y� � i�e��

�H�x� y� �� ��H�x� y� � P ��X�Y � � �x� y���

From the generalised symmetric logistic model in ������ and considering������� we obtain the following BPD

�H�x� y� �

��x� a��p � �y � b��p � k ��x� a� �y � b���p��

� �p

�a�p � b�p � k�a b��p��

� �p

������

for x � � and y � ��

The corresponding BPD for the generalised symmetric mixed model ������is

�H�x� y� �

�x�a � �

y�b � k

��x�a�p��y�b�p��p

�a � �

b � k

�ap�bp��p

������

with x � � and y � ��

As discussed in Section �� there will be some probability mass on the axes inthese distributions �see Figure �� page ���� We denote the density of distributionin R�� by h�x� y�� The distribution function of mass on the axes will be denotedby F��x� and F��y� and the corresponding densities will be indicated by lowercase letters� With this notation we have

h�x� y� � �d�H�x� y�

dxdy

��

Page 88: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

F��x� � �� �H�x� �� F��y� � �� �H��� y�

and

f��x� � �d�H�x� ��

dx f��y� � �d

�H��� y�

dy�

Note that the marginal sub�distribution function corresponding to h �de�ned inregion A of Figure �� page ��� is

�H�x� �� � �H������ �� �H�x���

and the corresponding bivariate sub�distribution function is

�H�x� �� � �H��� y�� �H�x� y�� �� ������

It should be mentioned that the total mass in this region is not equal to � and!sub�distribution" has been used to emphasise that�

The upper p�quantile curve is de�ned to be set of points �x� y� for which

P �X � x� Y � y� � p�

This can be calculated by

f�x� y� � �H�x��� � �H��� y�� �H�x� y� � pg�

The p�quantile curve which is de�ned to be all the points �x� y� such that

P ��X�Y � � �x� y�� � p

is obtained as

f�x� y� � �H�x� y� � pg�

In the generalised symmetric logistic model

F��x� � ��

�b�p � �a� x��p � k

�b �a�x��p�

� �p

�a�p � b�p � k

�a b�p�

� �p

and the total mass on the X axis is

�� �H��� �� � �� b���a�p � b�p � k

�a b�p�

� �p

��

Page 89: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

Note also that

P �X � x� � �H�x��� ��x� a����

a�p � b�p � k

�a b�p�

� �p

which is less than � for x � �� Of course the total mass sums up to one if weconsider the mass on Y axis too�

In ������ for p � � and k � �� all the mass is concentrated on the x and yaxes� This can be easily seen from distribution function

�H�x� y� �

�a�x � �

b�y�a � �

b

Another way to see this is to calculate ������ which is equal to � for p � � andk � ��

For larger p we get less mass on the axes� For example for p � �� k � � anda � b � � we get

�H����� � �H��� ��� � �

Z �

Z �

h�x� y�dxdy � ������ ������

This value becomes ����� for p � and ����� for p � ��� We see that when pincreases the density of mass on the axes decreases� Now let�s take p � �� k � ��a � � and b � �� Recall that for p � � the de�nition range for k is � � k � �and for k � � we have the independent case with no mass on R��� In this case

F��x� � ���q� � �� � x��� � �

��xp�

and Z �

f��x� � limx��

F��x� � �� �p��

The distribution of mass on the Y axis is

F��y� � ���q

�� � �� � y��� � �

� ���y�p�

and Z �

f��y� � limy��

F��y� � �� �p�

and this means that the mass on R�� is �p���� Figure � shows the corresponding

densities of this example�

��

Page 90: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

{p=2, k=1, a=2, b=1}

0

1

2

3

4

x

0

1

2

3

4

y

0

0.005

0.01

h(x,y)

0

1

2

3

4

x

{p=2, k=1, a=2, b=1}

0

1

2

3

4

x

0

1

2

3

4

y

0

0.2

0.4

0.6

H(x,y)

0

1

2

3

4

x

2 4 6 8 10 12 14x

{p=2, k=1, a=2, b=1}

0.05

0.1

0.15

0.2

F1(x)

2 4 6 8 10 12 14y

{p=2, k=1, a=2, b=1}

0.1

0.2

0.3

0.4

0.5

0.6

F2(y)

Figure � h�x� y� � H�x� y�� F��x� and F��y� for p � �� k � �� a � � andb � ��

��

Page 91: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

It is interesting to note that if we increase k to � �independent case� weobtain

F��x� � ���q� � �� � x��� � �

��x

and Z �

f��x� � limx��

F��x� ��

and

F��y� � ���q

�� � �� � y��� � �

��y

which gives Z �

f��y� � limy��

F��y� ��

��

Note that for the case that the marginal distributions of a multivariate ex�treme value distribution G�x� are all arbitrary Extreme Value Distributions ofthe form de�ned in ������ we merely need to observe that

G�x� � G�

��� �

� ��� x�

����

and the corresponding form of MGPD can be obtained by this transformation�For example for the generalised symmetric logistic model� ������ and ������ give

�H�x� y� � ���������� �� �a�x�

��

� p

�� � k��� �� �a�x�

��

� p

� ����� �� �b�y�

��

� p

� �� ���� �� �b�y�

��

� p

��

� �p

���� a��

��

� p��

� k��� a ��

��

� p� ����� b ��

��

� p� ��

���� b ��

��

� p��

� �p

We have reparametrised the distribution by choosing a � x� � �x and b �y� � �y� Here a���� � �� b���� � �� x� y � � � ��� �� � � and ��� �� are realparameters�

If marginal distributions are all equal to

�x� � expf�e�xg� x � R

i�e� for �� � �� � �� we interpret the above distribution as

��

Page 92: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

�H�x� y� �

�e�

p �a�x��� � e�

p �b�y��� � k

e

p � a�x��� b�y

���

� �p

�e� a p�� � e

� b p�� � k

e

p � a��

� b���

� �p

� ���� �

See Figure � for an example of �������

��� Likelihood Inference

One of the main reasons for studying multivariate extreme value distributionsand the corresponding Pareto distributions is that it allows us to answer ques�tions about joint behaviour of dependent variables in the tail� Of particularinterest are upper quantile curves and the degree of dependence of variables inthe extremes� Estimates of parameters in each model and their covariance ma�trix can be used to answer these questions� In this section we consider maximumlikelihood estimation of parameters in bivariate generalised Pareto distribution�The form of BPD which we have presented is with the assumption that eachmargin� Xi i � �� �� satis�es condition P �Xi � x� �x� x � �� i � �� ��Initially we derive the likelihood function under this assumption�

Note that what we use as observations in BPD�s are actually exceedancesof original observations over a high threshold in each margin� So if we denotethe original observations by �Z�� Z�� and the corresponding threshold on eachmargin by u� and u�� we have in the BPD X � Z� � u� and Y � Z� � u��see Figure ��� According to the �rst de�nition of MGPD� regions B and Ccontribute with univariate observations which are exceedances of the marginswhich are above their thresholds �see also Figure � on page ����

By the results of Section �� if F � D�G��� the distribution of exceedancesover a region S � f��� u������ u��g converges to a bivariate Pareto distribution�Now� as shown in Figure �� we partition Sc into three regions� We use the samenotation as in the previous section for the density of the BGPD in di�erentregions and we denote by the parameter vector of the dependence function�The likelihood for is

LS � x�y� �

nAYi�

h�xi� yi� �nBYj�

f��xj� �nCYk�

f��yk�� ������

In the above equation� nA� nB and nC denote the number of observations ineach region of Sc �see Figure ��� As mentioned above this likelihood function

��

Page 93: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

p=3,k=1,a=-3,b=1gamma1=-.5,sigma1=3,gamma2=-.3,sigma2=5

0

5

10

15

x

0

5

10

15

y

0

0.0005

0.001

0.0015

0.002

h(x,y)

0

5

10

15

x

p=3,k=1,a=-3,b=1gamma1=-.5,sigma1=3,gamma2=-.3,sigma2=5

0

5

10

15

x

0

5

10

15

y

0

0.25

0.5

0.75

H(x,y)

0

5

10

15

x

2 4 6 8 10 12 14x

p=3,k=1,a=-3,b=1gamma1=-.5,sigma1=3,gamma2=-.3,sigma2=5

0.2

0.4

0.6

0.8

F1(x)

2 4 6 8 10 12 14y

p=3,k=1,a=-3,b=1gamma1=-.5,sigma1=3,gamma2=-.3,sigma2=5

0.005

0.01

0.015

0.02

0.025

0.03

F2(y)

Figure �� h�x� y� � H�x� y�� F��x� and F��y� for p � �� k � �� a � ��� b � ��� � ������ � ���� � ��� and �� � ��

Page 94: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

��

��

��

��

�X � Z� � u�

Y � Z� � u�

u� Z�

X � Z� � u�

Y � Z� � u�

A

u�

Z�

C

B

Figure �� Illustration of the threshold model for bivariate Pareto distributionwith positive support� Each observation in region A contributes with a bivari�ate vector �x� y� to the likelihood function� Observations in regions B and Chave only one component which exceeds the corresponding threshold� Contribu�tion of these observations to the likelihood function is the exceedances of thosecomponents over the corresponding threshold�

corresponds to the case where each marginal component has approximately aunit Fr�chet distribution� In the general case we assume that the upper tails ofeach margin have a GPD with unknown parameters� Pickands ���� showed thatthis is equivalent to assumption that each margin is in the domain of attractionof an extreme value distribution �this is a special case of theorem ��� with d&���Thus for Z� � u� and X � Z� � u�

P �X � x� � �� ��� ��x

���

����

where �� � � � �� is any real number and x� � max�x� ���

Incorporating the transformations U � �������X��� � and V � ��

�����Y ����

into the likelihood function ������ gives the likelihood function

L� � ��� ��� ��� ��u�v� � ������

��

Page 95: Characterisation and Some Statistical Aspects of Univariate and

STATISTICS OF BIVARIATE GENERALISED PARETO

DISTRIBUTIONS

LS� ��� ��u

������� � ��� ��

v

������� � �

nAYi�

������� ��

ui��

���������� ��

vi��

������� �

nBYj�

����� ��

uj��

������� �

nCYk�

����� ��

vk��

��������

An equivalent approach is to use the general form of MGPD for each para�metric subfamily� For the generalised symmetric logistic model this is shown in������� The likelihood function in general form with arbitrary marginal distri�butions can then be obtained from �������

Maximum likelihood estimators of parameters in univariate GPD has beenstudied by Smith����� He states that when � � �

� the MLE�s are consistent�asymptotically normal and asymptotically e�cient� For a comparison of di�er�ent methods of parameter estimation in GPD see ���� and �����

In order to obtain the maximum likelihood estimate of � � ��� ��� ��� ���� thelogarithm of the likelihood ������ can be maximised by using a quasi�Newtonroutine� Initial values for the parameters in ������ can be obtained by �rstestimating the parameters for each margin and then maximising the likelihood������ with transformed data� The transformations are derived as follows �fordetails see ����� As we mentioned above� the likelihood function in ������ hasbeen obtained with the assumption that each margin has approximately unitFr�chet distribution� In order to satisfy this condition� a simple approach is toassume that the upper tails of each margin have a GPD and the remainder ofdistribution is arbitrary but known� This means that if we denote the thresholdson each margin by ui i � �� �� we have �we show the calculations for onemargin�

P �X � xjX � u�� � ��� ��x� u���

����� �

It is natural to estimate the probability of an exceedance of the threshold bythe proportion of points exceeding u�� i�e�

�� � P �X � u�� �nXi�

I�Xi � u��

n�

Thus for x � u�

�G��x� � P �X � x� � ���� � ��x� u���

����� �

For observations below the threshold we apply the probability integral transformto the ranks of observations� Thus we have

G��x� �

�R�xj��n� �� xj � u�

�� ����� ��xj�u���

����� xj � u��

������

��

Page 96: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

where R�xi� denotes the rank of xi� It is easy to see that applying the transfor�mation��log�Gi�xi���

�� for i � �� � will result in approximative unit Fr�chetdistribution in each margin� Thus in order to obtain an initial point for ������we �rst need to perform a GPD analysis in each margin� Then we apply theabove transformation to each margin and by maximising ������ with the trans�formed data we obtain an initial point for ������� However� our experience fromdi�erent simulation studies indicates that unfortunately this procedure does notalways produce !good" initial point and sometimes optimising ������ withouttransforming the data gives a better �in terms of convergence of optimisationroutine� initial point�

Simultaneous estimation of parameters with their covariance matrix can beused to make inferences on the dependence of the variables in the extremesand on upper quantile curves and also their standard error� It is also possible tochoose di�erent thresholds for the margins to test the sensitivity of the estimatesto the choice of u� and u��

� Application to Wind Data

In this section we present an application of our methods to a wind data set� Thedata comes from a project for modelling of wind storm damages in the south ofSweden and consists of the maximum wind speeds in ��� storms in the periodof � ���� �� Our main aim here is to illustrate how BGPD�s can be used formodelling of extreme events� More detailed study of relationship between windspeed and losses is presented in another report����� The data has been pro�vided by Swedish Metrological and Hydrological Institute �SMHI�� The SMHIcalculates a grid of wind speeds for each storm in Sweden� These calculationsare based on pressure measurements in di�erent meteorological stations whichcover an speci�c area� From �� grid points in Sk�ne we chose two points withthe least correlation� Figure shows the data and a two�dimensional Gaussiankernel estimate of probability density �see ���� for details and formulas abouttwo�dimensional Gaussian kernels�� At the �rst stage we perform a univariatePareto analysis on each margin� The main purpose of these calculations is to�nd an appropriate threshold�u� in each margin� It should be mentioned thatselecting the threshold is a practical problem in using these methods� however acommon technique is to use the mean residual life plot�see e�g� ������ It is easyto show that if X has GPD with distribution function ������ then the expectedvalue of exceedances over the threshold u is a linear function of the threshold�Speci�cally

E�X � ujX � u� �� � �u

� � ��

Therefore� plot of mean residual life against u should be approximately a straightline� The common practice is to choose the smallest value of u in the regionwhich the plot is a straight line� Figure �� shows the mean residual plots for

��

Page 97: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

m/s

m/s

0 5 10 15 20 25 30

05

1015

2025

30

••

••

•• •

••

••

•• ••

••

••

••

• ••

••

••

••

•••

••

••

•••

••

••

••

••

•• •

••

••

••

••

••

••

••

••

•••

•• •

••

• •

••

• • •

••

••

• •••

•• ••

••

••

••

••

••

••

•• •

••

••

•••

••

••

•••

• •

••

0.0050.010.015

0.02

Figure � Wind speeds in grid points gp���� and gp� �� in Sk�ne�

u

mea

n re

sidu

al

15.0 15.5 16.0 16.5 17.0 17.5 18.01.95

2.05

2.15

2.25

0.6

0.620.64

0.66

0.68 0.7

0.72

0.74

0.76

0.78

0.8

0.82

0.840.86

0.88 0.9

gp0707

u

mea

n re

sidu

al

15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5

2.0

2.1

2.2

2.3

2.4

2.5

0.6

0.62

0.640.660.68 0.70.72

0.74

0.76

0.78

0.8 0.82

0.84

0.86

0.880.9

gp0903

Figure ��� Mean residual plots for the wind speed data�

��

Page 98: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

our data� The numbers on the plot show the empirical probability that anobservation is less than the threshold�u�� The irregular behaviour of plot foreach grid point suggests that the marginal distributions could be exponential�� � ��� We will see that estimation of parameters in each margin con�rms this�

Table � summarises the results of univariate Pareto analysis in each gridpoint� Besides maximum likelihood estimates �MLE� we also give estimationof parameters with Method of Moments �MoM� and Probability Weighted Mo�ments �PWM� �see ���� and ���� for details�� The estimated asymptotic standarderror of MLE�s �denoted by se��� and se���� and the negative value of the log�arithm of the likelihood function �nllh� are also presented in this table�

Table �� Results of a univariate Pareto analysis in the two grid points

Grid u MoM PWM MLEpoint � � � � � � se��� se��� nllhgp���� ���� ���� ��� ���� ��� ��� ���� ���� ���� �����gp���� ����� ���� ���� ���� ���� ���� ��� ��� ���� ������gp���� ���� ���� ���� ���� ��� ��� ��� ���� ���� �����gp���� ���� ���� ���� ���� ���� ���� ���� ���� ���� ����gp���� ��� ���� ���� ���� ���� ���� ��� ���� ���� �����gp���� ��� ���� ���� ���� ���� ���� ���� ���� ���� ������gp���� �� ���� ���� ��� ��� ���� ���� ���� ��� �����gp���� ���� ��� ��� ��� ���� ���� ���� ��� ���� �����gp���� ����� ���� ��� ���� ���� ���� ��� ��� ���� ������gp���� ���� ���� ���� ���� ���� ���� ���� ���� ���� �����gp���� ���� ���� �� ���� ���� ��� ��� ���� ���� �����gp���� ��� ���� ���� ���� ��� ��� ����� ���� ���� �����gp���� ��� ���� ���� ���� �� ���� �� ���� ���� �����gp���� �� ���� ��� ���� ���� ��� ���� ���� ���� ���

These estimates can also be used to check the model in the bivariate case�As discussed in Section �� with a polar transformation the exponent measurefactorises into a known function of the radial component and a measure S ofthe angular component �see equation ���� �� page ���� In order to check theindependence of r and w di�erent graphical methods have been suggested� Forthe bivariate case it has been suggested to plot w against log�r� �see ���� and ���for some examples of this�� For checking whether the sample size is su�cientlylarge for the asymptotics to be a good approximation� Joe and Smith suggestusing a correlation test for all the points above a cut�o� point r�� For ourdata we use equations ������ and transform the margins to the unit Fr�chetmarginals� Using Spearman�s rank correlation we obtained p�value & �������i�e� there is no evidence against independence in the data� Figure �� shows thetransformed data and plot of log�r� versus w�

For an speci�c dependence function simultaneous estimation of parameterscan be obtained by maximising the likelihood function ������� The generalisedsymmetric logistic model with arbitrary marginal distributions has been givenin ������� Table � summarises estimation of parameters for this model withdi�erent thresholds� The threshold for each margin has been chosen to be equal

��

Page 99: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

••••••••••••

•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• •••••••••••••••••••••••••••••••••••••••••••••••

••

••••• ••••

••••••• •••••••••••••••••••••••••••••••••••••••••••••••••• •••••••••••••••• •••

•••••••••••• •••••••••••••••••••••••••••••••••••

• ••••

transformed x-values

tran

sfor

med

y-v

alue

s

0 100 200 300 400 500

020

040

060

080

010

00

••

••

• •

••

••

••

••

••

••

••

••

••

• •

••

••

••

•••

••

••

••

••

••

•••

••

••

••

••

••

• ••

••

•••

••••••

••

••

• •

••

••

•••

••

••

• •

log r

w

-8 -6 -4 -2 0 2

0.0

0.2

0.4

0.6

0.8

1.0

Figure ��� Transformed wind speeds and diagnostic plot�

to the p�quantile of the data for each margin and the �rst column of table showsthe value of p at each run�

Table �� Simultaneous estimates of parameters for the wind speed data

p qu u� u� a b p k �� �� �� �� nllh���� ���� ���� ���� ��� ���� ���� ���� ��� ���� ��� �������� ����� ����� �� ��� ���� ���� ���� ���� ���� ���� ��������� ���� ���� ���� ���� ���� ��� ���� ��� ���� �� �������� ���� ���� ���� ���� ���� �� ���� ���� ���� ���� ���������� ��� ��� ���� ���� ���� ��� ���� ���� ���� ��� ������� ��� ��� ���� ���� ��� ���� ��� ��� ���� �� ��������� �� �� �� ��� �� ��� ���� ���� ���� ��� ������

Figure �� shows the distribution function of the bivariate generalised Paretomodel with the estimated parameters� for the p�quantile equal to ���� Thecorresponding contour plot also shown�

As we mentioned earlier� one of the main reasons for �tting a bivariatePareto distribution to data is to estimate bivariate quantile curves� For ourdata Figure �� shows three quantile curves� The calculations are based onestimate of parameters with u� � ��� and u� � ����� For these values of the

��

Page 100: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

0

5

10

15

20

25

gp0707

0

10

20

30

gp0903

0

0.0001

0.0002

0.0003

0

5

10

15

20

25

gp0707

0 5 10 15 20

0

5

10

15

20

Figure ��� Distribution function and contour plot of the bivariate generalisedPareto distribution �tted to the wind speed data�

��

Page 101: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

thresholds� number of points in each region �see Figure �� page ��� is nA ���� nB � �� and nC � ���

gp0707

gp09

03

0 5 10 15 20

05

1015

20

••

••

•••

•••

•••

••

••

•••

••

• •

••

••• •

••

••

••

• ••

• •••••

•• ••

•••

••

••

• • ••

0.03

0.050.07

Figure ��� Three quantile curves for the wind speed data�

In Section � we saw that for k � � this model reduces to the symmetriclogistic model� By using standard likelihood ratio test we can test betweenthese two models� Table � gives MLE�s for k � �� As we see there is noevidence against choosing k � ��

Table �� Simultaneous estimates of parameters with k � �

p qu u� u� a b p k �� �� �� �� nllh���� ���� ���� ���� ��� ��� � ���� ��� ���� ���� �������� ����� ����� ��� ���� ���� � ���� ���� ��� ���� �������� ���� ���� ���� ���� ���� � ��� ���� ���� �� ������� ���� ���� ��� ���� ��� � ���� ���� ���� ��� ���������� ��� ��� ���� ��� ���� � ���� ��� ���� ���� ������� ��� ��� ��� ��� ���� � ���� ��� ���� ���� ��������� �� �� ���� ��� ���� � ���� ���� ��� ��� ������

It is interesting to see how changes of the thresholds a�ect the parameters es�timates� In a small simulation study we compared the results of four simulations�In each case we simulated ��� observations from the unit Fr�chet distribution

��

Page 102: Characterisation and Some Statistical Aspects of Univariate and

� APPLICATION TO WIND DATA

and found maximum likelihood estimates of the parameters by maximising thelikelihood function ������� In the �rst and second simulations we have inde�pendent margins� The di�erence between them is that in the �rst simulationthe thresholds are the ����quantile in each margin but in the second simula�tion we increased the thresholds to the ����quantile� In the third and fourthsimulations we used correlated data� For generating the correlated data� we�rst simulated data from bivariate normal distribution with correlation ��� andthen by using the probability integral transform� we generated correlated datawith Fr�chet distribution� The thresholds in this case are also the ����quantileand ����quantile of each margin� Each simulation has been repeated ��� times�Figure �� summarises the result�

02

46

810

12

simulation 1 simulation 2 simulation 3 simulation 4

a

02

46

810

12

simulation 1 simulation 2 simulation 3 simulation 4

b

23

45

67

simulation 1 simulation 2 simulation 3 simulation 4

p

02

46

8

simulation 1 simulation 2 simulation 3 simulation 4

k

Figure ��� Boxplots of the results of four simulations with unit Fr�chetmarginals�

Comparing the results of simulation � and � shows that increasing the thresh�olds changes the estimates of a and b but does not have any major e�ect onthe estimates of k� The estimates of p seem to decrease in the simulationswith higher thresholds� A similar pattern can be seen in simulations � and ��Comparing simulations � and � gives the opposite result� Recall that we have����quantiles as the thresholds in both simulations� As we see in Figure ��� theboxplots of the estimates of a and b are very similar but both p and k are largerin the third simulation� Note that in these simulations estimates of a and b are

��

Page 103: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

statistically equal because the marginal distributions and the thresholds are thesame� Again the same pattern as in simulations � and � can be seen by compar�ing simulations � and �� It should be mentioned that even if these results agreewith the intuitive interpretation of di�erent parameters in the model� more de�tailed simulations are needed for proper understanding of these models� Aninteresting question is whether the same thing happens if we �x the thresholdsin each margin� Note that in the simulations we compared relative changes inthe parameter estimates due to changes in the thresholds� We considered twocases with correlated and uncorrelated data� Simulations can also be used tostudy statistical properties of MLE�s of di�erent parameters in BGPD�s� Suchstudies should be based on simulated observations from di�erent BGPD�s withknown parameters� There is a need to develop direct methods for generatingdata from these distributions�

Acknowledgement

I would like to thank my supervisor Professor Holger Rootz�n for suggestingthe problem and for his great support and guidance during the work�

References

��� Arnold� B�C� �� ��� Pareto Distributions� International Co�operative Pub�lishing House�

��� Balkema� A� and Resnick� S� �� ��� Max�in�nite divisibility� J� Appl�Probab� ��� �� ��� �

��� Barnett� V� �� ��� The ordering of multivariate data �with discussion�� J�R� Statist� Soc� A ��� ��������

��� Coles� S� G� and Tawn� J�A� �� �� Modelling multivariate extreme events�J� R� Statist� Soc� B ��� ����� ��

��� Davison� A�C� and Smith� R�L� �� �� Models for exceedances over highthresholds� J� R� Statist� Soc� B ��� � ������

��� Deheuvels� P� and Tiago de Oliveira J� �� � � On the Non�parametric es�timation of the bivariate extreme value distributions� Statist� Probab� Lett��� ��������

��� Galambos� J��� ��� The Asymptotic Theory of Extreme Order Statistics��nd ed� Melbourne Krieger�

��� Galambos� J� and Kotz S� �� ��� Characterizations of Probability Distribu�tions� Lecture notes in Mathematics� New York Springer�Verlag�

Page 104: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

� � Gumbel� E�J� �� ��� Statistics of Extremes� New York Columbia Univer�sity Press�

���� Gumbel� E�J� �� ��� Bivariate exponential distributions� J� Amer� Statist�Assoc� ��� � ������

���� Haan� L� de and Resnick� S�I� �� ��� Limit theory for multivariate sampleextremes� Z� Wahrscheinlichkeitstheorie v� Geb� ��� ��������

���� Hosking� J�R�M� and Wallis� J�R� and Wood� E�F� �� ��� Estimation ofthe Generalised Extreme Value distributions by the method of probability�weighted moments� Technometrics ��� ��������

���� Hosking� J�R�M� and Wallis� J�R� �� ��� Parameter and quantile estimationfor the generalised Pareto distribution� Technometrics �� �� ��� �

���� Joe� H�� Smith� R�L� and Weissman� I��� �� Bivariate Threshold Methodsfor Extremes� J� R� Statist� Soc� B ��� ��������

���� Leadbetter� M�R�� Lindgren� G� and Rootz�n� H� �� ��� Extremes andRelated Properties of Random Sequences and Processes� Berlin Springer�Verlag�

���� Marshall� A�W� and Olkin� I� �� ��� A generalised bivariate exponentialdistribution� J� Appl� Probab� �� � ������

���� Pickands� J� III �� ��� Statistical inference using extreme order statistics�Ann� Statist� �� �� �����

���� Pickands� J� �� ��� Multivariate extreme value distributions� Proc� �rdSession I�S�I�� �� �����

�� � Resnick� S�I� �� ��� Extreme values Regular Variation and Point Processes�Berlin Springer�Verlag�

���� Rootz�n� H� and Tajvidi N� �� �� Extreme value statistics and wind stormlosses a case study� To appear in Scandinavian Actuarial Journal�

���� Rosbjerg� D�� Madsen� H� and Rasmussen� P� F� �� �� Prediction in par�tial duration series with generalised Pareto�distributed exceedances� WaterResources Research ��� ����������

���� Sibuya� M� �� ��� Bivariate extremal statistics� Ann� Ins� Statist� Math�XI� � ������

���� Silverman� B�W� �� ��� Density Estimation for Statistics and Data Anal�ysis� London Chapman and Hall�

���� Smith� R�L� �� ��� Statistics of extreme values� Proc� �th Session I�S�I�Paper ����� Amsterdam�

Page 105: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

���� Smith� R�L�� �� ��� Maximum likelihood estimation in a class of nonregularcases� Biometrika ��� ��� ��

���� Smith� R�L�� �� ��� Estimating tails of probability distributions� A� Statist���� ����������

���� Smith� R�L�� Tawn� J�A� and Yuen� H�K� �� �� Statistics of multivariateextremes� Int� Statist� Inst� Rev� ��� ������

���� Tawn� J� A��� ��� Bivariate extreme value theory Models and estimation�Biometrika ��� � ������

�� � Tiago de Oliveira J� �� ��� Regression in the non�di�erentiable bivariateextreme models� J� Amer� Statist� Assoc� � ��������

���� Tiago de Oliveira J� �� ��� Bivariate and multivariate extremes distribu�tion� Statistical Distributions in Scienti�c Work �� ��������

���� Tiago de Oliveira J� �� ��� Bivariate extremes Foundations and statistics�Proc� �th Int� Symp� Mult� Anal�� North Holland� New York�

���� Tiago de Oliveira J� �� ��� Bivariate models for extremes� statistical deci�sion� Statistical Extremes and Applications Reidel Dordrecht� ��������

���� Tiago de Oliveira J� �� � � Intrinsic estimation of the dependence structurefor bivariate extremes� Statist� Probab� Lett� �� ��������

���� Weissman� I� �� ��� Estimation of parameters and large quantiles based onthe k largest observations� J� Amer� Statist� Assoc� ��� ��������

Page 106: Characterisation and Some Statistical Aspects of Univariate and

This page intentionally contains only this sentence�

Page 107: Characterisation and Some Statistical Aspects of Univariate and

Design and Implementation of Statistical

Computations for Generalised Pareto

Distributions

Nader Tajvidi

April ����

Abstract

In this paper we discuss a general approach to design and implemen�tation of statistical computations� We use three previous articles as ex�amples to illustrate di�culties which arise in this kind of applicationand methods which may be used to solve them� A common theme inthese articles is univariate and multivariate generalised Pareto distribu�tions� However� the discussed problems are of a rather general nature anddemonstrate some typical tasks in applied statistical research� The �rstone concerns maximum likelihood estimation of parameters in a rathercomplicated bivariate generalised Pareto distribution� In this application�the main emphasis is on decomposing the optimisation problem into afew stages and using di�erent tools at each stage of the problem� Objectoriented programming is a rather recent approach to program develop�ment� which we found quite useful� We next describe and comment onusing S�plus as a main environment for a simulation study concerningcon�dence intervals and accuracy estimation for the generalised Paretodistribution� We design a number of objects and explain their respon�sibilities in the simulation� The implementation of the proposed designin S�plus is also presented� The last part of the paper is devoted to ageneral discussion of di�erent stages in data analysis� We suggest sometools which can be used at each stage of the analysis�

Keywords� Statistical computations� Generalised Pareto distribu�tions� Statistical software� Maximum likelihood� Simulation�

AMS ���� subject classi�cation� Primary ��U�� Secondary ��N�����U��

�Research supported by stiftelsen L�nsf�rs�kringsbolagens forskningsfond�

Page 108: Characterisation and Some Statistical Aspects of Univariate and

� INTRODUCTION

� Introduction

Availability of advanced computers and recent progress in computing technologyhave opened new opportunities for applying statistics and probability theory inimportant technological problems such as structural reliability� wind modellingand insurance� see e�g� ����� Many statistical and data manipulation tools areavailable in di�erent platforms� Some of these tools are very specialised and aredesigned with a speci�c application in mind� Others are intended to deal withrather general problems�

Although the diversity of available tools is a fortunate fact for researchers�it also makes it di�cult to decide which particular tool should be used in aspeci�c problem� Many factors should be considered in this decision� Even ifwe restrict ourselves to just available tools in the system� and thereby excludethe economical aspects of the decision� there still often are many options� Itis obviously impossible to become familiar with all of the di�erent programsavailable in each system� As a result� what usually happens in the practice isthat one ends up with learning just one or two languages%packages and tries tosolve every single problem with those tools� Obviously this is not necessarily agood decision� but there are many reasons for it� The main one is perhaps thelarge amount of time which usually is needed to learn a new language%package�Another reason seems to be the lack of information about the available tools andtheir application area� Even if a researcher is willing to invest time in learninga new language%package he or she might not even know about the capabilitiesor even existence of appropriate tools available in the system� This is a generalproblem in information systems today� Technical progress has made it possibleto save a huge amount of information in a very e�ective way but little has beendone to develop tools for helping people to �nd a speci�c piece of information�

However� to try to solve all the problems by means of just one or two packagesmay sometimes be compared to using a hammer for screwing a screw� So�what should one do� On one hand it is obviously impossible to learn about allavailable tools and on the other hand there does not exist one !super�package"which can be used in every single problem�

In this paper we describe how di�erent problems in three previous articles����� ��� ���� have been solved with the aid of computers� A common themein these articles is application of univariate and multivariate generalised Paretodistributions� However we believe that many of the problems discussed are ofrather general nature and can be of interest for researchers in applied statistics�

In Sections � we present a general framework for design and implementa�tion of statistical computations� We argue that this approach can create the�exibility which is needed in this type of application�

In each of Sections ��� we �rst brie�y describe a problem and then discuss

Page 109: Characterisation and Some Statistical Aspects of Univariate and

� DESIGN AND IMPLEMENTATION� GENERAL IDEAS

the design and implementation of the solution in our computer system� At theend of each section we discuss advantages%disadvantages of the proposed designand tools� We hope that this can give ideas about how one can approach andsolve similar problems�

In Section � we present a procedure for maximising a rather complicatedlikelihood function with � parameters ������� We discuss how the problem hasbeen decomposed and which tools have been used at each stage�

Section � is devoted to computations involved in simulations reported in����� The main idea in this section is to design di�erent objects which carry outthe simulations and also determine the information �ow between them� Thisapproach is inspired by the principle of object�oriented programming� We donot discuss these principles here but there are many books available on thesubject� see e�g� ���� ��� Chapter �� and ��� Appendix A�� In the last part of thesection we give some details of the programs used to calculate correction factorsfor improving likelihood�based con�dence intervals for parameters and quantilesof the generalised Pareto distribution�

Readers who are not interested in the details of calculations might wantto only read the !Discussion" sections ���� on page ��� and ��� on page ����and then continue to Section � which is devoted to general discussion aboutdi�erent stages in data analysis� In this section we discuss some tools whichcan be used at each stage of data analysis� In Section � we brie�y discuss a fewadditional tools which can be used in some stages of almost all applications�We have postponed the discussion to this section because we believe that theproposed designs in the previous sections can be implemented in a computingenvironment without necessarily using these tools� Section � summarises theproposed tools and outlines the conclusions�

� Design and Implementation� General Ideas

The following circumstances seem to be quite common� On one hand� there areseveral packages available in the system which can handle !standard" problems�for example� estimate the parameters in a generalised linear model�� On theother hand� it lies in the nature of research that it concerns problems which arenot !standard" in some sense� Thus it is not likely that a researcher can �nda package which solves his%her problem by just clicking on some options of amenu� and there is a need for more �exible environments� A key thesis of thispaper is that it might be possible to achieve the desired �exibility by using thefollowing approach�

� Design�

� decompose the problem into several stages�

Page 110: Characterisation and Some Statistical Aspects of Univariate and

� OPTIMISING A LIKELIHOOD FUNCTION

� determine the information �ow between the stages�

� Implementation�

� consider the task and information �ow at each stage and select an!appropriate" tool for the calculations�

� perform the calculations at each stage in the proper order�

It should be mentioned that there is an interaction between the above steps andthe whole process of design and implementation does not go in one direction�

In the following sections we show how the approach above can be used inresearch and data analysis� In each section we explain the design of computa�tions and present the tools which� in our opinion� are !appropriate" to use ateach particular stage�

It should be emphasised that we are not trying to discuss merits%demeritsof various packages%languages which principally are designed for similar tasks�We have used some of the packages which have been available in our computersystem� The reason for mentioning these tools is simply the fact that we havefound them fairly e�ective and straight�forward to use� As mentioned above�there is an interaction between design and implementation of computations andboth steps are highly a�ected by background and experience of the researchers�It should be emphasised that learning curves for some programs%packages wemention are rather �at and that it may take a considerable amount of timebefore one can take full bene�t of them� But we still recommend them becausewe believe that they provide a consistent� �exible and productive environmentwhich can be used at some stages of almost every problem� Learning these toolsmay be a big investment in time and e�ort but the payo� will also be large� Asuccessful use of computers in applied work requires to some extent skill in theart of selecting and combining proper tools�

� Optimising a Likelihood Function

In this section we discuss calculations which were used to optimise two likelihoodfunctions and� also to perform some simulations presented in a paper ������ onmultivariate generalised Pareto distributions� We do not discuss the theoreticalbackground here but refer the interested reader to the article� However� forreference purposes� we reproduce the distribution functions� They are so calledbivariate generalised Pareto distributions and are given in a standardised formby

�H�x� y� �

��x� a��p � �y � b��p � k ��x� a� �y � b���p��

� �p

�a�p � b�p � k�a b��p��

� �p

�����

Page 111: Characterisation and Some Statistical Aspects of Univariate and

� OPTIMISING A LIKELIHOOD FUNCTION

for x � � and y � � and generally by

�H�x� y� � ��������� �� �a�x�

��

� p��

� k��� �� �a�x�

��

� p� ����� �� �b�y�

��

� p� ��

���� �� �b�y�

��

� p��

� �p

���� a ��

��

� p

�� � k��� a��

��

� p

� ����� b ��

��

� p

� �� ���� b ��

��

� p

��

� �p

for a���� � �� b���� � �� x� y � � � ��� �� � ��

In the following we call these distributions bivariate Pareto distribution�BPD� and bivariate generalised Pareto distribution �BGPD� respectively� Forboth distributions the main steps in the computations are

�i� to calculate the density function�

�ii� to calculate the likelihood function�

�iii� to optimise the likelihood function and to �nd the maximum likelihood�ML� estimates of the parameters� and

�iv� to generate ��� random samples from the distributions� and to calculateand summarise the estimates of the parameters�

��� Design and Implementation

As we see below� the �rst two steps involve some symbolic mathematical calcu�lations� The third step is a pure numerical constrained optimisation problem�The last step involves repeated use of the optimisation procedure and to ob�tain the parameter estimates for each simulated sample and also to summarisethem in an appropriate plot� The question is how these steps should be carriedout� For the �rst step� it is easy to calculate symbolic derivatives of functions inMathematica� In step � we need a numerical optimisation routine� As we do notcalculate the �rst or second derivatives of the likelihood function symbolically�they must be calculated numerically� We found subroutine E�JAF from NAG�sFortran Library suitable for our needs� To use this subroutine we needed tosupply it with the likelihood function as a Fortran subroutine� Fortunately wecould use Mathematica also in this step� In the last stage we use the optimi�sation algorithm repeatedly and present the results of simulations in a numberof plots� Our choice for this step was S�plus� see ���� It is possible to callFortran routines from S�plus� This allows us to write a wrapper in S�plus

which calls the optimisation routine and saves the results in a appropriate formfor further analysis� Figure � shows the design and implementation for solvingthe problem�

Page 112: Characterisation and Some Statistical Aspects of Univariate and

� OPTIMISING A LIKELIHOOD FUNCTION

Write a program inFortran for maximisingthe likelihood functioncalling NAG routine

Write a wrapper inS-plus which calls theFortran program

in Mathematicaand Fortran subroutinesdensity functionsCalculate the

NAG routineE04JAF

Figure �� Block diagram of the design and it�s implementation for optimisingthe likelihood function�

In the following we give some details of the calculations� The completesource of the programs can be obtained from http���www�math�chalmers�se�

�nader�software�html and we refer the interested reader to this address forfurther details�

The following commands in Mathematica show how one can produce thedesired results of the �rst two steps for BPD� In the �rst command we de�nethe exponent measure

������ �x� y��c� � ��

xp�

yp�

k

�xy�p����p � �� � k � ��p� ��� p � �� x� y � ��

expmeas�x��y��p��k�����x�p�y�p �k�x y ��p� ��p �

Page 113: Characterisation and Some Statistical Aspects of Univariate and

� OPTIMISING A LIKELIHOOD FUNCTION

The BPD given in ����� is de�ned as follows

�H�x� y� ������� �x� x�� y � y���c�

������ �x�� y���c�� x� y � �� x�� y� � �

and this is how one can de�ne it in Mathematica

bivpar�x��a��y��b��p��k����expmeas�x�a�y�b�p�k�expmeas�a�b�p�k��

The next step is to calculate the density functions� i�e� the derivatives in thethree regions as shown in Figure �� For example� the function in region A of

��

��

��

��

�X � Z� � u�

Y � Z� � u�

u� Z�

X � Z� � u�

Y � Z� � u�

A

u�

Z�

C

B

Figure �� Illustration of the threshold model for bivariate Pareto distributionwith positive support� Each observation in region A contributes with a bivari�ate vector �x� y� to the likelihood function� Observations in regions B and Chave only one component which exceeds the corresponding threshold� Contribu�tion of these observations to the likelihood function is the exceedances of thosecomponents over the corresponding threshold�

the �gure can be calculated by the command �Dbivparx�a�y�b�p�k��x�y��It is interesting to note that the command InputForm generates the de�nition

Page 114: Characterisation and Some Statistical Aspects of Univariate and

� OPTIMISING A LIKELIHOOD FUNCTION

of the function in Mathematica�s own format� In the same way FortranForm

gives the de�nition of density function in Fortran so that it can directly be usedin a Fortran subroutine� Another nice by�product is that by using the TeXFormcommand in Mathematica� one can directly produce the function in the TEXformat� We do not reproduce the result here as it occupies more than one page�

So far we have calculated the density function for the BPD� As discussedin ���� the BGPD can be obtained from the BPD by a simple transformation� Itis quite straightforward to incorporate the desired transformation into the abovecommands and obtain the density function of BGPD in the di�erent regions�Having calculated the density functions in all regions� the only thing left is towrite a subroutine in Fortran which calculates the likelihood function as givenin equations ������ and ������ of ����� This subroutine� which calculates value ofthe likelihood function at each iteration� must be supplied to the optimisationsubroutine� E�JAF� The last step is to write a wrapper in S�plus which calls theoptimisation routine in Fortran and obtains the estimates of the parameters�The optimisation routine runs rather fast� For the simulation study we wroteanother program in S�plus which generates the samples� calls the optimisationroutine for each sample and plots the results�

The preceeding examples are straightforward but� as we will se in Section ����Mathematica can also perform much more complicated calculations almost asreadily as in the simple examples discussed in this section�

��� Discussion

Estimation of parameters by maximum likelihood method is quite common inapplied statistics� The design and implementation of calculations in Section ���can be directly used in this type of application�

We used Mathematica to produce complex algebraic expressions in di�er�ent formats� This shows the importance of choosing appropriate tools at eachstage� Mathematica is a powerful symbolic mathematics package which can beused for calculations of great complexity� e�g� manipulating expressions� takingderivatives and integrals and solving equations� see Section ��� below for anexample�

The computations in this section also involved numerical optimisation� Thereare many factors which should be considered in selecting an optimisation rou�tine� As any other numerical problem� the precision� speed and robustness ofthe optimisation algorithms are the most important factors� Fortran is a verypowerful language for numerical problems� Numerical and optimisation rou�tines for various problems are available in the NAG Fortran library� In orderto use these optimisation routines� it is usually necessary that the user suppliesa subroutine in Fortran which calculates the value of the function at each point�

���

Page 115: Characterisation and Some Statistical Aspects of Univariate and

� BOOTSTRAP SIMULATIONS

Hence some knowledge of the language is necessary for using these libraries� Asimilar library is also available in C� As we will discuss in Section ���� to increasethe speed of the programs in high level languages like S� it may be necessaryto write part of the programs in compilable languages like Fortran and C andfamiliarity with such languages hence is important�

For simulations and presentation of results� we used S�plus� In our opinion�S�plus is one of the most powerful statistical systems available� and it providesa very �exible environment for statistical calculations� It contains more than���� built�in functions for statistics and data analysis� and it is quite easy for theuser to write new functions in S� S�plus also provides powerful� high�resolutiongraphical facilities with possibilities for interactive work� It contains interfacesto Fortran� C and UNIX shell and hence these system�s abilities can easilybe incorporated into S�plus� In Section ��� we brie�y discuss object�orientedfeatures of the language�

Bootstrap Simulations

The aim of ���� was to compare di�erent methods for constructing con�denceintervals for the parameters and ���quantile of the generalised Pareto dis�tribution �GPD�� The main emphasis was on small to moderate sample sizes�The GPD has two parameters� � and �� where � is a shape parameter and� is a scale parameter� We chose � values for �� ���������� � � � ���� As � isa scale parameter it was set to � in all simulations� The sample sizes weren � ��� ��� ��� ���������� �������� This gives �� di�erent combinations of nand �� The main interest was in comparing con�dence intervals for �� � andthe ���quantile of the GPD constructed by the following methods

�i� percentile bootstrap con�dence intervals�

�ii� bias corrected and accelerated bootstrap con�dence intervals�

�iii� likelihood�based con�dence intervals� and

�iv� corrected likelihood�based con�dence intervals�

We were also interested in comparing accuracy measures for ML estimates ofthe parameters� such as bias and standard error�

��� Design and Implementation

As mentioned above� there are �� di�erent combinations of n and � in our sim�ulation study� For each combination we generate ��� random samples from the

���

Page 116: Characterisation and Some Statistical Aspects of Univariate and

� BOOTSTRAP SIMULATIONS

GPD� For each random sample ���� bootstrap samples are generated� Max�imum likelihood estimates of the parameters and ���quantile are calculatedfor all samples� Furthermore� we calculate several accuracy measures for MLestimators� Con�dence intervals for the parameters and ���quantile of theGPD are also constructed according to di�erent methods� The main interestis to compare empirical coverage and length of con�dence intervals� Obviouslythis involves a substantial amount of calculation�

While the main emphasis in Section � was on combining di�erent tools� herewe concentrate on designing di�erent objects which carry out the whole simu�lation study� Some of the objects will be responsible for generating a speci�ckind of data and others will just contain the data in a pre�speci�ed structure�By decomposing the whole simulation task and assigning each task to a speci�cobject it is possible to obtain more control on writing the scripts and in debug�ging the programs� The crucial point is the communication between the objectsand delegation of responsibilities between them� The whole design was inspiredby object�oriented programming�

The following is the name of objects� their responsibilities and the name ofobjects they create�

boot�bcanon This function generates m samples from a GPD with a speci�cvalue of � and n� For each sample� ���� bootstrap samples are generated�For each bootstrap sample� ML estimates of parameters are �rst calcu�lated� Then� for each random sample� bootstrap and jackknife estimatesof the bias and standard error of ML estimators are calculated� By us�ing two bootstrap methods� viz� percentile bootstrap and bias correctedand accelerated bootstrap� con�dence intervals for the parameters are con�structed �for di�erent nominal con�dence levels�� The results are savedin an object� e�g� with the name simres�gamma� �n � for � � ���� andn � ��� More details about the di�erent methods used in these functionsare given in �����

boot�quantile This function calculates accuracy measures and con�dence in�tervals for ���quantile of the GPD using di�erent methods� It generatesobjects which have same structure as those generated by boot�bcanon�E�g�� for � � ���� and n � ��� the results were saved in an object withthe name simres�gamma� �n ��quantile�

boot�summary This function generates a summary for each combination of nand �� E�g�� for � � ���� and n � ��� the results are saved in an objectwith the name simres�gamma� �n ��summary�

boot�quan�russ�doll This function uses the so called !Russian doll principle"������ and does the same calculations as boot�quantile for ���quantileof the GPD� E�g�� for � � ���� and n � ��� the results are saved in anobject with the name simres�gamma� �n ��quan�russ�doll�

���

Page 117: Characterisation and Some Statistical Aspects of Univariate and

� BOOTSTRAP SIMULATIONS

boot�estpar�pro�le This function calculates the necessary data for plottingthe pro�le likelihood function for � and �� The results are saved in anobject e�g� with the name simres�gamma� �n ��profile for � � ����and n � ��� Other versions of this function incorporate the correctionfactors into the pro�le likelihood con�dence intervals�

boot�pro�le�conf�int�all This function generates a con�dence interval for �and � using the created objects by boot�estpar�profile�

boot�pro�le�conf�int�quan���bonferoni This function generates a con�denceinterval for ���quantile of the GPD using Bonferoni�s method�

A block diagram of the relationships between these functions is presented inFigure �� As discussed in the previous section� S�plus contains many built�in

boot.bcanon boot.quantile

boot.summary boot.summary.all

profile.pareto

simres.gamma.2.n20 simres.gamma.2.n20.quantile

simres.gamma.2.n20.summary

simres.gamma.2.n20.profile

boot.profile.conf.intplot.profile

boot.create.table

table.of.bias.mean

boot.create.latex.table

table.of.bias.mean.tex

Similar objects

for creating

tables

Similar objectsfor creatingtables

boot.quan.russ.doll

simres.gamma.2.n20.quan.russ.doll

boot.estpar.profile

= A data object= An auxiliary function = A function

Figure �� Block diagram of the design for bootstrap simulation objects�

functions for simulations� For reasons discussed in Section ��� below� we choseS�plus as the main environment for the calculations�

To construct corrected likelihood�based con�dence intervals we need to cal�culate a correction factor for the likelihood ratio criterion of the GPD� This

���

Page 118: Characterisation and Some Statistical Aspects of Univariate and

� BOOTSTRAP SIMULATIONS

generates a corrected statistic LR� which is distributed as ��� when terms of

order O�n��� and smaller are neglected� For this purpose we had to calculatethe expected values of the �rst four derivatives of the likelihood function of theGPD� We also needed the inverse of the expected value of the Hessian of thelog�likelihood function� The following is part of the program in Mathematica

which performs these calculations�

G�F�� vars����args��� ��

G�F� vars��Sequence��Sort��args��� � D�F�vars�� args��

dmatrix�f�Symbol� vars��Symbol� maxord�Integer� ��

Outer� G�f� vars�� Sequence �� Table��vars�� �maxord�� ��

EG�F�� vars����args��� ��

EG�F� vars��Sequence��Sort��args��� �

G�F� vars��Sequence��Sort��args��� � rule � rule� �

Edmatrix�f�Symbol� vars��Symbol� maxord�Integer� ��

Outer� EG�f� vars�� Sequence �� Table��vars�� �maxord�� ��

The main idea in this calculation was to compute all the terms in equation��� of ��� as a matrix �or array� with proper dimensions� The entries in thematrix are expected values of di�erent order derivatives of the log�likelihoodfunction� For the log�likelihood function of the GPD� l �the logarithm of thelikelihood function L� with arguments gamma and sigma� and maxord � maximum

order of derivative needed in the matrix� the function dmatrix createsa matrix �or array� of derivatives of the proper order� The function G calculatesderivatives of a function with arbitrary number of variables� Note that G storesits values for unique argument lists� For example for a function of two variablesan array of fourth derivatives has �� components� However� in the presentapplication� G will not have to recalculate beyond the maximum � functionsneeded� Especially in higher dimensions� this saves a considerable amount ofmemory� By de�ning some appropriate transformation rules �rule� and rule

above�� the expected values of derivatives of the likelihood function �EG� areobtained and the matrix of expected values of dmatrix� Edmatrix is calculated�We do not present further details of the calculations here but the source of theseprograms are available at http���www�math�chalmers�se��nader�software�boot�epsilonk�

Considering generality and amount of calculations involved in the aboveprograms� they look remarkably compact� This is because most of the work isdone by built�in functions like Outer in Mathematica�

���

Page 119: Characterisation and Some Statistical Aspects of Univariate and

� BOOTSTRAP SIMULATIONS

As there are many di�erent objects in the design� naming becomes impor�tant� By choosing homogeneous names for di�erent objects we could writefunctions which operate on all �� objects at the same time� For example� wehave used the pattern below repeatedly in our simulations� This is a functionwritten in S�plus which allows the user to choose any subset of �� di�erent dataobjects by passing appropriate values of � and n to the function� The functionalso provides a simple mechanism for generating the name of data objects� Theaim of this kind of function is two�fold

� to eliminate the time which is necessary to type the �usually long� namesof data objects� and

� to make it possible to handle similar objects simultaneously�

function�n � c� �� ��� �� ��� ��� ���� ���� ���� which�gammas �

c� �� �� �� ���

simres �� paste��simres�gamma��� rep�which�gammas� rep�

length�n�� length�which�gammas���� ��n�� n� sep �

���

for�i in simres� �

� process the object and save the results in an

� appropriate form

NULL

Note that the default values generate all of the �� data objects�

As seen in Figure �� in di�erent stages of calculations we use similar functionsfor creating data frames ����� of di�erent simulation results and for convertingthem to a LATEX�tables �represented by dashed boxes in the �gure�� We usedthe print�display software� developed by Richard M� Heiberger and Frank E�Harrell� Jr� to create the desired LATEX�tables� see ����

Of course there are many other details in the implementation which we havenot mentioned yet� E�g� for each combination of n and �� ML estimates of theparameters are calculated� This requires a considerable amount of memory� Inthe beginning we decided to run �� random sample in each simulation� It turnedout that for sample sizes up to �� our computers could handle the problem �Weran these programs in a Sun Ultra Station � machine� Each simulation neededmore than ��� MB ram to �nish��� but for larger sample sizes we needed todecrease number of generated samples� For n � ���� ��� we could only simulate

�Available from statlib

���

Page 120: Characterisation and Some Statistical Aspects of Univariate and

� BOOTSTRAP SIMULATIONS

�� samples at a time and for n � ��� just �� samples were possible� As a result�we needed a number of auxiliary functions which took the results from each seriesof simulations and put them together in one complete unit� In Figure � we haveonly presented the main part of the design� The auxiliary functions and someother functions for generating summary plots are not shown� A complete listof the functions can be obtained at http���www�math�chalmers�se��nader�

software�html� Also in this case� to avoid the very time consuming task ofdoing these operations by hand� it is important to choose the name of dataobjects so that they can be easily regenerated inside a function�

It was also important that we could improve the e�ciency of the programsby writing the optimisation routines in a low�level language �Fortran�� see thenext section for further discussion� It might be of interest to note that eachseries of simulations �m � ��� took about three days to one week to �nish�depending on the load on the computers� It may also be mentioned that wecreated a script which ran all the combinations for one series as a Batch Job

which has the lowest priority in the execution�

��� Discussion

The design in Figure � was inspired by the principles of object�oriented pro�gramming� Brie�y the philosophy underlying object�oriented programming isthat objects should be characterised in terms of their behaviour rather than oftheir implementation� Ideally� a programmer attempting to understand a pieceof code written in an object�oriented style should require only descriptions ofthe meaning associated with each operation an object performs� and should notneed the actual code used to implement the behaviour of the object� As dis�cussed in Sections ��� and �� we chose S�plus as the main environment for thecalculations� However� we believe that a similar design could be implemented inother object�oriented languages� Chapter � of ��� gives a good overview of theconcepts of object�oriented programming in general� Appendix A of ��� discussesobject�oriented features of S�

As mentioned above� we used print�display ��� software to create the de�sired LATEX�tables� It is fairly straight�forward to de�ne a generic function forcreating data frames and rede�ne the above functions as di�erent methods forappropriate classes of objects� Exactly the same thing can be done for creat�ing LATEX�tables� see e�g� ���� This kind of abstraction is in the line of theinheritance principle in object�oriented programming� It also shows that theprocess of design and its implementation in a speci�c language does not go inone direction� After some progresses in the implementation one often realisesthat it is desirable to create a new class of objects or combine existing classesin a !superclass"�

The main part of simulations in this section was implemented in pure S�plus�

���

Page 121: Characterisation and Some Statistical Aspects of Univariate and

� DATA ANALYSIS

It could have been more e�ective to write some parts of the code in a compil�able language like Fortran or C� However the gain in e�ciency is not for free�It usually takes much more time to develop programs in low�level languages likeFortran than high�level languages like S� In practice� it in addition is di�cult topredict how execution time will be used in di�erent programs� It is far better todevelop a working program and run it to discover in which sections the execu�tion time being used and to improve those sections� than to spend an inordinateamount of time worrying about e�ciency early in a project� The following quotefrom an article by Wulf ���� page ���� provides an interesting remark on thismatter ! More computing sins are committed in the name of e�ciency �with�out necessarily achieving it� than for any other single reason�including blindstupidity�"� We have found list structures and multivariate arrays in S�plus

very useful and handy in applications� In our opinion� the small reduction inprogram e�ciency due to the use of these features is well motivated to tolerate�

� Data Analysis

In the previous two sections we discussed some details of the calculations in���� and ����� We presented the design of calculations and also focussed on thetools which were used in the implementation� In this section we consider dataanalysis from a general point of view�

A data analysis task can be divided into three stages�

�i� preparing the data for analysis�

�ii� analysing the data� and

�iii� examining and presenting the results�

We consider implementation of each stage separately in the following sections�

��� To Prepare the Data for Analysis

Usually the �rst step in data analysis involves transforming the original datainto a form that can be used by a particular package� For example in �����the original data was generated by a database program� In order to read thedata into S�plus we had to make sure that all of the �elds have equal number ofrecords� This is a typical data manipulation task� Other tasks of this stage mightinclude changing �eld delimiters in each record� subsetting the data� splittingthe �elds� creating new variables� taking samples and sorting the data�

The text editor Emacs can be used for many of these tasks� see Section ��There are also other possibilities to manipulate data under UNIX� Traditionally�

���

Page 122: Characterisation and Some Statistical Aspects of Univariate and

� DATA ANALYSIS

experienced UNIX users use a pattern scanning and processing language likeawk or other tools like sed� grep� tr� C and shell�scripts for this purpose�A rather new possibility is the Perl language which� in our opinion� is verypowerful� consistent and easy to use for this purpose� The reason is that Perlcombines the best features of above mentioned tools into a single language inaddition to some extra features which are not available in any of them�

There are hundreds of built�in functions in Perl and it is easy to createscripts for special data manipulating tasks� We give a very simple example� Asmentioned in Section �� we used the print�display software ��� with S�plus

for converting data frames to LATEX�tables� One problem with the tables wasthat they were slightly longer than one page� However we could �t them into onepage by just adding the line �vspace�� cm� to the line after �begin�table� ineach �le� Of course this can be done with a text�editor� but as there were �� �lesto edit� doing this by hand would be tedious� The following one line commandin Perl does the same task in less than � seconds� This simple program alsogenerates a backup of all �les�

perl �wpi� �e �s����begin�table����n������vspace�� cm��n��� ��tex

This is a trivial example which does not really show the power and �ex�ibility of this language� The article Bates ��� gives several examples of howuseful Perl is for di�erent tasks� A more complete reference is the book !Pro�gramming in Perl" by Wall and Schwartz ����� This book does not cover thenew features of Perl � �including object�oriented features�� but still gives avery good introduction� Manipulating text is near to the heart of Perl butthere are many other application areas� For example we used Perl to developeCGI�scripts in WWW�applications� An updated reference to the new features ofPerl is the manual of version ������ It is freely available from many sites� e�g�http���www�catt�ncsu�edu�users�bex�www�tutor�perl�html�

��� To Analyse the Data

There are many tools available for data analysis� Some packages like Minitab

have very steep learning curves and are e�g� suitable for teaching statistics toapplied researchers�

For more advanced and developmental work we found S�plus very useful�There are many reasons for this such as object�oriented features� vector�orientedprogramming� list structures and high dimensional arrays� As discussed in theprevious sections� it is sometimes necessary to combine programming in S withsome low�level language like Fortran or C to decrease the the execution time ofthe program� There is no general rule about when this is a good decision�

���

Page 123: Characterisation and Some Statistical Aspects of Univariate and

� OTHER IMPLEMENTATION DETAILS

��� To Examine and Present the Results

This is the last stage in data analysis� It is important to have this stage inmind when one chooses the tools in the �rst two stages� In Section ��� we gavean example of how one can use Mathematica to obtain the results in di�erentformats� Sometimes it is more e�ective to use specially developed software likeprint�display ��� to generate the results in the desired format� If there doesnot exist any specialised software� one can always write a script in languageslike Perl or C to perform the desired task� One of the main messages in thispaper is that one should consider all the � stages in data analysis as a wholeand before starting to write any program the analyst should have a preliminaryplan in mind about how each stage will be carried out�

� Other Implementation Details

The key thesis in this paper is that a professional data analyst or researchershould be familiar with several general tools and that the key to success liesin the ability to combine these tools� So far we have mentioned tools whichcan be used in a very broad spectrum of applications� This includes tools fortext manipulation� simulations� symbolic calculations and general optimisationroutines� Researchers will bene�t from the possibility to develop their programsand run di�erent tools from one common environment� We believe that thefull Emacs ���� implementation that is available under UNIX provides the nec�essary tools for this purpose� Emacs is a rational� extensible environment thatprovides the user with a consistent interface for performing a large variety oftasks without forcing the user to learn an idiosyncratic set of commands foreach application� We do not know of any better computing environment forthe kind of 'experimental programming' which is necessary for e�cient ande�ective research and data analysis�

Learning to use the more advanced features of Emacs requires some invest�ment of time� However with the built�in tutorial and intelligent commandcompletion features� the e�ort repays in time saved across applications� Thefollowing is just a list of a few things we did in Emacs�

�i� Text editing inside a host of other environments� general �le management�

�ii� Using UNIX shells and hence awk� sed� make� � � � under shell�mode�

�iii� Writing Perl programs under perl�mode�

�iv� Editing� compiling� debugging and printing LATEX documents under AucTex�mode�

�v� Writing� compiling� testing and running Fortran programs under fortran�mode�

��

Page 124: Characterisation and Some Statistical Aspects of Univariate and

SUMMARY

�vi� Reading� writing� �ling and sending e�mails and news�

�vii� Creating and previewing WWW�documents in html�mode�

�viii� Running S�plus under S�mode�

�ix� Running and documenting Mathematica session under latex�mathema�

tica�mode�

�x� Using concurrent versions system �CVS� for general software revision andrelease control under vc�mode �see ��� for an introduction to CVS��

Successful use of these tools requires that one can grasp the abstract conceptof a !mode" in Emacs� Each major mode customises Emacs for editing text ofa particular sort� It should be mentioned that Emacs� S�mode or any otherpackage mentioned above� are complex systems in and of themselves� and forexample the problem of learning S�plus should not be compounded by thoseof learning Emacs and S�mode� In addition� to understand why the occasionalfunction that runs without any problem under UNIX�shell does not parse underS�mode� it is also necessary to distinguish among the three� The main pointis that Emacs provides very powerful tools for running just about any majorprogram under UNIX� and someone who intends to run� say� S�plus under UNIXwould be well�compensated for the time and energy spent learning it�

Most of the packages under Emacs are written in Emacs Lisp � � and onecan use this language to customise%extend existing packages or create new ones�Several WWW�sites in Internet are devoted to archiving these extensions and be�fore starting to develop a new package it is a good idea to check these sites�Another good source of information is di�erent discussion groups in USENET�The messages in many of these groups can also be received by subscribing to anappropriate mailing list�

� Summary

In the previous sections we discussed and proposed di�erent tools for performingvarious tasks in research and data analysis� Some of the languages%packageswere

� low�level languages Fortran and C�

� languages for writing scripts and manipulating text Perl and Emacs

Lisp�

� a package for symbolic calculations� Mathematica�

���

Page 125: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

� user�callable subroutines for mathematical and statistical computationavailable in the NAG Fortran Library�

� a system for statistical computations and data analysis S�plus�

� concurrent versions system �CVS� for general software revision and releasecontrol�

� UNIX and a few of the utilities it provides� e�g� sed� grep� tr and awk�

� a common general environment for calculations� Emacs� and many spe�cialised tools which are available under it�

Learning these tools requires a big investment in time and e�ort but thepayo� is also large� The data analyst who has many tools in his%her toolboxwill be at an advantage over the one who has fewer� As discussed in some detailabove� there is an interaction between design and implementation of statisticalcomputations� Knowledge of these tools and their capabilities will result in abetter and more e�ective design of calculations� We have found these toolsextremely e�ective and powerful and recommend learning them�

However� there is no claim that the packages%languages are !optimal" or!best" in any sense� Comparing the designs and implementations discussed inSections � and � shows that there does not exist a !unique" solution in this kindof application and that each problem should be analysed independently and analgorithm for solving it should be designed separately� Furthermore� design andimplementation of solutions are highly a�ected by the experience and back�ground of researchers� Let�s quote a remark by Budd ���� page ���� !Althoughit is true that there are many bene�ts to using object�oriented programmingtechniques� it is also true that programming a computer is still one of the mostdi�cult tasks ever undertaken by humankind� becoming pro�cient in program�ming requires talent� creativity� intelligence� logic� the ability to build and useabstractions� and experience � even when the best of tools are available"�

Acknowledgement

I would like to thank my supervisor Holger Rootz�n for supporting the ideaof writing the paper and also for many stimulating and helpful discussions andcomments on the article� Thanks also to Mary Sheeran for comments on anearly version of the manuscript�

References

��� Bates� D�M� �� �� Data Manipulation in Perl� Computing Science andStatistics� Proceedings of the �th Symposium on the Interface� ��� ��������

���

Page 126: Characterisation and Some Statistical Aspects of Univariate and

REFERENCES

��� Becker� R�A�� Chambers J�M� and Wilks A�R� �� ��� The new S LanguageA Programming Environment For Data Analysis and Graphics� Wadsworth$ brooks%cole Computer Science Series�

��� Berliner� B �� �� Parallelizing Software Development�Conference Proceed�ings of the USENIX Association�s Winter � � conference� Washington�DC�

��� Budd� T� �� �� An Introduction to Object�Oriented Programming�Addison�Welsey Publishing Company� Inc�

��� Chambers� J�M� and Hastie� T�J� �� �� Statistical Models in S� Wadsworth$ brooks%cole Computer Science Series�

��� Heiberger� R�M� and Harrell� F�E� �� �� Design of Object�Oriented Func�tions in S for Screen Display� Interface and Control of Other Programs�SAS and LATEX�� and S Programming� Proceedings of the ��th Symposiumon the Interface�

��� Lawley� D�N� �� ��� A General Method for Approximating to the Distri�bution of Likelihood Ratio Criteria� Biometrica ��� � ������

��� Lemay� L� and Perkins� C� L� �� �� Teach yourself Java in �� days� SamsPublishing�

� � Lewis� B�� Laliberte� D�� Stallman� R� and the GNU Manual Group �� ��GNU Emacs Lisp Reference Manual� Free Software Foundation� Inc�

���� Rootz�n� H� and Tajvidi N� �� �� Extreme value statistics and wind stormlosses a case study� To appear in Scandinavian Actuarial Journal�

���� Stallman� R� �� �� GNU Emacs Manual Eleventh Edition Updated forEmacs Version � ���� Free Software Foundation� Inc�

���� Tajvidi� N� �� �� Multivariate generalised Pareto distributions� ReportMathematical statistics Chalmers University of Technology�

���� Tajvidi� N� �� �� Con�dence Intervals and Accuracy Estimation forHeavytailed Generalised Pareto Distribution� Report Mathematical statis�tics Chalmers University of Technology�

���� Wall� L� and Schwartz� R�L� �� �� Programming Perl� O�Reilly $ Asso�ciates� Inc�

���

Page 127: Characterisation and Some Statistical Aspects of Univariate and

Appendix to

Characterisation and Some Statistical

Aspects of Univariate and Multivariate

Generalised Pareto Distributions

Nader Tajvidi

Department of MathematicsG�teborg����

Page 128: Characterisation and Some Statistical Aspects of Univariate and
Page 129: Characterisation and Some Statistical Aspects of Univariate and

Appendix to

Characterisation and Some Statistical

Aspects of Univariate and Multivariate

Generalised Pareto Distributions

Nader Tajvidi

GOTEBORG

CH

AL

ME

RSTEKNISKAHO

GSK

OLA

DEPARTMENT OF MATHEMATICS

G�TEBORG

����

Page 130: Characterisation and Some Statistical Aspects of Univariate and

G�teborg ����

ISBN ������������

ISSN ������X

Page 131: Characterisation and Some Statistical Aspects of Univariate and

List of Tables

� Percentage of misses in one sided upper limited con�dence in�tervals for �� �bc� �� and �boot� �� stand for bias corrected andpercentile bootstrap �� con�dence intervals respectively� � � � �

� Percentage of misses in one sided lower limited con�dence inter�vals for �� �bc����� stands for bias corrected con�dence intervalswith con�dence level � � ���� � ����� �boot����� stands for per�centile bootstrap con�dence intervals� � � � � � � � � � � � � � � � �

� Percentage of misses in two sided con�dence intervals for � � � � �

� Mean length of con�dence intervals for � � � � � � � � � � � � � � � �

� Mean of con�dence intervals symmetry for � � � � � � � � � � � � �

� Percentage of misses in one sided upper limited con�dence in�tervals for �� �bc� �� and �boot� �� stand for bias corrected andpercentile bootstrap �� con�dence intervals respectively� � � � ��

� Percentage of misses in one sided lower limited con�dence inter�vals for �� �bc����� stands for bias corrected con�dence intervalwith con�dence level � � ���� � ����� �boot����� stands for per�centile bootstrap con�dence intervals� � � � � � � � � � � � � � � � ��

� Percentage of misses in two sided con�dence intervals for � � � � ��

Mean length of con�dence intervals for � � � � � � � � � � � � � � � ��

�� Mean of con�dence intervals symmetry for � � � � � � � � � � � � ��

�� Percentage of misses in one sided upper limited con�dence inter�vals for Q������ �bc� �� and �boot� �� stand for bias corrected andpercentile bootstrap �� con�dence intervals respectively� � � � � ��

�� Percentage of misses in one sided lower limited con�dence in�tervals for Q������ �bc����� stands for bias corrected con�denceintervals with con�dence level �� ���� � ����� �boot����� standsfor percentile bootstrap con�dence intervals� � � � � � � � � � � � � ��

�� Percentage of misses in two sided con�dence intervals for Q����� ��

�� Mean length of con�dence intervals for Q����� � � � � � � � � � � � ��

�� Mean of con�dence intervals symmetry for Q����� � � � � � � � � � �

Page 132: Characterisation and Some Statistical Aspects of Univariate and

� LIST OF TABLES

�� Percentage of misses of pro�le likelihood con�dence intervals for� with con�dence level ��� !prof����" and !prof�����nas" giverespectively the percentage of misses or !not�availables" in theleft side� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Percentage of times pro�le likelihood con�dence intervals withcon�dence level �� failed to cover the true value of �� !prof����"gives percentage of misses or !not�availables" in the left side� � � ��

�� Percentage of misses of pro�le likelihood con�dence intervals for� with con�dence level ��� !prof����" and !prof�����nas" giverespectively the percentage of misses or !not�availables" in theleft side� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Percentage of times pro�le likelihood con�dence intervals withcon�dence level �� failed to cover the true value of �� !prof����"gives percentage of misses or !not�availables" in the left side� � � ��

�� Mean length of pro�le likelihood �� con�dence intervals for �and � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Percentage of misses of corrected pro�le likelihood con�dence in�tervals for � with con�dence level ��� !prof����" and !prof�����nas"give respectively the percentage of misses or !not�availables" inthe left side� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Percentage of times corrected pro�le likelihood con�dence inter�vals with con�dence level �� failed to cover the true value of ��!prof����" gives percentage of misses or !not�availables" in theleft side� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Percentage of misses of corrected pro�le likelihood con�dence in�tervals for � with con�dence level ��� !prof����" and !prof�����nas"give respectively the percentage of misses or !not�availables" inthe left side� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Percentage of times corrected pro�le likelihood con�dence inter�vals with con�dence level �� failed to cover the true value of ��!prof����" gives percentage of misses or !not�availables" in theleft side� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Mean length of corrected pro�le likelihood �� con�dence inter�vals for � and � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

�� Percentage of misses of one sided upper limited con�dence inter�vals for Q����� using !Russian doll principle"� �boot� �� standsfor percentile bootstrap �� con�dence intervals� � � � � � � � � � ��

Page 133: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES �

�� Percentage of misses of one sided lower limited con�dence inter�vals for Q����� using !Russian doll principle"� �boot����� standsfor percentile bootstrap con�dence intervals with con�dence level�� ���� � ����� � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Percentage of misses of two sided con�dence intervals for Q�����using !Russian doll principle" � � � � � � � � � � � � � � � � � � � ��

� Mean length of con�dence intervals for Q����� using !Russian dollprinciple" � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Jackknife� bootstrap and simulation estimates of bias of �� � � � � ��

�� Jackknife� bootstrap and simulation estimates of bias of �� � � � � ��

�� Jackknife� bootstrap and simulation estimates of bias of dQ����� � ��

�� Estimates of standard error of b� � � � � � � � � � � � � � � � � � � ��

�� Estimates of standard error of b� � � � � � � � � � � � � � � � � � � ��

�� Estimates of standard error of dQ����� � � � � � � � � � � � � � � � � �

�� Mean of bootstrap sample correlation� asymptotic correlation andsample mean of asymptotic correlation of b�and b� � � � � � � � � � ��

�� Jackknife� bootstrap and simulation estimates of coe�cient ofvariation of b� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Jackknife� bootstrap and simulation estimates of coe�cient ofvariation of b� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��

� Jackknife� bootstrap and simulation estimates of coe�cient of

variation of dQ����� � � � � � � � � � � � � � � � � � � � � � � � � � � ��

�� Jackknife� bootstrap and simulation estimates of rmse of b� � � � � ��

�� Jackknife� bootstrap and simulation estimates of rmse of b� � � � � ��

�� Jackknife� bootstrap and simulation estimates of rmse of dQ����� � ��

Page 134: Characterisation and Some Statistical Aspects of Univariate and

This page intentionally contains only this sentence�

Page 135: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES �

Table �� Percentage of misses in one sided upper limited con�dence intervalsfor �� �bc� �� and �boot� �� stand for bias corrected and percentile bootstrap �� con�dence intervals respectively�

simulation bc�� bc��� bc���� boot�� boot��� boot��������

n�� ���� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n��� ����� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ���� ����n� � ����� ���� ���� ���� ���� ����n�� ����� ���� ���� ��� ���� ����

����

n�� ��� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n��� ��� ���� ���� ���� ���� ����n� ��� ���� ���� ���� ���� ����n� � ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n��� ����� ���� ���� ���� ���� ����n� ��� ���� ���� ���� ���� ����n� � ����� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ���������

n�� ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n��� ����� ���� ���� ���� ���� ����n� ����� ���� ���� ���� ���� ����n� � ����� ���� ���� ���� ���� ����n�� ����� ���� ���� ��� ���� ����

����

n�� ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n��� ����� ���� ���� ���� ���� ����n� ����� ���� ���� ���� ���� ����n� � ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����

Page 136: Characterisation and Some Statistical Aspects of Univariate and

� LIST OF TABLES

Table �� Percentage of misses in one sided lower limited con�dence intervals for�� �bc����� stands for bias corrected con�dence intervals with con�dence level�� ���� � ����� �boot����� stands for percentile bootstrap con�dence intervals�

simulation bc��� bc�� bc� boot��� boot�� boot� ����

n�� ��� ����� ����� ����� ����� �����n�� ���� ����� ����� ����� ����� �����n�� ���� ����� ����� ����� ����� �����n�� ���� ��� ����� ����� ����� �����n��� ���� ��� ����� ����� ����� �����n� ���� ���� ����� ��� ����� �����n� � ���� ���� ����� ���� ����� �����n�� ���� ���� ����� ���� ����� �����

����

n�� ���� ��� ����� ����� ����� �����n�� ���� ����� ����� ����� ����� ����n�� ��� ����� ���� ����� ����� ����n�� ���� ����� ����� ����� ���� �����n��� ���� ���� ����� ��� ����� �����n� ���� ��� ����� ��� ����� �����n� � ���� ���� ����� ���� ����� ����n�� ���� ���� ����� ���� ����� ����������

n�� ����� ����� ���� ����� ����� �����n�� ��� ����� ����� ����� ����� �����n�� ���� ��� ����� ����� ���� �����n�� ���� ��� ����� ����� ����� ����n��� ���� ���� ����� ��� ����� ����n� ���� ���� ����� ��� ����� �����n� � ���� ���� ����� ���� ��� �����n�� ���� ���� ����� ���� ��� ����������

n�� ��� ����� ���� ���� ����� �����n�� ���� ��� ����� ����� ����� �����n�� ���� ��� ����� ����� ����� �����n�� ���� ����� ����� ����� ����� �����n��� ���� ����� ����� ����� ����� �����n� ���� ���� ����� ���� ����� ����n� � ���� ��� ����� ��� ����� ����n�� ���� ���� ��� ���� ��� �����

����

n�� ���� ��� ����� ����� ����� ����n�� ���� ��� ����� ����� ���� �����n�� ���� ��� ����� ����� ����� �����n�� ���� ���� ����� ����� ����� �����n��� ���� ���� ����� ���� ����� �����n� ���� ���� ����� ���� ����� �����n� � ���� ���� ����� ���� ����� �����n�� ���� ���� ��� ���� ���� �����

Page 137: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES �

Table �� Percentage of misses in two sided con�dence intervals for �simulation bc�� bc��� boot�� boot�������

n�� ����� ����� ����� �����n�� ����� ��� ����� �����n�� ����� ��� ����� �����n�� ����� ���� ���� �����n��� ����� ���� ����� �����n� ����� ���� ����� ���n� � ����� ���� ����� ����n�� ����� ���� ����� ���

����

n�� ����� ��� ����� �����n�� ����� ���� ����� �����n�� ����� ����� ����� ����n�� ����� ���� ���� �����n��� ����� ���� ����� �����n� ����� ��� ����� ���n� � ����� ���� ����� ���n�� ��� ���� ����� ���������

n�� ���� ����� ����� �����n�� ����� ����� ����� �����n�� ����� ���� ����� �����n�� ����� ���� ����� �����n��� ����� ���� ����� ���n� ����� ���� ����� ���n� � ����� ���� ����� ����n�� ����� ���� ����� ���������

n�� ����� ����� ����� ����n�� ����� ���� ����� �����n�� ����� ���� ���� �����n�� ����� ���� ����� �����n��� ����� ���� ����� �����n� ����� ���� ����� ���n� � ����� ���� ����� �����n�� ����� ���� ����� ��������

n�� ����� ����� ����� �����n�� ����� ��� ����� �����n�� ����� ���� ���� �����n�� ����� ���� ����� �����n��� ����� ���� ����� ����n� ����� ���� ����� ����n� � ����� ���� ����� ���n�� ��� ���� ��� ����

Page 138: Characterisation and Some Statistical Aspects of Univariate and

� LIST OF TABLES

Table �� Mean length of con�dence intervals for �simulation bc�� bc��� boot�� boot�������

n�� ����� ����� ��� �����n�� ���� ����� ���� �����n�� ���� ���� ����� ���n�� ����� ����� ����� �����n��� ���� ���� ���� ����n� ����� ���� ����� ����n� � ���� ���� ����� ����n�� ���� ����� ���� �����

����

n�� ����� ���� ���� �����n�� ���� ����� �� �����n�� ���� ��� ���� ����n�� ��� ���� ���� ����n��� ���� ����� ���� �����n� ����� ����� ����� �����n� � ���� ����� ����� �����n�� ����� ��� ����� ���������

n�� ���� ���� ���� �����n�� ����� ���� ����� �����n�� ��� ����� ���� �����n�� ���� ���� ���� ����n��� ����� ����� ���� �����n� ����� ����� ���� �����n� � ����� ����� ����� �����n�� ����� ����� ���� ���������

n�� ����� ���� ����� ����n�� ����� ���� ����� ����n�� ���� ���� ���� ����n�� ���� ����� ���� ���n��� ���� ���� ����� ����n� ���� ����� ����� ���n� � ���� ���� ����� �����n�� ����� ����� ����� ��������

n�� ���� ����� ����� �����n�� ����� ���� ����� �����n�� ����� ���� ����� �����n�� ���� ����� ���� ����n��� ���� ���� ���� ����n� ����� ���� ����� �����n� � ����� ����� ����� ����n�� ����� ����� ���� �����

Page 139: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES

Table �� Mean of con�dence intervals symmetry for �simulation bc�� bc��� boot�� boot�������

n�� ����� ���� ��� ����n�� ����� ����� ����� �����n�� ����� ����� ����� �����n�� ����� ���� ����� ����n��� ����� ���� ����� �����n� ����� ����� ����� ����n� � ��� ���� ����� ����n�� ���� ����� ��� �����

����

n�� ����� ����� ���� ����n�� ����� ����� ����� �����n�� ����� ����� ����� �����n�� ���� ��� ����� �����n��� ����� ���� ���� �����n� ����� ��� ����� �����n� � ����� ���� ����� ����n�� ���� ���� ���� ����������

n�� ����� ���� ���� �����n�� ���� ����� ����� �����n�� ����� ����� ���� �����n�� ����� ���� ����� ����n��� ���� ����� ����� ����n� ���� ���� ����� �����n� � ���� ����� ���� �����n�� ���� ����� ����� ����������

n�� ����� ����� ����� �����n�� ���� ����� ����� ����n�� ���� ���� ���� �����n�� ����� ���� ����� �����n��� ���� ���� ����� �����n� ����� ���� ���� ����n� � ��� ����� ��� ����n�� ����� ���� ���� ���������

n�� ����� ���� ���� ����n�� ����� ���� ����� �����n�� ����� ���� ����� �����n�� ���� ���� ����� �����n��� ����� ����� ����� �����n� ���� ����� ���� �����n� � ����� ���� ����� ����n�� ���� ���� ����� �����

Page 140: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table � Percentage of misses in one sided upper limited con�dence intervalsfor �� �bc� �� and �boot� �� stand for bias corrected and percentile bootstrap �� con�dence intervals respectively�

simulation bc�� bc��� bc���� boot�� boot��� boot��������

n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n��� ��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ���� ����n� � ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ��� ���� ����

����

n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n��� ��� ���� ���� ���� ���� ����n� ��� ���� ���� ��� ���� ����n� � ����� ���� ���� ��� ���� ����n�� ����� ���� ���� ��� ���� ���������

n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n��� ����� ���� ���� ��� ���� ����n� ����� ���� ���� ��� ���� ����n� � ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ����� ���� ���������

n�� ����� ���� ���� ���� ���� ����n�� ����� ���� ���� ��� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n��� ��� ���� ���� ���� ���� ����n� ��� ���� ���� ���� ���� ����n� � ����� ���� ���� ��� ���� ����n�� ����� ���� ���� ��� ���� ����

����

n�� ��� ���� ���� ���� ���� ����n�� ��� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����n�� ����� ���� ���� ���� ���� ����n��� ����� ���� ���� ����� ���� ����n� ����� ���� ���� ��� ���� ����n� � ���� ���� ���� ���� ���� ����n�� ����� ���� ���� ��� ���� ����

Page 141: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table �� Percentage of misses in one sided lower limited con�dence intervalsfor �� �bc����� stands for bias corrected con�dence interval with con�dence level�� ���� � ����� �boot����� stands for percentile bootstrap con�dence intervals�

simulation bc��� bc�� bc� boot��� boot�� boot� ����

n�� ���� ���� ����� ���� ����� �����n�� ���� ���� ����� ���� ����� ����n�� ���� ���� ����� ���� ���� �����n�� ���� ���� ����� ���� ��� �����n��� ���� ���� ��� ���� ��� �����n� ���� ���� ����� ���� ��� �����n� � ���� ���� ��� ���� ���� �����n�� ���� ���� ��� ���� ���� �����

����

n�� ���� ���� ��� ���� ��� �����n�� ���� ���� ����� ���� ��� ����n�� ���� ���� ����� ���� ��� �����n�� ���� ���� ����� ���� ���� �����n��� ���� ���� ����� ���� ��� �����n� ���� ���� ����� ���� ��� �����n� � ���� ���� ����� ���� ���� �����n�� ���� ���� ��� ���� ���� ����������

n�� ���� ���� ����� ���� ����� �����n�� ���� ���� ����� ���� ��� �����n�� ���� ���� ����� ���� ���� �����n�� ���� ���� ��� ���� ���� �����n��� ���� ���� ���� ���� ���� �����n� ���� ���� ��� ���� ���� �����n� � ���� ���� ����� ���� ���� �����n�� ���� ���� ����� ���� ��� ����������

n�� ���� ���� ����� ���� ��� �����n�� ���� ���� ����� ���� ����� �����n�� ���� ���� ����� ���� ��� �����n�� ���� ���� ����� ���� ��� �����n��� ���� ���� ��� ���� ���� �����n� ���� ���� ����� ���� ���� �����n� � ���� ���� ��� ���� ���� �����n�� ���� ���� ��� ���� ���� �����

����

n�� ���� ���� ��� ���� ��� �����n�� ���� ���� ����� ���� ��� ����n�� ���� ���� ����� ���� ����� �����n�� ���� ���� ����� ���� ��� �����n��� ���� ���� ��� ���� ���� �����n� ���� ���� ����� ���� ���� �����n� � ���� ���� ����� ���� ��� �����n�� ���� ���� ��� ���� ���� ���

Page 142: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table �� Percentage of misses in two sided con�dence intervals for �simulation bc�� bc��� boot�� boot�������

n�� ����� ���� ����� ����n�� ����� ���� ����� ����n�� ��� ���� ��� ����n�� ����� ���� ����� ����n��� ��� ���� ����� ����n� ����� ���� ����� ����n� � ���� ���� ��� ����n�� ����� ���� ��� ����

����

n�� ��� ���� ����� ����n�� ��� ���� ����� ����n�� ��� ���� ����� ����n�� ��� ���� ����� ����n��� ����� ���� ����� ����n� ����� ���� ����� ����n� � ����� ���� ����� ����n�� ���� ���� ���� ���������

n�� ����� ���� ����� ����n�� ��� ���� ����� ����n�� ��� ���� ����� ����n�� ��� ���� ��� ����n��� ��� ���� ��� ����n� ��� ���� ��� ����n� � ����� ���� ����� ����n�� ����� ���� ����� ��������

n�� ����� ��� ����� ����n�� ����� ���� ����� ����n�� ����� ���� ����� ���n�� ��� ���� ����� ����n��� ��� ���� ��� ����n� ��� ���� ����� ����n� � ��� ���� ��� ����n�� ��� ���� ��� ��������

n�� ��� ���� ����� ����n�� ����� ���� ����� ����n�� ����� ���� ����� ����n�� ����� ���� ����� ����n��� ����� ���� ����� ����n� ����� ���� ����� ����n� � ����� ���� ����� ����n�� ��� ���� ��� ����

Page 143: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table � Mean length of con�dence intervals for �simulation bc�� bc��� boot�� boot�������

n�� ����� ���� ��� �����n�� ���� ����� ����� ����n�� ���� ��� ����� ����n�� ���� ����� ��� ����n��� ����� ����� ����� ����n� ����� ����� ���� �����n� � ����� ����� ���� �����n�� ���� ����� ����� �����

����

n�� ���� ����� ���� �����n�� ����� ��� ���� ����n�� ����� ����� ����� �����n�� ��� ����� ���� �����n��� ����� ���� ���� ����n� ���� ����� ��� �����n� � ����� ����� ����� �����n�� ��� ����� ����� ���������

n�� ����� ����� ����� ����n�� ����� ����� ���� ����n�� ����� ��� ����� ����n�� ��� ����� ���� ����n��� ����� ���� ����� ���n� ���� ���� ����� �����n� � ���� ���� ���� ����n�� ����� ����� ���� ����������

n�� ����� ����� ����� �����n�� ����� ����� ����� ����n�� ����� ����� ���� �����n�� ��� ����� ����� �����n��� ����� ��� ���� ���n� ����� ����� ���� ����n� � ���� ���� ����� ����n�� ����� ����� ���� ���������

n�� ����� ���� ����� �����n�� ����� ���� ���� �����n�� ����� ���� ����� �����n�� ����� ����� ����� ����n��� ����� ���� ���� ����n� ���� ���� ���� ����n� � ����� ����� ����� ����n�� ����� ����� ����� ����

Page 144: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Mean of con�dence intervals symmetry for �simulation bc�� bc��� boot�� boot�������

n�� ����� ���� ����� �����n�� ���� ���� ����� �����n�� ��� ����� ����� ����n�� ����� ����� ����� ����n��� ���� ��� ����� �����n� ����� ����� ���� �����n� � ����� ����� ���� �����n�� ����� ���� ���� ����

����

n�� ���� ���� ����� �����n�� ����� ����� ����� ����n�� ����� ����� ����� ����n�� ���� ����� ����� �����n��� ����� ����� ����� ����n� ���� ��� ����� �����n� � ����� ����� ���� ����n�� ����� ����� ���� ���������

n�� ����� ���� ����� �����n�� ����� ����� ���� �����n�� ���� ����� ����� �����n�� ���� ���� ����� �����n��� ���� ���� ���� ����n� ����� ����� ���� ����n� � ����� ����� ����� �����n�� ���� ����� ����� ����������

n�� ����� ����� ����� �����n�� ���� ����� ���� ����n�� ����� ����� ����� �����n�� ����� ���� ����� �����n��� ����� ���� ����� �����n� ���� ����� ���� ���n� � ����� ����� ����� �����n�� ����� ����� ���� ���������

n�� ����� ���� ����� ����n�� ����� ����� ���� �����n�� ����� ����� ���� ����n�� ���� ����� ���� �����n��� ����� ����� ����� ����n� ���� ����� ���� ����n� � ����� ����� ����� �����n�� ����� ����� ����� ����

Page 145: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Percentage of misses in one sided upper limited con�dence inter�vals for Q������ �bc� �� and �boot� �� stand for bias corrected and percentilebootstrap �� con�dence intervals respectively�

simulation bc�� bc��� bc���� boot�� boot��� boot��������

n�� ����� ����� ���� ���� ����� �����n�� ����� ����� ����� ����� ���� �����n�� ����� ����� ����� ����� ���� �����n�� ����� ��� ���� ����� ����� ���n��� ����� ���� ���� ����� ����� ����n� ��� ���� ���� ����� ��� ����n� � ����� ���� ���� ����� ��� ����n�� ����� ���� ���� ����� ��� ����

����

n�� ����� ����� ����� ���� ����� �����n�� ���� ����� ����� ����� ���� �����n�� ���� ����� ����� ����� ����� �����n�� ����� ��� ���� ���� ����� ���n��� ����� ���� ���� ����� ��� ����n� ����� ���� ���� ����� ����� ����n� � ��� ���� ���� ����� ���� ����n�� ����� ���� ���� ����� ��� ���������

n�� ����� ���� ����� ���� ����� �����n�� ����� ���� ����� ���� ����� ����n�� ����� ����� ��� ����� ����� �����n�� ����� ��� ���� ���� ����� ���n��� ����� ��� ���� ���� ����� �����n� ����� ���� ���� ���� ����� ����n� � ����� ���� ���� ����� ��� ����n�� ��� ���� ���� ����� ���� ���������

n�� ����� ���� ����� ���� ����� �����n�� ����� ����� ��� ����� ����� �����n�� ����� ����� ���� ����� ����� �����n�� ����� ��� ���� ���� ����� �����n��� ����� ��� ���� ���� ����� �����n� ����� ���� ���� ����� ����� ����n� � ����� ��� ���� ���� ����� ���n�� ����� ���� ���� ����� ���� ����

����

n�� ����� ����� ����� ����� ����� ����n�� ����� ����� ��� ����� ����� �����n�� ����� ��� ���� ����� ����� ���n�� ����� ��� ���� ����� ����� ���n��� ����� ��� ���� ����� ����� ����n� ����� ���� ���� ����� ����� ����n� � ����� ���� ���� ����� ��� ����n�� ����� ���� ���� ����� ��� ����

Page 146: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Percentage of misses in one sided lower limited con�dence intervalsfor Q������ �bc����� stands for bias corrected con�dence intervals with con�dencelevel � � ���� � ����� �boot����� stands for percentile bootstrap con�denceintervals�

simulation bc��� bc�� bc� boot��� boot�� boot� ����

n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n��� ���� ���� ����� ���� ���� ����n� ���� ���� ����� ���� ���� ���n� � ���� ���� ��� ���� ���� ����n�� ���� ���� ����� ���� ���� �������

n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ���� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n��� ���� ���� ����� ���� ���� ����n� ���� ���� ����� ���� ���� ����n� � ���� ���� ����� ���� ���� ����n�� ���� ���� ����� ���� ���� ��������

n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n��� ���� ���� ����� ���� ���� ����n� ���� ���� ��� ���� ���� ����n� � ���� ���� ����� ���� ���� ����n�� ���� ���� ����� ���� ���� ����

�����

n�� ���� ���� ����� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n��� ���� ���� ����� ���� ���� ����n� ���� ���� ����� ���� ���� ����n� � ���� ���� ��� ���� ���� ����n�� ���� ���� ����� ���� ���� �������

n�� ���� ���� ����� ���� ���� ����n�� ���� ���� ����� ���� ���� ����n�� ���� ���� ��� ���� ���� ����n�� ���� ���� ��� ���� ���� ����n��� ���� ���� ����� ���� ���� ����n� ���� ���� ���� ���� ���� ����n� � ���� ���� ����� ���� ���� ���n�� ���� ���� ����� ���� ���� ����

Page 147: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Percentage of misses in two sided con�dence intervals for Q�����simulation bc�� bc��� boot�� boot�������

n�� ����� ����� ����� �����n�� ����� ����� ����� �����n�� ���� ����� ����� �����n�� ����� ��� ����� �����n��� ����� ���� ����� ���n� ����� ���� ����� ����n� � ��� ���� ����� ����n�� ����� ��� ����� �������

n�� ����� ����� ����� �����n�� ����� ����� ����� �����n�� ����� ����� ���� �����n�� ����� ���� ����� �����n��� ����� ���� ����� ����n� ����� ��� ����� ���n� � ����� ���� ����� ����n�� ����� ���� ����� ���

�����

n�� ����� ����� ����� �����n�� ����� ����� ����� ����n�� ����� ����� ���� �����n�� ����� ��� ����� �����n��� ����� ����� ����� �����n� ����� ���� ����� ���n� � ����� ���� ����� ����n�� ����� ���� ����� ���������

n�� ����� ���� ����� �����n�� ����� ����� ����� �����n�� ����� ��� ����� �����n�� ����� ����� ����� �����n��� ����� ���� ����� �����n� ����� ���� ����� ����n� � ����� ���� ����� ���n�� ����� ���� ����� ����

����

n�� ����� ����� ����� ����n�� ���� ����� ���� �����n�� ����� ����� ����� �����n�� ����� ��� ����� ���n��� ����� ���� ����� ���n� ����� ���� ����� ����n� � ����� ���� ����� ����n�� ����� ���� ����� ���

Page 148: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Mean length of con�dence intervals for Q�����simulation bc�� bc��� boot�� boot�������

n�� ����� ����� ���� ����n�� ���� ����� ����� ����n�� ���� ���� ����� �����n�� ����� ����� ����� ����n��� ����� ����� ��� �����n� ����� ���� ����� ����n� � ����� ����� ��� �����n�� ��� ����� ���� ���������

n�� ��� ����� ����� ����n�� ��� ���� ��� ����n�� ����� ����� ���� �����n�� ����� ���� ����� �����n��� ���� ���� ����� ����n� ����� ���� ��� ����n� � ����� ����� ��� ���n�� ����� ���� ���� �����

�����

n�� ���� ����� ��� ����n�� ����� ���� ��� ���n�� ���� ����� ���� ���n�� ����� ����� ����� ����n��� ���� ���� ����� ����n� ���� ����� ����� ����n� � ���� ����� ���� ����n�� ����� ���� ����� ���������

n�� ������ ������ ������ ����n�� ������ ������ ����� �����n�� ����� ������ ��� ������n�� ���� ������ ����� ���n��� ����� ���� ���� �����n� ����� ���� ����� ����n� � ����� ����� ����� �����n�� ��� ����� ���� �����

����

n�� ������ ���� ����� �����n�� ����� ������ ������ ������n�� ������ ����� ������ �����n�� ������ ������ ����� �����n��� ������ ��� ������ ����n� ��� ������ ���� �����n� � ����� �� ����� ����n�� ���� ����� ���� �����

Page 149: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES �

Table ��� Mean of con�dence intervals symmetry for Q�����simulation bc�� bc��� boot�� boot�������

n�� ����� ����� ����� �����n�� ����� ����� ����� �����n�� ����� ����� ���� �����n�� ����� ����� ���� ���n��� ���� ���� ����� �����n� ���� ����� ����� �����n� � ����� ����� ����� �����n�� ����� ����� ����� ���������

n�� ����� ����� ����� �����n�� ����� ���� ����� ����n�� ���� ���� ����� �����n�� ����� ����� ���� ����n��� ���� ����� ����� �����n� ���� ����� ����� ����n� � ���� ��� ���� �����n�� ��� ���� ����� ����

�����

n�� ���� ���� ����� �����n�� ����� ����� ����� �����n�� ���� ����� ���� �����n�� ����� ����� ����� ����n��� ���� ����� ����� �����n� ��� ����� ����� ����n� � ���� ����� ���� ����n�� ���� ���� ����� ����������

n�� ��� ��� ����� �����n�� ����� ����� ����� �����n�� ����� ���� ���� ���n�� ����� ���� ����� �����n��� ����� ����� ���� �����n� ���� ����� ���� ����n� � ���� ���� ����� �����n�� ��� ����� ����� �����

����

n�� ���� ���� ���� �����n�� ���� ����� ���� ����n�� ����� ���� ���� ����n�� ���� ���� ���� ����n��� ���� ����� ���� ����n� ����� ����� ���� ����n� � ����� ����� ���� ����n�� ��� ����� ����� �����

Page 150: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table �� Percentage of misses of pro�le likelihood con�dence intervals for �with con�dence level ��� !prof����" and !prof�����nas" give respectively thepercentage of misses or !not�availables" in the left side�simulation prof��� prof���� prof����nas prof�����nas prof�twos����nas����

n�� ��� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

Page 151: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Percentage of times pro�le likelihood con�dence intervals with con��dence level �� failed to cover the true value of �� !prof����" gives percentageof misses or !not�availables" in the left side�

simulation prof��� prof���� prof�twos�������

n�� ��� ����� �����n�� ���� ���� �����n�� ���� ���� ����n�� ���� ���� ���n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

Page 152: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Percentage of misses of pro�le likelihood con�dence intervals for �with con�dence level ��� !prof����" and !prof�����nas" give respectively thepercentage of misses or !not�availables" in the left side�simulation prof��� prof���� prof����nas prof�����nas prof�twos����nas����

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

Page 153: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table �� Percentage of times pro�le likelihood con�dence intervals with con��dence level �� failed to cover the true value of �� !prof����" gives percentageof misses or !not�availables" in the left side�

simulation prof��� prof���� prof�twos�������

n�� ���� ����� �����n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ���� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ���� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ���� ����n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

Page 154: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Mean length of pro�le likelihood �� con�dence intervals for � and�

simulation gamma sigma����

n�� ����� �����n�� ��� �����n�� ���� �����n�� ����� ����n��� ����� �����n� ��� �����n� � ���� �����n�� ����� ����

����

n�� ����� ����n�� ��� ���n�� ��� �����n�� ���� ����n��� ���� ����n� ����� ����n� � ����� �����n�� ���� ����������

n�� ���� ��n�� ����� ����n�� ���� �����n�� ���� �����n��� ���� ����n� ����� �����n� � ����� ����n�� ���� ����������

n�� ���� �����n�� ���� �����n�� ����� �����n�� ����� �����n��� ���� ����n� ����� �����n� � ���� �����n�� ���� ���������

n�� ���� ����n�� ����� �����n�� ���� ����n�� ���� �����n��� ���� ����n� ��� ����n� � ����� �����n�� ����� �����

Page 155: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Percentage of misses of corrected pro�le likelihood con�dence in�tervals for � with con�dence level ��� !prof����" and !prof�����nas" giverespectively the percentage of misses or !not�availables" in the left side�simulation prof��� prof���� prof����nas prof�����nas prof�twos����nas����

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

Page 156: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Percentage of times corrected pro�le likelihood con�dence intervalswith con�dence level �� failed to cover the true value of �� !prof����" givespercentage of misses or !not�availables" in the left side�

simulation prof��� prof���� prof�twos�������

n�� ���� ����� �����n�� ���� ����� �����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ����� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ����� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ����� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ��� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

Page 157: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Percentage of misses of corrected pro�le likelihood con�dence in�tervals for � with con�dence level ��� !prof����" and !prof�����nas" giverespectively the percentage of misses or !not�availables" in the left side�simulation prof��� prof���� prof����nas prof�����nas prof�twos����nas����

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ��� ���n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ����� �����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ��� ���n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ���������

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

����

n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����n��� ���� ���� ���� ���� ����n� ���� ���� ���� ���� ����n� � ���� ���� ���� ���� ����n�� ���� ���� ���� ���� ����

Page 158: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Percentage of times corrected pro�le likelihood con�dence intervalswith con�dence level �� failed to cover the true value of �� !prof����" givespercentage of misses or !not�availables" in the left side�

simulation prof��� prof���� prof�twos�������

n�� ���� ����� �����n�� ���� ���� �����n�� ���� ����� �����n�� ���� ���� ���n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ����� �����n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ����� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ���������

n�� ���� ���� �����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

����

n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ����n�� ���� ���� ����

Page 159: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES �

Table ��� Mean length of corrected pro�le likelihood �� con�dence intervalsfor � and �

simulation gamma sigma

����

n�� ���� ���n�� ��� �����n�� ���� �����n�� ����� ����n��� ���� �����n� ����� �����n� � ����� ����n�� ���� ���������

n�� ����� ����n�� ���� ����n�� ���� ����n�� ���� ����n��� ���� ���n� ����� �����n� � ����� �����n�� ��� ���������

n�� ����� �����n�� ���� ����n�� ����� �����n�� ���� �����n��� ����� ����n� ����� �����n� � ����� ����n�� ����� ���������

n�� ���� �����n�� ����� ����n�� ����� �����n�� ���� �����n��� ���� ����n� ���� ����n� � ���� �����n�� ����� ���������

n�� ���� �����n�� ����� ���n�� ����� �����n�� ����� �����n��� ��� ����n� ���� ����n� � ����� ����n�� ����� �����

Page 160: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table �� Percentage of misses of one sided upper limited con�dence inter�vals for Q����� using !Russian doll principle"� �boot� �� stands for percentilebootstrap �� con�dence intervals�

simulation boot���� boot��� boot������

n�� ��� ����� ����n�� ��� ����� �����n�� ���� ����� ����n�� ���� ���� �����n��� ���� ���� �����n� ���� ���� ���n� � ���� ���� �����n�� ���� ���� �����

����

n�� ��� ����� �����n�� ���� ����� �����n�� ���� ����� �����n�� ���� ��� �����n��� ���� ���� �����n� ���� ��� �����n� � ���� ���� �����n�� ���� ��� ����������

n�� ����� ����� �����n�� ����� ����� �����n�� ��� ����� �����n�� ���� ��� �����n��� ��� ����� �����n� ���� ��� �����n� � ���� ��� �����n�� ���� ���� ����������

n�� ����� ����� �����n�� ���� ����� �����n�� ��� ����� ����n�� ���� ����� �����n��� ��� ����� �����n� ���� ��� �����n� � ���� ����� �����n�� ���� ���� �����

����

n�� ����� ����� �����n�� ��� ����� �����n�� ���� ����� �����n�� ��� ����� �����n��� ���� ����� �����n� ���� ��� �����n� � ���� ���� �����n�� ���� ��� �����

Page 161: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Percentage of misses of one sided lower limited con�dence inter�vals for Q����� using !Russian doll principle"� �boot����� stands for percentilebootstrap con�dence intervals with con�dence level �� ���� � �����

simulation boot��� boot�� boot� ����

n�� ���� ���� ���n�� ���� ���� �����n�� ���� ���� ����n�� ���� ���� �����n��� ���� ���� ���n� ���� ���� �����n� � ���� ���� ���n�� ���� ���� �����

����

n�� ���� ���� ���n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ���n��� ���� ���� ���n� ���� ���� �����n� � ���� ���� ���n�� ���� ���� ����������

n�� ���� ���� ���n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ���n��� ���� ���� ���n� ���� ���� ����n� � ���� ���� ���n�� ���� ���� ��������

n�� ���� ���� ���n�� ���� ���� ���n�� ���� ���� ���n�� ���� ���� ���n��� ���� ���� ����n� ���� ���� ����n� � ���� ���� ���n�� ���� ���� �����

����

n�� ���� ���� ���n�� ���� ���� ���n�� ���� ���� ����n�� ���� ���� ����n��� ���� ���� ���n� ���� ���� ����n� � ���� ���� ���n�� ���� ���� ���

Page 162: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Percentage of misses of two sided con�dence intervals for Q�����using !Russian doll principle"

simulation boot�� boot���

����

n�� ���� �����n�� ����� �����n�� ����� �����n�� ����� ����n��� ����� ����n� ����� ����n� � ����� ����n�� ����� �������

n�� ����� �����n�� ����� ���n�� ����� ���n�� ����� ����n��� ����� ����n� ����� ���n� � ����� ����n�� ����� ���������

n�� ���� �����n�� ����� �����n�� ����� ���n�� ����� ����n��� ����� �����n� ����� ����n� � ����� ����n�� ����� ���������

n�� ����� �����n�� ����� ���n�� ����� ���n�� ����� ���n��� ����� ���n� ����� ����n� � ����� ����n�� ����� ��������

n�� ���� �����n�� ����� ���n�� ����� ���n�� ����� ���n��� ����� ����n� ����� ����n� � ����� ����n�� ����� ����

Page 163: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table �� Mean length of con�dence intervals for Q����� using !Russian dollprinciple"

simulation boot�� boot���

����

n�� ���� ����n�� ����� ���n�� ���� �����n�� ����� �����n��� ����� �����n� ����� ����n� � �� ����n�� ��� ���������

n�� ���� ���n�� ����� �����n�� ����� ����n�� ���� �����n��� ���� ����n� ��� �����n� � ����� ���n�� ���� ����������

n�� ���� ���n�� ���� �����n�� ����� ����n�� ����� �����n��� ���� �����n� ���� ���n� � ����� ����n�� ����� ����������

n�� ���� ����n�� ����� �����n�� ���� ������n�� ���� ����n��� ���� �����n� ����� �����n� � ���� �����n�� ����� ���������

n�� ������ �����n�� ������ ������n�� ����� ����n�� ������ ������n��� ���� ����n� ���� �����n� � ���� ���n�� ��� �����

Page 164: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Jackknife� bootstrap and simulation estimates of bias of ��simulation dbias��� jack dbias��� boot dbias��� ����

n�� ����� ����� �����n�� ����� ���� �����n�� ���� ���� ����n�� ���� ����� �����n��� ����� ���� ����n� ����� ����� �����n� � ����� ����� �����n�� ����� ����� �����

����

n�� ���� ����� �����n�� ���� ���� ����n�� ����� ����� ����n�� ����� ����� �����n��� ����� ����� �����n� ����� ����� ����n� � ����� ����� �����n�� ����� ����� ����������

n�� ���� ����� �����n�� ���� ����� �����n�� ����� ����� ����n�� ���� ���� ����n��� ����� ����� �����n� ����� ����� �����n� � ����� ����� �����n�� ����� ���� �����

�����

n�� ����� ����� �����n�� ���� ����� �����n�� ����� ����� �����n�� ����� ���� ����n��� ���� ����� �����n� ����� ���� �����n� � ����� ����� �����n�� ����� ����� ����������

n�� ����� ����� ����n�� ���� ���� �����n�� ����� ����� �����n�� ���� ����� �����n��� ���� ����� ������n� ����� ����� �����n� � ����� ����� �����n�� ����� ����� �����

Page 165: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Jackknife� bootstrap and simulation estimates of bias of ��simulation dbias��� jack dbias��� boot dbias��� ����

n�� ����� ����� �����n�� ����� ����� ����n�� ��� ���� �����n�� ����� ����� �����n��� ����� ����� �����n� ����� ����� ����n� � ���� ����� �����n�� ����� ����� �����

����

n�� ����� ����� �����n�� ����� ����� �����n�� ���� ���� ���n�� ����� ���� ����n��� ����� ����� �����n� ����� ����� �����n� � ���� ����� �����n�� ����� ����� ���������

n�� ���� ����� ����n�� ���� ����� �����n�� ��� ���� �����n�� ����� ����� �����n��� ����� ����� ����n� ����� ����� �����n� � ���� ����� �����n�� ����� ����� �����

�����

n�� ����� ����� �����n�� ����� ����� ����n�� ���� ���� ����n�� ����� ����� ���n��� ����� ����� ����n� ����� ���� �����n� � ����� ���� �����n�� ����� ����� ����������

n�� ����� ���� �����n�� ����� ����� �����n�� ���� ���� ����n�� ����� ���� �����n��� ����� ����� �����n� ����� ����� �����n� � ����� ����� ����n�� ����� ����� �����

Page 166: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Jackknife� bootstrap and simulation estimates of bias of �Q�����simulation dbias��Q��� jack dbias��Q��� boot dbias��Q���

����

n�� ������ ������ ������n�� ������ ������ �����n�� ������ ������ ����n�� ������ ����� ������n��� ������ ������ �����n� ������ ����� ������n� � ������ ������ ������n�� ������ ������ ����������

n�� ������ ����� ������n�� ����� ���� ������n�� ������ ����� �����n�� ������ ���� �����n��� ������ ����� �����n� ������ ����� ����n� � ����� ����� �����n�� ������ ����� ����������

n�� ����� ��� �����n�� ����� ���� ������n�� ����� ���� ������n�� ����� ����� �����n��� ����� ����� �����n� ����� ����� ������n� � ����� ����� �����n�� ����� ���� ����������

n�� ����� ���� ����n�� ���� ����� ����n�� ����� ���� �����n�� ����� ���� �����n��� ����� ����� �����n� ����� ���� �����n� � ����� ����� �����n�� ����� ����� ��������

n�� ��� ����� ���n�� ���� ���� ����n�� ����� ���� �����n�� ����� ��� �����n��� ����� ���� ����n� ����� ����� ����n� � ����� ����� �����n�� ����� ����� �����

Page 167: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Estimates of standard error of b�simulation bse�b� jack bse�b� boot bse�b�

asymp�b�� bse�b� asymp��� bse�b�

����

n�� ��� ���� ����� ���� �����n�� ��� ���� ����� ���� �����n�� ����� ����� ����� ���� ����n�� ����� ���� ����� ����� ����n��� ���� ���� ����� ���� �����n� ���� ����� ����� ����� �����n� � ����� ��� ���� ��� ���n�� ��� ���� ���� ���� ��������

n�� ����� ����� ���� ����� �����n�� ����� ����� ���� ����� �����n�� ����� ����� ���� ����� �����n�� ����� ����� ���� ��� ����n��� ����� ����� ���� ����� �����n� ����� ���� ����� ����� �����n� � ���� ����� ����� ����� �����n�� ����� ��� ��� ��� ����

�����

n�� ����� ����� ����� ���� �����n�� ����� ����� ����� ���� �����n�� ����� ����� ����� ����� �����n�� ����� ����� ����� ����� �����n��� ���� ���� ���� ���� ����n� ����� ���� ����� ����� �����n� � ����� ���� ����� ����� ����n�� ����� ����� ����� ����� ����������

n�� ����� ���� ����� ����� ����n�� ����� ����� ����� ���� ����n�� ����� ��� ����� ���� ����n�� ����� ����� ���� ����� ����n��� ����� ����� ����� ���� �����n� ���� ����� ���� ���� ����n� � ���� ����� ����� ����� �����n�� ���� ����� ���� ����� ��������

n�� ��� ����� ����� ����� ���n�� ��� ���� ����� ����� ����n�� ����� ���� ���� ����� ����n�� ���� ���� ����� ���� �����n��� ���� ����� ����� ����� �����n� ����� ���� ��� ����� �����n� � ����� ���� ����� ����� �����n�� ����� ����� ����� ����� ����

Page 168: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Estimates of standard error of b�simulation bse�b� jack bse�b� boot bse�b�

asymp�b��b�� bse�b� asymp����� bse�b�

����

n�� ����� ����� ����� ����� ����n�� ����� ���� ���� ���� �����n�� ���� ����� ���� ����� ����n�� ����� ���� ����� ���� �����n��� ��� ����� ���� ���� ����n� ���� ����� ���� ����� �����n� � ����� ����� ����� ����� �����n�� ����� ����� ���� ����� ���������

n�� ����� ����� ���� ����� ����n�� ����� ����� ���� ����� �����n�� ���� ���� ����� ����� �����n�� ����� ���� ����� ����� ����n��� ����� ����� ���� ���� �����n� ���� ���� ����� ����� ����n� � ����� ����� ���� ����� �����n�� ����� ����� ����� ���� �����

�����

n�� ����� ���� ����� ����� �����n�� ���� ���� ����� ����� �����n�� ���� ����� ��� ���� �����n�� ���� ����� ����� ����� ���n��� ����� ����� ���� ����� ����n� ���� ���� ���� ���� ����n� � ����� ����� ����� ����� �����n�� ����� ����� ���� ����� ����������

n�� ���� ���� ����� ����� �����n�� ��� ��� ���� ����� �����n�� ����� ����� ����� ����� �����n�� ����� ���� ���� ���� ����n��� ����� ����� ����� ���� ����n� ���� ����� ���� ���� ���n� � ���� ����� ����� ����� �����n�� ����� ����� ����� ����� ���������

n�� ����� ���� ����� ����� �����n�� ����� ����� ��� ����� �����n�� ����� ����� ����� ����� �����n�� ���� ����� ���� ���� �����n��� ����� ����� ����� ����� �����n� ����� ����� ����� ����� �����n� � ����� ����� ���� ����� �����n�� ����� ����� ����� ����� ����

Page 169: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES �

Table ��� Estimates of standard error of �Q�����simulation bse��Q��� jack bse��Q��� boot bse��Q���

����

n�� ����� ��� ���n�� ����� ����� ����n�� ����� ���� �����n�� ����� ����� �����n��� ����� ����� �����n� ����� ���� �����n� � ����� ��� ����n�� ����� ����� ��������

n�� ����� ���� �����n�� ����� ����� �����n�� ���� �� ����n�� ���� ��� ����n��� ���� ���� ���n� ���� ���� ����n� � ���� ���� �����n�� ����� ����� ����������

n�� ���� ���� �����n�� ���� ���� ��n�� ����� ����� �����n�� ���� ��� ���n��� ���� ����� �����n� ���� ���� ����n� � ����� ���� ���n�� ���� ���� ����������

n�� ����� ���� �����n�� ���� ��� ����n�� ����� ���� �����n�� ����� ���� ����n��� ����� ���� �����n� ����� ����� �����n� � ����� ���� �����n�� ����� ����� ���������

n�� ����� ����� ����n�� ���� ���� ����n�� ����� ����� ����n�� ���� ����� ����n��� ��� ����� �����n� ����� ����� ����n� � ��� ��� �����n�� ����� ���� ����

Page 170: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table �� Mean of bootstrap sample correlation� asymptotic correlation andsample mean of asymptotic correlation of b�and b�

simulation cor�b��� b�� corasymp����� corasymp�b��b��

����

n�� ���� ����� ����n�� ���� ����� ����n�� ����� ����� ����n�� ����� ����� �����n��� ���� ����� ����n� ���� ����� �����n� � ���� ����� �����n�� ����� ����� ����

����

n�� ����� ��� ����n�� ����� ��� �����n�� ��� ��� �����n�� ����� ��� �����n��� ����� ��� �����n� ����� ��� �����n� � ����� ��� �����n�� ����� ��� ����������

n�� ����� ���� �����n�� ����� ���� ����n�� ���� ���� �����n�� ���� ���� �����n��� ���� ���� �����n� ����� ���� �����n� � ����� ���� �����n�� ����� ���� ����������

n�� ����� ����� �����n�� ���� ����� �����n�� ����� ����� �����n�� ���� ����� ����n��� ����� ����� �����n� ����� ����� �����n� � ����� ����� �����n�� ����� ����� ���������

n�� ���� ����� �����n�� ����� ����� �����n�� ����� ����� �����n�� ����� ����� �����n��� ���� ����� �����n� ����� ����� �����n� � ����� ����� �����n�� ����� ����� �����

Page 171: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Jackknife� bootstrap and simulation estimates of coe�cient of vari�ation of b�

simulation bcv�b� jack bcv�b� boot bcv�b� ����

n�� ����� ���� �����n�� ���� ���� ����n�� ����� ���� ����n�� ���� ����� �����n��� ���� ���� �����n� ���� ����� ����n� � ����� ����� ����n�� ���� ����� ���������

n�� ����� ����� ����n�� ����� ����� �����n�� ����� ����� �����n�� ����� ���� ����n��� ���� ����� �����n� ����� ���� ����n� � ���� ����� �����n�� ���� ����� �����

�����

n�� ����� ����� ����n�� ����� ����� ����n�� ����� ��� �����n�� ���� ���� �����n��� ����� ���� ����n� ����� ����� �����n� � ����� ���� ����n�� ���� ���� ����������

n�� ����� ����� ����n�� ����� ���� �����n�� ���� ����� ����n�� ����� ����� ����n��� ����� ����� �����n� ����� ����� �����n� � ���� ��� �����n�� ���� ����� ������

����

n�� ����� ����� ����n�� ����� ���� �����n�� ����� ���� �����n�� ����� ����� �����n��� ���� ����� ������n� ����� ����� �����n� � ���� ���� �����n�� ����� ����� �����

Page 172: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Jackknife� bootstrap and simulation estimates of coe�cient of vari�ation of b�

simulation bcv�b� jack bcv�b� boot bcv�b� ����

n�� ����� ���� �����n�� ���� ���� �����n�� ����� ����� �����n�� ����� ���� ����n��� ����� ����� �����n� ���� ����� ����n� � ���� ����� �����n�� ����� ����� ��������

n�� ����� ����� ���n�� ���� ����� �����n�� ��� ����� ���n�� ����� ��� ����n��� ���� ����� ����n� ����� ����� ����n� � ����� ����� �����n�� ����� ����� ����������

n�� ����� ����� �����n�� ���� ����� ���n�� ����� ����� ����n�� ����� ���� ����n��� ���� ����� ���n� ���� ����� �����n� � ����� ����� ����n�� ���� ����� �����

�����

n�� ����� ����� �����n�� ����� ���� ����n�� ����� ����� �����n�� ����� ��� �����n��� ���� ����� ����n� ����� ���� �����n� � ����� ����� �����n�� ����� ����� ���������

n�� ��� ����� �����n�� ����� ����� �����n�� ����� ����� �����n�� ����� ����� �����n��� ���� ����� �����n� ����� ���� �����n� � ����� ����� ����n�� ����� ����� �����

Page 173: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table �� Jackknife� bootstrap and simulation estimates of coe�cient of vari�

ation of �Q�����simulation bcv��Q��� jack bcv��Q��� boot bcv��Q���

����

n�� ������ ���� ������n�� ����� ������ ������n�� ������ ����� ������n�� ����� ������ �����n��� ������ ������ �����n� ������ ����� �����n� � ������ ������ ������n�� ����� ����� ����������

n�� ������ ������ ������n�� ������ ������ ������n�� ������ ������ ������n�� ������ ������ �����n��� ������ ������ �����n� ������ ������ �����n� � ����� ������ �����n�� ������ ����� ����������

n�� ����� ����� �����n�� ���� ����� �����n�� ����� ���� �����n�� ����� ����� �����n��� ����� ���� �����n� ����� ����� ������n� � ����� ����� �����n�� ����� ����� �����

�����

n�� ����� ����� �����n�� ���� ����� �����n�� ����� ����� ����n�� ���� ����� �����n��� ����� ����� �����n� ���� ����� �����n� � ����� ����� �����n�� ����� ����� ���������

n�� ����� ����� ����n�� ���� ���� ����n�� ���� ����� ����n�� ����� ���� �����n��� ���� ���� ����n� ���� ����� �����n� � ����� ����� �����n�� ����� ����� ����

Page 174: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Jackknife� bootstrap and simulation estimates of rmse of b�simulation �rmse�b� jack �rmse�b� boot �rmse�b�

����

n�� ����� ����� ����n�� ����� ����� ����n�� ����� ����� �����n�� ���� ����� �����n��� ���� ����� ����n� ����� ����� �����n� � ����� ����� �����n�� ���� ���� ��������

n�� ���� ����� ����n�� ����� ����� ���n�� ����� ����� �����n�� ���� ���� �����n��� ���� ����� �����n� ���� ����� ����n� � ����� ����� �����n�� ����� ��� ����

�����

n�� ���� ����� �����n�� ����� ����� �����n�� ���� ����� ����n�� ����� ����� �����n��� ���� ��� ���n� ����� ����� �����n� � ����� ����� ����n�� ����� ����� ����������

n�� ���� ����� �����n�� ����� ����� �����n�� ���� ���� ���n�� ���� ����� �����n��� ����� ����� �����n� ���� ����� ����n� � ���� ����� �����n�� ���� ����� ��������

n�� ����� ����� �����n�� ��� ���� ����n�� ����� ����� �����n�� ���� ���� ����n��� ����� ����� �����n� ����� ���� �����n� � ����� ���� �����n�� ����� ����� ����

Page 175: Characterisation and Some Statistical Aspects of Univariate and

LIST OF TABLES ��

Table ��� Jackknife� bootstrap and simulation estimates of rmse of b�simulation �rmse�b� jack �rmse�b� boot �rmse�b�

����

n�� ����� ���� �����n�� ��� ���� ����n�� ����� ����� ���n�� ���� ��� ����n��� ����� ����� �����n� ����� ����� ����n� � ����� ����� �����n�� ����� ����� ���������

n�� ���� ����� �����n�� ����� ���� ����n�� ����� ����� �����n�� ���� ��� �����n��� ����� ���� ����n� ���� ���� ���n� � ����� ����� �����n�� ����� ����� ����

�����

n�� ����� ���� ����n�� ����� ����� ����n�� ����� ����� �����n�� ��� ����� ����n��� ����� ����� ����n� ��� ���� ���n� � ����� ����� �����n�� ����� ����� ���������

n�� ����� ���� �����n�� ����� ����� ����n�� ����� ���� �����n�� ����� ����� �����n��� ����� ����� �����n� ��� ����� ����n� � ���� ����� �����n�� ����� ���� ���������

n�� ���� ��� �����n�� ����� ���� ����n�� ���� ����� ����n�� ���� ����� �����n��� ���� ���� �����n� ����� ����� �����n� � ����� ����� �����n�� ����� ����� ����

Page 176: Characterisation and Some Statistical Aspects of Univariate and

�� LIST OF TABLES

Table ��� Jackknife� bootstrap and simulation estimates of rmse of �Q�����simulation �rmse��Q��� jack �rmse��Q��� boot �rmse��Q���

����

n�� ���� ���� ���n�� ����� ���� ����n�� ����� ����� �����n�� ����� ����� �����n��� ����� ����� �����n� ��� ���� �����n� � ����� ����� ����n�� ���� ����� ��������

n�� ����� ���� �����n�� ���� ����� �����n�� ��� �� ����n�� ���� ���� ����n��� ����� ���� ���n� ����� ���� �����n� � ����� ���� �����n�� ����� ����� ����������

n�� ���� ����� �����n�� ����� ���� ��n�� ����� ���� �����n�� ����� ����� ���n��� ����� ����� ����n� ���� ��� ����n� � ���� ���� ���n�� ���� ���� ����������

n�� ����� ����� �����n�� ���� ����� �����n�� ����� ����� ����n�� ���� ����� �����n��� ���� ���� �����n� ����� ����� �����n� � ����� ��� �����n�� ���� ���� ���������

n�� ���� ����� ���n�� ��� ����� ����n�� ���� ���� ����n�� ����� ����� �����n��� ��� ����� �����n� ���� ���� �����n� � ����� ����� �����n�� ��� ���� ����