8
INTERNATIONAL JOURNAL OF CLIMATOLOGY, VOL. 16, 105-1 12 (1996) SHORTER CONTRIBUTION ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL D. E. REEVE Sir William Halcmw & Partners Ltd, Swindon SN4 OQD, UK Received 2 August 1993 Accepted 8 March I995 ABSTRACT The bootstrap method is applied to a time series of Indian summer monsoon rainfall to determine the best fit to specific families of univariate distribution functions. The definition of an error norm provides a criterion by which a model may be selected. On this basis both the Weibull and general extreme value distributions are selected to derive return periods of extreme monsoon rainfall. KEY WORDS: Indian summer monsoon rainfall; extreme rainfall; extreme distribution models; return periods INTRODUCTION The importance of the Indian summer monsoon on both regional and global scales has been widely recognized since the pioneering work of Sir Gilbert Walker (1910). The purpose of this paper is to apply the resampling technique known as the bootstrap method (Efron, 1982), to recently published Indian summer monsoon rainfall data, and to examine the suitability of the normal distribution as a model for this data. The rainfall data are rainfall totals during a monsoon season averaged over the Indian subcontinent, referred to as the All India monsoon rainfall. Best-fit parameters of a number of specific probability distributions are determined. An error norm, or test statistic, which is a variant of the Kolmogorov-Smimov test, is used to determine the distribution that provides the best fit to the data. From this, return periods of extreme events may be estimated. Furthermore, questions germane to flood prediction and drought management may be addressed. For example: (i) What is the probability that the All India monsoon rainfall in a particular year will be less (more) than r mm? (ii) In the next n years, what is the probability that the lowest (highest) All India monsoon rainfall in the n years will be less (more) than r mm? (iii) What is the 1 in N year All India monsoon rainfall (for planning and designing flood or drought alleviation measures)? Under the assumption that the All India monsoon rainfall data may be regarded as realizations of an independently and identically distributed stationary random process the above questions may be answered when the cumulative distribution function, F(r), is known. In this case the answer to (i) is simply F(r) itself. For question (ii) the distribution function of the smallest of n observations is given by F, = 1 - (1 - F(r))" (1) The corresponding distribution for the largest of n variables is F"(r). To answer the last question, the N-year return value, rN, is calculated from the equation 1 - 1/N for maxima for minima F(rN) = { 1" CCC 0899-84 18/96/0 1010548 0 1996 by the Royal Meteorological Society

ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

  • Upload
    d-e

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

INTERNATIONAL JOURNAL OF CLIMATOLOGY, VOL. 16, 105-1 12 (1996)

SHORTER CONTRIBUTION

ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

D. E. REEVE Sir William Halcmw & Partners Ltd, Swindon SN4 OQD, UK

Received 2 August 1993 Accepted 8 March I995

ABSTRACT

The bootstrap method is applied to a time series of Indian summer monsoon rainfall to determine the best fit to specific families of univariate distribution functions. The definition of an error norm provides a criterion by which a model may be selected. On this basis both the Weibull and general extreme value distributions are selected to derive return periods of extreme monsoon rainfall.

KEY WORDS: Indian summer monsoon rainfall; extreme rainfall; extreme distribution models; return periods

INTRODUCTION

The importance of the Indian summer monsoon on both regional and global scales has been widely recognized since the pioneering work of Sir Gilbert Walker (1910). The purpose of this paper is to apply the resampling technique known as the bootstrap method (Efron, 1982), to recently published Indian summer monsoon rainfall data, and to examine the suitability of the normal distribution as a model for this data. The rainfall data are rainfall totals during a monsoon season averaged over the Indian subcontinent, referred to as the All India monsoon rainfall. Best-fit parameters of a number of specific probability distributions are determined. An error norm, or test statistic, which is a variant of the Kolmogorov-Smimov test, is used to determine the distribution that provides the best fit to the data.

From this, return periods of extreme events may be estimated. Furthermore, questions germane to flood prediction and drought management may be addressed. For example:

(i) What is the probability that the All India monsoon rainfall in a particular year will be less (more) than r mm? (ii) In the next n years, what is the probability that the lowest (highest) All India monsoon rainfall in the n years

will be less (more) than r mm? (iii) What is the 1 in N year All India monsoon rainfall (for planning and designing flood or drought alleviation

measures)?

Under the assumption that the All India monsoon rainfall data may be regarded as realizations of an independently and identically distributed stationary random process the above questions may be answered when the cumulative distribution function, F(r), is known. In this case the answer to (i) is simply F(r) itself. For question (ii) the distribution function of the smallest of n observations is given by

F, = 1 - (1 - F(r))" (1)

The corresponding distribution for the largest of n variables is F"(r). To answer the last question, the N-year return value, rN, is calculated from the equation

1 - 1/N for maxima for minima F ( r N ) = { 1"

CCC 0899-84 18/96/0 10 10548 0 1996 by the Royal Meteorological Society

Page 2: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

106 D. E. REEVE

MODEL SELECTION

The method of selecting a statistical model adopted here is based on that described in Linhart and Zucchini (1986). It is one generalization of the Kolmogorov-Smirnov test that allows the goodness-of-fit to be weighted towards particular sections of the distribution. For the data set under consideration this has the advantage of allowing the goodness-of-fit at the tails of the distributions to be assessed, and of providing an indication of the suitability of any distribution for estimating extreme rainfall. There is no reason a priori that the distribution F(r) should belong to any of the families specified, but in what follows it is assumed that in each family there is a model that is in some sense 'closest' to F(r).

To measure the closeness of a model to F(r), an error norm, A, is defined by

A(Q) = maxrlF(~)h - G Q ( ~ ' I (3 1 where GQ(r) is the distribution function of an approximating model with parameter set Q, and h > 0. The emphasis on the fit of the model at various portions of the distribution may be changed by altering h. The Kolmogorov- Smirnov statistic is recovered for h = 1.0.

The selection criterion is computed by estimating the expectation

where rl, r2, . . . , r, are the observed rainfalls ordered in increasing magnitude, Q is the estimator of the parameters in the approximating family, and E (.) is the expectation operator.

No general analytical expressions for the expectation in equation (4) are available. However, it is possible to obtain a direct estimate by using bootstrap methods (e.g. Efron, 1982). To summarize the method: from n original observations select a random sample of size n, with replacement, to obtain a bootstrap sample; compute the maximum likelihood estimate of the parameter set using this sample; compute the error norm for the sample; repeat the previous steps a large number of times, and estimate the expectation in equation (4) as the average of the computed error norms.

The procedure is performed for each of the following families of distributions: normal, log-normal, gamma, exponential, Weibull, Gumbel, and general extreme value (GEV). Details of each of these distributions and their associated parameters are provided in the Appendix.

The GEV distribution was first used for meteorological data by Jenkinson (1955). The distribution encompasses the three types of limiting behaviour identified by Fisher and Tippett (1928). The GEV, Gumbel, and Weibull distributions have been proposed as extreme value distributions. The Gumbel and exponential distributions are special cases of the GEV and gamma distributions respectively. The selected families of distributions have two parameters with the exception of the exponential and GEV distributions, which have one and three parameters respectively. Thus, intuitively, because of the greater flexibility to make the distribution fit the sample points it might be expected that the GEV distribution would give the 'best fit' and the exponential would give the worst fit. However, this greater flexibility does not necessarily translate into an improved ability to extrapolate to low- frequency occurrences.

APPLICATION TO RAINFALL DATA

Indian summer monsoon rainfall time series for the period 187 1-1 99 1 analysed in this paper has been presented by Parthasarathy et al. (1992), and is from a large number of measuring stations giving good spatial coverage of contiguous India. A brief description of the data is given below and the interested reader is referred to Parthasmthy et al. (1992) for further details. Readings from a fixed number of 306 well-distributed rain-gauge stations in the plains of India are used. These are shown in Figure 1 together with the meteorological subdivisions of India. The mountainous regions consisting of four subdivisions in the Himalayas (hatched areas in Figure 1) have been excluded, as have the islands in the Bay of Bengal and the Arabian Sea (subdivisions 1 and 35). The contiguous area shown constitutes approximately 90 per cent of the total area of India. The subdivisional monsoon rainfall (from June to September inclusive) for each subdivision is calculated by assigning the district area as the

Page 3: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

EXTREME INDIAN MONSOON ESTIMATION 107

2 Arunachef Pradesh 3 NorthAssam 4 South Assarn 5 SubSPmalayan West Bengal 6 Gangetic West Bengal 7 Orissa 8 BAarPlateau 9 B i P h h s

10 East Uttar M e s h 11 West Uttar Pradesh Plains 12 West Uttar Radesh Hpls

13 Harayana 14 Ptmjab 15 HLnschal Pradesh 16 Jammu and Kashmir 17 WestRajasthan 18 EastRajasthan

20 EastMarhyaPmdesh

22 Saurastma & Kutch 23 Konkan & Goa

19 WestMadhyaPtadesh

21 Gujarat

24 MadhyaMaharashtra 25 Marathwada 26 Mdarbha 27 Coastal AndhraPradesh 28 Telenoana 29 Rayalseema 30 TamPNedu 31 CoastalKarnataka 32 NorthKamataka 33 SouthKarnataka 34 Kerala

Figure 1. Meteorological subdivisions of India with network and locations of rain-gauges considered. Shaded hilly subdivisions are excluded from the All India monsoon rainfall (adapted from Parthasarathy el al., 1992)

Page 4: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

108 D. E. REEVE

weight for each rain-gauge station. The All India monsoon rainfall is determined from the weighted average of the subdivisional monsoon rainfall, with the subdivisional areas as the weights. Previous studies (e.g. Mooley and Parthasarathy, 1984; Parthasarathy et al., 1987) have indicated that the rainfall data of all the individual stations, the subdivisional monsoon rainfall and the All India monsoon rainfall may be described by a normal distribution.

A regression analysis performed on the data confirmed that any linear trend was negligible. The mean and standard deviation of the monsoon rainfall series are 850.7 mm and 84.1 mm respectively. The maximum and minimum rainfall was 1017 mm (in 1961) and 604 mm (in 1877), giving a range of 413 mm. The series has a 1 year lag correlation coefficient of -0.13, the 95 per cent confidence intervals being f0-178. This is consistent with the assumption that the data are independent and stationary.

Maximum likelihood estimates of the model parameters are given in Table I. The bootstrap estimates of the selection criterion (equation 4), based on 500 realizations, are presented in Table 11. Results for each of the seven distributions is given for values of the exponent h between 0.25 and 1.50. The lowest value of the criterion for each value of h is in italics. As the GEV distribution is used for describing the behaviour of the maximal extremes of a distribution, results are presented for both maxima and minima. The latter is found by multiplying the data by - 1. For minima this results in lower values of the error norm for values of h less than or equal to 1.0 (i.e. at the lower tail of the distribution), and these are shown in bold in Table 11.

The results show that the one parameter exponential distribution is unable to provide a good fit to the data, whereas the Weibull and GEV distributions are consistently closer to the data than the other distributions. Parthasarathy et al. (1987) indicate that both the subdivisional and All India monsoon rainfall data may be described by the normal distribution. The results of this study suggest that an improved description is provided by the Weibull or GEV distributions, particularly for values at the tails of the distribution. The Weibull distribution provides a better fit to the upper tail whereas the GEV provides a closer fit to the lower tail. The extra parameter in the GEV distribution does not appear to provide it with a clear advantage over the Weibull distribution in this case.

Figure 2 shows the rainfall data and the best-fit curves for the normal, Gumbel and GEV distributions on a probability plot (sometimes known as Q-Q plots). These are constructed as follows. Let the rainfall series be

Table I. Best-fit distribution parameters

Model

Normal Log-normal Gamma Exponential Weibull Gumbel GEV

GEV (negative data)

p = 850.7 v = 6.741

rZ = 850.7 p = 12.16 0 = 806.7 = 827.7

a = 97.30

= -886.4

Q = 84.13 z = 0.1029 /I = 8.743

6 = 887.4 = 91.77

x = 89.98 K = 04469 x = 73.63

NIA

K = 0.1085

Table 11. Computed error norms

h = 0.25 h = 0.50 h = 1.50 h = 1.25 h = 1.00 h = 0.75

Normal 0.1010 0.0994 0.0950 0.0876 0.0976 0.1340 Log-normal 0.1 179 0.1166 0.1 119 0.1025 0.1 129 0.1744 Gamma 0.1 130 0.1115 0.1068 0.0977 0.1050 0.1480 Weibull 0.0690 0.0658 0.0638 0.0640 0.07 16 0.0760 Gumbel 0.1585 0.1432 0.1342 0.1249 0.1219 0.2112 Exponential 0.4166 0.4678 0.52 15 0.5825 0.6269 0.5461 GEV (+data) 0.0732 0.0693 0.0662 0.0640 0,0698 0.0820 GEV (-data) 0,0734 0.0679 0.0626 0.0586 0.0607 0.0665

Page 5: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

EXTREME INDIAN MONSOON ESTIMATION 109

Rainfall (mm) (Thousands) 1.36 1

I . 0.66 I I I

-2 0 2 4 Reduced variate

6

’ Data - Qumbel QEV - - - - - Normal Figure 2. Gumbel eQ plot showing monsoon rainfall data and best-fit Gumbel, normal and GEV curves (full, dashed, and dotted lines

respectively)

rlr rz, . . . , TN and the ordered series be rN,l < rN,2 < . . . < rNfl Then plot rN,i against xN,i (1 < i < N) ( x is known as the reduced variate), where xN,; = -In(-ln(pN,;)) and pN,; is the plotting position. Here, Hazen’s formula is used (e.g. Chambers et al., 1983) giving p ~ , = (i - 0.5)/N.

If the data obey a Gumbel distribution then the points should lie along a line with intercept 8 and slope 5 (see Appendix for definition of parameters). The GEV curve evidently provides a better fit to the data than the normal distribution. The error norm for the Gumbel distribution is consistently about twice that for the GEV and Weibull distributions (see Table I) and this is reflected in Figure 1.

Figure 3 shows the Q-Q plot for the best-fit normal, GEV and Weibull dlstributions. In this case In(rN,;) is plotted against the reduced variate xN,,{1 < i < N) where xN,; = In(-ln(l - P ~ , ~ ) ) and pN,i is the plotting position determined by Hazen’s formula. If the data obey a Weibull distribution then the points should lie along a line with slope l l p and intercept ln(6). Again, this demonstrates that the normal distribution does not describe the data as well as the GEV or Weibull distributions, especially for values at the tail of the distribution.

DIAGNOSTICS

Rainfall levels corresponding to N-year return periods may be estimated from equation (2) using the maximum likelihood parameters in Table I. Confidence intervals about such specific percentiles are often required. The bootstrap method has been shown to provide good estimates in comparison with analytically derived results for log-normal and Gumbel distributions (Stedinger, 1983; Zucchini and Adamson, 1988). Here, the bootstrap techmque is used to estimate confidence intervals for the extreme values determined by the Weibull and GEV distributions, for which no analytical expressions for the confidence intervals are available.

Suppose the N-year return value has been calculated as R N . Confidence intervals for RN are estimated by taking M bootstrap samples of size n (with replacement) and performing a maximum likelihood fit on each sample. The M

Page 6: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

110 D. E. REEVE

6.9

6.8

6.7

6.6

6.6

6.4

LOGe(Rainfall (mm)) 7

-

-

-

-

-

-

-6 -4 -2 0 2 Reduced variate

Data - Weibull QEV - - - - - Normal

Figure 3. Weibull Q-Q plot showing monsoon rainfall data and best-fit Weibull, normal and GEV curves (MI, dashed, and dotted lines respectively)

estimates of R N thus generated are ordered in increasing magnitude. As M increases so the sample percentage points of the estimated N-year return values converge to estimates of the corresponding percentage points of R N . In other words, the estimates of the 90 per cent confidence interval of R N based on 500 bootstrap samples is the interval between the 25th and 475th largest values.

Table I11 summarizes extreme maximum rainfall values as determined from the ‘best-fit’ Weibull, GEV and normal distributions. Also shown are the 90 per cent confidence intervals as derived using 500 bootstrap samples.

Table 111. Return values for maximal extremes’

Weibull (mm)

GEV (-1

2 5

10 20 50

100 200 500

1000 2000

851 (838, 863) 922 (909, 932) 959 (945, 971) 989 (974, 1000)

1050 (1030, 1060) 1070 (1050, 1080) 1090 (1070, 1100) 1110 (1090, 1130) 1130 (1100, 1150)

1020 (1000, 1040)

861 (845, 875) 922 (908, 934) 950 (933, 963) 971 (953, 985) 992 (973, 1010)

1010 (986, 1020) 1020 (996, 1040) 1030 (1010, 1050) 1040 (1020, 1060) I050 (1020, 1070)

858 (844, 871) 926 (913, 937) 955 (941, 964) 976 (960, 984) 994 (975, 1000)

1000 (982, 1010) 1010 (987, 1020) 1020 (991, 1030) 1020 (993, 1040) 1020 (994, 1040)

~~ ~~

a Figures in parentheses refer to the 90 per cent confidence limits.

Page 7: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

EXTREME INDIAN MONSOON ESTIMATION 111

Table IV Return values for minimal extremes’

Weibull (mm)

GEV (mm)

2 5

10 20 50

100 200 500

1000 2000

851 (838, 863) 781 (765, 799) 746 (728, 767) 720 (700,743) 694 (671, 719) 679 (656, 707) 668 (645,697) 658 (634, 687) 653 (628, 682) 649 (624, 679)

861 (845, 875) 784 (763, 806) 737 (712, 765) 695 (667, 726) 644 (613, 679) 608 (575, 645) 574 (539, 613) 532 (496, 575) 503 (465, 547) 475 (437, 521)

858 (841, 874) 780 (761, 803) 737 (715, 762) 700 (675, 731) 659 (625, 697) 631 (590, 674) 605 (557, 653) 574 (518, 628) 552 (487, 610) 530 (460, 594)

a Figures in parentheses refer to the 90 per cent confidence limits.

Table IV shows the analogous quantities for extreme minimum rainfall values. (In this case the GEV parameters derived by analysing the negative of the rainfall data have been used to estimate extreme return values.)

DISCUSSION

The method of maximum likelihood has been used to fit specific families of univariate distribution functions to time series of All India monsoon rainfall. A measure of ‘goodness-of-fit’ has been defined by an error norm, or test statistic, which is estimated using the bootstrap method.

The results suggest that, for the goodness of fit criterion used, some distributions traditionally used to describe extreme values provide a good fit throughout the range of the data. In particular, both the GEV and Weibull distributions provide a better description of the data than the normal distribution, particularly at the tails of the distribution, where the normal distribution gives an overestimate of the rainfall.

Rainfall values for specific return periods have been computed using both GEV and Weibull distributions. The bootstrap method has been used to provide estimates of the 90 per cent confidence intervals.

The All India monsoon rainfall is a broad measure of the total monsoon rainfall, and cannot quantify the spatial extent of excessive and deficient rainfall. Further work, to extend that of, for example, Parthasarathy et al. (1987), is required to provide a better predictive capability of anomalous monsoon rainfall patterns.

The techniques discussed are not computationally expensive and represent an efficient way of estimating the likelihood of occurrence of extreme events and associated confidence limits.

APPENDIX

The families of distributions used in this paper are given below. The notationsf(x) and F(x) have been used for the density function and cumulative distribution respectively.

Normal

Mean = p

Page 8: ESTIMATION OF EXTREME INDIAN MONSOON RAINFALL

112

Log-normal

D. E. REEVE

Mean = exp(v + .r2/2)

Gamma

Mean = afl

Exponential

Mean = rl

Weibull

Mean = 6r((l + p)/p)

Gumbel

Mean = 0 + y<

General extreme value

I/=

~ ( x ) = e(l-K(fT)) Mean = + I { 1 + r(l + K)}/K In the above equations y is Euler’s constant = 0.5772.. . , and r is the gamma fhction.

REFERENCES

Chambers, J. M., Cleveland, W. S., Kleiner, B. and Uey , I-! A. 1983. Gmphical Methodsfor Data Analysis, Duxbury, Boston, MA. Efron, B. 1982. ‘The jackknife, the bootstrap and other resampling plans’, CBMS-NSF Regional Conference Series in Applied Mathematics,

Fisher, R. A. and Tippett, L. H. C. 1928. ‘Limiting forms of the frequency distributions of the largest or smallest member of a sample’, Proc.

Jenkmson. A. F. 1955. ‘The frequency distribution of the annual maximum (or minimum) values of meteorological elements’, Q. 1 R.

Lhart , H. and Zucchini, W. 1986. Model Selection, Wiley, New York. Mooley, D. A. and Parthasarathy, B. 1984. ‘Fluctuations in All Indian summer monsoon rainfall during 1871-1978’, Clim. Change, 6, 287-

Parthasarathy, B., Rupa Kumar, K. and Kothawale, D. R. 1992. ‘Indian summer monsoon rainfall indices: 1871-1900’, Meteorol. Mag., 121,

Parthasarathy, B., Sontakke, N. A., Munot, A. A. and Kothawale, D. R. 1987. ‘Droughtdfloods in summer monsoon season over different

Stedinger, J. 1983. ‘Confidence intervals for design events’, 1 Hydmul. Eng. ASCE, 109, 13-27. Walker, G. T. 1910. ‘On the meteorological evidence for supposed changes of climate in India’, Mem. India Mereorol. Depf, 21, 1-21. Zucchini, W. and Adamson, P. T. 1989. ‘Bootstrap confidence intervals for design storms from exceedance levels’, Hydrol. Sci. 1,34,4148.

Monograph 38, Society for Industrial and Applied Mathematics, Philadelphia.

Camb. Philos. Soc., 24, 180-190.

Meteorol. Soc., 87, 158-164.

301.

174-186.

meteorological subdivisions of India for the period 1871-1984’, 1 Climatol., 7, 57-70.