41
ON THE THEORY OF ORDER STATISTICS By ALFRt~D RI~NYI (Budapest), corresponding member of the Academy Dedicated to A. N. KOLMOCORO'r on the occasion of his 50th birthday Introduction Since the beginning of the century many authors, e. g. K. PEARSON [1]~ L. v. BORTKIEWlCZ [2], E. L. DODD [3], L. H. C. TIPPET [4], and M. FRE- CHET [5] have dealt with particular problems which may be classified as belonging to the theory of order statistics. A. N. KOLMOGOROV [6], V. I. GLI- VENKO [7], N. V. SMIRNOV[8], B. V. GNEDENKO [9], and other mathematicians having recognized the great theoretical and practical importance of this set of problems, developed this subject into a systematical theory. In the last three years a particularly great number of papers dealt with such problems; of these we mention those of B. V. GNEDEN~O and V. S. KOROLUK [10], B. V. ONEDENKO and E. L. RVA~.EVA [11], B. V. GNE-- DENKO and V. S. MIHALEVI~. [12], V. S. NIHALEW~ [13], J. D. KVIT [14], G. lVi. MANIA [15], I. I. GIHMAN [16], W. FELLER [17], J. L. DOOB [18], F. J. MASSEY [19], M. D. DONSKER [20], T. W. ANDERSON and D. A. DARLING [21]. A bibliography up to 1947 is to be found in the paper of S. S. WILKS [30] enumerating 90 papers. The purpose of the present paper is to give a new method by means of which many important results of the theory of order statistics can be obtained with surprising simplicity; the method also enables us to prove several new theorems. The essential novelty of this method is that it reduces the problems connected with order statistics to the study of sums of mutually independent random variables. w 1 contains the review of the method, w 2 is devoted to the proof of some known theorems by means of this method, and w 3 contains the formulation of some new results obtained by this method,. concerning the comparison of the sample distribution function to that of the population. These results are connected with the fundamental results of A. N. KOL/VIOGOROV and N. V. SN.IRNOV. Let F,~(x) denote the distribution function of a sample of size n drawn from a population having the continuous distribution function F(x), in other-

On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS

By ALFRt~D RI~NYI (Budapest), corresponding member of the Academy

Dedicated to A. N. KOLMOCORO'r on the occasion of his 50th birthday

Introduction

Since the beginning of the century many authors, e. g. K. PEARSON [1]~ L. v. BORTKIEWlCZ [2], E. L. DODD [3], L. H. C. TIPPET [4], and M. FRE- CHET [5] have dealt with particular problems which may be classified as belonging to the theory of order statistics. A. N. KOLMOGOROV [6], V. I. GLI- VENKO [7], N. V. SMIRNOV [8], B. V. GNEDENKO [9], and other mathematicians having recognized the great theoretical and practical importance of this set of problems, developed this subject into a systematical theory.

In the last three years a particularly great number of papers dealt with such problems; of these we mention those of B. V. GNEDEN~O and V. S. KOROLUK [10], B. V. ONEDENKO and E. L. RVA~.EVA [11], B. V. GNE-- DENKO and V. S. MIHALEVI~. [12], V. S. NIHALEW~ [13], J. D. KVIT [14], G. lVi. MANIA [15], I. I. GIHMAN [16], W. FELLER [17], J. L. DOOB [18], F. J. MASSEY [19], M. D. DONSKER [20], T. W. ANDERSON and D. A. DARLING [21]. A bibliography up to 1947 is to be found in the paper of S. S. WILKS [30] enumerating 90 papers.

The purpose of the present paper is to give a new method by means of which many important results of the theory of order statistics can be obtained with surprising simplicity; the method also enables us to prove several new theorems. The essential novelty of this method is that it reduces the problems connected with order statistics to the study of sums of mutually independent random variables. w 1 contains the review of the method, w 2 is devoted to the proof of some known theorems by means of this method, and w 3 contains the formulation of some new results obtained by this method,. concerning the comparison of the sample distribution function to that of the population. These results are connected with the fundamental results o f A. N. KOL/VIOGOROV and N. V. SN.IRNOV.

Let F,~(x) denote the distribution function of a sample of size n drawn from a population having the continuous distribution function F(x), in other-

Page 2: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

192 A. RENYI

words, F,~(x) denotes the frequency ratio of sample values not exceeding x. KOLMOQO~OV determined the limiting distribution of the supremum of ]Fn(x)--F(x)[, SMIRNOV did the same for F~(x)--F(x); in w 4 we shall determine the limiting distribution of the supremum* of the relative deviations

r,~ (x) - - F, (x) - - F(x) F ( x ( (x)_ i and F(x) , respectively. To do this, besides our method,

the lemmas in w 4, generalizing some results of P. ERD6S and M. KAC [22], are needed. These lemmas are of certain interest in themselves. w 5 contains the proof of the results formulated in w 3, while w 6 contains some remarks on the numerical computation of the values of the limiting distribution functions occurring in our theorems; tables for one of these are also given. I have found the method formulated in w 1 by analysing a theorem of S. MALMQUIST [23]; w 1 contains among others a new and simple proof for this theorem of MALMQUIST. Together with G. HAJOS, we have found another simple proof of this theorem which will be published in a joint paper of ours [24]. All these investigations have for their origin in the discussions in a seminary of the Departments of Probability Theory and of Mathematical Statistics of the In- stitute for Applied Mathematics of the Hungarian Academy of Sciences. I lectured on the part of the results contained in this paper in January 1953 on the congress of the Humboldt University in Berlinland in September 1953 on the VIIlth Polish Mathematical Congress in Warsaw. On this last occasion A. N. KOLMOGOROV has made certain valuable remarks for which I express him my most sincere thanks. Further, I express my thanks to T. LIPTAK who participated in the preparation of this paper by elaboration of some particular calculations, a s well as to Miss I. PALASTI and Mrs. P. VARNAI for the numerical computations.

w 1. A new method in the theory of order statistics

Let us start with the following special case: let a sample of size n be given concerning the value of a random variable ~ of exponential distribution, i. e. the results of n independent observations for its value, denoted by ~ ~2, ~," in other words, ~, ~2, ~,, are mutually independent random ~ I , �9 " " , ~ " " " ,

variables with the same distribution function of exponential type. We need the following well-known property of the exponential distribution: if ~ is an exponentially distributed random variable, then

(1.1) P(~ < x-]-yl~ >= y) = P(~ < x),

* (in an interval 0 < a ~ F(x) ~ b ~ 1). This lecture will be published in the communications of the Congress under the

~ollowing title: "Eine neue Methode in der Theorie der geordneten Stichproben".

Page 3: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 193

if X > 0 and y-->_ 0. 2 This property characterizes uniquely the exponential distribution. Indeed, let F(x) be the distribution function of g, then

P(C < x + yl~ > y ) - F(x-]- y ) - - F ( y ) - = 1 - - F ( y )

and it follows that (1.1) is equivalent to the following relation:

(1.2) q)(x + y ) ----- q,(y)

where ~ ( x ) = 1--F(x) . It is known however that, of all ftmctions satisfying the condition 0 - - < -- ~ (x ) N 1, except for the trivial cases q)(x)~_0 and ~ ( x ) ~ 1, the functions CP(x )=e -xx (2 > 0) and only these satisfy the func- tional equation (1.2).

The meaning of (1.1) becomes especially clear, if the random variable is interpreted as the duration of a happening having random duration. In

this interpretation the proposition (1.1) can be formulated as follows: in case of a happening of exponentially distributed random duration, being in pro- gress at the moment y, the further duration of the happening does not depend on y, i .e. on its duration until the given moment.

Let us arrange the numbers ~1, .~2,..., ~, in order of magnitude and use the notation

.(1.3) ~ = ,~,~(~, ~2, �9 .., ~,) ( k = 1, 2 , . . . , n)

where the function Rk(xl, x2 . . . . , x , ) of the n variables xl, x2 , . . . , x~, denotes the k-th of the values x~,x2 . . . . , x , in order of magnitude ( k = 1,2 . . . . , n); thus e .g . ~ : rain gk and ~*~ = max ~ . Then the individual and joint dis-

l ~ k ~ n l < : k ~ n

tributions of the values of the order statistics ST <-_ ~,~ <= . . . <= ~,*~ can be most ,easily determined. For that purpose we interpret the variables ~k as random durations of mutually independent happenings; then ~ denotes the duration of the happening finished as k-th of the n happenings. Let us determine, first of all, the distributions of the differences ~+~- -~ . If U _k = y, then

:g $ p $ (1.4) P(~k+l--~k > X[~ = y) = (~k+l > X ' @ y[ ~ y)

where on the right side there stands the probability of the event that none of the n - - k happenings, being in progress at the moment y, finishes until the moment X@y. By virtue of (1.1), the value of this probability is

(P(r > x))'~-k = e-(,~-,0x~

and thus the conditional distribution function of r with respect to the condition ~, ~ y is

(1 .5) 1 ~ ( ~ + 1 - - ~ . < x I g; = y) = 1--e-('~-~0 x~.

2 P(A) denotes the probability of the event A, and P(A[B) denotes the conditional probability of the event A with respect to the event B.

Page 4: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

194 A. RENYI

As the conditional distribution function (1.5) does not depend on y, (1.5) gives also the non-conditional distribution function of ~7,§ indeed, by virtue of the theorem on total probability,

Qo

( | . 6) P ( ~ + I - - ~ < x) = f P ( ~ + I - - ~ ; < x l~ , = y ) d P ( ~ < y) = 1 - - e -('~-kJ~. 0

Therefore the differences ~ + 1 - - ~ are themselves exponentially distributed 1

with the mean value and thus the variables (n-k)z (1 .7 ) 4+1 = ( n - - k ) ( k = O, 1 . . . . , n--l

1 (In the above rela- are also exponentially distributed with the mean v a l u e ~ .

tion by definition ~3 ~ 0.) It follows from the abovesaid that the variables d~, d~, . . . , d, are mutu-

ally independent random variables. It is namely easy to see that the proba- bility

does not depend on the variables y~, Y2 . . . . . yk; this is evident, as the above conditions mean that ~ - = y ~ , ~ - ~ y l § ~ . = y ~ + y 2 @ . . . +yk; i . e . they give the moments of the finishing of k happenings, which finish first of the n happenings which started simultaneously at the moment t = 0 . These: conditions imply that at the moment t = y ~ + y 2 + . . . @ y k there are still n - - k happenings in progress and the probability of the finishing of at least: one of them before the moment t @ x is equal to 1 - - e ~(~-k~x. Thus the probability of the left hand side of (1.8) equals 1--e-('~-~% i. e. it does not: depend on the variables y~, y~, . . . , y~, and this is equivalent to the fact that the variables ~+~--~?, (and also the variables dk) are mutually independent~ Thus the variables ~ can be expressed in the form

d~ dk (k = 1, 2, n), 0 . 9 ) ~ = -h - § + § n - - k § " "

i. e. as linear forms of mutually independent random variables having the same distribution. (1.9) can be also expressed by saying that the variables ~:. form an (additive) Markov chain. By virtue of (1.9) the distribution of any ~,, further, the joint distribution of any number of the variables ~ can be: determined in explicit form.

Consider now, how the abovesaid can be applied in general to the study of order statistics. Let ~ be any random variable having a continuous and steadily increasing ~ distribution function F(x), let (~,g.~, . . . , ~ ) b e a

By saying that F(x) is steadi!yincreasing, we mean that F(x) is a strictly increas- ing monotone function in the least interval (a, b) where F(a)= 0 and F(b)---1; it may be happen that a = - - ~ or b : - § ~ .

Page 5: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 195

sample of size n consisting of n independent observations of the value of ~, that is to say, let ~1, ~2, . . . . ~,~ be mutually independent random variables with the same (continuous) distribution function F(x ) . Let us arrange the sample values ~; in order of magnitude, that is to say, let us form the new variables ~ R~(~I, ~2, . . . . ~).

The main problem of order statistics is to study the variables ~.; this, however, can be reduced to the special case when the variables ~k are expo- nentially distributed (and tlierefore - - by virtue of (1.9) - - to the study of sums of mutually independent random variables), as follows: let us put

1 (1.10) '2~ = F(~k) and ~k ~ l o g ~ ( k = 1,2 . . . . . n)

and let us denote by ~ = (~) the k-th of the variables ~1, ~/2,..., ~/~ in order of magnitude, i. e. let us put ~2~ : Rk(~, ~, .-- ,~b~); further, let us put

, 1 ( k = 1 , 2 , . . n). (1.11) ~k : log ~.+~_,~ .,

1 As log~- is a steadily decreasing function, we obtain:

(1.12) ~ = Rk(~i, g2 . . . . . _~,) (k~---- 1, 2 , . . . , n),

whence ~* . . . , _k is the k-th of the variables ~, ~2, ~,~ in order of magnitude. As we have assumed the variables ~k to be mutually independent, it follows that the variables ~ are also mutually independent.

Let us investigate now the distribution of the single variable ~ . F ( x ) being a strictly increasing function, the inverse function of x ~ F(y), denoted

by y = F-~(x) , is uniquely defined in the interval 0 --< x --< 1, and thus

( 1 ) _ F ( F _ I ( e = ~ ) ) = P(r < x) = P log T((~; < x = P(~k > F-~(e-X)) = 1 1 - - e -~,

if 0 ~ x =< 1. Therefore the variables gl, g-~, . . . , g,, are mutually independent and of exponential distribution with the mean value 1. In this way, the ran- dom variables ~* ~,.~ themselves can be expressed in the form

( , -1 ~* (1.13) ~,~ : F (e--~+~ ~ ) - - F-i~e - ~ ~ J (k~ ... . 1, 2 , . . . , n);

where the variables O~, 6~ . . . . ,6,~ are mutually independent and of exponen- tial distribution with the same distribution function 1--e -.~ (x > 0). It also follows from this result that the quotients

6 n + l - k

(1. 14) ~-+1, - - e

are mutually independent random variables (here, by definition, ~ /*+~1) , since the variables 6,,,+~_~ are, as we have seen, mutually independent. Another consequence of (1.13) is that the variables ~ ...... ~, . . ~ f o r m a

13 Acta Mathematica

Page 6: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

196 A. RI~NYt

Markov chain (and thus the variables ~ , ~_~,..., ~* form also a Markov chain); this follows from

, 6k+1 (I , 15) -~*-1': = F-1 (e l~ ~;-/~" )

which is obtained from (1.13) and from the fact that the variables _~+1-1< and d'k+~ are independent, since

(1 (o , o, )]) _~*+1_t~--F -1 exp - - n + -I--- '-t n--k-t-1 "

A. N. KOLMOGOROV [6a] was the first who remarked that the variables _~',~, ...,~,'~, i.e. the sequence of order statistics, form a Markov chain. The new method contained in the present paper starts from this fundamental observation, but the possibilities implied by it could be developed only after having transformed the Markov chain {~} into an additive Markov chain by

1 means of the transformation ~;=-log F(~_~+I)" In this connection it is in-

teresting to consider the following general problem: for which Markov processes {~t} can such a family of functions Gr be found that the variables ~7r form an additive Markov process? A necessary condition of this is that the distribution function F(x, s,y, t ) = P(~ < x I-~*'-Y) of the Markov process should satisfy the following differential equation:

OFOF(OF O 7 O F]_ o2F tO FfOFT_ F(OF] i ~x Oy IOy dx-Oy ~ 0 ~ ~-) Ox~y f Ox '2 ~.Oy] ~ kOxJ }"

We hope to return to the discussion of this problem on another occasion. The variables ~ are obviously uniformly distributed in the interval

(0, 1), because, if 0 < x < 1, then

(1.16) P(~I~ < x ) - - P(~I~ < F-~(x))= F(F-~(x))~- x,

and therefore the variables ro~ form an ordered sample of size n drawn from a population of uniform distribution in the interval (0, 1).

It follows from (1.14) that

and as P(e 4''+z k < x) --- P 6,~+~ i. > log = e- '~ --= x, therefore the va-

riables ~-~-< ) are mutually independent and have the same distribution, namely,

they are uniformly distributed in the interval (0, 1). This is the theorem of S. MALMQUrST mentioned in the introduction.

Page 7: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE T H E O R Y OF ORDER STATISTICS 197

w 2. The theory of order statistics built up by means of the method of w 1

On the basis of what has b e e n said above it is easy to obtain the results concerning the limiting distribution of order statistics. In order to show this, we shall prove the following theorems?

THEOREM 1. If k >= 1 is a fixed positive integer, then

lim P(n_~ < x ) = [" tk-~e-~ ( k - 1) ! at (x > o), 0

i.e. n ~ has in the limit a F-distribution of order k.

PRooF. As we have seen,

61 62 Ok

where 0~, 0,~,..., 0k are mutually independent and exponentially distributed xandom variables with the common distribution function 1 - - e -~ ( x > 0 ) .

6j TherefOre, the variable n- t - i - - j has the probability densi ty function

{ n ~ - l - - j ) e o,+1-5)~ ( x > 0 ) a n d thus it follows by simple calculation that the probability density function of ~* is

g,~(t)=(nk)ke'"*(d--1)'~';

hence n ~ has the frequency function

[n(e~-- l )]k- 'e-~=ll l - - ( : ) ( ) 1 n - - 1 -t ; _ _ = e (e 1) '~-1- (2. 1) -n-g k--1

tk-le-~ As ~,-~lim n (e '~ - - 1) ~--- t, thus the density function of n ~ converges to (k - - 1) .t

as n--~ ~ , i.e. to the density function of the F-distribution of order k. This result might have been expected by the following consideration.

Obviously ]r .

,(2.2) n ~ = 01 -}- 02 -}- �9 �9 �9 @ 0k ~- Y ( j - 1)0j. -- j='~.; n -}- l ~ j '

the density function of all+ 6.~ @. . . -}- 0k is, however, (k - - 1) [ ' on the other

hand, the variable . ~ ] ( j - - 1 ) 0 j tends stochastically to 0, as n - - ~ . j=2 n + 1 - - j

4 We use the notations introduced in w I.

13"

Page 8: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

198 A. RENYI

To develop this consideration into a precise proof, we need the following lemma, due to H. CRAM~R ([25], p. 254).

LEMMA 1. Let us put 7',~ = e,~ + ~,~ where a,~ and ,#,~ are random variables and let F,~(x) denote the distribution function of a,~ and further let us assume thaf'

lira M~',~-- 0 and lira D#,~ ~ 0.

Furthermore, let us suppose that there exists the limit F(x) of the distribution functions F,(x) as n ~ , i. e.

lim F,~(x) = F(x )

holds for all points x of continuity of the distribution function F(x). Further, let us denote the distribution function of 7,~ by G,~(x). Then

lim G,~(x) = F(x).

PROOF. 6 Without loss of generality we may suppose that M s Then, by virtue of Tchebyshev's inequality, we have

P ( l~ . l > ~) < -

therefore, given any, arbitrarily small e > 0 and 6 > 0, there exists a positive integer no(d) such that

P(I~',~I > ~) < O if n > no(O).

But then

(2. 3) o . (x) = p (7. < x) <= p < x + + P < -

In fact, if et~+#~ < x, then either a,~ < x + e or c~,~ ~ x +~ , but in the latter case at the same time ~,~ < - - e holds and we obtain (2.3) by means of the theorem on total probability. Similarly,

(2.4) G,~(x)=P(7,~<x)>=p(~,~ < x - - e ) - - P ( # , ~ > , ) ;

in fact, if tz,,< x - - , , then either e,~+s < x , or ct,~+s >_ x, but in the latter case at the same time #,~ > e. Consequently, we have

F , , (x - - *)- - d --< O, (x) ~ F,,(x + ,) + d.

Passing to the limit n--* ~ and considering that e and d can be chosen arbitrarily small, it follows that

F(x--O) <= lira Gn(x) <_-- lim G,,(x) ~ F(x-}-0),

5 Here and in what follows we shall denote then mean value of the random variable by Mr and its standard deviation by D~.

0 We give here the proof of this lemma of CRAu~R because a similar method of proof is needed in the proof of Lemma 2.

Page 9: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 199

i .e . , that in all points of continuity of F(x) lira G,~(x) ~- F(x).

This proves Lemma 1. Theorem 1 follows immediately from this lemma, as

J - ' ( , + , M n + ~ ) - - .=. n 4- 1 - - j and =2 n q- l - - j ) .i=.~ j),-,

and thus the conditions of Lemma 1 are satisfied. By means of Theorem 1 we can easily determine also the limiting

distribution of ~-. We have /]**+~-7, ~-~-e --~ and therefore

P(n(1--/]:+>,,)<x) = P ( ~ < log 1 ) . l__ x.3_ '

n

since 1

l o g - - - - x

1 - - - - n

lim - - 1

n

and the F-distribution is continuous, it follows that n(l--yb*,+>l.)has in limit

also a F-distribution of order k with the density function ( k - - l ) ! (t > 0 ) .

Now, the random variables /]k are mutually independent and uniformly dis- tributed in the interval (0, 1). Because of the symmetry of the uniform distri- bution, the same holds also for the variables l - - / ]k(k~-- -1 ,2 , . . . , n ) a n d thus the variables

/]~ --/;?~-(/]1,/]2 . . . . ,/],0 and 1 --/],]+l-k = R k ( 1 --/]1, 1--/]2 . . . . . 1 --/],,)

have the same distribution. Hence it follows the following

TueoueM 2. The distribution of the variables n~7~ and n(1--/],*,+~_ 0 in case of any fixed k >= 1, tends to the F-distribution of order k with the den-

tl~ 1 e sity function ( k - - l ) ! ( t > 0 ) ?

By means of Theorem 2 we can determine also the limiting distributions

of ~* * " .,c and ~;~+>k, these, howe.ver, contrary to the limiting distributions of the variables ~,*~ and g~ - - will depend on the distribution function F(x) (see [81).

7 The distribution of the variables ~ can be also determined exactly for finite n and after this passing to the limit Theorem 2 can be proved also in this manner by means of some simple calculations (see H. C~AM~R [25]). We proved this theorem here by means of our method to show its application at first in a simple case.

Page 10: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

200 h. RENu

Now we shall prove the following

THEOREM 3. 8 The variables ~l~ and ~,%1-j are independent in the limit i f n ~ ~ and at the same time k and j are fixed, namely

x y

. . . . . n ' d u d r ( x > O , y > O ) .

O 0

PROOF. First of all we prove a lemma:

LEMMA 2. Let us pu t 7,~ = e,~ q- 8,~ where c~,~ and [L. are random variables., and let a random variable Z,r be given which is independent o f cr Let us denote the distribution functions of c~,~ and Z,~ by F,~(x) and H,~(x) respecti- vely. Let us assume that the limit functions F(x) ~ lira F,~(x) and H(x) ~ lira H~(x)

exist and, further, that

lira M f i , = 0 and l imD# ,~ - -0 .

In case all these conditions are satisfied, we have

lira P(7~ < x; Z , < Y) ~- F(x) H(y) ,

i.e., the variables 7,, and Z,~ become independent in the limit.

PROOF. Let us choose (as in the proof of Lemma 1) the value of the integer no so large that

P(i~,~l > e) < O if n > no

Similarly to the arguments applied in the proof of Lemma 1, if may be proved that, if n > no, where no depends on the choice of the positive numbers ~ and d, then

(2.5) P (r162 < x - - e, g,~ < Y) - - d N P (7,~ < x, Z,~ < Y) <= P (a,~ < x -}- ~', )b,. < Y) -5 d,

and as, by our assumption, c~,, and Z,~ are independent, therefore

p < x • < y) =- t% (x • H,, (y)

and similarly to the proof of Lemma 1, we obtain that in all points of con- tinuity of F(x )

lim P(7,, < x, Z , < Y) = F(x) H ( y )

holds. This completes the proof of Lemma 2.

Now 6j+1 d~.+l-k

~,*+1-~-- log n . q - . . . q- log n + ~ n - - j k

8 See CRAM~R [25], p. 371. We discuss this well-known theorem here, because our method throws more light on the real ground of the fact expressed in the theorem, than its known proof.

Page 11: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 201

where

and therefore

d~+~ As the sum - - . + . . . +

n - - j

further, we know that

0 i O~ + n + l - - j '

lira M , ~ = l im D~* - 0 .

(~n+l-k k logn is independent of ~}' and

lira n log 1 - - y ,

1 Y n

we have

(2.6) lira P(~i~ \

where

g

f t j_l l i m P ( n g j < y ) - - ( j - - l ) !

o

e -~ dt,

- - ~/,~+l-j < = < ) - , 1 * = lira P ,,+~__~--~ > log ---~ Fj(y)

!/

f p-1 FAy) = ( j _ e" dt,

o

On the other hand, by Lemma 1

(2. 7) P ,~+1~i:--~ > log ~- l im P * n+l-k

and by virtue of Theorem 2

(2. 8)

.9,'

x ] _ _ ( t~qe -t < 3-) - - ) at.

o

From the relations (2.6), (2. 7) and (2.8), Theorem 3 follows.

By means of Theorem 3 we can determine the limiting distribution of the difference ~b*~--t/{. This is important, because ~/,'[--~/I, the range of the sample Ql,,r/~ . . . . ,r/?0, can be used to estimate the standard deviation of the population. As, for large n, the variable ~{ is near to 1 and 2fi is near to 0 with a probability near to 1, we obviously have to consider the variable n [ 1 - - ( ~ - - ~ l l ) ] and as, by virtue of Theorem 3, n ,~ and n(1--~/~) are independent in the limit, the limiting distribution of their sum equals the composition of their limiting distributions. As e -~ (x > O) is . the densi ty func-

i , * , tion of the limiting distribution of both n~{ a n d n ( - ' ~ b , ) , the density

Page 12: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

202 A. RI~NYI

function of the limiting distribution of n[1--(r/,~--~{)] is .,,e

j'e-(:~-y)e -~t d y = x e .... (x > 0); 0

therefore n[1--(~,]--ri~)] is in the limit a random variable having a / '-dis- tribution of order 2.

By means of the limiting distributions of the random variables *i~, the limiting distributions o f the random variables ~* _k can also be determined.

Hitherto we have considered the limiting distributions of the order statistics ~, (resp. those of their transformed values ~ and ~) under the condition that the index k (resp. j - - n + l - - k ) is fixed and at the same time n - - ~ ; this set of problems is called the study of the "extreme values" of the sample. We now turn to the study of the limiting distributions of the variables ~$ (resp. of zff and ~) under the condition that together with n k tends also to infinity, namely so that I k - - n q [ = o ( V n ) , where q is a con- stant (0 < q < 1). The variable ~*+~ (where k = [nq]) 9 satisfies this condition; this variable is called the q-quantile of the sample. In the special case n = 2 m + I where m is an integer, the variable ~" ~,,+1 is the median of the sample: obviously, the q-quantile of the sample is nothing else but the q-quantile of the sample distribution function and thus the median of the sample is nothing else but the median of the sample distribution function. Consequently, if n is an even

integer, i.e. n~-2m, then 2(~*~+~,~+0 is called the median of the sample.

We shall now prove the following theorem containing the proposition that the q-quantiles of the sample in the limit are normally distributed, if the distribution function F(x) of the population satisfies certain simple Conditions.

THEOREM 4. ~~ Let us suppose that the density function f (x) - - -F ' (x) of the common distribution function F(x) of the mutually independent random variables ~, ~2 . . . . . ~,, exists and that f(x) is continuous and positive in the interval a < x < b; then, if 0 < F(a) < q < F (b )< 1 and further, if [k,,~nql

-~-o(Vn)(and thus, afortiori, lira k,~, q), then ~* is, in the limit, nor-

really distributed with the mean value Q=F-~(q), which is the q-quantile of the distribution function F(x), i.e.

x

lim Pl_ 1 . ] / ~ - - q ) <X : [ / ~ e-

f (Q) V n

9 [x] is the largest integer for which [x] ~ x. 1~ This theorem is contained in a general theorem of N. V. S~amNov [8g].

Page 13: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 203

PROOF�9 First consider the limiting distribution of ~+l-l"n (Jj

(2.9) - - n + 1 - j ,i"=1

where the variables da- are mutually independent and exponentially distributed with the distribution function 1--e -x (x > 0), that is to say, Md~-= D d j - = 1 .and

oe 1 <~

M(Id' i - - l l a) -~- f ]x- - l l 3 e -.~' dx < [e -.~ dx + i x a e-Xdx <= 5. 0 0 0

Then, however, n+l-gn 1

*~+l-lc~ l

(2. 10) S~ = D ~ r247 ,,,, = .~' (n + 1 - - j ) " ' j = l

[ n§ n I r g n+l-/c n | K,?= . ~ M ~ 5 ~ ' (n +- l - - j ) : "

3-1 n -]- 1 - - j j=l

and thus

K,,, 5 (2.11) s,-( =< k,-7

Therefore, if n--* o% then ~- - - - ,0 ; this means that the central limit theorem

in LIAPUNOV'S form can be applied to the sequence of sums ( 2 . 9 ) a n d thus

(2. 12) ,,§ P L S;7 - < x. - - g 2 ~ _' e - ~- dt.

Now, it is known that

21 k=l ~ - - - l o g m + C + A , .

where C is the Euler constant, and an A > 0 constant can be found

that !A,,, [ < A m

(2. 13)

Thus we have

(2. 14)

such

�9 By means of some simple calculations it can be verified that

1 1 1 & 1,.=,, kr - - h H + h ( h - - 1 ) (0 < ,9 < 1).

n 8' M,, == log ~ + ....

1- l (2. 15) & k,, h +e": '

Page 14: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

204 A. RI~NYI

A B where /g+l < ~ and l+,71 < ~ , A and B being constants not depending on n.

Therefore n

~,~+1-~:,, - - M, , g,+,+l J~. - - l o g

(2. ~6) s , U ~ - g ~ + 7 + + +':'

I / [ n k , +

n where IgL'l<L k , ( n - - k , ) and L is a constant

k, if n---. ++, then -----~ q and thus ~;"--+ 0. Hence

n

not depending on n. But,

(2. 18) lira P q~ < x - - e - ~ dt. ( / ~ - q /

nq 1

As ~+*++1,:,~--- log F(~*+,) ' if follows from (2.18) that

( ' 2 " (2. 19) ,+lim++ P ~,* > F -+~ q e-~[/W = - ~ _ e - V d t .

But, in view of the mean value theorem of differential calculus, we have

(2, 20) F-' (qe " +") = Q 4c q f ( O & + )

it follows that

For brevity, let us introduce the notation q+-- k,+, then, by virtue of (2.17) n

and taking into account that owing to #, . --nql = o (Vn), we have

log q~-- log-ql- : o ( ~ ) ,

where lim & + = 1.

It follows that

(2.21) lim P (. ~ " - - Q

. . . . f~Q) V q C l - q ) t. n

which was to be proved.

i: t"- _ 1 e - ~ dt, < x V ~

/g'~+~-k, - - log ~ < _ . e - ~- dt. (2. 17) lim P 1 + +++ ,+.+ + ] / n - k ~ 7 ~

~. n k++ -~'+

Page 15: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 205,

The statement of our theorem can also be characterized by that the q-quantile of a sample of size n in case of large n is approximately normally distributed around Q, i.e. around the q-quantile of the population, with the

standard deviation 1 ] / q ( 1 - - q ) f ( Q ) [/ n "

In this way, by means of the sample median, an interval containing the median of the population with probability arbitrarily near to 1, can be given. In the special case of symmetrical, for example, normal distribution, the median of the distribution coincides with its mean and in this way we can estimate the mean value of the population.

The theory of order statistics has a widespread applicability in t he statistical quality control of mass production, ~1 which is an important field of application of probability theory. Let us assume that a certain measurement of some engine parts produced on an automatic machine displays some small random fluctuations from specimen to specimen, therefore its value can be considered as a random variable. Let us suppose that, under standard. manufacturing circumstances, the distribution function of this measurement is the (continuous) function F ( x ) ; to control the process of production, at regular time intervals we draw a sample of size n - - e.g. of size 5.

We take the considered measurement of the values of the sample and we mark them on a perpendicular straight line drawn across the abscissa corres- ponding to the point of time of sampling on the ,,control chart" and we mark their places with dots; the values of the sample will be placed automatically in order of magnitude. In order to detect any irregularity in the process of production (e. g. the displacement of the adjustment of the automatic macIfine or the attrition Of certain parts of the producing machine etc.), we draw 5 bands determined by parallel straight lines, giving intervals containing the least, the second, the third, the fourth, and the largest value, respectively, of the sample of size 5 at the same time with a given probability - - e . g. 95 % - - u n d e r standard manufacturing circumstances. The determination of these intervals is very easy by what has been said above. In fact, if ~ denotes the k-th sample value in order of magnitude (k = 1, 2, 3,4, 5), then, as we have seen, we can exactly determine the individual and joint distributions of the variables

1 ~'7~ = log F(~_~:) (k = 1,2, 3, 4, 5).

The practical application of this method in quality control is dealt with by the Department of Mathematical Statistics of the Institute for Applied

n Cf. the work of L. I. BRA(ItNSKY [26] ; by means of the theory of order statistics, the calculations of BRAGINSKY which are not quite exact can be put in a precise form; in the practical application it is suitable to carry out the control charts on the basis of these precise calculations.

Page 16: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

2 0 6 A. R~NYI

Mathematics of the Hungarian Academy of Sciences; tables needed for the use of the method are also prepared.

We shall not continue the enumeration of theorems obtainable by means of this method, we only emphasize, that our method consists in deducing a l l these theorems, by means of (1.9), from the theory of limiting distribution of sums of mutually independent random variables.

w 3. Formulation of some new theorems concerning the comparison of the distribution function of a population and

t ha t of a sample drawn f rom it

Hitherto we have only shown how our method makes it possible to prove certain well-known results of order statistics. Now we shall show the new results which can be obtained by means of the same method.

A. N. KOLMOGOROV [6b] proved a fundamental theorem giving a test for the hypothesis that a sample has been drawn from a population having a given distribution. By means of this test we can infer to the unknown distri- bution of the population from the distribution of sample values. 12 Let us define

0 if x<~, - -

(3. 1) F~(x) : k if ~ < x ~ ~J%1, /7

1 if ~*<x, i. e. F,~(x) is the distribution function of the sample, in other words, the frequency ratio of the values less than x in the sample.

KOLMOGOROV'S theorem is as follows:

I 2 (--1)~ e-2;~'~ if Y > 0 0 . 2 ) lirn P ( V n sup lFn(x) - -F(x) l<y)= ~.:_~

~r -~<x<+~ 0 if y ~ 0 .

KOLMOGOROV'S theorem therefore gives the limiting distribution of the sup- remum of absolute value of the difference between the distribution function of the sample and that of the population. This limiting distribution does not ~depend on the distribution function F(x) of the population which is assumed, for the validity of theorem, to be continuous. KOLMOQOROV'S theorem con- siders the difference [F,(x)--F(x)] with the same weight, regardless to the value of F (x ) ; so e.g. the difference [F,,~(x)--F(x)]=-O.Ol has the same weight in a point x with F ( x ) = 0 . 5 (where this difference is 2% of the value of F(x)) as a point x with F(x) = 0.01 (where this difference is 100%! of the value of F(x)). We can avoid this by considering the quotient

IF~(x)--F(x)[ instead of ]F,~(x)--F(x)], that is to say, by considering the F(x)

x~ I. e. we can give confidence limits for the unknown distribution function.

Page 17: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 207

relative error of F,~(x). In this way, the idea arises, naturally, to consider the

limiting distribution of the supremum of the quotient [E"(x)--F(x)l which F(x) characterizes the relative deviation of distribution function of the population and that of the sample.

A theorem similar to that of KOLMOGOROV's was proved by N. V. SMIRNOV concerning the one-sided deviation of the sample and population distribution functions. SMmNOV'S theorem is as follows:

(3. 3) lim P(Vn sup (F,~(x)--F(x)) < y) = i

4

if Y 0, . . . . . . <.~<+~ ,0 if y=<0 .

We shall consider also the analogous problem for relative deviations. All these problems can be successfully solved by means of the above

method. In the course of solving these problems a natural limitation is :to be adopted: as E(x) takes on arbitrarily small values, it is not suitable to con-

F,~(x)--F(x) F , ( x ) - - f ( x ) [ sider the supremum of F(x) or respectively F(x) taken

in the whole interval - - ~ < x < + ~ , but to restrict ourselves to an interval x~ N x < + o,, where the abscissa x~ is defined by the relation F(x,,) = a > 0; the value of a, however, can be an arbitrarily small positive value. In {} 5 we shall prove the following results:

THEOREM 5.

(3.4) lira P (Vn sup ) F(x) < y =

' ~ l - a , t~

= e ~dt o

0 if y ~ 0 .

(3. 5)

THEOREM 6.

( ) lim P l/rn sup F(x) < y --

if y > O,

4 ~ '~ e 8 a y'~ = ~- k=o~'~ (--1)~ 2kq-1 if y > 0 ,

0 if y ~ O . We may consider the limiting distribution of the supremum o f

F,~(x)--F(x) and of its absolute value taken in the interval x~ < x <= xb, F(x)

respectively, where the abscissae x,~ and xb are defined by the relations F(x , , )=a>O and F ( x O = b < l ( 0 < a < b < l ) . We then arrive at the following theorems.

Page 18: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

208 A. RENY[

(3.6)

(3.7)

where

and

T H E O R E M 7.

1 lim P n sup < y = ,,-~ ~ ~ ~ F(~) <= ~ F(x)

1 y e ~ dt du - - e -

J~7 - ct~

THEOREM 8.

lira P ( Vn sup a ~ F (~c) ~ b

( - - ~ < y < + ~ ) .

F,~(x) - - F ( x ) ) F(x) < y -=

_ (9-k+l) ~- n 2 (I - a )

_ _ - - 1 )t~ e 8 ~ .r

- - " ~ 1,-_o 2 k + 1

0 if y ~ 0 ;

El,- if y > O,

E k ~ 1 - - - - -

~ ?r

" d u + q ~ ,

b y~ (2k+O T 2 e - 2 ( l - b ) ? (1-b)~t2

) e ~'bY~ s i n u d t / .

Y i~da y o

These theorems provide tests for verifying the hypothesis that the sample (-~, ~2, . . . , ~,~) has been drawn from a population of the distribution function F(x). The character of these tests consists in that they give a band around F(x) in which, if the hypothesis is true, the sample distribution function F,(x) have to lie with a certain probability and the width of this band in all points x being proportional to F(x) This band, however, is not a symmetrical one. To overcome this difficulty, we apply the test twice, first to the sample (.~1,-~2,..., ~,~) having the population distribution function F(x) and then to the sample ( - - ~ , - - ~ 2 . . . . ,--~,,) having the popula t ion distribution function G ( x ) = - l - - F ( - - x ) . In order to illustrate this, let us denote the distribution function of the sample (--.~1,--_~:, . . . . ,--~,,) by G,~(x) and let A be the event that

sup / ~,, <= ~,, (~), F ( x ) < y '

.and B the event that

- <-F( G,(x)--G(X)G(x) I g '~ (x ' ) -F (x ' ) Vn sup) < y , i.e., 1/n sup < y ; ' ~'(z')<=l-~ 1 - -F(x ' )

Page 19: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 2 0 9

finally, let us denote the simultaneous occurrence of A and B by C. Taking into account that

P (C) ---~ e (A B ) : P (A) + P (B)-- P (A q- B)

and in case of occurrence A + B we have obviously at the same time

]/n sup iF,,(x)--F(x)l < y ;

and this last event, by KOLMOGOROV'S theorem, has in the limit the probability

{3.8) K(y) = 2 ( - - ! ) ke-~k2~,

Therefore P(A-I -B)<=K(y) in the limit and in the same case

P(C) ~ P ( A ) + P ( B ) K(y). T h e probabilities P(A) and P(B) are equal and their common value is given by (3.5). Thus the probability of the event that the sample distribution function F,,~(x)lies in the intersection of bands defined by the above two conditions corresponding to the sample (~,, ~2 ... . ,~,,) and (--~1,--_~2 . . . . . --~,~),

respectively, is not less than 2 L ( Y l / j - ~ ) = - K ( y ) i n the limit, where

(2k+l)~n ~

L(z)= 2 (__I),~ e s~ ,~=o 2(k4- 1) (z > 0) and K(y) is the function defined by (3.8).

Let us point out a most surprising corollary of Theorem 7. From the theorem (3.3) of SMIRNOV, we get

(3 .9) lira P ( sup (F~(x)--F(x)) < 0 ) = 0 , n-~cc - a c ~ x < + ~ o

i. e. the probability of the event that the sample distribution function does not exceed the population distribution function all along the interval - - ~ < x < + ~ , tends to 0 as n-- , ~ . From Theorem 5 it follows that

(3.10) lira P ( sup (F~(x)--F(x)) < 0)~0,

i. e. the same is true for the interval x~ =< X < - [ - ~ . On the other hand, by Theorem 7,

l I n ( l - b ) b - a

1 T e-V-dtdu > O, {3.11a) l imP( sup (F~,(x)--F(x))<O)---~ e- n->.aa a Z a ~ X ~ x b

0 0

i. e. the probability of the event that the sample distribution function does not exceed the population distribution function all along the interval in which the value of F(x) lies between arbitrarily fixed values a and b (0 < a < b < 1), remains positive also in the limit. This result, obviously, is important also from the point of view of statistical practice.

Page 20: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

210 A. RI~NYI

The result that the limit on the left side of (3. l l a ) i s positive, was also proved by fireMAN [16]; moreover, he obtained that

(3.11b) l i m P ( sup ( f**(x)--F(x))<O)-~ 1 V a ( 1 - - b ) t

- - arc sin ,~-.~o ~ =<~.~,.~ :w b(1--a)

GmMnn mentioned further that the result (3. l lb) has been already known tcr GNEDENKO. The terms on the right sides of (3.11a) and (3 .11b)are , of course, identical. This follows from the following consideration: the right side of (3. l la) is nothing else than two times the probability of the event that a random point normally distributed in the plane (x,y) and having the pro-

1 1 (x2+y2) bability density function ~ - e - - , lies in the infinite sector 0 < x < q- ~ ,

I f a (1- -b) and is to O < y < x b - - a ' this probability equal

.

2 a r c t g u ~ a 1 arc s i n V a(1 -o ) (3. 12) 2~- - - ~ _ b(1--a) o

Indeed, because of the circular symmetry of the normal distribution having 1

l - "T (~'+"r the density function~-~-e the probability corresponding to the infi-

nite sector of angle ~ is ~ 2 i f "

Theorems 5--8 will be proved in w 5. First in w 4 we shall prove some auxiliary theorems which are of interest i n themselves too.

w 4. S o m e n e w l imit ing d i s tr ibut ion t h e o r e m s

Let a sequence b e given ~ consisting of the sets of random variables

~,,,1,~,,,2 . . . . ,~,~,N, (n = 1, 2, . . . ) . Let us assume that the random variables ~,~, k have the expectation 0 and a finite variance, further, that the random variables having the same first index n ( n = 1 , 2 , . . . ) are mutually independent and satisfy LINDEBERG'S condition, that is to say, introducing the notations

F~,,k (x) = P (~,,, ~: < x); S,,,, k ~ , ~,, ,~; B~ D" S,,, N,,, -~- ~ , D 2 ,},,, 7p==-1 k = l

we suppose ~o

= f xdF,,, =0,

and

(4.1) lim 1.,-~s j~ x2dF,,j~(x)=O if ~ > 0 . . . . . BT, k=l I~I:>~B,~

Page 21: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 211

Concerning these sequences the following theorems.

THEOREM 9.

satisfying the above conditions we shall prove

�9 - - X ~-2

(4.2) l i m P ( max S,,,~<xB~)~-tV2-j e-~dt if x>O,

. . . . ~-<k-<x,~ [ 0 if x ~ O .

(4. 3)

(4. 4)

r h e l l

(4 5)

where

THEOREM 10.

o~!~m" P ( 1--<k-~N,,max IS,~.~,I < xB,O-- ~ (--1) ,~ , ~Yrk=o 2 k + l - if x > O ,

kO if x ~ O

THEOREM 1 1.

lim P(--yBn <-- min &, ~, ~ max S,,, k < xB~)~-

= ~(~+# s i n ( 2 k + l ) Y r x if x > O and y-->O, 2 k + l

either x ~ 0 or y < 0 .

REMARK. In case y=-x, Theorem 11 reduces to Theorem 10.

THEOREM 12. Let A~D2S,,,~G with 1 ~M,~<N.,~ and . A~, xm ~ - - ~ . ( O < X < 1).

l i m P ( max I&,,,~[<yB,,)= n-~c~ Mn < IC~r~

( - g k + l ) ? n - ~ ~

i 4 ~ s~'~( 2 f '~ ) y

0 y ~ O

if y > 0 ,

r / y2 (2k+l) ~-

2Le- o~--~ f ~'"_2 Y2-~zy e 2y2 sin u du. o

REMARK. In the special case of M n - - 1 (i. e. for 2 ~ 0), Theorem 12 is identical with Theorem 10.

For the special case in which all lhe considered random variables ~,,~ have the same distribution, Theorems 9 and 10 were proved by P. ERDOs

14 Acta Mathemafica

Page 22: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

212 A. R E N Y I

(4.6)

and

(4. 7)

and M. KAC [22]Y In the proofs of the above more general theorems, we modify their proofs inasmuch as we apply an ordinary (one-dimensional) limiting distribution theorem instead of the multi-dimensional limit theorem used by them; this enables us to generalize their results. We shall use Theorems 9- -12 in w 5 to determine the limiting distribution of the random variables

F. (x ) - -F(x ) F, ,(x)--F(x) sup and sup

~<-u(x)<-b F(x) F(x) _ _ a , ~ . v ( a . ) ~ b

where F,~(x) denotes again the distribution function of a sample of n mutually independent observations concerning the random variable ~ having a conti- nuous distribution function F(x) and further 0 < a < b > 1.

Let us turn to the proof of Theorem 9. Let us put

P , ~ ( x ) = P ( max S,,I~<xB, O. l ~ k ~ N u

Let ~1, g,_ . . . . , ~i . . . . . . be mutually independent random variables which are normally distributed with mean 0 and variance 1, and let us introduce the random variables

k

Ck-- ~.7~2,~ (k~-~- 1, 2, . . . ) .

First of all we shall prove that for any ~ > 0 and for any positive integer k we have

1 lira P,,(x) >= P(max (r g2 , . . . , Ck) < (x--~) Vk-)- d k

lira Pn(x) <= P(max (~1, ~ . . . . , ~k) < x Vk).

For, let mj be the least positive integer satisfying m j

( j ~ 1, 2, . . ., k). Obviously, 1 <= ml = < m~ <= . . . ==< m~ ~- N~. Let us define now the following variables:

( 4 . 8 ) ,~n, 1 ---- 8,% m., ," Afn, j = S , b mj - - Sn, ruff_ 1 (] = 2, 3, . . . , k).

We can see easily that for any fixed j ( j = 1, 2 . . . . , k) the Lindeberg con- dition holds for the sequence

(4.9) ~,,, W_l+l , ~n, mj_l+2 . . . . ' g'" ~"0' (n = 1, 2, 3 , . . . ) .

1~ Their method was generalized by M. D. DOSSIER [20b]. See further the papers by A. WALl) [27], [28], and K. L. CaUN~ [29]. They consider the limiting distributions of the supremum of the first n partial sums under the conditions that the variables have the same distribution and that the variables have finite third moments, respectively. C~UNQ gives for this latter special case an estimate for the remainder term also. ERD6S and KAC remarked that their theorems can be proved under more general conditions.

Page 23: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 213

lndeed, introducing the notation

B 2 �9 = D"d~,a' and using the relation

1 l i m ~ s u p D2~,.~,k~---0

which trivially follows from (4. 1), we obtain that, for any d > O,

,(4. 10) 1 - - d 2 B 2 1 4-6 k B~<--_ ,~,3"=<--k B;, if n>no(d)

;holds. Thus for any s > 0

l ' ~J >fBn X2 k l ~5' [ B" a & , (x) <= y _ x ,,,j . . . . *J-l+l Ix] ,a' 1 - - d B,'2,,. l l~l>rB,~

V1 Lindeberg condition is therefore actually if n>no(d) , and s ' = s - ~ . The

satisfied by the sequence (4. 9). Therefore, by the central limit theorem, ,'v

1 f ~" (4. 1.1) lim P(&,j<xB,,,~) ] [~ e-gdt ( j = 1,2, . . . , k).

But the random variables &, , , & , u , . . . , &,a, are mutually independent and

lim B,,, ~ 1 ,~§ B~ Vk ( j= l , 2 , . . . , k ) ;

hence

(4. 12) ~-,| ( A~,J__B,, < v ~Xi " ) . . . . , k - :

i . e .

(4. 13)

-- 1 xl x2 Xke-y(j)lt) ) ( V ~ ) k ; ; ' ' ' ; dtl' ' 'd'lr

-0o -0o

l imP(An'14-J~ '24- '"4-A"J<x; j~l 2, k )=

1 {" -o-! "~ ~j" - - ] e " ~y=l ) dt~ ...

(T~) where the integration is to be extended over the domain Tx defined by

Tz: {--oc <t14-t24-...+tj<xVtc; j ~ 1 ,2 , . . . ,k} . Hence we obtain

(4. 14) lim P( max &, W < xB,O = P( max gy < x~k).

14.

Page 24: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

214 A. RENYI

Let us put Q,,, ,,(x) = P( max S,,, ,,,; < xB~),

l ~ j ~ k

and let lln,,.(x) denote the probability of the event that &,r is the first of the sums &,5 ( J ~ 1 , 2 , . . . , r ) which is ~ x B,~, i. e. let

H , , r ( x ) = P ( & , ~ > = x B ~ , ; max S , , j < x B , O. l ~ j ~ r - 1

Obviously,

Let us suppose mj_l < r

H % ( x ) = P (S ....

and H (~) (x) = P (S .....

we ge t evidently

H,,,,. (x) = a - - Pn (x) <-- 1.

my; introducing the notations

~ x B , , ; max & , j < x B ~ ; ]&,,,g--S,~,ri>=sB,0 , 1-<2"~r-1

>=xB,~; max S,, , j<xB~,; [&, , ,~- -&, ,oI<sB,O. 1 ~ j ~ r - 1

Let us apply TCHF~BYSHEV'S inequality: 2

(1) n,,,,.(x) =H,~ r ( x ) P ( s , ~ , w - g .... I > 8B~,)< n,~ r(x) B,~,j 8 B;

and consider the relation (4. 10); thus we obtain that

(1) H,~,, (x) < H,~ r (x) 1 + d ~ ~o_L. ,

therefore

(4. 15)

On the other hand,

(4. 16)

as from the relations

it follows that

Thus we have

1--P,.~(x) = ~ H,~, ,. (x) ~ --sHe q- ,,.=, " ~ H~.(x) .

H (2),~, ,. ,rx~, =< 1 - - Q,, , ~ (x - - 8)

S,, ,. >= xB,~ and IS,,, , , j--S,, , ,.] < 8B,~

&, ~,j > (x - - 0 B,,.

1--P~(x) ~ 1 -}-d 8 2 k + 1 - - Q,~, ~. ( x - - 8).

Further, on account of the trivial inequalities P,~(x)<= Q,~,k(x) (k = 1, 2 , . . . ) , we obtain

l q - d (4. 17) Q, ,k(x--8) 8~ k <= P,~(x) <= Q,,,k(x).

Page 25: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON T H E THEORY OF ORDER STATISTICS 2 1 5

Comparing the above relation (4.17) with (4. 14), we have just the desired inequalities (4.6) and (4. 7).

Let us consider now the special case in which the variables ~,~ assume only the values 4-1 arid --1, and

1 P ( ~ , k = 4. 1) ~ P(~, ~--- - - 1) ~----~- ( k = l, 2, . . . ,N,~=n; n = l , 2 . . . . ).

Then

: 11v) + [xpn + 1] ~ rnod 2 . .

,(except if x f n is an integer) and thus from the Moivre--Laplace theorem we conclude that

x

(4. 19) lim P,(x) = e ~dt (x > 0). n ---~ oc

o

Therefore, as it follows from (4. 6) that

(4. 20) lira P(max (g~, ~ , . . . , ;k) < xl/'k) <= lim P,,(x+~) k-->- oe n - ~ - oo

and from (4. 7) that

{4. 21) l im P (max (~, ~.~ . . . . , ~ ) < x Vk) => l im P,, (x),

we have

(4. 22) x

lim P(max (~,, .~2 . . . . , ~k) < x ~'k) : e-Ydt (x > 0), k -->- ~

o

and, applying again the relations (4. 6) and (4. 7), we obtain (4. 2). The basic idea of this proof can be summed up as follows: we have

pointed out that in case of a special choice of the variables ~,,, k, (4. 2) holds; from this, by ( 4 . 6 ) a n d (4. 7), we have concluded that (4. 2) is true also in case ~,,k=~i,~ where the variables z/, are normally distributed; hence, again by (4. 6) and (4. 7), it followed that (4. 2) holds also for any variables ~,,~ satisfying the conditions of the theorem.

The proof of Theorems 10 and 11 is based on the same idea and, with suitable modifications, agrees step by step with the proof of Theorem 9.

It is sufficient to prove only Theorem 11, because this, as we have seen, includes Theorem 10 as a special case. It is unnecessary to detail the first part of the proof, therefore we shall deal only with the second.

Let us have again 1

P(~,~.k-~-- 4 .1 )~- - -P(~ ,~ ,k=- - l ) - - 2 ( k ~ - - l , 2 . . . . , N , , = n ; n = 1 , 2 , . . . )

and let us suppose that the variables ~,~, (k = 1, 2 . . . . , N~ = n) are mutually

Page 26: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

216 h. RENYI

independent. Then it follows by simple arguments, well-known in the theory of the random walk in the plane, that, if A = [x~/n] + 1 and B = [yVn] + 1,. then

(4.23) P ( - - y l / n < &,,k< xVn; k = 1 ,2 , . . . , n) =

i - B ~ k . ~ A ~ v = l

where

n k -~- if k ~ n (mod2) V k =

~0 in all other cases.

As, by the Moivre--Laplace theorem,

l i r a ( ~ ( n ) 1~ t - -

by simple calculation we obtain

(4. 24) lim P ( - - y g n < S,~,,~ < x / n ; k (21:+1)~n2

a t2

1 f e - g d t '

- b

~-~-1,2 . . . . , n ) =

X 4 ~ e ~.(x+,~)2 s i n ( 2 k @ l ) . w - -

x + y (x>O, y > O ) . I

k=0"~ 2 k + 1

Similarly to the case of Theorem 9, it follows that the limit of the proba- bility on the left side of (4. 4) is the same also in the general case.

Theorem 12 can be derived from Theorem 11 as follows. Owing to the: independence of S~,j• and Sn, k--&~,M~ (k>M~), by the relation

P ( max [S,,,kI<yB,O=P(--yB,~<(S.,~+(S,~,~--S~,~s Mn < !~ <= 5" n

we obtain, by virtue of the theorem on total probability, that

(4. 25) P ( max ]S,~,k I < Y B,,) :

As, in accordance with our restrictions, Lindeberg's condition is satisfied

k=~ - - Bn -~2, as n -+ ~ , there-

fore (uniformly in x in all finite intervals) we have x

(4. 26) l imp <x --- e - d u - - e gdt. . . . . t B,, l/2, Z

Page 27: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 217

Furthermore, by Theorem 11, considering the relation

D(&,G--S,~,ar,~) = VB,--A;, it follows that

(4.27) limP(--(xq-y)B~<S~,k--S,~,,~,~<(y--x)Bn; k = 1 , 2 , . . . , n ) ~--- 'g --~ Go

4 2 (2k+1)~,=o-,) e sy= sin(2kq- 1);r y - x

k=o 2kq- 1 2y '

if y ~ O or y > 0 but Ix ]>y .

Therefore, finally, we obtain

(4. 28)

if y > O and [x[<=y,

l i m P ( max l&,k l<yB,0~- -

(2k4-1)2 a= (1-~2) +y x~ 4 2 e sy~ f e "'__ rW f

- - yrk=o 2kq-1 V2rr~ sin (2k- l -1)yr dx. - y

Hence, by simple calculations, we obtain Theorem 12. In fact, +y x~

V~--~[ sin (2k q- 1)yr clx ~- g

) { 2, I (t (2k+1)i;~

=- (-- lyre V-~ ] [~ dt .

t~

Now, since the integral of e 2 on a > 0 and O is real, then

+a (t- ib)~ a - i b t ~ +a t~

; e - ~ f e 2 f e ~ d t - a - a - i b .-a

a t~ a2 b

- - f ~2 ~_ [ e "~ dtq-2e '~ e g s i n a v d v . 2 v2, o

Consequently,

any closed curve vanishes, therefore, if

a t2 a - i b t2

-=-~--~ dt + t ]/ 2 :r

~a b a

+y x 2

f e ~a~---sin(2kq-1)yrY~dy = V2n*

- g

( ; (2k+1)2 ~ ~ h2 2 t2

~ _ ( _ l ) , e s~ 1 ] f ~ e-~dtq t/~_~y k

This completes the proof of Theorem 12.

y2 (2k+1) @ ) i ~ v2 2e 2;.~ 2 ._,,.,~

e sin v dv . 0

Page 28: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

2 1 8 A. RENYI

w 5. Proof of the tests analogous to those of Kolmogorov and Smirnov

Let ~ be a random variable having the continuous distribution function F(x) and let ~1, ~2,...,~n denote the results of n independent observations for the value of ~, i.e. let ~1, ~ , . . . , ~ , be mutually independent random variables having the same continuous distribution function F(x). Let us denote the distribution function of this sample by Fn(x).

We shall prove the theorems formulated in w 3 by means of the method exposed in w 1, using the theorems of w 4.

1 * F * Let us put ~k~F(~k) and ~ . = l o g - - ~ S , further, ~ k = (~k) and

~ = R k ( ~ , ~2,.-., gn). In this case the variables r~k are uniformly distributed in the interval (0, 1) and their sample distribution function is

Gn (x) == F,~ (F -~ (x)) where y = U l ( x ) is the inverse function of x = F(y). But it is easily seen that

( F~(x)--F(x) ) G~(x)--x (5. 1) sup - - : sup ,

a<=F@) F ( X ) a ~ x ~ l X

therefore, instead of the variable on the left side of (5. 1), we may consider

the variable sup G.(x)--x identical with it. The variables ? ~ , - as we a ~ z ~ l X

have seen ' - - form a Markov chain. Further, we have seen that the variables 0 * * k+l = (n - -k ) (gk+l--~) are mutually independent and exponentially distributed with mean 1, i. e.

P (Ok < x) = 1 - - e -x (x > 0).

1 We have also seen how the variables ~ = l o g ~ may be decomposed in-

to sums of mutually independent random variables by means of the 6j's:

k d1 j:l n+l - - j "

Let us turn to the proof of Theorem 8. First of all, it is easy to see that instead of the relation (3.4) it is enough to prove

(5.2) ~}imP(Vnn sup G'(x)--x< ) 1 / ~ f -~ ~<=G~(~:) ~ y -= e ~ dt ( y > 0 ) . o

For, if [G,,(x)--x] <=i~, then from G,~(x) >= a+~ it follows that x>= G,~(x)--~>=a and thus

G~(x)--x G,~(x)--x sup => sup a ~ x X 6'~ (x) ~ a+~ X

Page 29: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 219

i .e. from sup G,~(x)--x< y it follows that sup G,~(x)--x< y . But if a ~ x X ] /1 l G n ( x ) ~ a + t X v11

A, A' and B are any events and ABc A'B, then

P (d ) = P(A B) + P(A B) ~ P(B) + P(A'B) <= P(B) + P(A').

Applying this inequality to the case when A is the event sup Gn(x)--x<'- Y o~x x Vn'

B is the event ] On (x) - - x l =< e, and, finally, A' denotes the event

G,~(x)--x< y we obtain s u p %~ (~) ~ a+e X V ~

P :nsupG"(x)--X<y <:P(lG,~(x)--xl>~)-+-P n sup G,~(x)--X<y.

It can be similarly shown that

P ( ~ n sup G~(x)--X<y)~ p ( [ G , ~ ( x ) _ x , > ~ ) + p O / n s u p a - ~ G n ( x ) X a ~ x

As lira P(I Gn(x)--x! > e) = 0

:it follows that if (5.2) is satisfied then /[ a+~

Y [ 1-a-e

l imP supG'~(x)--X<y ~ e-gdt n-+ o~ a ~ x X

0

and

Y |' 1- a+e

lira P sup G,~(x)--x < Y ~ e Y dr. o

Since ~ can be chosen arbitrarily small and the integral is a continuous -function of its upper limit, it follows that

P(VnsupG,~(x)--x ) ~ f - ~ l i m - - < y = e a ~ x X

o

Therefore, (3. 4) actually follows from (5.2) and thus, to prove Theorem 8, it is enough to show that (5.2) holds.

Further, we shall need the following relation:

{5.3) Vn sup G,~(x)--x_ Vn max - 1

This follows from the fact that G,~(x) is a constant lying between ~ and

~ , ~ ( x ) - x ) " <Y).

(~ > o),

Page 30: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

220 a. RENYI

~k+a, and so in any interval ~2k < < ~k+~ the supremum of G~,(x)--x G~(x) X x

is equal to k

G,,(~,:+O) 1 - - n 1.

Now, let us apply Theorem 9 to the sequence consisting of the following sequence of random variables:

dj.--1 n-l- 1 - - j ( j = 1,2 . . . . , [n(1--a)]+l); .

this sequence satisfies LINDEBERQ'S condition (and, moreover, even LIAPUN0V'S~ condition) if 0 < a < 1. Then, 'by (1.11), for any z > 0 , we obtain

z

(5.4) l im P max log - - < z g = e dt.

As in case of k>=an and 0 < a < l ,

~ T = l o g ~ - + O and ~.~ k~ a ~ k ~ n

from (5.4) one concludes

(5.5) lim P max log ~ < z =

z

2 dt;

0

] f l - - a therefore, finally, introducing the notation y = z - - we have a '

V T ( ) ) (5.6) l i m p Vn max log - -1 < y ~--- e -Td t . 0

In view of (5.2) and (5.3), Theorem 5 follows from (5.6). Let us now turn to the proof of Theorem 7. We may obtain the

random variable

(5.7) ~ = 7 n max log - - ~ . ] = V n max ~.~ n § a n ~ k ~ b ~ , l , . ~ k "F 1r a n ~ k ~ b n j = l

as the sum of to independent random variables *~ and T~ where

(5. 8) *~ = l/~ ~ 4.- i 1 ~ _ j ~ n + l - b ~ rt - ~ 1 - - j

and k

(5.9) ~ = Vn max .~ ' g - 1 ~_-<~,+~ ~_<b,y=l n-t- l - - j "

Page 31: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 221

It is evident that in the limit ~ is a normally distributed

with the standard deviation V l o b ; further, from the proof of

we can see that

lim P(% ]fb < z) = e- v. dt o

(5. lo)

Considering further that % and -v2 are and (5.10) that

random variable

Theorem 5,

independent, it follows from

(z > 0).

(5.8)

(5.13)

k For, if Tj~+I <-~, then

however, * k ~i~+1 ~ n ' then in case k ~ an, ~+~ >= a and thus

I k k n n

- 7 - - - - 1 ~ 1 - - - < 1 i ~2~+1 ~/~+1 - -

C o n s e q u e n t l y , in either case

k n ~+~ -- I

k + l n 1

n ~ k + l

k + l n

~k+l

1

=<1 ~f+; ~ +Y5

1 1 O,~(x)--x [/n max - ~ - - ~ V n sup x =<

a n - - ~ ; ~ n /~k a ~ G n(x)

~ V n max k - - 1

k k n n

~lk+l ~k+l

1 . 2 f _ _ _

V~ k + l

n - - - - 1 <

~'~k+l =

k + l n

~k+l

+- - -

1 ; if,~

1 a n

(k >= an)

from which it follows

Y _ / a b (Y ~0]/~-77 ~

(5. l l ) l i m P ( - c < y ) - ~ - ~ ~ e 2(1-o e 2 dtdu. -ce 0

This completes the proof of Theorem 7. In the same way we can also prove Theorem 6. Here the relation (5.3}

is replaced by

(5.12) Vn sup G~(X-)x--X =][n max ~-~-- n _ 1 a ~ On(a: ) an ~ ~ ~ n

Page 32: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

2 2 2 A. RENY!

k-t-]

and since max a n ~ - k ~ n /~k+l

3. 3. max - - - - 1 -w-- - - 1

The limiting distribution of the variable

1 N max ~--kn, - a n ~ k ~ n ] ~k

]In max

k

}/n. sup a ~ Z.'(x)~b

is identical with that of the variable

1 , w e get

k 3- max -~

1 - 1 + - all

eccurring here is identical with that of the variable

Vn max l o g - - 1 = l / n max n + l - j ' an ~ - k ~ n 2~k v=k -~ - a n ~ k ~ n j ~ l

which can be determined by means of Theorem 10.

To prove Theorem 8, those steps have to be applied simultaneously which have been used in the proof of Theorems 6 and 7; in this proof we shall use Theorem 12 instead of Theorem 10.

The basic idea of the proof is as follows. The limiting distribution of the variable

F,,(x)--F(x) F(x)

k 3.

z~- - ] fn max - -1 an--<k--< b, r/~:

.and therefore it is identical also with that of the variable

k 3.

]/n max '~2 ~ ' ' ' r

z~ = ]/n max log-~ =

Thus Theorem ]2 is applicable, namely, since the values of the constants &, and B,, occurring in it, are as follows:

1 - - a A , c ~ } / ~ + o ( 1 ) and B,~--~+o(1), therefore

A,, ] (~O--b)

Page 33: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 223,

and thus introducing the notation &,,r =__~@ d 3 - I n + 1--j' we have

) = l i m P max [Sn,, . l<y ~ B , ~ =

n+oo kn+l-bn~r~n+l-an _ (2k+1)2~ ~ 1-a c o

- - : r ( - - 2 k - t - 1 1 e d u + o~

where / ~ bye" (2 k+l) ~--~

2 e- ~ 2 ~i2(l_b )

el~ = ] / ~ y f e "~:f sin u du. 0

This completes the proof of Theorem 8.

w 6. Remarks on the limiting distribution functions occurring in Theorems 5--8

The values of the limiting distribution function occurring in Theorem 5 �9 may be read from the tables of the normal distribution function. The values

of the limiting distribution function occurring in Theorem 6 can be computed

7 by substituting the values z = y ~ into the function

(6.-1) L(z)__ 2L (_1)~ e x p 4 ~ - ~ 8z'2 ) (z > 0). ~ k=o 2 k + 1

(Vg At the end of this paper, we give the table of the function L y ~ forcer-

tain valuesofa. The curve of the distribution function L y ~ can for

certain values of a be seen on Fig. 1. The values of the limiting distribution function occurring in Theorem 7

can be approximatively computed in the following manner:

(6.2) F(y,a,b) i f - ~ - ~ f "~ f f - - - - e 2 e T d v d u = 1 e - '~+v~ - ~ dudv -~ 0 T

Page 34: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

"224 A. RI~NYI

where T is an infinite triangular domain in the plane (u, v) defined by the following inequalites :

i 1/o (6.3) T: - - ~ < u < y l - - b ; -- -- 1 - - b "

Introducing the polar coordinates r - - l [ ~ + v ~, cf = arc tg v - - we obtain // 7g-f%

1 [1 - -e 2(1-a)sin~(ep+cx)) d~ (6.4) F(y, a, b) ~ - ~

o

z ~ t

2;;,

IJT-~ ~ o, oa

11"~ 4o~ / ~ . . . 7 " a l /

~ ts 2o 2s 3o,~404g..V 8'0 m ,r gb fog

Fig. 1.

y 7g. a ( 1 - - b ) and O< c~< ~ - , therefore where t g c ~ = - b - - a

7~

1 1 - 2 0-~)=i~0) (6.5) F(y , a, b) = ~ - - e dfl.

As

(6.6) ~ - - exp - - 2(1 - - a ) sin~fl) J + o o

we have finally the following approximative expression:

Y 1//~_-~ arc ltg V a(1 - -b )

(6.7) F(y , a, b) = e: =2 d u - - .7/7 0

(~ - R )

Page 35: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 225

where t~

(6.8) R = exp 2 ( 1 - - a ) sin ~ dfl and a==arctg a(lb_a--b) o

If 1 - - b - - s is small, then R is, in most eases, negligible, except for extre- mely small values of y, since

(6.9) R ~ exp(--2(1 aYi2sin2c/)=exp (2(~Y_~_b}).

The values of the limiting distribution function occurring in Theorem 8 can be approximatively computed in the following manner: it follows from the second mean value theorem of the integral calculus that

(2k+1)

(6. lO) I~)kl= e 2b~ s inu du 0

bY~ Y 2e- 2(1-o 1 - - b < O ((2k-I-1)2:u~(1--'O)) = 2 ~ y exp t 8by 2 "

In this way, using the notation (6. 1) the limiting distribution function ~occurring in Theorem 8 can be expressed in the following form

where, as is seen by way of a simple calculation.

22 e- 2(1 b) 1 -t- exp 8 ~ ) (6. 12) A < log

:u~(b--a)] ' ~ 2]/f~y 1--exp ~ )

whence it is readily seen that if b is very near to l, A is negligible. Observe that the first factor of the main term depends only on a, the second only on b; this fact simplifies the computation to a great extent; namely, because we can obtain the first factor from the table of L(z), the second from the table of the normal distribution function.

INSTITUTE FOI~ APPLIED MATHEMATICS

,OF THE HUNGARIAN ACADEMY OF SCIENCES.

(Received 30 October 1953)

Page 36: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

226 A. RENYI

Bibliography [1] K. PEARSON, Note on Francis Galton's problem, Biometrika, 1 (1902), pp. 390--399. [2] L. v. BoRxalEWlCZ, Variationsbreite und mittlere Fehler, Sitzungsber. Berl. Math. Ges.~

21 (1922), pp. 3--11. [3] E. L. Dooo, The greatest and the least variate under general laws of error, Trans.

' Amer. Math. Soc., 25 (1923), pp. 525--539. [4] L. H. C. TIPPET, On the extreme individuals and the range of samples taken from a

normal population, Biometrika, 17 (1925), pp. 264--387. [5] M. FR~CHET, Sur la loi de probabilit~ de l'6cart maximum, Ann. Soc. Polon. Math.,.

6 (1928), pp. 92--116. [6] a) A. N. KOLMOOO~OV, Sulla determinazione empirico di una legge di distribuzione,.

Giornale dell'lstituto Italiano d. Attuari, 4 (1933), pp. 83--91. b) A. N. KOLMO~O~OV, Confidence limits for an unknown distribution function, Ann.

Math. Stat., 12 (1941), pp. 461--463. [7] V. I. GUVENKO, Sulla determinazione empirica delle leggi di probabilith, Giorn. Ist. Ital..

Attuari, 4 (1933), pp. 92--99. [8] a) N.V. S~mNOV, l]ber die Verteilung des allgemeinen Gliedes in derVariationsreihe, Metron,~

12 (1935), pp. 59--81. b) H. B. C M U p , O B, Sur la distribution de ~o~, Comptes Rendus Ac. Sci., Paris, 202

(1936), pp. 449--452. c) H. B. C M . p n o B, O pacnpe~eaeHHH oJ-%KpHTepn~ M.3eca, M a T e M. C 6 o p . . K,

2 (44) (1937), pp. 973--994. d) H. B. C M u p n o u, O6 y~aoneun~x 9MnHpH~ecKo~i I~pI4BO~i pacnpexenenHn, M a T e M.

C 6 o p a H K, 6 (44) (1939), pp. 3--26. e) H. B. C M u p n o B, OL~em<a pacxon<~eni~ Men<~y ~MnI~pn~ecKHMn Kp~BL~MH pacnpe-

~enenn~ B ;~Byx HeaaBHC~M~IX B~16opKax, B to an . M 0 c K. Y U n B., 2 (1939), pp. 3--14.

f) H. B. C ~ n p u o u, l-lp~6a~n<eHne aai<oHoB pacnpe~enen~ cnyuafi~lx uen~nH no 3Mnupl4qeCl<uM xaHnI~I/~, Y C n e x ~i M a T. H a y K, 10 (1944), pp. 176--206.

g) H. B. C ~ n p H O ~, FIpe~enbnbie ~a~on~I pacnpe~eaenn~ Xn~ ~aeHou Bapna~noHHoro p~/{a, T p y ~ , ~ M a T . HHCT. C T e ~ n o B a , 25 (1949), pp. 1--60.

[9] B.V. ONEDENKO, Limit theorems for sums of independent random variables, Transactions Amer. Math. Soc., 45 (1951), pp.

[10] B. B. F n e ~ e n e o .n B. C. I<opon~ot< , O Ma~<caMan~noM pacxon<Aeni~e /~ByX: a~nnpn~ecKnx pacnpe~enennfi, ~ o K ~. A ~ a ~. H a y K C C C P, 80 (1951), pp. 525--528.

[11] B. B. F H e g e a ~< O ~ E. J1. P B a ~ e ~ a, 06 o~nofi ~aua~e cpasHenn~ ~ y x ~ n n p n - . ~ecenx pacnpe~eaennfi, ~ o ~< n. A i< a A. H a y ~< C C C P, 82 (1952), pp. 513--516..

[12] B. B. F n e ~ e n ~ < o n B. C. M ~ a x a a e B n ~ , ~se Teope~si o noBe~ennfi ~ n n p ~ - ~ecKnx 0pyn~xmfi pacnpe~enennn, 1~ o K a. A i< a X. H a y K C C C P, 82 (1952), pp. 841--843 and pp. 25--27.

[13] B. C. M n x a n e B n q, O BaaHMnOM pacnonon<ennfi ~syx ~Mnnpn~ec~nx qbyHi<tI!4fi pac- npexenenrm, ~ o K n . Ai<a~. H a y ~ C C C P , 85 (1952), pp. 485--488.

[14] I4. ~. I( Bn T, O Teope~ae H. B. CMnpHoBa oxuocnxen~no cpaunenn~ ~ByX Bbl6OpO~, ~ O K n . A K a n . H a y K C C C P , 71 (1950), pp. 229--231.

[15] F. M. M a H n ~, O6o6menne KpnTepn~ A. H. goaMoropoBa ~a~ ot{en~n uax<ona pacnpe- ~eaenn~ no oMnnpn~ec~<n~ ~aHn~i~, ~oI<n . A~<a~. H a y K C C C P , 69 (1949), pp. 495--497.

[16] I/I. I/I. F n x ~ a H, O6 o~nnpnqec~ofi ~yn~uf i pacnpe~eaeun~ B cny~ae r p y n n n p o B ~ ~anns~x, ] l o ~ n . A ~ a ~ . H a y ~ C C C P , 82 (1952), pp. 837--840.

Page 37: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 227

[171 W. FELLER, Off the Kolmogorov--Smirnov limit theorems for empirical distributions, Ann. Math. Statistics, 19 (1948), pp. 177--180.

[18] J. L. DooB, Heuristic approach to the Kolmogorov--Smirnov theorems, Ann. Math. Statistics, 20 (1949), pp. 393--403.

[19] a) F. J. MASSEY, A note on the estimation of a distribution function by confidence limits~ Ann. Math. Statistics, 21 (1950), pp. il6--119.

b) F. J. MASSEY, A note on tlle,~9~gr of a non-garametric test, mAnn. Math. Statistics, 2 1 (1950), pp. 440--443.

C) F. J. MASSEY, Distribution table for the deviation between two sample cumulatives. Ann. Math. Statistics, 23 (1952), pp. 435--441.

[20] a) M. D. DONSKEIr Justification and extension of Doob's heuristic approach to the Kolmogorov~Smirnov theorems, Ann. Math. Statistics, 23 (1952), pp, 277--281,

b) M. D. Dor~saER, An invariance principle for certain probability limit theorems, Memoirs Amer. Math. Soc., 6 (1951), pp. 1--12.

[21] T. W. ANDERSON--D. A. DARLINg, Asymptotic theory of certain ,,goodness of fit" criteria based on stochastic processes, Ann. Math. Statistics, 23 (1952), pp. t93--212

[22] P. ERI)6s--M. KAC, On certain limit theorems of 1he theory of probability, Bull. Amer Math. Soc., 52 (1946), pp. 292--302.

[23] S. MALMQU1ST, On a property of order statistics from a rectangular distribution, Skand. Akluerietidsskrift, 33 (1950), pp. 214--222.

[24] G. HAj6s and A. R~Nw, Elementary proofs of some basic facts concerning order sta- tistics, Acta Math. Acad. Sci. Hung., 5 (1954) (under press).

[25] H. CRAM~R, Mathematical methods of statistics (Princeton, 1946). [26] JI. H. B p a r H n c K u, OnepaTn~nb~fi CTaTnCTnqecKnii KOnTpO/Ib r~aqecTBa B ~aamnnoc-

Tpoennn (MoCX<Ba, 1951), Maturn3. [27] A. WALD, Limit distribution of the maximum and minimum of successive cumulative

sums of random variables, Bull. Amer. Math. Soc., 53 (1947), pp. 142--153. [28] A. WALD, On the distribution of the maximum of successive cumulative sums of inde-

pendently, but not indentically distributed chance variables, Bull. Amer. Math. Soc., 54 (1948), pp. 422--430.

[29] K. L. CHVNO, Asymptotic distribution of themaximum cumulative sum of independent random variables, Bull. Amer. Math. Soc., 54 (1~,8), pp. 1162--1170.

[30] S. S. W~L~S, Order statistics, Bull. Amer. Math. Soe., 54 (1948)," pp. 6--50.

15 Acta Mamematlca

Page 38: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

228 A. RI~NYI

$ c~

~ " ~ b , , . . , ~ " ~ a ~ r '',- ~ " ~ ' ~ C ~ ' ~ ' ~ O

I

0 ~ d d d ~ d d d d d d d d ~ d d d d d ~ d d d d d o d ~ d d

Page 39: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 2 2 9

r

�9

.r

e- ,

r >.

Page 40: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

230 A. RENYI

K T E O P H H B A P H A I ~ H O H H b l X P~]],OB

A. PEHBH (ByaanemT)

lira P lVn sup n-~.-m \ O ~ a ~ F ( ~ ) ~ l

T e o p e M a 6.

( P e a m ~ a e )

l leab HacTomReii CTaTbrl- Haao~euue HOBOFO MeT0~a, C noMOtttbm ROTOpOFO MO)KHO' npOCTb~M y CncTeMaTnqec~nM 06pa3oM ~ocTpOnTb Teopnm Bapnattnonnblx p~XOB H /Io~a3aTb P~B; HOBblX TeopeM. Cym, HOCTb MeToBia COCTOHT B TOM, qTO HccneaoBanne npeaeabm,~:('pac- n p e R e a e g n ~ : B e a n q n H , aasncamnx OT qaeHOB B a p t i a R H o g H o r O w a a CBO,~IITC$I 1~ nccaeaoBanm~ pacnpeaenengn (pyn~Ltni~ OT cyMM HeaaBncnm,~x CayqafiHblX Re~nqtm. OTOT MeTOR nCXO~HT Ua qbaKTa, ~OTOpb~fi nepB~[M aa~eTnn A. H. 1{ o a M o r o p o s n csoeii pa60Te [6a],i qTO qaeHh~ sapnatmouuoro pnaa o6paaymT Lteg~ M a p K o aa. Boaee T0rO, Car aXo~(haan S. Malmquist, ecan #~ (k ~ 1, 2 , . . . , n)--pacno:1ox<eunbm s soapac~amtae~ nopuaxKe qaeHH aae~enToa ~ ( k ~ 1,2 . . . . , n) Bb~6op~n o6"~e~a n na CTaTgCTnqecKofi CO~ot(ynnocTn c uenpep~iBHoi~ ~0yn~<tmefi pacnpeaeaenne F(x) n @ ~ F(#~) (k ~ 1, 2 . . . . . n), TO ~e.rmqnn~,l

(~)'~(k--" = 1,2 . . . . . n) ~IB01~IIOTC~t snoal-le neaaancn.~,Mn n B nuTep~aae (0, 1)pasnoMepuo

pacnpe/~eaeHHblMn cnyqafinbtMn Benuqunamn, n rIOOTOMy BenHqHHH ~ z ~ l o g ~*~ o6paaymT a;lanTuBHym i~en~ M a p u o B a. IIpocToe aouaaaTea~cTBo aToro r aaHo B w 1. w 2 c o ~ e p ~ n T H3~IO~,I(eHHe n p u M e H e H n ~ DTOFO ~ a ~ T a ~ NpOCTOFO ~ 0 K a 3 a T e a b C T B a HeI(OTOpbIX

u3BecTnblX Teope~l Teopu~i ~apnaLmonn~ix pUaOB. B w 3 cq~op~ynupoBan~ caeg~ymmne uosue peaynbTaTbI, HoayqeHnbie c no~ou~bro HO~OrO MeTo~a.

HyCTb Fn (x) o3naqaeT aMnnpnqecwro r pacnpeaeaennu B~6opun, T. e. noao~H~ k , ,

F ~ ( x ) = - ~ /Iau 2k<:x<~zc+~ ( k - - l , 2 . . . . . n--1), P;~(x)=O aau x < ~ u F,~(x) I

aau #;~ ~ x. Toraa nMee~

T e o p e ~ a 5.

I O ~a~ y < O.

(2 k+l)2 a~ (1-t~) "~ 8 y2 C~

IV" sup ]F~dx)--F(x), < Y ) 1 4 k ~ = O = ~e t - n • (--1) - 2k- t - 1 lira P ,~-.~ ~ o < ~ _ F ( ~ , ) ~ l ] F(x)

LO ;ln~ y ~ O .

T e o p e M a 7.

~a~ y > O,

~ 1-b ~ z b - a

lim P n sup F~,(x)--F(x) < y __ 1 e--~

0

Page 41: On the theory of order statisticsblog.thegrandlocus.com/static/misc/Renyi_1953.pdfON THE THEORY OF ORDER STATISTICS 193 if X > 0 and y-->_ 0. 2 This property characterizes uniquely

ON THE THEORY OF ORDER STATISTICS 231

T e o p e M a 8.

v<:t F,,.(x)--F(x) lira P } n sup . . . . o o <-a -< F(,:) --< F(x)

(2/;+1) -~ a - (t - a )

) 2, 4 ~ k e E~- < Y = 7 . = - -1) 2 k + 1

~a~ y ~ 0,

b y2 (2k+1)

r g t e E z : 1 2 f ,, ~ f _ (1-b)u'-' = - - - - 2du-[- V ) e sinudu. 1'2~ e- 2e 2(1-b) 2~,y2

2 a ( " y

~TH TeopeMb~, KOTOpble aHa.rlOFHqHbl HgBeCTHblM TeoperaaM H. B. C M n p n o B a n A. H. I( o a M o r o p o ~ a, aamT KpnTepnn ~aa rnnoTea OTnOCnTeabno F(x), COOTK ,~at0T ;~0Bepn.Teat, nI~le rpatmt~bi~as~ nenaBeCTnO~i ~yni~tmi~ F(X).

~0KUaaTe.at,r xeopeM Co/~ep~i,~TC;I:' B' w 5 , n onupae;rc;~,, KpoMe ynoM~nyTOrO MeTO~a, .Ha HeKOTOpr~XX HOBBIX npe~eab~.sIx TeopeM, u~.no~r B, w O~HOI3I/1TeaBHO ~a~cnMy~aa qaCTHbIX CyMM IIOC.rle,/~OBaTedIbHOCTeH HeSaBHCI, IMI-,IX cay~a~inux Beawmn. w co/lep~nT ne~oTOpb~e aaMe,mnn~ OTHOCHTedlBHO BBIqHCaeHH~I Flpe~edlbHblX qbyH~Rn~ pac- npe~eaenn~,~nvypnpymmne n xe0pe~ax 5--8. B ~OHRe'~CraTrl aa.Ha Ta6antta ananennfi

L ( z ) : 4 Z ( - - 1 ) " e 8~= rae z = y ] / a I;=o 1 - - a

~a~ pa3aHqHbiX 3HaqeHH~ OT y H O.