6

Click here to load reader

On the Distribution of the Cauchy Maximum-Likelihood Estimator

Embed Size (px)

Citation preview

Page 1: On the Distribution of the Cauchy Maximum-Likelihood Estimator

On the Distribution of the Cauchy Maximum-Likelihood EstimatorAuthor(s): Peter McCullaghSource: Proceedings: Mathematical and Physical Sciences, Vol. 440, No. 1909 (Feb. 8, 1993), pp.475-479Published by: The Royal SocietyStable URL: http://www.jstor.org/stable/52247 .

Accessed: 03/05/2014 22:12

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

The Royal Society is collaborating with JSTOR to digitize, preserve and extend access to Proceedings:Mathematical and Physical Sciences.

http://www.jstor.org

This content downloaded from 130.132.123.28 on Sat, 3 May 2014 22:12:30 PMAll use subject to JSTOR Terms and Conditions

Page 2: On the Distribution of the Cauchy Maximum-Likelihood Estimator

On the distribution of the Cauchy maximum- likelihood estimatort

BY PETER MCCULLAGH

Department of Statistics, University of Chicago, 5734 University Avenue, Chicago, Illinois 60637, U.S.A.

The two-parameter Cauchy maximum-likelihood estimator T(y) = (T(y), T2(y)) is known to be unique for samples of size n 3 3 (J. Copas, Biometrika 62, 701-704 (1975)). In this paper we exploit equivariance under the real fractional linear group to show that the joint density of T has the form p,(X)/(4tct2), where X = t-012/ (4t2 02). Explicit expressions are given for p3(X) and p4(X) and the asymptotic large- sample limit. All such densities are shown to have the remarkable property that E(u(T)) = u(O) if u(') is harmonic and the expectation is finite. In particular, both components of the maximum-likelihood estimator are unbiased for n > 3, Elog (T12+T) = log (0 + 0), E(T12-T2) = 02-02, E(T1T2) = 0102 for n > 4, and so on.

1. Equivariance and the Mobius group Let Y1,..., Yn be independent and identically distributed Cauchy random variables with density

f(y; 0) = 1021/ly - 12. (1)

In this expresson y is a real number, and 0 = 01+i02 is the Cauchy parameter represented as a complex number. The parameter space is here taken to be the

complex plane in which complex conjugate pairs of points are identified. This set, which is isomorphic to the upper half plane, is denoted by 0 or U. For brevity, we write Y C(0) meaning that Y has the Cauchy distribution (1) with median 01 and probable error 021.

It is a property of the Cauchy family that if Y - C(0), then

aY+b c (a+b cY+d kca+d) (2)

for all real numbers a, b, c, d with ad-bc - 0. In other words, the Cauchy family is closed under the action of the real Mobius group, also known as the real fractional linear group, and the parameter, when represented as a complex number, is

equivariant. The group (2) is henceforth denoted by G. Let T = T1 + iT be the maximum-likelihood estimator of 0 based on observations

y. It is an automatic consequence of (2) and the properties of maximum-likelihood estimators that

T(ray +

b= aT(y)+b \cy + d cT(y)+d'

t This paper was accepted as a rapid communication.

Proc. R. Soc. Lond. A (1993) 440, 475-479 ? 1993 The Royal Society Printed in Great Britain 475

This content downloaded from 130.132.123.28 on Sat, 3 May 2014 22:12:30 PMAll use subject to JSTOR Terms and Conditions

Page 3: On the Distribution of the Cauchy Maximum-Likelihood Estimator

In other words, if the components of y are transformed by g G, the maximum- likelihood estimate obtained from gy is the same as g applied to the estimate based on y. We write T(gy) = gT(y) and say that T is equivariant under G. Apart from the maximum-likelihood estimator, most other estimators, based on the sample median and sample probable error or semi-inter-quartile range, are equivariant under the location-scale group (c = 0), but not equivariant under the larger fractional linear

group. We now sketch a proof of the claim that the density of any equivariant estimator

T with respect to Lebesgue measure on U must have the form

pn(X)/)(4rt2) (3)

for some function Pn( ) on the positive real line. It is enough to show that, under the action of G on U x 0

at + b a0 + b gt = t' = gat+ b = 0 a b

(4) ct + d ' c + d'

the following properties are satisfied. (i) dt1 dt2/t2 is invariant measure on U.

(ii) X = It-12/(4t2 02) is a maximal invariant. To prove (i) it suffices to show that

dtl dt/t~ = Jdt dt /t2 J S J gS

for measurable sets S c U. This follows from the fact that the jacobian \I(t')/8(t)I is

equal to (t2/t2)2. To prove (ii) we first note that X is invariant by virtue of the fact that it is the

cross-ratio of the points (t, 0, t, 0). One way to show that it is a maximal invariant is to show that there exists a g E G such that an arbitrary pair (t, 0) can be transformed to (Ai, i) with A > 1. This set of points forms a cross-section of the G-orbits, so that all invariants must be a function of A. The required result then follows from the fact that X = (A-1)2/(4A) is a 1-1 function of A.

2. Explicit form for the marginal density of T

One way to calculate the exact marginal distribution of T is to begin with the joint density of (Y, ..., Yn) and to make a non-singular transformation to (T, A), where A is a suitable (n- 2)-dimensional complementary statistic. If we take aj = (yj-t?)/t2, the so-called configuration statistic, it is possible to express any two components in terms of the remaining n- 2 components. The marginal density of T is then obtained

by integrating out the complementary variable a. This method works in principle for

any location-scale family but it is rarely feasible to implement. However, we have been unable to find a better method.

If Y has density 2 f((y - 01)/02) then the joint distribution of (T, A) is

ft-n I I\=1f11- a) .'!n-2 J 29n [ll Q 2V 2 J(a),

where J(a) is a jacobian that depends on the expression for (an_, an) in terms of

Proc. R. Soc. Lond. A (1993)

476 P. McCullagh

This content downloaded from 130.132.123.28 on Sat, 3 May 2014 22:12:30 PMAll use subject to JSTOR Terms and Conditions

Page 4: On the Distribution of the Cauchy Maximum-Likelihood Estimator

The Cauchy maximum-likelihood estimator

(al, ..., an_2). In the case of the Cauchy distribution, the configuration statistic has a particularly simple form when transformed on to the unit circle via

1 +ia, j - ia'

The likelihood equation is then zj = 0 (McCullagh 1992a). Given values z, ..., Zn_2

satisfying [z. I < 1, where Z. = (Z1 . + Zn-2)/2,

the remaining two points are -z.((l+i(Iz. -2-1)i) provided that z. 0. In particular, for n = 3 the configuration vector is necessarily of the form

(z1, Z1, 0)2z1) or (z1, (2Z1, (Z1),

where w = exp (2ni/3). On the original scale, the configuration is

( a+Va3 a- ,v/3 oro

a -l+ /3 a+-/3'

'l-a\3I+'a3 a\' 1+(aj 3',1-a\V3}

Taking the first of these, the jacobian of the transformation from (Y, Y2,Y3) to (tl,t2, a) is

t2 J(a) = 63(a2 + 1)2 t2/(l -3a2)2

For 0 = (0, 1), the joint density of T is thus

f 2t2 J(a) da {1 + (tl + t2a )2 {11+ (tl + t2 a2)2)}1 (tI + t2 a3)2}'

the additional factor of 2 coming from the two distinct configurations. The integral runs from -oo to oo, but with a factor of three can be made to run from - //3 to 1//3. It is by no means obvious that the density thus obtained has the form (3). However, after some considerable simplification the joint density is obtained in the form

1 3V/3 4it2 r(1 + 3+ 3X2)'

so that p3(X) is given by the second factor. For n = 4 the configuration is necessarily a permutation of (Zl, 2, -, -Z2)-. On

the original scale this translates to (al, a2,- l/a, - l/a2). From this we obtain a double integral, analogous to (5), with J(a) = 21(a1-a2)(1 +a1a2))l/(a2a2). Remark- ably, this integral can be evaluated explicitly giving

12 log (1 + 2X) P4(X) = p4()

= -2(X + l) (2X + l)'

One strength of the preceding derivation is that an explicit closed form expression for T is not required. Somewhat surprisingly, closed form expressions are available for n = 3 and n = 4 (Ferguson 1978), but these are of no help in obtaining the density. The principal weakness in our derivation is that, although the answer is known to have the form (3), intermediate calculations such as (5) make little use of this fact. It is only at the final step that the density miraculously turns out to depend only on X. I have not yet found a way of exploiting the known form of the marginal density to simplify intermediate calculations in a substantive way. Nor have I succeeded in finding Pn(X) explicitly for values of n greater than 4.

Proc. R. Soc. Lond. A (1993)

477

This content downloaded from 130.132.123.28 on Sat, 3 May 2014 22:12:30 PMAll use subject to JSTOR Terms and Conditions

Page 5: On the Distribution of the Cauchy Maximum-Likelihood Estimator

For large n it is known that

-(i/n) logPn(X) > log (1 +X)

(McCullagh 1992b). Simulation results indicate that

Pn(X) - (n- 2)(1 + X)-n+l

is reasonably accurate for moderate values of n, suggesting that Pn(') possesses moments up to, but not including, order n-2.

3. Moments and other expectations

Let u(t) be any function that is harmonic on U. In other words, for t in the upper half plane

+ = 0. at2 8t2

Consider now the expected value of u(T), where T has a density of the form (3). The expected value of u(T) is given by

E(u(T)) = u(t) 4t2 p(X) dt, dt2.

The curve X = const. is a complete circle in U with centre (o = (01, 02(1+2X)) and radius p = 202/((1 +X)). As a consequence it is convenient to make a change of variables from (tl, t2) to (X, 0) as follows:

tl = 01 + 202 A/ (X(1 + X)) COS 0,

t2 = 02(1+ 2X) + 202 /(X(I +X)) sin ,

with jacobian (t - 2t 0 a(tX,t2)

2 2 20

This change of variables gives

E(u(T)) = p(X) dx 02u() +p e) d (6)

o( = j()) X 2tr(o2+psin (q)) ')

The standard form of the Poisson integral formula (Rudin 1987, ?5.22)

1 (1-r2)h(ei)dS = h(red) 2j_ 1 +r2- 2rcos(--c) )

with a = -7c/2 and r = X/V(X(X + 1)) gives

1 F h(e1i)do -, xeia d

2 j.X 1 + 2X + 2/ (X(1 + X)) sin / (x( + X)) if h(-) is harmonic on the open unit disc and continuous on the closed disc. Application of the formula in this form to (6) gives

E(u(T)) = p(X) u(0) d = u(0) (7)

provided that the integral exists and u( ) is harmonic on the upper half plane. Note in particular that the real and imaginary parts of an analytic function are

harmonic. Consequently, if g(') is analytic on U with Elg(T)l < oo it follows

Proc. R. Soc. Lond. A (1993)

478 P. McCullagh

This content downloaded from 130.132.123.28 on Sat, 3 May 2014 22:12:30 PMAll use subject to JSTOR Terms and Conditions

Page 6: On the Distribution of the Cauchy Maximum-Likelihood Estimator

The Cauchy maximum-likelihood estimator

that E(g(T)) = g(O). In particular, if g(t) = tk the integral (6) converges provided that fp(X) X-l dX < oo. In the case of the Cauchy maximum-likelihood estimator E(ITIk) < oo for k < n- 1 if our conjecture regarding the moments of pn(X) is correct.

4. Further properties The calculations of ?3 show that Pn(X) is the marginal distribution with respect to

Lebesgue measure on the positive real line of the random variable IT- 12/(4T2 02). We now construct a complementary statistic that is independent of X, although not invariant under (4).

The conditional distribution of T given X is circular Cauchy with parameter 0, and concentrated on the circle C(o, p) with centre 0o and radius p. This conditional distribution is the exit distribution for brownian motion starting at 0. The circular Cauchy distributions are closed under fractional linear transformation. Conse- quently, if we make such a transformation such that C(o, p)-> C(0, 1) and 0 - 0, the transformed variable is uniformly distributed on the unit circle for each X. This argument leads to the conclusion that

is( +X T-0

is uniformly distributed on the unit circle and independent of X. To say the same thing in another way, the complex-valued statistic

T-0 T-6

takes values in the unit disc and is invariant under location-scale transformation. The argument of this statistic is uniformly distributed independently of the modulus. The modulus, which is equal to (X/( + ))2, is invariant under fractional linear transformation, but the argument is invariant only under location-scale trans- formation.

I am grateful to R. A. Wijsman for helpful discussions on this topic.

References

Ferguson, T. 1978 Maximum likelihood estimates of the parameters of the Cauchy distribution for samples of size 3 and 4. J. Am. Statist. Assoc. 73, 211-213.

McCullagh, P. 1992a Conditional inference and Cauchy models. Biometrika 79, 247-259.

McCullagh, P. 1992b On the choice of ancillary in the Cauchy location-scale problem. Technical Report No 311, Department of Statistics, University of Chicago, U.S.A.

Rudin, W. 1987 Real and complex analysis. New York: McGraw-Hill.

Received 16 October 1992; accepted 24 November 1992

Proc. R. Soc. Lond. A (1993)

479

This content downloaded from 130.132.123.28 on Sat, 3 May 2014 22:12:30 PMAll use subject to JSTOR Terms and Conditions