Chernoff's density is log-concave - University of WashingtonBalabdaoui and Wellner/Cherno ’s density is log-concave 2 Thus log˚ ˙is concave, and hence ˚ ˙is a log-concave density.As

Chernoff’s density is log-concave

Fadoua Balabdaoui

Centre de Recherche en Mathematiques de la Decision, Universite Paris-Dauphine, Paris,France,

e-mail: [email protected]

Jon A. Wellner∗

Department of Statistics, University of Washington, Seattle, WA 98195-4322,e-mail: [email protected]

Abstract: We show that the density of Z = argmax{W (t)−t2}, sometimesknown as Chernoff’s density, is log-concave. We conjecture that Chernoff’sdensity is strongly log-concave or “super-Gaussian”, and provide evidencein support of the conjecture. We also show that the standard normal den-sity can be written in the same structural form as Chernoff’s density, makeconnections with L. Bondesson’s class of hyperbolically completely mono-tone densities, and identify a large sub-class thereof having log-transformsto R which are strongly log-concave.

AMS 2000 subject classifications: Primary 60E05, 62E10; secondary60E10, 60J65.Keywords and phrases: Airy function, Brownian motion, correlationinequalities, hyperbolically monotone, log-concave, monotone function esti-mation, Prekopa-Leindler theorem, Polya frequency function, Schoenberg’stheorem, slope process, strongly log-concave.

1. Introduction: two limit theorems

We begin by comparing two limit theorems.First the usual central limit theorem: Suppose that X1, . . . , Xn are i.i.d.

EX1 = µ, E(X2) <∞, σ2 = V ar(X). Then, the classical Central Limit Theo-rem says that

√n(Xn − µ)→d N(0, σ2).

The Gaussian limit has density

φσ(x) =1√2πσ

exp

(− x2

2σ2

)= e−V (x),

V (x) = − log φσ(x) =x2

2σ2+ log(

√2πσ)

V ′′(x) = (− log φσ)′′(x) =1

σ2> 0.

∗Supported in part by NSF Grants DMS-0804587 and DMS-1104832, by NI-AID grant2R01 AI291968-04, and by the Alexander von Humboldt Foundation

1

imsart-generic ver. 2007/02/20 file: Chernoff-Is-LogConcave-v3.tex date: March 2, 2012

mailto:[email protected]

mailto:[email protected]

Balabdaoui and Wellner/Chernoff’s density is log-concave 2

Thus log φσ is concave, and hence φσ is a log-concave density. As is well-known,the normal distribution arises as a natural limit in a wide range of settingsconnected with sums of independent and weakly dependent random variables;see e.g Le Cam (1986) and Dehling and Philipp (2002).

Now for a much less well-known limit theorem in the setting of monotoneregression. Suppose that the real-valued function r(x) is monotone increasingfor x ∈ [0, 1]. For i ∈ {1, . . . , n}, suppose that xi = i/(n+ 1), εi are i.i.d. withE(εi) = 0, σ2 = E(ε2i ) <∞, and suppose that we observe (xi, Yi), i = 1, . . . , n,where

Yi = r(xi) + εi ≡ µi + εi, i ∈ {1, . . . , n}.

The isotonic estimator µ of µ = (µ1, . . . , µn) is given by

µj = maxi≤j

mink≥j

{ ∑kl=i Yl

k − i+ 1

},

µ = (µ1, . . . , µn) ≡ TY= least squares projection of Y onto Kn,

Kn = {y ∈ Rn : y1 ≤ · · · ≤ yn}.

For fixed x0 ∈ (0, 1) with xj ≤ x0 < xj+1 we set rn(x0) ≡ rn(xj) = µj .

Brunk (1970) showed that if r′(x0) > 0 and if r′ is continuous in a neighbor-hood of x0, then

n1/3(rn(x0)− r(x0))→d (σ2r′(x0)/2)1/3(2Z1).

where, with {W (t) : t ∈ R} denoting a two-sided standard Brownian motionprocess started at 0,

2Z1 = slope at zero of the greatest convex minorant of W (t) + t2 (1.1)d= slope at zero of the least concave majorant of W (t)− t2d= 2 argmin{W (t) + t2}.

The density f of Z1 is called Chernoff’s density. Chernoff’s density appears in anumber of nonparametric problems involving estimation of a monotone function:

• Estimation of a monotone regression function r: see e.g. Ayer et al. (1955),van Eeden (1957), Brunk (1970), and Leurgans (1982).• Estimation of a monotone decreasing density: see Grenander (1956a),

Prakasa Rao (1969), and Groeneboom (1985).• Estimation of a monotone hazard function: Grenander (1956b), Prakasa Rao

(1970), Huang and Zhang (1994), Huang and Wellner (1995).• Estimation of a distribution function with interval censoring: Groeneboom

and Wellner (1992), Groeneboom (1996).



In each case:

• There is a monotone function m to be estimated.• There is a natural nonparametric estimator mn.• If m′(x0) 6= 0 and m′ continuous at x0, then

n1/3(mn(x0)−m(x0))→d C(m,x0)2Z1

where 2Z1 is as in (1.1).

See Kim and Pollard (1990) for a unified approach to these types of problems.The first appearance of Z1 was in Chernoff (1964). Chernoff (1964) consid-

ered estimation of the mode of a (unimodal) density f via the following simpleestimator: if X1, . . . , Xn are i.i.d. with density h and distribution function H,then for each fixed a > 0 let

xa ≡ center of the interval of length 2a containing the most observations.

Let xa be the center of the interval of length 2amaximizingH(x+a)−H(x−a) =P (X ∈ (x− a, x+ a]). Then Chernoff shows:

n1/3(xa − xa)→d

(h(xa + a)

c

)1/3

2Z1

where c ≡ h′(xa−a)−h′(xa+a). Chernoff also showed that the density fZ1 = fof Z1 has the form

f(z) ≡ fZ1(z) =1

2g(z)g(−z) (1.2)

where

g(t) ≡ limx↗t2

∂

∂xu(t, x),

where, with W standard Brownian motion,

u(t, x) ≡ P (t,x)(W (z) > z2, for some z ≥ t)

is a solution to the backward heat equation

∂

∂tu(t, x) = −1

2

∂2

∂x2u(t, x)

under the boundary conditions

u(t, t2) = limx↗t2

u(t, x) = 1, limx→−∞

u(t, x) = 0.

Again let W (t) be standard two-sided Brownian motion starting from zero,and let c > 0. We now define

Zc ≡ sup{t ∈ R : W (t)− ct2 is maximal}. (1.3)



As noted above, Zc with c = 1 arises naturally in the limit theory for nonpara-metric estimation of monotone (decreasing) functions. Groeneboom (1989) (seealso Daniels and Skyrme (1985)) showed that for all c > 0 the random variableZc has density

fZc(t) =1

2gc(t)gc(−t)

where gc has Fourier transform given by

gc(λ) =

∫ ∞−∞

eiλsgc(s)ds =21/3c−1/3

Ai(i(2c2)−1/3λ). (1.4)

Groeneboom and Wellner (2001) gave numerical computations of the densityfZ1

, distribution function, quantiles, and moments.Recent work on the distribution of the supremum Mc ≡ supt∈R(W (t)− ct2)

is given in Janson et al. (2010) and Groeneboom (2010). Groeneboom (2011)studies the the number of vertices of the greatest convex minorant of W (t) + t2

in intervals [a, b] with b − a → ∞; the function gc with c = 1 also plays a keyrole there.

Our goal in this paper is to show that the density fZcis log-concave. We also

present evidence in support of the conjecture that fZcis strongly log-concave:

i.e. (− log fZc)′′(t) ≥ some c > 0 for all t ∈ R.

The organization of the rest of the paper is as follows: log-concavity of fZc isproved in Section 2 where we also give graphical support for this property andpresent several corollaries and related results. In Section 3 we give some partialresults and further graphical evidence for strong log-concavity of f ≡ fZ1

: thatis

(− log f)′′(t) ≥ (− log f)′′(0) = 3.4052 . . . = 1/(.541912 . . .)2 ≡ 1/σ20

for all t ∈ R. As will be shown in Section 3, this is equivalent to f(t) = ρ(t)φσ0(t)

with ρ log-concave. In Section 5 we briefly outline some corollaries and conse-quences of log-concavity and strong log-concavity of f .

2. Chernoff’s density is log-concave

Recall that a function h is a Polya frequency function of order m (and we writeh ∈ PFm) if K(x, y) ≡ h(x−y) is totally positive of order m: i.e det(Hm(x, y) ≥0 for all choices of x1 ≤ · · · ≤ xm and y1 ≤ · · · ≤ ym where Hm ≡ Hm(x, y) =(h(xi − yj))mi,j=1. It is well-known and easily proved that a density f is PF2 ifand only if it is log-concave. Furthermore, h is a Polya frequency function (andwe write h ∈ PF∞) if K(x, y) ≡ h(x − y) is totally positive of all orders m;see e.g. Schoenberg (1951), Karlin (1968), and Marshall et al. (2011). FollowingKarlin (1968) we say that h is strictly PF∞ if all the deteminants det(Hm) arestrictly positive.



Theorem 2.1. For each c > 0 the density fZc(x) = (1/2)gc(x)gc(−x) is PF2;i.e. log-concave.

The Fourier transform in (1.4) implies that gc has bilateral Laplace transform(with a slight abuse of notation)

gc(z) =

∫ezsgc(s)ds =

21/3c−1/3

Ai((2c2)−1/3z)(2.1)

for all z such that Re(z) > −a1/(2c2)−1/3 where −a1 is the largest zero of Ai(z)in (−∞, 0).

To prove Theorem 2.1 we first show that gc is PF∞ by application of thefollowing two results:

Theorem 2.2. (Schoenberg,1951) A necessary and sufficient condition for a(density) function g(x), −∞ < x < ∞, to be a PF∞ (density) function isthat the reciprocal of its bilateral Laplace transform (i.e. Fourier) be an entirefunction of the form

ψ(s) ≡ 1

g(s)= Ce−γs

2+δssk∞∏j=1

(1 + bjs) exp(−bjs) (2.2)

where C > 0, γ ≥ 0, δ ∈ R, k ∈ {0, 1, 2, . . .}, bj ∈ R,∑∞j=1 |bj |2 <∞. (For the

subclass of densities, the if and only if statement holds for 1/g of this form withψ(0) = C = 1 and k = 0.)

Proposition 2.1. (Merkes and Salmassi) Let {−ak} be the zeros of the Airyfunction Ai (so that ak > 0 for each k). The Hadamard representation of Ai isgiven by

Ai(z) = Ai(0)e−νz∞∏k=1

(1 + z/ak) exp(−z/ak)

where

Ai(0) =1

32/3Γ(2/3)=

Γ(1/3)

31/62π≈ 0.35503,

Ai′(0) = − 1

31/3Γ(1/3)= −31/6Γ(2/3)

2π≈ −0.25882, and

ν = −Ai′(0)/Ai(0) =31/3Γ(2/3)

Γ(1/3)=

2π

31/6Γ(1/3)2≈ .729011 . . . .

Proposition 2.1 is given by Merkes and Salmassi (1997); see their Lemma 1,page 211. This is also Lemma 1 of Salmassi (1999). Our statement of Proposi-tion 2.1 corrects the constants c1 and c2 given by Merkes and Salmassi (1997).Figure 1 shows Ai(z) (black) and m term approximations to Ai(z) based onProposition 2.1 with m = 25 (green), 125 (magenta), and 500 (blue).



-4 -2 2 4

-0.5

0.5

1.0

Fig 1. Product approximations of Ai(x)

Proposition 2.2. The functions t 7→ gc(t) are in PF∞ ⊂ PF2 for every c > 0.Thus they are log-concave. In fact, t 7→ gc(t) is strictly PF∞ for every c > 0.

Proof. By Proposition 2.1,

Ai((2c2)−1/3z) = Ai(0)e−ν(2c2)−1/3z

∞∏j=1

(1 +

z

(2c2)1/3aj

)exp

(− z

(2c2)1/3aj

)

= Ai(0)eδz∞∏j=1

(1 + bjz) exp (−bjz)

which is of the form (2.2) required in Schoenberg’s theorem with k = 0,

δ = −(2c2)−1/3ν = − (3/2)1/3Γ(2/3)

c2/3Γ(1/3), (2.3)

C = Ai(0) = 1/(32/3Γ(2/3)), and (2.4)

bj =1

(2c2)1/3aj, j ≥ 1 (2.5)

where {−aj} are the zeros of the Airy function Ai. Thus we conclude fromSchoenberg’s theorem that gc is PF∞ for each c > 0.

The strict PF∞ property follows from Karlin (1968), Theorem 6.1(a), page357: note that in the notation of Karlin (1968), γ = 0 and Karlin’s ai is our1/ak with

∑k(1/ak) =∞ in view of the fact that ak ∼ ((3/8)π(4k − 1))2/3 via

9.9.6 and 9.9.18, page 18, Olver et al. (2010).

Now we are in position to prove Theorem 2.1:

Proof. This follows from Proposition 2.2: note that

− log fZc(x) = − log gc(x)− log gc(−x),



so

w(x) ≡ (− log fZc)′′(x) = (− log gc)

′′(x) + (− log gc)′′(−x) ≡ v(x) + v(−x) ≥ 0

since gc ∈ PF∞ ⊂ PF2.

Some Scaling Relations: From the Fourier tranform of gc given above, itfollows that

gc(x) =(2/c)1/3

2π

∫ ∞−∞

e−iux

Ai(i(2c2)−1/3u)du

=(2/c)1/3(2c2)1/3

2π

∫ ∞−∞

e−iv(2c2)1/3x

Ai(iv)dv

≡ 21/6c1/3g2−1/2((2c2)1/3x).

Thus it follows that

(log gc)′′

(x) = (2c2)2/3 · (log g2−1/2)′′

((2c2)1/3x),

and, in particular,

(log gc(x))′′∣∣∣∣x=0

= (2c2)2/3 · (log g2−1/2)′′

(x)

∣∣∣∣x=0

.

When c = 1, the conversion factor is 22/3. Furthermore we compute

fZc(t) =

1

2gc(t)gc(−t) =

1

221/3c2/3g2−1/2((2c2)1/3t)g2−1/2(−(2c2)1/3t)

≡ c2/3f1(c2/3t)

where

f1(t) ≡ fZ1(t) =1

2g1(t)g1(−t)

=1

221/3g2−1/2(21/3t)g2−1/2(−21/3t).

Thus we see that

Zcd= c−2/3Z1

for all c > 0.

Figure 2 gives a plot of fZ ; Figure 3 gives a plot of − log fZ ; and Figure 4gives a plot of (− log fZ)′′.

If we use the inverse Fourier transform to represent g via (1.4), and thencalculate directly, some interesting correlation type inequalities involving theAiry kernel emerge. Here is one of them.

Let h(u) ≡ 1/|Ai(iu)| ∼ 2√πu1/4 exp(−(

√2/3)u3/2) as u → ∞ by Groene-

boom (1989), page 95. We also define ϕ(u, x/2) = Re(eiux/2Ai(iu))h(u) andψ(u, x/2) = Im(eiux/2Ai(iu))h(u).



-3 -2 -1 1 2 3

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Fig 2. The density fZ

-2 -1 1 2

2

4

6

8

10

12

14

Fig 3. − log fZ

Corollary 2.1. With the above notation,∫ ∞0

sin2(uy)ϕ(u, x)h(u)du ·∫ ∞0

cos2(uy)ϕ(u, x)h(u)du

+

∫ ∞0

sin(uy) cos(uy)ψ(u, x)h(u)du ≥ 0 for all x, y ∈ R.

3. Is Chernoff’s density strongly log-concave?

From Rockafellar and Wets (1998) page 565, h : Rd → R is strongly convex ifthere exists a constant c > 0 such that

h(θx+ (1− θ)y) ≤ θh(x) + (1− θ)h(y)− 1

2cθ(1− θ)‖x− y‖2



-3 -2 -1 0 1 2 3

2

4

6

8

10

Fig 4. (− log fZ)′′

for all x, y ∈ Rd, θ ∈ (0, 1). It is not hard to show that this is equivalent toconvexity of

h(x)− 1

2c‖x‖2

for some c > 0. This leads (by replacing h by − log f) to the following definitionof strong log-concavity of a (density) function: f : Rd → R is strongly log-concave if and only if

− log f(x)− 1

2c‖x‖2

is convex for some c > 0. Defining − log g(x) ≡ − log f(x) − (1/2)c‖x‖2, it iseasily seen that f is strongly log-concave if and only if

f(x) = g(x) exp(−(1/2)c‖x‖2)

for some c > 0 and log-concave function g. Thus if f ∈ C2(Rd), a sufficientcondition for strong log-concavity is: Hess(− log f)(x) ≥ cId for all x ∈ Rd andsome c > 0 where Id is the d× d identity matrix.

Figure 4 provides compelling evidence for the following conjecture concerningstrong log-concavity of Chernoff’s density.

Theorem 3.1. (Conjectured). Let Z1 again be a “standard” Chernoff randomvariable. Then for σ ≥ σ0 ≈ 0.541912... = (−(log fZ1

(z))′′|z=0)−1/2 the densityfZ1

can be written as

fZ1(x) = ρ(x)

1

σϕ(x/σ)

where ϕ(x) = (2π)−1/2 exp(−x2/2) is the standard normal density and ρ is

log-concave. Equivalently, if c ≥ σ3/20 ≈ 0.398927 . . ., then

fZc(x) = ρ(x)ϕ(x)



where ρ is log-concave.

Proof. (Partial) Let w ≡ (− log fZc)′′ and v ≡ (− log gc)′′. Then

w(t) = v(t) + v(−t) ≥ 2v(0) = w(0) > 0

is implied by convexity of v and strict positivity of w(0). Thus we want to showthat v(2) = (− log gc)

(4) ≥ 0.To prove this we investigate the normalized version of gc given by gc(x) =

gc(x)Ai(0)/(2/c)1/3 = gc(x)/∫gc(y)dy so that

∫gc(x)dx = 1. Suppose that

bi is given in (2.5), and let Xi ∼ Exp(1/bi) be independent exponential ran-dom variables for i = 1, 2, . . .. Since

∑∞i=1 b

2i < ∞, the random variable Y0 =∑∞

i=1(Xi−bi) is finite almost surely (see e.g. Shorack (2000), Theorem 9.2, page241) and the Laplace transform of −(δ + Y0) is given by

ϕ(s) ≡ e−δsEe−sY0 = exp(−δs) · 1∏∞i=1(1 + bis)e−bis

=1

eδs ·∏∞i=1(1 + bis)e−bis

,

exactly the form of the Laplace transform in Schoenberg’s theorem, but withoutthe Gaussian term. Thus we conclude that gc is the density of Y ≡ −δ − Y0 =−δ −

∑∞j=1(Xj − bj).

Now let λi = 1/bi for i ≥ 1. Thus Xi ∼ Exp(λi). A closed form expressionfor the density of Ym ≡

∑mi=1Xi has been given by Harrison (1990). From

Harrison’s Theorem 1, Ym has density

fm(t) =

m∑j=1

λj exp(−λjt)∏i 6=j

λiλi − λj

. (3.1)

If we could show that vm(t) ≡ (− log fm)′′(t) is convex, then we would be done!Direct calculation shows that this holds for m = 2, but our attempts at a prooffor general m have not (yet) been successful. On the other hand, we know thatfor t ≥ 0,

w(t) = v(t) + v(−t) ≥ v(t) ≥ v(0) > 0

if v satisfies v(t) ≥ v(0) for all t ≥ 0, so we would have strong log-concavitywith the constant v(0).

4. A representation of the Gaussian density as a (symmetric)product of log-concave densities

It is easily seen that the standard Gaussian density φ can be represented by aproduct of the same form as Chernoff’s density (1.2) where the function g isGaussian (and hence PF∞):

φ(x) =1

2g(x)g(−x) where g(x) = (8π)1/4φ(x/

√2)φ(−x/

√2).



But can g be taken to be asymmetric, log-concave, and in some other interestingclass?

Our goals in this section are: (a) to show that the standard Gaussian densitycan be written in the same product form (1.2) as Chernoff’s density, but in termsof a function g that is log-concave and, in fact, is in the class of densities on Rgiven by the “log-transform” of (random variables) with densities in the class ofhyperbolically completely monotone (HM∞) densities described by Bondesson(1992, 1997); and (b) to prove that certain symmetric sub-classes of the “log-transforms” of Bondesson’s hyperbolically completely monotone classes are, infact, strongly log-concave.

The following proposition is a consequence of Bondesson (1992), Example5.2.1.

Proposition 4.1. The standard normal density φ(z) = (2π)−1/2 exp(−z2/2)can be written as

φ(z) =1√2π

exp

(∫ ∞0

{log

(es + 1

es + ez

)+ log

(es + 1

es + e−z

)}ds

)(4.1)

=1

2g(z)g(−z) (4.2)

where

g(z) ≡ (2/π)1/4 exp(z) exp

(∫ ∞0

log

(es + 1

es + ez

)ds

)= (2/π)1/4 exp

(π2

12+ z +

∫ 0

−ez

log(1− t)t

dt

)(4.3)

is log-concave, integrable, and g ∈ log(HM∞).

Proof. Note that

1

y

{1

x+ y− 1

x21

y + x−1

}=

1

y

{x2(y + x−1)− (y + x)

x2(y + x)(y + x−1)

}=

x2 − 1

x2(y + x)(y + x−1)

=x2 − 1

x2(1 + xy)(1 + y/x)=

1

(1 + xy)(1 + y/x)− 1/x2

(1 + xy)(1 + y/x)

=1

x

{1

y + x−1− 1

y + x

}.



Thus it follows that∫ ∞1

{1

x+ y− 1

x21

y + x−1

}1

ydy

=1

x

∫ ∞1

{1

y + x−1− 1

y + x

}dy (4.4)

=1

x{log(y + 1/x)− log(y + x)}

∣∣∣∣∞1

=1

xlog

(y + 1/x

y + x

) ∣∣∣∣∞1

=1

x

{0− log

(1 + x−1

1 + x

)}=

1

xlog x (4.5)

Also note that the integral on the right side in (4.4) can be rewritten as∫ ∞1

{1

y + x−1− 1

y + x

}dy =

∫ ∞1

y + x− (y + x−1)

(y + x−1)(y + x)dy

= (x− x−1)

∫ ∞1

1

y2 + (x+ x−1)y + 1dy = (x− x−1)

x log x

x2 − 1

= log x;

this rewrite makes the integrability completely clear. Integrating the resultingidentity

log x

x=

∫ ∞1

{1

x+ y− 1

x21

y + x−1

}1

ydy

with respect to x on both sides over [1, z] and using Fubini’s theorem (permittedbecause the integrand is non-negative) yields

1

2(log z)2 =

∫ z

1

{∫ ∞1

{1

x+ y− 1

x21

y + x−1

}1

ydy

}dx

=

∫ ∞1

{∫ z

1

{1

x+ y− 1

x21

y + x−1

}dx

}1

ydy

=

∫ ∞1

{log(y + x) + log(y + x−1)

} ∣∣∣∣z1

1

ydy

= −∫ ∞1

{log

(y + 1

y + z

)+ log

(y + 1

y + z−1

)}1

ydy.

Making the change of variable of integration y = es gives

1

2(log z)2 = −

∫ ∞0

{log

(es + 1

es + z

)+ log

(es + 1

es + z−1

)}ds,

Changing the variable z to ex then yields the claim:

−1

2x2 =

∫ ∞0

{log

(es + 1

es + ex

)+ log

(es + 1

es + e−x

)}ds.



The claimed identity involving g follows by direct substitution. To see the secondform of g, note that∫ ∞

1

log

(y + 1

y + z

)1

ydy =

∫ ∞1

log

(1 + 1/y

1 + z/y

)1

ydy

=

∫ 1

0

log

(1 + t

1 + tz

)1

tdt

by the change of variables t = 1/y

=

∫ 1

0

log(1 + t)1

tdt−

∫ 1

0

log(1 + tz)1

tdt

=π2

12+

∫ 0

−zlog(1− s)1

sds

by the change of variable − s = tz.

To see that g is log-concave, note that

− log g(z) = −π2

12− z −

∫ 0

−ez

log(1− t)t

dt,

and hence

(− log g)′(z) = −1 +log(1 + ez)

−ez· (−ez) = −1 + log(1 + ez),

(− log g)′′(z) =ez

1 + ez≥ 0.

Integrability of g follows from

− log g(z) ∼{−z as z → −∞,−z + z2/2 as z →∞,

where we used∫ 0

−y

log(1− t)t

dt ∼{

0 as y ↘ 0,−(1/2)(log y)2 as y →∞.

Alternatively, note that g ∈ log(HM∞) ⊂ log(HM1) = PF2 = log-concave viaBondesson (1992), (5.2.3) page 73 and page 102.

Figures 5-8 give plots of g, h ≡ − log g, and the second and third derivativesof h, namely h(2) and h(3). In fact, easy computation shows that

h(2)(x) = (− log g)′′(x) =ex

1 + ex,

h(3)(x) = (− log g)(3)(x) =ex

(1 + ex)2,



-4 -2 2 4

0.2

0.4

0.6

0.8

Fig 5. The function g

-4 -2 2 4

1

2

3

4

5

Fig 6. The function − log g

the logistic distribution function and density respectively.Now we examine an interesting sub-class of Bondesson’s class HM∞ which

provides a rich class of strongly log-concave densities when log-transformed toR. From Bondesson (1992), page 73, HM∞ contains all (density) functions ofthe form

fY (y) = Cyβ−1h1(y)h2(1/y), y > 0

where

hj(y) = exp

(−bjy +

∫ ∞1

log

(v + 1

v + y

)dΓj(v)

)for some bj ≥ 0 and non-negative measures Γj , j = 1, 2, on [1,∞). Thus X =



-4 -2 2 4

0.2

0.4

0.6

0.8

1.0

Fig 7. The function (− log g)′′

-4 -2 2 4

0.05

0.10

0.15

0.20

0.25

Fig 8. The function (− log g)(3)

log Y has density fX given by

fX(x) = fY (ex)ex

= Ce(β−1)xh1(ex)h2(e−x)ex

= Ceβxh1(ex)h2(e−x) = Ceβxg1(x)g2(−x) (4.6)

= Cg1(x)g2(−x) (4.7)

where

gj(x) ≡ hj(ex) = exp

(−bjex +

∫ ∞1

log

(v + 1

v + ex

)dΓj(v)

), (4.8)

gj(x) ≡ eβjxgj(x), (4.9)



β1 − β2 = β, and Γj satisfies∫∞1

(1 + v)−1dΓj(v) <∞. Note that

− log gj(x) = bjex −

∫ ∞1

log

(v + 1

v + ex

)dΓj(v),

(− log gj)′(x) = bje

x +

∫ ∞1

ex

v + exdΓj(v) = bje

x +

∫ ∞1

(1− v

v + ex

)dΓj(v),

(− log gj)′′(x) = bje

x +

∫ ∞1

vex

(v + ex)2dΓj(v) ≡ vj(x),

and from this last expression we see that vj(x) ≥ 0; i.e. g1, g2 ∈ PF2. (This isalso easy and known since log(HM∞) ⊂ log(HM1) = PF2; see e.g. Bondesson(1992), p. 102.)

To show that fX in (4.7) is strongly log-concave, we want to show that forsome c

v1(x) + v2(−x) ≥ c > 0 for all x ≥ 0

under some conditions on b1, b2 and Γ1,Γ2. If v1 = v2 ≡ v, this is clearlyimplied by convexity of v with c = 2v(0). On the other hand, since vj(x) ≥ 0from log-concavity of gj , to prove that strong log-concavity holds (perhaps witha sub-optimal constant) it suffices to show that for

vj(x) ≥ vj(0) > 0 for all x ≥ 0 for either j = 1 or j = 2.

The following proposition isolates simple sufficient conditions under which stronglog-concavity of fX given by (4.7) - (4.9) holds.

Proposition 4.2. Suppose that fX is given by (4.7) - (4.9).A. If b1 > 0 and b2 > 0, then fX is strongly log-concave for any (all) measuresΓ1 and Γ2.B. Suppose b1 = 0 or b2 = 0 and dΓj(v) = v−1rj(v)dv for j = 1, 2 where (atleast one of) r1 and r2 satisfy: (i) rj(y) ≥ 0 all y ∈ [1,∞) with strict inequalityfor some y > 0; (ii) rj is non-decreasing. Then vj(x) ≡ (− log gj)

′′(x) ≥ vj(0) >0 for all x ≥ 0 and hence fX is strongly log-concave with (− log fX)′′(x) ≥max{v1(0), v2(0)} > 0.

Proof. For part A, note that

(− log fX)′′(x) = b1ex + b2e

−x +

∫ ∞1

vex

(v + ex)2dΓ1(v) +

∫ ∞1

ve−x

(v + e−x)2dΓ2(v)

≥ b1ex + b2e

−x ≥ 2√b1b2 > 0 for all x,



For part B, since (at least one) Γj has density γj(v) = v−1rj(v),

(− log gj)′′(x) =

∫ ∞0

ew−x

(1 + ew−x)2ewγj(e

w)dw

=

∫ ∞0

ew−x

(1 + ew−x)2rj(e

w)dw since γj(y) = y−1rj(y)

=

∫ ∞−x

ez

(1 + ez)2rj(e

z+x)dz

= Erj(eZ+x)1{Z ≥ −x} ≡ vj(x)

where Z ∼ standard logistic with density ez/(1 + ez)2. Then it follows that

vj(x) = Erj(eZ+x)1{Z + x ≥ 0} = Erj(e

Z+x)1[−x,0](Z) + Erj(eZ+x)1(0,∞)(Z)

≥ Erj(eZ+x)1(0,∞)(Z) ≥ Erj(eZ)1(0,∞)(Z) = vj(0) > 0.

Thus (− log fX)′′(x) = v1(x) + v2(−x) ≥ v1(0) + 0 = v1(0) if the hypothesisholds for j = 1, while (− log fX)′′(x) = v1(x) + v2(−x) ≥ 0 + v2(0) = v2(0) ifthe hypothesis holds for j = 2. Together these imply the claimed inequality.

Example. Note that r(y) = (log y)c with c > 0 satisfies the hypotheses of theproposition. Furthermore, this choice with c = 0 yields the standard normaldensity. Note that in this case we have

v(x) = (− log g)′′(x) = E(Z + x)c1{Z ≥ −x} ∼ xc as x→∞.

5. Consequences of log-concavity and strong log-concavity of f

Log-concavity of Chernoff’s density implies that the peakedness results of Proschan(1965) and Olkin and Tong (1988) apply. See also Marshall and Olkin (1979),page 373, and Marshall et al. (2011).

Note that the conclusion of the conjectured Theorem 3.1 is exactly the formof the hypothesis of the inequality of Harge (2004) and of Theorem 11, page 559,of Caffarelli (2000); see also Barthe (2006), Theorem 2.4, page 1532. Anotherimplication is that a theorem of Caffarelli (2000) applies: the transportationmap T = ∇ϕ is a contraction. In our particular one-dimensional special case

the transportation map T satisfying T (X)d= Z for X ∼ N(0, 1) is just the

solution of Φ(z) = FZ(T (z)), or equivalently T (z) = F−1Z (Φ(z)). This functionis apparently connected to another question concerning convex ordering of FZand Φ(·) in the sense of van Zwet (1964b); see also van Zwet (1964a): is T−1(w) =Φ−1(FZ(w)) convex for w > 0?

6. Problems remaining

The structure of the standard normal density φ given in (4.2) and (4.3) is exactlythe same as that of Chernoff’s density (1.2) where g has Fourier transform given



in (1.4). In this case we know from Section 2 that g ∈ PF∞. Two naturalquestions are: (a) Does the function g in (1.2) satisfy g ∈ log(HM∞)? (b) Doesthe function g in (4.3) satisfy g ∈ PF∞?

A further question remaining from Section 4.2: Is Chernoff’s density stronglylog-concave? A whole class of further problems involves replacing the (ordered)

convex cone Kn in Section 1 by the convex cone Kn corresponding to a convexityrestriction as in section 2 of Groeneboom et al. (2001b). In this latter case thelimiting distribution has only been described in terms of two-sided Brownianmotion; see Groeneboom et al. (2001a,b).

Acknowledgements: We owe thanks to Guenther Walther for pointing us toKarlin (1968) and Schoenberg’s theorem. We also thank Tilmann Gneiting forseveral helpful discussions.

References

Ayer, M., Brunk, H. D., Ewing, G. M., Reid, W. T. and Silverman,E. (1955). An empirical distribution function for sampling with incompleteinformation. Ann. Math. Statist. 26 641–647.

Barthe, F. (2006). The Brunn-Minkowski theorem and related geometric andfunctional inequalities. In International Congress of Mathematicians. Vol. II.Eur. Math. Soc., Zurich, 1529–1546.

Bondesson, L. (1992). Generalized gamma convolutions and related classes ofdistributions and densities, vol. 76 of Lecture Notes in Statistics. Springer-Verlag, New York.

Bondesson, L. (1997). On hyperbolically monotone densities. In Advances inthe theory and practice of statistics. Wiley Ser. Probab. Statist. Appl. Probab.Statist., Wiley, New York, 299–313.

Brunk, H. D. (1970). Estimation of isotonic regression. In NonparametricTechniques in Statistical Inference (Proc. Sympos., Indiana Univ., Blooming-ton, Ind., 1969). Cambridge Univ. Press, London, 177–197.

Caffarelli, L. A. (2000). Monotonicity properties of optimal transportationand the FKG and related inequalities. Comm. Math. Phys. 214 547–563.

Chernoff, H. (1964). Estimation of the mode. Ann. Inst. Statist. Math. 1631–41.

Daniels, H. E. and Skyrme, T. H. R. (1985). The maximum of a randomwalk whose mean path has a maximum. Adv. in Appl. Probab. 17 85–99.

Dehling, H. and Philipp, W. (2002). Empirical process techniques for de-pendent data. In Empirical process techniques for dependent data. BirkhauserBoston, Boston, MA, 3–113.

Grenander, U. (1956a). On the theory of mortality measurement. I. Skand.Aktuarietidskr. 39 70–96.

Grenander, U. (1956b). On the theory of mortality measurement. II. Skand.Aktuarietidskr. 39 125–153 (1957).

Groeneboom, P. (1985). Estimating a monotone density. In Proceedings ofthe Berkeley conference in honor of Jerzy Neyman and Jack Kiefer, Vol. II



(Berkeley, Calif., 1983). Wadsworth Statist./Probab. Ser., Wadsworth, Bel-mont, CA.

Groeneboom, P. (1989). Brownian motion with a parabolic drift and Airyfunctions. Probab. Theory Related Fields 81 79–109.

Groeneboom, P. (1996). Lectures on inverse problems. In Lectures on proba-bility theory and statistics (Saint-Flour, 1994), vol. 1648 of Lecture Notes inMath. Springer, Berlin, 67–164.

Groeneboom, P. (2010). The maximum of Brownian motion minus a parabola.Electronic Journal of Probability 15 1930–1937.

Groeneboom, P. (2011). Vertices of the least concave majorant of Brownianmotion with parabolic drift. Electronic Journal of Probability 16 2234–2258.

Groeneboom, P., Jongbloed, G. and Wellner, J. A. (2001a). A canon-ical process for estimation of convex functions: the “invelope” of integratedBrownian motion +t4. Ann. Statist. 29 1620–1652.

Groeneboom, P., Jongbloed, G. and Wellner, J. A. (2001b). Estimationof a convex function: characterizations and asymptotic theory. Ann. Statist.29 1653–1698.

Groeneboom, P. and Wellner, J. A. (1992). Information bounds andnonparametric maximum likelihood estimation, vol. 19 of DMV Seminar.Birkhauser Verlag, Basel.

Groeneboom, P. and Wellner, J. A. (2001). Computing Chernoff’s distri-bution. J. Comput. Graph. Statist. 10 388–400.

Harge, G. (2004). A convex/log-concave correlation inequality for Gaussianmeasure and an application to abstract Wiener spaces. Probab. Theory RelatedFields 130 415–440.

Harrison, P. G. (1990). Laplace transform inversion and passage-time distri-butions in Markov processes. J. Appl. Probab. 27 74–87.

Huang, J. and Wellner, J. A. (1995). Estimation of a monotone density ormonotone hazard under random censoring. Scand. J. Statist. 22 3–33.

Huang, Y. and Zhang, C.-H. (1994). Estimating a monotone density fromcensored observations. Ann. Statist. 22 1256–1274.

Janson, S., Louchard, G. and Martin-Lof, A. (2010). The maximum ofBrownian motion with parabolic drift. Electronic Journal of Probability 151893–1929.

Karlin, S. (1968). Total positivity. Vol. I. Stanford University Press, Stanford,Calif.

Kim, J. and Pollard, D. (1990). Cube root asymptotics. Ann. Statist. 18191–219.

Le Cam, L. (1986). The central limit theorem around 1935. Statist. Sci. 178–96. With comments, and a rejoinder by the author.

Leurgans, S. (1982). Asymptotic distributions of slope-of-greatest-convex-minorant estimators. Ann. Statist. 10 287–296.

Marshall, A. W. and Olkin, I. (1979). Inequalities: theory of majorizationand its applications, vol. 143 of Mathematics in Science and Engineering.Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York.

Marshall, A. W., Olkin, I. and Arnold, B. C. (2011). Inequalities: theory



of majorization and its applications. 2nd ed. Springer Series in Statistics,Springer, New York.

Merkes, E. P. and Salmassi, M. (1997). On univalence of certain infiniteproducts. Complex Variables Theory Appl. 33 207–215.

Olkin, I. and Tong, Y. L. (1988). Peakedness in multivariate distributions.In Statistical decision theory and related topics, IV, Vol. 2 (West Lafayette,Ind., 1986). Springer, New York, 373–383.

Olver, F. W. J., Lozier, D. W., Boisvert, R. and Clark, C. W. (2010).NIST Handbook of Mathematical Functions. Cambridge University Press.

Prakasa Rao, B. L. S. (1969). Estimation of a unimodal density. SankhyaSer. A 31 23–36.

Prakasa Rao, B. L. S. (1970). Estimation for distributions with monotonefailure rate. Ann. Math. Statist. 41 507–519.

Proschan, F. (1965). Peakedness of distributions of convex combinations.Ann. Math. Statist. 36 1703–1706.

Rockafellar, R. T. and Wets, R. J.-B. (1998). Variational analysis, vol.317 of Grundlehren der Mathematischen Wissenschaften [Fundamental Prin-ciples of Mathematical Sciences]. Springer-Verlag, Berlin.

Salmassi, M. (1999). Inequalities satisfied by the Airy functions. J. Math.Anal. Appl. 240 574–582.

Schoenberg, I. J. (1951). On Polya frequency functions. I. The totally positivefunctions and their Laplace transforms. J. Analyse Math. 1 331–374.

Shorack, G. R. (2000). Probability for statisticians. Springer Texts in Statis-tics, Springer-Verlag, New York.

van Eeden, C. (1957). Maximum likelihood estimation of partially or com-pletely ordered parameters. I. Nederl. Akad. Wetensch. Proc. Ser. A. 60 =Indag. Math. 19 128–136.

van Zwet, W. R. (1964a). Convex transformations: A new approach to skew-ness and kurtosis. Statistica Neerlandica 18 433–441.

van Zwet, W. R. (1964b). Convex transformations of random variables, vol. 7of Mathematical Centre Tracts. Mathematisch Centrum, Amsterdam.


Documents

Chernoff's density is log-concave - University of WashingtonBalabdaoui and Wellner/Cherno ’s density is log-concave 2 Thus log˚ ˙is concave, and hence ˚ ˙is a log-concave density.As