MULTI-STAGE KERNEL-BASED CONDITIONAL QUANTILE PREDICTION IN TIME SERIES

This article was downloaded by: [UQ Library]On: 09 November 2014, At: 03:07Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Theory and MethodsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lsta20

MULTI-STAGE KERNEL-BASED CONDITIONAL QUANTILEPREDICTION IN TIME SERIESJan G. De Gooijer a , Ali Gannoun b & Dawit Zerom ca Department of Economic Statistics , University of Amsterdam , Roetersstraat 11,Amsterdam, 1018 WB, The Netherlandsb Université Montpellier II , Laboratoire de Probabilités et Statistique, Place EugèneBataillon, 34095, Montpellier Cédex 5, Francec Department of Economic Statistics , Tinbergen Institute , University of Amsterdam,Roetersstraat 11, Amsterdam, 1018 WB, The NetherlandsPublished online: 15 Feb 2007.

To cite this article: Jan G. De Gooijer , Ali Gannoun & Dawit Zerom (2001) MULTI-STAGE KERNEL-BASED CONDITIONALQUANTILE PREDICTION IN TIME SERIES, Communications in Statistics - Theory and Methods, 30:12, 2499-2515, DOI:10.1081/STA-100108445

To link to this article: http://dx.doi.org/10.1081/STA-100108445

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/lsta20

http://www.tandfonline.com/action/showCitFormats?doi=10.1081/STA-100108445

http://dx.doi.org/10.1081/STA-100108445

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

MULTI-STAGE KERNEL-BASED

CONDITIONAL QUANTILE

PREDICTION IN TIME SERIES

Jan G. De Gooijer,1 Ali Gannoun,2 and Dawit Zerom3

1Department of Economic Statistics,University of Amsterdam, Roetersstraat 11,

1018 WB Amsterdam, The NetherlandsE-mail: [email protected]

2Laboratoire de Probabilites et Statistique,Universite Montpellier II, Place Eugene Bataillon,

34095 Montpellier Cedex 5, France3Tinbergen Institute and Department of

Economic Statistics, University ofAmsterdam, Roetersstraat 11, 1018 WB

Amsterdam, The Netherlands

ABSTRACT

We present a multi-stage conditional quantile predictor for

time series of Markovian structure. It is proved that at any

quantile level, p 2 ð0, 1Þ, the asymptotic mean squared error

(MSE) of the new predictor is smaller than the single-stage

conditional quantile predictor. A simulation study confirms

this result in a small sample situation. Because the improve-

ment by the proposed predictor increases for quantiles at the

tails of the conditional distribution function, the multi-stage

predictor can be used to compute better predictive intervals

with smaller variability. Applying this predictor to the

2499

Copyright & 2001 by Marcel Dekker, Inc. www.dekker.com

COMMUN. STATIST.—THEORY METH., 30(12), 2499–2515 (2001)

Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

changes in the U.S. short-term interest rate, rather smoothout-of-sample predictive intervals are obtained.

Key Words: Conditional quantile; Kernel; Markovian;Mean squared error; Multi-stage predictor; Single-stagepredictor; Time series

1. INTRODUCTION

Over the past two decades there has been a growing interest in pre-diction methods that can accommodate a broad class of time series andapply when the usual assumptions of linearity and Gaussianity no longerhold. In this respect, nonparametric methods are of interest because they donot rely heavily on a priori time series assumptions and instead base statis-tical inference mainly on data; see, e.g., Hardle et al. (1) for a review. Whilenonparametric prediction using the conditional mean has dominatedthe literature; see, e.g., Eubank (2) and Hardle (3), recently attention hasalso focussed on nonparametric estimation of other features of the condi-tional distribution. For example, Matzner-Løber et al. (4) and De Gooijerand Zerom (5) applied kernel estimates of the conditional median and theconditional mode to real time series.

In this paper the problem of quantile based multi-step prediction isconsidered for Markovian type time series processes. One unattractivefeature of most nonparametric quantile prediction methods is that, whenmaking more than one-step ahead predictions, not all the informationcontained in the past is used. Thus a substantial loss in prediction accuracyis likely to occur. To deal with this shortcoming, we directly exploit theMarkovian property of the time series. It turns out that one or more ofthe unused data can be easily incorporated in a recursive manner whileimproving prediction efficiency. Motivated by this recursion idea, wepropose a multi-stage kernel smoother for conditional quantiles. We alsoshow theoretically that the asymptotic performance of the new predictor issuperior to the corresponding single-stage conditional quantile estimator interms of mean-squared error (MSE).

The remainder of the paper is structured as follows. In Section 2 weclarify the difference in information content used by the estimators of thesingle-stage and the multi-stage conditional quantiles. Section 3 containsthe main result stating that the estimator of the two-stage conditional quan-tile has a smaller asymptotic MSE than the estimator of the single-stageconditional quantile. Empirical comparison of the single-stage and multi-stage predictors of the conditional quantile via a simulation study is carried

2500 DE GOOIJER, GANNOUN, AND ZEROM

Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

out in Section 4. In Section 5 we evaluate the two prediction approaches inan application to the changes in the U.S. monthly interest rate series.Conditions and proofs are collected in the Appendix.

2. SINGLE-STAGE VERSUS MULTI-STAGE PREDICTION

Let fWt; t51g be a strictly stationary real-valued Markovian processof order m, i.e. LðWtjWt�1, . . . ,W1Þ ¼ LðWtjWt�1, . . . ,Wt�mÞ where L

denotes the law. From the set of observations W1, . . . ,WN , we are interestedin making predictions of WNþH where H ð14H4N �mÞ denotes theprediction horizon. For that purpose, we construct the associated strictlystationary R

m� R-valued process ðXt,ZtÞ defined by

Xt ¼ ðWt,Wtþ1, . . . ,Wtþm�1Þ, Zt ¼ WtþHþm�1, t 2 N: ð2:1Þ

Let fðXt,Zt; t51Þg be a sequence of Rm� R–valued strictly stationary

random variables with common probability density function with respectto the Lebesgue measure �mþ1 on R

mþ1. Further, we suppose that the con-ditional distribution function of Zt given Xt ¼ x, Fð:jxÞ, has a unique quan-tile of order p 2 ð0, 1Þ at a point qðxÞ, defined by

FðqðxÞjxÞ ¼ p: ð2:2Þ

Now, given the observations ðX1,Z1Þ, . . . , ðXn,ZnÞ, where n ¼ N�

H �mþ 1, an estimator qnðxÞ of qðxÞ can be defined as the root of theequation Fnð�jxÞ ¼ p where Fnð:jxÞ is an estimator of Fð:jxÞ. Thus a predictorof the pth conditional quantile of WNþH is given by qnðXN�mþ1Þ. Of course,in practice a nonparametric estimate of the conditional distribution functionis needed. It is known that estimating Fð�jxÞ, with � 2 R, can be regarded asa nonparametric regression problem, i.e. Fð�jxÞ ¼ Eð1fZt 4 �gjXt ¼ xÞ.Accordingly Collomb (6) defined the following empirical kernel-basedestimate Fð�jxÞ,

~FFnð�jxÞ ¼

Pnt¼1 Kfðx� XtÞ=hng1fZt 4 �gPn

t¼1 Kfðx� XtÞ=hng, ð2:3Þ

where 1fAg denotes the indicator function for set fAg, Kð:Þ is a nonnegativedensity function (kernel), and hn a smoothing parameter called the band-width. Other kernel smoothers which are always a distribution function like(2.3) but with better bias characteristics have also been proposed in theliterature.

QUANTILE PREDICTION IN TIME SERIES 2501

Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

We shall refer to the solution of the equation

~FFnð�jxÞ ¼ p: ð2:4Þ

as the single-stage conditional quantile predictor and denote this by ~qqnðxÞ.Note that the conditional quantile predictor in (2.4) uses only the infor-mation in the pairs ðZt,XtÞ ðt ¼ 1, . . . , nÞ and ignores the informationcontained in

Yð1Þt ¼ Xtþ1, Y

ð2Þt ¼ Xtþ2, . . . ,Y

ðH�1Þt ¼ XtþðH�1Þ: ð2:5Þ

Below we illustrate the impact of the data contained in (2.5) on multi-stepprediction accuracy. Let g1ðyÞ¼Eð1fZt 4 �gjY

ðH�1Þt ¼yÞ. For j ¼ 2, . . . ,H � 1,

also define gjðyÞ ¼ Eðgj�1ðYðH�ð j�1ÞÞt ÞjY

ðH�jÞt ¼ yÞ. It is well known that for a

pair of random variables ðB,CÞ, VarðCÞ ¼ E½VarðCjBÞ þ Var½EðCjBÞ .

Hence, Var½gjðYðH�jÞt Þ ¼ Var½EðgjðY

ðH�jÞt ÞjY

ðH�j�1Þt þ E½VarðgjðY

ðH�jÞt Þj

YðH�j�1Þt . But, for j ¼ 1, . . . ,H � 2, we have gjþ1ðY

ðH�j�1Þt Þ ¼ EðgjðY

ðH�jÞt Þj

YðH�j�1Þt Þ. Thus

Var gjþ1 YðH�j�1Þt

� �h i4Var gj Y

ðH�jÞt

� �h i: ð2:6Þ

Similarily, it is also easy to see that

Var g1 YðH�1Þt

� �jXt ¼ x

h i4Var 1fZt 4 �gjXt ¼ x

� �: ð2:7Þ

Now, directly exploiting the Markovian property of Wt, we canrewrite Eð1fZt 4 �gjXt ¼ xÞ in such a way that the information in (2.5) isincorporated, i.e.

E 1fZt 4 �gjXt ¼ x�

¼ E g1 YðH�1Þt

� �jXt ¼ x

� �,

¼ E g2 YðH�2Þt

� �jXt ¼ x

� �,

..

.

¼ E gH�1 Yð1Þt

� �jXt ¼ x

� �:

ð2:8Þ

Observe that as we go down each line in (2.8) more and more information isutilized. Recalling the two previous inequalities, (2.6) and (2.7), we can seethat as more information is used, the prediction variance gets smaller andhence prediction accuracy, in terms of MSE improves. Thus, at least intheory, it pays off to use all the ignored information.

Based on the above recursive set-up, we now introduce akernel-based estimator of Fð�jxÞ. First the estimators of g1ð yÞ and gjð yÞ,


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

( j ¼ 1, . . . ,H � 2) are defined, respectively, as follows.

Stage 1: gg1ð yÞ ¼

Pni¼1 K y� Y

ðH�1Þi

� �=h1, n

n o1fZi 4 �gPn

i¼1 K y� YðH�1Þi

� �=h1, n

n o ,

Stage j : ggjð yÞ ¼

Pns¼1 K y� Y ðH�jÞ

s

� �=hj, n

n oggj�1 Y ðH�ðj�1ÞÞ

s

� �Pn

s¼1 K y� YðH�jÞs

� �=hj, n

n o :

Then, using ggH�1ð yÞ, we compute FFð�jxÞ by

Stage H : FFnð�jxÞ ¼

Pnk¼1 Kfðx� XkÞ=hH, ngggH�1ðY

ð1Þk ÞPn

k¼1 Kfðx� XkÞ=hH, ng: ð2:9Þ

We shall refer to the root of the equation FFnð�jxÞ ¼ p as the multi-stagep-conditional quantile predictor qqnðxÞ. The above procedure is easy to imple-ment. A MATLAB code is available upon request from the authors.

3. ASYMPTOTIC MSEs

Before comparing the asymptotic MSE of the multi-stage conditionalquantile smoother with the asymptotic MSE of the single-stage conditionalquantile smoother, we first present some results on the asymptotic proper-ties of conditional quantiles. To this end, we assume for simplicity of nota-tion that H ¼ 2, m ¼ 1. From the process ðWiÞ, let us construct theassociated process ðXi,Yi,ZiÞ defined by

Xi ¼ Wi, Yi ¼ Yð1Þi ¼ Wiþ1, Zi ¼ Wiþ2:

Assume that the R � R � R-valued ðXi,Yi,ZiÞ is a sequence of independentand strictly stationary random vectors with the same distribution as a vectorðX ,Y ,ZÞ defined on a probability space ðO,F ,PÞ.

We suppose that the conditional distribution function Fð:jxÞ of Zgiven X ¼ x, where x 2 R, admits a unique conditional quantile (of orderp 2 ð0, 1Þ) at a point qðxÞ. We also suppose that the random variables ðX ,YÞ

(respectively ðY ,ZÞ) has a joint density pX ,Y ð:, :Þ (respectively pY ,Zð:, :Þ).Let pX ð:Þ, pY ð:Þ and pZð:Þ be the marginal densities of X ,Y , and Z,and pZjX ð:jxÞ ¼ pX ,Zðx, :Þ=pX ðxÞ be the conditional density function.Furthermore, we define

F ði, jÞðtjsÞ ¼

@iþjFðtjsÞ

@si@tj, and p

ð1ÞX ðxÞ ¼

dpX ðxÞ

dx:


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

Given the above set-up, and using the assumptions given in theAppendix, the following Lemma and Theorems can be stated.

Lemma: (Collomb (6); Hall et al. (7)): Let ðX ,ZÞ be R2-valued random

variables. For � 2 R, define �2ð�,xÞ ¼ Varð1fZ4 �gjX ¼ xÞ. Assume that

Assumptions (A.1)–(A.3) given in the Appendix are satisfied. If nhn ! 1,then

Ef ~FFnð�jxÞÞ � Fð�jxÞg2 ¼1

nhnD2ð�, xÞ þ

h4nD1ð�,xÞ

4þ o

�h4n þ

1

nhn

�ð3:1Þ

where

D1ð�, xÞ ¼ k21 F ð2, 0Þ

ð�jxÞ þ2F ð1, 0Þ

ð�jxÞpð1ÞX ðxÞ

pX ðxÞ

( )2

,

D2ð�, xÞ ¼k2�

2ð�, xÞ

pX ðxÞ,

and where k1 and k2 are constants defined respectively as k1 ¼RRu2KðuÞdu

and k2 ¼RRK2

ðuÞdu.

Theorem 1: Assume that Assumptions (A.1)–(A.5) given in the Appendixare satisfied. If nhn ! 1 as n ! 1, then for all x 2 R where pZjX ðqðxÞjxÞ 6¼0, the asymptotic pointwise MSE of ~qqnðxÞ is given by

Ef ~qqnðxÞ � qðxÞg2 ¼1

p2ZjX ðqðxÞjxÞ

h4nD1ðqðxÞ, xÞ

4þ

1

nhnD2ðqðxÞ, xÞ

!

þ o h4n þ

1

nhn

� �: ð3:2Þ

Furthermore, under the same assumptions mentioned above and ifD1ðqðxÞ,xÞ 6¼ 0, the asymptotically optimal value h�n, say of hn, minimizing(3.2) is given by

h�n ¼D2ðqðxÞ, xÞ

D1ðqðxÞ, xÞ

� �1=5

n�1=5,

and the corresponding best possible MSE of ~qqnðxÞ is given by

MSE�f ~qqnðxÞg ’

5n�4=5

4p2ZjX ðqðxÞjxÞ

D4=52 ðqðxÞ, xÞD1=5

1 ðqðxÞ,xÞ: ð3:3Þ

Remark 1: Theorem 1 is a reformulation of Theorem 5.1 of Berlinet et al. (8)established for double kernel smoothing estimator. Similar result can befound in Jones and Hall (9). Details of the proof are left to the reader.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

The main result of the paper is stated as follows.

Theorem 2: Assume that Assumptions (A.1)–(A.5) given in the Appendix aresatisfied, and that nh1, n ! 1 as n ! 1 and h1, n ¼ oðh2, nÞ. For � 2 R, letv1ð�, xÞ ¼ Varðg1ðYÞjxÞ (g1ðYÞ is as defined in Section 2). Then for all x 2 R

such that pZjX ðqðxÞjxÞ 6¼ 0, the asymptotic MSE of the two-stage estimatorqqnðxÞ is given by

EfqqnðxÞ � qðxÞg2 ¼1

p2ZjX ðqðxÞjxÞ

h42, nD1ðqðxÞ, xÞ

4þ

1

nh2, n

D3ðqðxÞ, xÞ

!

þ o�h4

2, n þ1

nh2, n

�,

where

D3ðqðxÞ, xÞ ¼ k2

v1ðqðxÞ, xÞ

pX ðxÞ:

Furthermore, it follows that the asymptotically optimal value h�2, n, say, ofh2, n is given by

h�2, n ¼D3ðqðxÞ, xÞ

D1ðqðxÞ, xÞ

� �1=5

n�1=5,

and the corresponding best possible MSE is

MSE�fqqnðxÞg ’

5n�4=5

4p2ZjX ðqðxÞjxÞ

D4=53 ðqðxÞ, xÞD1=5

1 ðqðxÞ, xÞ:

Corollary: Let v2ð�, xÞ ¼ E½Varð1fZ4 �gjYÞjx . Then under the assump-tions of Theorems 1 and 2, the ratio of the asymptotic best possibleMSEs of the single-stage estimator ~qqnðxÞ and the two-stage estimator qqnðxÞis given by

rðqðxÞ, xÞ ¼ 1 þv2ðqðxÞ, xÞ

v1ðqðxÞ, xÞ

� �4=5

51:

Remark 2: Note that the asymptotic results are insensitive to the choice ofthe bandwidth h1, n, provided nh1, n ! 1 and h1, n ¼ oðh2, nÞ.

Remark 3: It can be noticed from the proof of the Corollary that theasymptotic MSE of the two-stage smoother is smaller becausev1ðqðxÞ, xÞ4 �2

ðqðxÞ, xÞ; see also the discussion in Section 2.

Remark 4: It can easily be verified that �2ðqðxÞ, xÞ ¼ pð1 � pÞ. Thus we express

the asymptotic ratio rðqðxÞ, xÞ as a function of p: rðqðxÞ, xÞ ¼fpð1 � pÞ=ðpð1 � pÞ � v2ðqðxÞ, xÞÞg

4=5. Note that v2ðqðxÞ, xÞ4 pð1 � pÞ.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

Figure 1 shows a plot of r versus p ð0:14 p4 0:9Þ for, say v2 ¼ 0:05. Clearly rincreases sharply as we go to edge of the conditional distribution. This illus-trates theoretically that the improvement achieved by the multi-stage condi-tional estimator is more pronounced for quantiles in the tail of the conditionaldistribution.

Remark 5: The asymptotic results in Theorem 2 and the Corollary can beshown to hold when the observations are dependent. Proofs can be obtainedunder some assumptions on mixing coefficients and by replacing classicalinequalities by Bernstein inequalities for mixing processes; see also DeGooijer et al. (10).

4. PRACTICAL PERFORMANCE

We have shown that the multi-stage conditional quantile estimatorqqnðxÞ has a better prediction performance than the single-stage conditionalquantile estimator ~qqnðxÞ in terms of asymptotic MSE. In this section asimulated example is used to illustrate the finite sample performance ofthe new predictor. Note from Section 3 that the optimal bandwidthfor both predictors depends on p. Thus the amount of smoothing requiredto estimate different parts of Fð:jxÞ may differ from what is optimal toestimate the whole conditional distribution function. This is particularlythe case for the tails of Fð:jxÞ. Therefore, a unique bandwidth is chosenfor the computation of each p-conditional quantile. To this end, the follow-ing practical approach is employed. First a primary bandwidth, suitable for

Figure 1. Ratio of asymptotic best possible MSEs (r) versus the quantile level p.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

the conditional mean estimation, is selected. Then it is adjusted according tothe following rule-of-thumb hn ¼ hmean½fpð1 � pÞg=f�ð��1

ðpÞÞ2g 1=ðmþ4Þ wherehmean is the optimal bandwidth for the conditional mean. � and � are thestandard normal density and distribution functions, respectively.

The above approach is appropriate for the single-stage predictor ~qqnðxÞ.However, for the multi-stage predictor several values of the bandwidthneed to be selected. For simplicity, we fix the bandwidth at the last predic-tion stage, say hH, n, at the optimal value of the single-stage estimator hn.The bandwidths in the intermediate stages are scaled downward arbitrarilyvis-�a-vis hH, n. This is in accordance with the theory of Section 3. Differentoptions such as hH, n, hH, n=5, hH, n=10, and hH, n=20 were tried and thelast three seem to give more or less similar results. Hence only results forhH, n=5 are reported. The standard Gaussian kernel is used throughout allcomputations.

Consider the simple, Markovian type, nonlinear autoregressive modelof order 1

Zt ¼ 0:23Zt�1ð16 � Zt�1Þ þ 0:4"t, ð4:1Þ

where f"tg is a sequence of iid random variables each with the standardnormal distribution truncated in the interval [�12,12]. The objective is toestimate 2- and 5-steps ahead p-conditional quantiles using both qqnðxÞand ~qqnðxÞ and compare their prediction accuracy. Predictions will be madeat p ¼ 0:25, p ¼ 0:50, and p ¼ 0:75. The conditional density of ZtþH givenZt ¼ x will be examined at x ¼ 6, x ¼ 8, and x ¼ 10. Clearly a properevaluation of the accuracy of qqnðxÞ and ~qqnðxÞ requires knowledgeabout the ‘‘true’’ conditional quantile qðxÞ. This information is obtainedby generating 10,000 independent realizations of ðZtþH jZt ¼ xÞ (H¼ 2,and 5) iterating the process (4.1) and computing the appropriate quantilesfrom the empirical conditional distribution function of these generatedobservations.

From (4.1), 150 samples of sample size n¼ 150 were generated. Eachreplication had a unique seed. To compare the accuracy of the predictorsqqnðxÞ and ~qqnðxÞ with qðxÞ, the following error measures are computed foreach replication j ( j ¼ 1, . . . , 150Þ:

ej~qqnðxÞ ¼f ~qqjn � qðxÞg2

qðxÞ2and ejqqnðxÞ ¼

fqqjnðxÞ � qðxÞg2

qðxÞ2:

Then percentile values are computed from the empirical distributions of the150 replication samples, i.e. from ej~qqnðxÞ and ejqqnðxÞ.

The graphs a)–c) in Figure 2 show that the percentiles of the squarederrors from the 2-stage and 5-stage predictions (solid line) lie overall belowthe corresponding percentiles of the squared errors from the single-stage


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

Figure 2. a)–c) Percentile plots of the empirical distribution of the squared errors

for model (13) for the single-stage predictor ~qqnðxÞ (medium dashed line) and themulti(¼two)-stage stage predictor qqnðxÞ (solid line); d)–f) Box-plots correspondingto the percentile plots a)–c), respectively; n ¼ 150, 150 replications.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

predictions (medium dashed line). This implies that the conditional quantilepredictions made by qqnðxÞ are more accurate than those made by ~qqnðxÞ. Nowconsider the box-plots d)–f) in Figure 2. It is clear from these plots that themulti-stage predictor has a much smaller variability while its bias is nearlythe same as that of the single-stage estimator. This confirms the theoreticalresult in Section 3. Similar box-plots were also obtained for other combina-tions of H, p, and x.

5. APPLICATION

Here we apply the multi-stage and single-stage conditional quantilepredictors to obtain 6-step ahead out-of-sample prediction intervals for themonthly U.S. short-term interest rate, i.e. the yield on U.S. Treasury Billswith three months to maturity. The time series contains 348 monthly obser-vations from January 1966 to December 1994. The data were obtained fromthe Internet at the website: www.bog.fed.us/releases/h15/data.htm. The firstdifference of the original series (after taking logarithms), denoted by Wt, willbe used in our analysis with a total of 347 observations. Our motivation forusing this series has grown out of recent work on predicting weekly U.S.T-bill rates; see De Gooijer and Zerom (3). Using the notations of Section 2,Xt ¼ Wt and Zt ¼ Wtþ6 where t ¼ 1, . . . ,N � 6, and N is the index of theprediction base. In this example, a nominal coverage probability of 0.80is considered, i.e. ½qnðx, 0:1Þ, qnðx, 0:9Þ where x ¼ XN . Note that qnðx, 0:1Þand qnðx, 0:9Þ are respectively the 6-step ahead 10th- and 90th-conditionalquantiles.

As in Section 4, we choose h6, n such that h6, n ¼ hn where hn is theoptimal value of the single-stage estimator. In the intermediate stages, thetheory requires the bandwidths be smaller than h6, n. In order to have someidea on how smaller they should be, we compute in-sample 6-step aheadpredictions at various levels of undersmoothing (i.e., h6, n=5, h6, n=10, h6, n=15, . . . , h6, n=70). By in-sample, it is meant that x is contained in Xt.Among various bandwidths considered, the choice: h1, n ¼ h6, n=45 andh2, n ¼ h3, n ¼ h4, n ¼ h5, n ¼ h6, n=35 seems to yield multi-stage quantile esti-mates (see Figure 3) which are roughly the same as that of the single-stagewhile being less noisy.

Now using the above set of bandwidths, we compute 6-step aheadout-of-sample predictions standing on the last 42 observations, i.e.W300, . . . ,W341. For example, at W300 we predict the 10th or 90th quantileof W306 conditional on x ¼ X300. The respective average lengths of theintervals for the single-stage and multi-stage are 0.148 and 0.152 which,respectively, are 24% and 25% of the range of the data. Thus both


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

estimators perform comparably well in the sense of having not too wide

intervals. Only about 19% of the actual observations lie outside the inter-

vals. The average values of the upper and lower predictive intervals for the

single-stage and multi-stage are [�0.0719 (0.0125); 0.0759 (0.0175)] and

Figure 3. In-sample 10th- and 90th-conditional quantile estimates of the single-stage (medium dashed line) and the multi-stage (solid line) predictors.

Figure 4. Out-sample 10th- and 90th-conditional quantile estimates of the single-stage (medium dashed line) and the multi-stage (solid line) predictors.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

[�0.0755 (0.0038); 0.0765 (0.0095)], respectively, where the numbers in the

parentheses are the standard deviations. This indicates that the single-stage

based confidence intervals are more erratic than those of the multi-stage

approach. We can also observe this from Figure 4 which displays the

80% confidence intervals.

In the foregoing analysis we have used quantiles to construct the

confidence intervals. But in situations where the predictive densities are

asymmetric or multi-modal, the quantile based intervals tend to give

wider intervals. Therefore, De Gooijer and Gannoun (11) suggested

the use of more efficient predictive intervals which are based directly on

the conditional distribution function (CDF). Fortunately, the multi-

stage approach introduced in this paper is still useful. We just have to

employ the multi-stage CDF FFnðxÞ instead of ~FFnðxÞ. Figure 5 presents the

42 out-of-sample single- and multi-stage CDFs which corresponds to the

quantile values in Figure 4. While the general pattern of the CDFs from

Figure 5. a) 42 out-of-sample single-stage CDFs; b) 42 out-of-sample multi-stage

CDFs.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

both estimators is the same, the multi-stage based CDFs are noticeablysmoother.

6. APPENDIX: ASSUMPTIONS AND PROOFS

The asymptotic results will be derived on a set of assumptionsgathered below for ease of reference.

(A.1) The kernel K is bounded, even, and strictly positive Holderianfunction satisfying limu!1 uKðuÞ ! 0 and

R1�1

u2KðuÞ du <1.(A.2) The marginal density pX ðxÞ of X is lower bounded away

from 0. Its first and second derivatives exist, are boundedand integrable.

(A.3) The conditional density function pZjX ð:jxÞ is continuous.(A.4) The joint density pX ,Y ðx, yÞ is Holder continue both in x and y.(A.5) The functions Fð�jxÞ ¼ Eð1fZ4 �gjX ¼ xÞ, and Fð�jyÞ ¼

Eð1fZ4 �gjY ¼ yÞ are twice differentiable (with respect to xand y) and the second derivatives are Holder continuoussuch that

jF ð0, 2Þð�jx1Þ�F ð0, 2Þ

ð�jx2Þj42jx1 � x2j�2 ,

and

jF ð0, 2Þð�jy1Þ�F ð0, 2Þ

ð�jy2Þj4 1jy1 � y2j�1 :

In addition Fð�jyÞ is Holder continuous such that

jFð�jy1Þ � Fð�jy2Þj4 3jy1 � y2j�3

where �1, �2, and �3 are positive constants.

Some comments on the above assumptions are in order. Assumption(A.1) is quite usual in kernel estimation. A symmetric density with compactsupport satisfies this assumption. Assumption (A.2) is needed to prove theconvergence of the multi-stage kernel smoother estimator of the conditionalquantile. Assumption (A.5) is used to ensure that the variance of FFnð�jxÞexists and is finite.

Before we give the proof of Theorem 2, it is helpful to state the follow-ing general result. Let � be a continuously differentiable real function with avalue p at a point �. Suppose that � and � are unknown and that thereexists an estimate �n of � based on n observations. If, at a point �n,�nð�nÞ ¼ p, then it is natural to estimate � by �n. If we consider a sequencef�ng of differentiable estimates for which asymptotic results are known, it ispossible to obtain for the sequence f�ng asymptotic results (convergence,


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

rate of convergence, asymptotic distribution) by using a Taylor series expan-sion of �ð�Þ:

�ð�Þ ¼ �nð�nÞ ¼ p ¼ �ð�nÞ þ ð��nÞ�ð1Þð��

Þ, ð6:1Þ

where �� is between � and �n and the superscript (1) denotes the firstderivative. Note that (6.1) contains as special cases quantiles and condi-tional quantiles estimation problems.

For ease of notation we replace X ¼ x by x in the following proofs.

Proof of Theorem 2. We make use of the property

FðqðxÞjxÞ ¼ p ¼ FFnðqqnðxÞjxÞ:

Taylor expansion of FðqnðxÞjxÞ about qqnðxÞ and various approximation (see,e.g., Lemma D on p. 97 of Serfling (12) or Lemma 4 of Cai (13) gives

FðqðxÞjxÞ ¼ p ¼ FFnðqqnðxÞjxÞ � FFnðqðxÞjxÞ þ ðqnðxÞ � qqnðxÞÞpZjX ðq�jxÞ,

where q� is some random point between qðxÞ and qqnðxÞ.By the Lemma and a slight modification of Theorem 1 of Chen (14)

(replacing Z by 1fZ4 �g), if h1, n ¼ oðh2, nÞ the asymptotic bias of FFnðqðxÞjxÞ isgiven by

FðqðxÞjxÞ�EðFFnðqðxÞjxÞÞ ¼ h22,nd1ðqðxÞ,xÞþ oðh2

2,nÞþO� 1

nh2,n

�, ð6:2Þ

where d21 ðqðxÞ, xÞ ¼ D1ðqðxÞ, xÞ. Further, the asymptotic variance of

FFnðqðxÞjxÞ is given by

VarðFFnðqðxÞjxÞÞ ¼1

nh2, n

VarðFððqðxÞjYÞjxÞÞ

pX ðxÞk2 þ o

� 1

nh2, n

�

¼1

nh2, n

D3ðqðxÞ, xÞ þ o� 1

nh2, n

�: ð6:3Þ

Note that

D3ðqðxÞ, xÞ ¼ k2

VarðFððqðxÞjYÞjxÞÞ

pX ðxÞ¼ k2

v1ðqðxÞ, xÞ

pX ðxÞ:

Thus, from (6.2) and (6.3) it follows directly that the asymptotic MSE ofFFnðqðxÞjxÞ is given by

EfFFnðqðxÞjxÞ � FðqðxÞjxÞg2 ¼1

nh2, n

D3ðqðxÞ, xÞ þh4

2, nD1ðqðxÞ, xÞ

4

þ o�h4

2, n þ1

nh2, n

�:


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

Now, by Assumptions (A.1), (A.2) and (A.3) and by unicity of qðxÞ, we getthat qqðxÞ converges to qðxÞ in probability. Then, by continuity of pZjX ð:jxÞand because q�ðxÞ is between qðxÞ and qnðxÞ, we have thatpZjX ðq

�ðxÞjxÞ ¼ pZjX ðqðxÞjxÞ þOpð1Þ; see for the proof, e.g., Theorem 1 of

Yakowitz (15) and Theorem 1 of Samanta and Thavaneswaran (16).Then we deal with the ratio of random variables in a standard way,

and obtain

EfqqnðxÞ � qðxÞg2 ¼1

p2ZjX ðqðxÞjxÞ

h42, nD1ðqðxÞ, xÞ

4þ

1

nh2, n

D3ðqðxÞ, xÞ

!

þ o h42, n þ

1

nh2, n

� �: ð6:4Þ

The value of h2, n minimizing (6.4) is given by

h�2, n ¼D3ðqðxÞ, xÞ

D1ðqðxÞ, xÞ

� �1=5

n�1=5:

The corresponding best possible MSE of qqnðxÞ is given by

MSE�fqqnðxÞg ’

5n�4=5

4p2ZjX ðqðxÞjxÞ

�D4=5

3 ðqðxÞ, xÞD1=51 ðqðxÞ, xÞ

�:

Proof of the Corollary: By Theorem 1, we have that the minimum value ofthe asymptotic MSE of ~qqnðxÞ is given by

MSE�f ~qqnðxÞg ’

5n�4=5

4p2ZjX ðqðxÞjxÞ

�D4=5

2 ðqðxÞ, xÞD1=51 ðqðxÞ, xÞ

�:

It is easy to see that �2ðqðxÞ, xÞ ¼ v1ðqðxÞ, xÞ þ v2ðqðxÞ, xÞ. Therefore, the

ratio of the minimum asymptotic MSE�s of the estimators ~qqðxÞ and qqnðxÞis given by

rðqðxÞ, xÞ ¼ 1 þv2ðqðxÞ, xÞ

v1ðqðxÞ, xÞ

� �4=5

51:

REFERENCES

1. Hardle, W.; Lutkepohl, H.; Chen, R. A review of nonparametric timeseries analysis. International Statistical Review, 1997, 65, 49–72.

2. Eubank, R.L. Spline Smoothing and Nonparametric Regression, MarcelDekker: New York, 1988.


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

ORDER REPRINTS

3. Hardle, W. Applied Nonparametric Regression, Cambridge UniversityPress: Cambridge, 1990.

4. Matzner-Løber, E.; Gannoun, A.; De Gooijer, J.G. Nonparametricforecasting: a comparison of three kernel-based methods. Communica-tions in Statistics—Theory and Methods, 1998, 27, 1593–617.

5. De Gooijer, J.G.; Zerom, D. Kernel-based multistep-ahead predictionsof the U.S. short-term interest rate. J. of Forecasting, 2000, 19,335–353.

6. Collomb, G. Proprietes de convergence presque complete du predicteura noyau. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 1984, 66,441–460.

7. Hall, P.; Wolff, R.C.L.; Yao, Q. Methods for estimating a conditionaldistribution function. J. Amer. Statist. Assoc., 1999, 94, 154–163.

8. Berlinet, A.; Gannoun, A.; Matzner-Løber, E. Asymptotic normalityof convergent estimates of conditional quantiles. Statistics, 2001, 35,139–169.

9. Jones, M.C.; Hall, P. Mean squared error properties of kernelestimates of regression quantiles. Statistics & Probability Letters,1990, 10, 283–289.

10. De Gooijer, J.G.; Gannoun, A.; Zerom, D. Mean squared error prop-erties of the kernel-based multi-stage median predictor for time series.Statistics & Probability Letters, 2001 (forthcoming).

11. De Gooijer, J.G.; Gannoun, A. Nonparametric conditional predictiveregions for time series. Computational Statistics & Data Analysis,2000, 33, 259–275.

12. Serfling, R.J. Approximations Theorems of Mathematical Statistics,Wiley: New York, 1980.

13. Cai, Z. Regression quantiles for time series. Econometric Theory, 2001(forthcoming).

14. Chen, R. A nonparametric multi-step prediction estimator inMarkovian structures. Statistica Sinica, 1996, 6, 603–615.

15. Yakowitz, S.J. Nonparametric density estimation, prediction, andregression for Markov sequences. J. Amer. Statist. Assoc., 1985, 80,215–221.

16. Samanta, M.; Thavaneswaran, A. Non-parametric estimation of theconditional mode. Communications in Statistics – Theory and Methods,1990, 19, 4515–4524.

Received August 2000Revised August 2001


Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

Order now!

Reprints of this article can also be ordered at

http://www.dekker.com/servlet/product/DOI/101081STA100108445

Request Permission or Order Reprints Instantly!

Interested in copying and sharing this article? In most cases, U.S. Copyright Law requires that you get permission from the article’s rightsholder before using copyrighted content.

All information and materials found in this article, including but not limited to text, trademarks, patents, logos, graphics and images (the "Materials"), are the copyrighted works and other forms of intellectual property of Marcel Dekker, Inc., or its licensors. All rights not expressly granted are reserved.

Get permission to lawfully reproduce and distribute the Materials or order reprints quickly and painlessly. Simply click on the "Request Permission/Reprints Here" link below and follow the instructions. Visit the U.S. Copyright Office for information on Fair Use limitations of U.S. copyright law. Please refer to The Association of American Publishers’ (AAP) website for guidelines on Fair Use in the Classroom.

The Materials are for your personal use only and cannot be reformatted, reposted, resold or distributed by electronic means or otherwise without permission from Marcel Dekker, Inc. Marcel Dekker, Inc. grants you the limited right to display the Materials only on your personal computer or personal wireless device, and to copy and download single copies of such Materials provided that any copyright, trademark or other notice appearing on such Materials is also retained by, displayed, copied or downloaded as part of the Materials and is not removed or obscured, and provided you do not edit, modify, alter or enhance the Materials. Please refer to our Website User Agreement for more details.

Dow

nloa

ded

by [

UQ

Lib

rary

] at

03:

07 0

9 N

ovem

ber

2014

http://www.copyright.gov/fls/fl102.html

http://www.publishers.org/conference/copyguide.cfm

http://www.dekker.com/misc/useragreement.jsp

http://www.dekker.com/misc/useragreement.jsp

http://s100.copyright.com/AppDispatchServlet?authorPreorderIndicator=N&pdfSource=Dekker&publication=STA&title=MULTI-STAGE+KERNEL-BASED+CONDITIONAL+QUANTILE+PREDICTION+IN+TIME+SERIES&offerIDValue=18&volumeNum=30&startPage=2499&isn=0361-0926&chapterNum=&publicationDate=12%2F31%2F2001&endPage=2515&contentID=10.1081%2FSTA-100108445&issueNum=12&colorPagesNum=0&pdfStampDate=07%2F28%2F2003+11%3A46%3A16&publisherName=dekker&orderBeanReset=true&author=Jan+G.+De+Gooijer%2C+Ali+Gannoun%2C+Dawit+Zerom&mac=gDiMSrEsFhgTFHqlK0Uo1w--

Documents

MULTI-STAGE KERNEL-BASED CONDITIONAL QUANTILE PREDICTION IN TIME SERIES