On the estimation of fractionally integrated processes ...pure.au.dk/portal/files/98248752/fsnielsen.pdfOn the estimation of fractionally integrated processes processes ... Chapter

$Page 1: On the estimation of fractionally integrated processes ...pure.au.dk/portal/files/98248752/fsnielsen.pdfOn the estimation of fractionally integrated processes processes ... Chapter$
On the estimation of fractionally integrated On the estimation of fractionally integrated On the estimation of fractionally integrated On the estimation of fractionally integrated processesprocessesprocessesprocesses

By By By By Frank S. NielsenFrank S. NielsenFrank S. NielsenFrank S. Nielsen

A dissertation submitted to

the Faculty of Social Sciences, University of Aarhus

in partial fulfilment of the requirements of

the PhD degree in

Economics and Management


To My Family Til Min Familie


Table of ContentsTable of ContentsTable of ContentsTable of Contents PrefacePrefacePrefacePreface vvvv SummarySummarySummarySummary viiviiviivii Dansk Resume (Danish Summary)Dansk Resume (Danish Summary)Dansk Resume (Danish Summary)Dansk Resume (Danish Summary) xxxxxxxx Chapter 1Chapter 1Chapter 1Chapter 1 1111 A vector autoregressive model for electricity prices subject to long memory and regime switching Chapter 2Chapter 2Chapter 2Chapter 2 25252525 Local polynomial Whittle estimation covering non-stationary fractional processes Chapter 3Chapter 3Chapter 3Chapter 3 65656565 Local polynomial Whittle estimation of perturbed fractional processes Chapter 4Chapter 4Chapter 4Chapter 4 111113131313 Long-run dependencies in return volatility and trading volume


Preface

This thesis was written in the period from February 2006 to January 2009 while I was a PhD student

at the School of Economics and Management, Aarhus University and during my visit at Cornell

University, New York, USA. I am grateful to the school, to the Danish Social Sciences Research

Council (grant no. FSE275-05-0199) for generously �nancial support in connection with courses,

conferences, and the time abroad, and the Center for Research in Econometric Analysis of Time Series

(CREATES), funded by the Danish National Research Foundation for providing excellent research

facilities.

A number of people have contributed to the making of this thesis. First and foremost I thank my

advisor Niels Haldrup, for invaluable encouragement, and for always being there with competent and

contructive comments and suggestions. I have bene�ted greatly from our discussions. In addition, I

am thankful to Morten Ørregaard Nielsen. Without his advice, comments, suggestions, and friendship

this PhD thesis wouldn�t be the same.

From August 2007 to February 2008 I visited Department of Economics, Cornell University, New

York, USA. I would like to thank the department for their hospitality. Additionally, thanks goes to

Morten Ørregaard Nielsen for making the stay extremely pleasant, especially for the non-academic

discussions and all the hours in the gym, where he didn�t beat me once on the treadmill.

At Aarhus University I would like to thank faculty and fellow students. Special thanks goes to the

table football (foosball) team, i.e. Allan, Christian, Claus, Jonas, Niels, Rune, and Torben. I would

also like to send a special thanks to my o¢ ce mate and good friend Niels Skipper with whom I have

shared lots of academic and not so academic discussions, and of most importance many visits to the

Friday Bar.

Finally, a very special thanks to my closest family for putting up with me through the years and

their unconditional love.

Frank S. Nielsen, Aarhus, January 2009

Updated preface

The pre-defence took place on March 9, 2009 in Aarhus. I am grateful to the members of the assessment

committee, Allan Würtz, Tom Engsted (sitting in for Allan Würtz at the pre-defence), Javier Hualde,

and Jörg Breitung, for their careful reading of the thesis and their many useful and constructive

comments and suggestions. Some of the suggestions have been incorporated in the present version of

the thesis while others remain for future work on the chapters.

Frank S. Nielsen, Aarhus, March 2009

v


Summary

This thesis is concerned with time series modeling where the unifying theme is the treatment of long

memory and fractional integration.1 The aim of the thesis is to further develop methods of inference

for long memory models. In particular, the focus is on estimation as well as applications of the derived

theory on economic data.

Empirical evidence of fractional integration, and more generally long memory, have been around

for a long time in various �elds, such as astronomy, chemistry, agriculture, and geophysics, but it

was not until the seminal works of Granger (1980), Granger & Joyeux (1980), and Hosking (1981)

that long memory and fractional integration were introduced in economics. The past decades have

witnessed an increasing interest in fractionally integrated models as a convenient and parsimonous way

to capture the long memory properties of many time series. Long memory and fractionally integrated

processes are regarded as halfway representations between models where the correlations decay at an

exponential rate, i.e. short-memory models (e.g. autoregressive moving average models) and unit

root models which exihibits no mean reversion. The autocorrelation of a long memory or fractionally

integrated process decays at a slower hyperbolic rate. This permits parsimonious representation of

series which exhibit non-zero autocorrelation at high lags. Empirically, there is now a broad range of

applications showing the relevance of long memory and fractional integration. For instance, in �nance

(e.g. Baillie, Bollerslev & Mikkelsen (1996), Breidt, Crato & de Lima (1998), Andersen, Bollerslev,

Diebold & Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001, 2003), and Bandi & Perron

(2004)), and in macroeconomics (e.g. Diebold & Rudebusch (1989, 1991), Sowell (1992), and Gil-

Alana & Robinson (1997)). See Robinson (1994), Baillie (1996), and Henry & Za¤aroni (2003) for

three excellent surveys.

The thesis consists of four self-contained chapters, two-single-authored, and two written with co-

authors. In chapter 1, together with Niels Haldrup and Morten Ørregaard Nielsen, we consider a regime

dependent vector autoregressive model, where we allow for state dependent fractional integration as

well as the possibility of state dependent fractional cointegration. The proposed model is relevant

in describing the price dynamics of electricity prices when the transmission of power is subject to

occasional congestion periods. For a system of bilateral prices non-congestion means that electricity

prices are identical whereas congestion makes prices depart. Hence, the joint price dynamics implies

switching between essentially a univariate price process under non-congestion and a bivariate price

process under congestion. At the same time it is an empirical regularity that electricity prices tend

to show a high degree of fractional integration, and thus that prices may be fractionally cointegrated.

We apply the model to an analysis of the area prices in Nord Pool. The analysis extends the approach

used in Haldrup & Nielsen (2006) as we model multiple price series jointly. Their analysis is limited

to individual price series and the relative price series are analyzed separately as univariate models.

When the focus of analysis is the potential (fractional) cointegration amongst multiple series, a system

approach is more natural, but given the particular features the model should allow also more complex in

the present context. Given our approach, we �nd that the behavior of electricity prices in geographical

1 In the following we use the terms �long memory process� and �fractionally integrated process� synonymously,

although strictly speaking, a fractional process is just a particular form of a long memory process.

vii

price regions are di¤erent across states. The analysis shows that it is important to condition on

congestion/non-congestion as non-switching models can generate misleading conclusions with regard

to the fractional integration orders and potential fractional cointegration. Three leading types of

misclassi�cation of the model dynamics may arise. First, non-switching models may indicate that

the price series are fractionally cointegrated. Although, when conditioning on states this is only the

case in the non-congestion state (which is cointegrated by de�nition). Secondly, the non-switching

model indicate that there is no fractional cointegration when in fact there is cointegration in the non-

congestion state. Finally, there is the possibility of fractional cointegration in both regimes, but not

in the non-switching model. Conditioning on states is also important when looking at the adjustment

coe¢ cients, as the non-switching models can lead to wrong conclusions about the convergence of

geographical price regions towards equilibrium.

Chapter 2 is concerned with semiparametric estimation of the fractional integration parameter in

the empirically relevant scenario of short-run contamination of the process of interest and potential

non-stationarity. We propose an estimator which extends the local polynomial Whittle estimator

of Andrews & Sun (2004) to fractionally integrated processes covering both stationary and non-

stationary regions. We utilize the extended discrete Fourier transform and periodogram to extend

the local polynomial Whittle estimator to the non-stationary region. By approximating the short-run

component of the spectrum, say ' (�), by a polynomial Pr (d; �) (even and �nite) instead of a constant

in a shrinking neighborhood of frequency zero, we alleviate some of the bias that we usually see in

the classical local Whittle setting. The bias reduction comes at a cost as the variance is in�ated by

a multiplicative constant. Additionally, given that the generating process is linear, the same central

limit theorem argument as in the stationary case jdj < 12 derived by Robinson (1995) holds; although,

not for d0 =�12 ;32 ; :::

. We establish consistency and asymptotic normality for d0 2 (�1=2;1) :

Furthermore, if '(�) is in�nitely smooth near frequency zero, the rate of convergence can become

arbitrary close to the parametric rate. The simulations reveal that our proposed estimator is superior

when considering possible short-run contamination and non-stationary values of d: Finally, an analysis

of credit spreads demonstrates the usefulness of the estimator. We cannot reject that the log of yields

of Aaa, Baa and Treasury bonds contain a unit root. However, the results are more mixed when

looking at spreads, depending on the semiparametric estimator and bandwidth choice. Therefore, as

in Ratta & Urga (2005) we can, for our given data, reject the reduced-form modeling of Das & Tufano

(1996), Jarrow, Lando & Turnbull (1997) and Du¢ e & Singleton (1999), which explicitly implies the

data generating process of the risk-free process, and hence also credit spreads, follows a short-memory

process.

In chapter 3, co-authored with Per Frederiksen and Morten Ørregaard Nielsen, we propose a semi-

parametric local polynomial Whittle with noise (LPWN) estimator of the memory parameter in long

memory time series perturbed by a noise term which may be serially correlated. The proposed estima-

tor allows both the spectrum of the perturbation and the spectrum of the short-memory component of

the signal, i.e. �w(�) and �y(�) to be approximated by polynomials hw(�w; �) and hy(�y; �) of (�nite

and even) orders 2Rw and 2Ry near the zero frequency, instead of constants, thereby obtaining a bias

reduction depending on the smoothness of �w(�) and �y(�) near the origin. The approach taken here

in modeling the short-run dynamics by a polynomial was introduced by Andrews & Sun (2004) for

viii

non-perturbed processes, but is novel in the context of perturbed fractional processes. Our results

show that introducing polynomials hy(�y; �) and hw(�w; �) in�ates the asymptotic variance of the long

memory estimator, d, by a multiplicative constant which depends on the true long memory parame-

ter, d. However, the in�ation decreases when d increases, and we obtain a reduction in the order of

magnitude of the bias if �(�) is su¢ ciently smooth near frequency zero. We show that the estimator

is consistent for d 2 (0; 1), asymptotically normal for d 2 (0; 3=4), and if �(�) is in�nitely smoothnear frequency zero, the rate of convergence can become arbitrary close to the parametric rate. The

Monte Carlo study shows the usefulness of the proposed LPWN estimator. Compared to standard

estimators, such as Hurvich & Ray (2003) local Whittle with noise estimator, the LPWN estimator

can produce considerable bias reductions in practice, especially in cases with short-run dynamics in

both the signal and noise components. We also include an empirical application to the 30 DJIA stocks

where the LPWN estimator indicates stronger persistence in volatility than the standard estimators,

and for most of the stocks produce estimates of d in the nonstationary region.

In chapter 4, we are interested in characterizing the long-run joint volatility-volume relationship in

the context of the Mixture of Distributions Hypothesis (MDH), set forward by Clark (1973), Epps &

Epps (1976), and Tauchen & Pitts (1983). MDH asserts that returns and trading volume are jointly

dependent on the same underlying latent information arrival process. By using the log-periodogram

estimator of Geweke & Porter-Hudak (1983), Bollerslev & Jubinski (1999) show that volatility and

trading volume of the S&P100 common stocks have a similar degree of fractional integration. This

evidence of pairwise correspondence between estimates of long memory across the volatility-volume

series supports a long-run view of the MDH; i.e. both processes are driven by a slowly mean-reverting

fractional integrated latent information process. Instead of using the log-periodogram estimator which

is downward biased, in the context of perturbed fractional processes, we use a semiparametric estimator

that is robust to time series perturbed by a noise term which may be serially correlated. More so

than other studies, our results show evidence of volatility and trading volume being more persistent in

terms of memory. We see this when introducing semiparametric estimators that are robust to potential

short-run contamination in both the signal and noise. Additionally, volatility displays a higher degree

of long memory than trading volume for the S&P100 common stocks, and shows evidence of being

governed by a perturbed fractional process, whereas this is not generally the case for trading volume.

Furthermore, we �nd weak evidence of there being a cointegrating relation between volatility and

trading volume. This is in line with other studies showing that although volatility and volume might

share a common fractional integration order, they do not move together over time.

ix

Dansk resume (Danish summary)

Denne afhandling beskæftiger sig med tidsrækkemodellering, hvor det overordnet tema er behandlingen

af lang hukommelse og fraktionel integration. Formålet med afhandlingen er at videreudvikle metoder

til inferens for lang hukommelse modeller. Især fokuseres der på estimering samt implementering af

teori på økonomisk data.

Empirisk bevis for lang hukommelse processer har eksisteret i mange år indenfor forskellige felter,

såsom astronomi, kemi, landbrug og geofysik, men det var ikke før skelsættende værker af Granger

(1980), Granger & Joyeux (1980) og Hosking (1981), at begreberne lang hukommelse og fraktionel

integration blev introduceret i den økonomiske litteratur. De seneste årtier har oplevet en stigende

interesse for fraktionel integrede tidsrækker som en bekvem og parsimonous måde at indfange de lang

hukommelse egenskaber, som mange tidsrækker indeholder. Lang hukommelse og fraktionelt integr-

eret processer betragtes som halvvejs repræsentationer mellem modeller, hvor korrelationen mellem

observationer over tid henfalder med en eksponentiel hastighed, dvs. kort hukommelse modeller (f.eks

autoregressive glidende gennemsnits modeller) og enhedsrod tilfældet, der udviser ingen tilbagevenden

mod dens middelværdi. En lang hukommelse eller fraktionelt integreret proces henfalder derimod ved

en hyperbolsk hastighed. Dette tillader parsimonous og �eksibel repræsentation af tidsrækker, som

udviser ikke-nul autokorrelation mellem observationer som er langt fra hinanden i tid. Nyere empiriske

forskning viser relevansen af lang hukommelse og fraktionel integration. Af eksempler kan nævnes:

indenfor �nansiering (eks. Baillie et al. (1996), Breidt et al. (1998), Andersen, Bollerslev, Diebold

& Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001, 2003) og Bandi & Perron (2004)), og

indenfor makroøkonomi (eks Diebold & Rudebusch (1989, 91), Sowell (1992) og Gil-Alana & Robin-

son (1997)). Tre glimrende oversigtsartikler er Robinson (1994), Baillie (1996) og Henry & Za¤aroni

(2003).

Afhandlingen indeholder �re uafhængige artikler, to selvstændige og to skrevet med medforfat-

tere. I kapitel 1, skrevet sammen med Niels Haldrup og Morten Ørregaard Nielsen, betragter vi en

regime afhængig vektor autoregressiv model, hvor vi tillader regime afhængig fraktionel integration

såvel som muligheden for regime afhængig fraktionel kointegration. Den foreslåede model er rele-

vant i beskrivelsen af prisdynamikken vedrørende elektricitetspriser, hvor transmissionen af strøm er

underlagt lejlighedsvis kapacitetsbegrænsning. For et system af bilaterale priser betyder ingen ka-

pacitetsbegrænsning, at elpriserne er identiske, hvorimod kapacitetsbegrænsning bevirker, at priserne

afviger fra hinanden. Derfor indebærer den fælles prisdynamik, at vi skifter mellem en univariate

prisproces, når vi ingen kapacitetsbegrænsning har, og en bivariate prisproces når der er kapacitetsbe-

grænsning. Samtidig er det et empirisk faktum, at elpriser udviser en høj grad af fraktionel integration,

og således åbner dette op for at priserne kan være fraktionelt kointegreret. Vi anvender den opstillede

model til at undersøge elpriserne opdelt på de geogra�ske regioner i Nord Pool samarbejdet. Analysen

udvider tilgangen af Haldrup & Nielsen (2006), idet vi modellerer �ere tidsrækker på samme tid. Deres

analyse er begrænset i den forstand at de individuelle prisserier, og den relative prisserie er analyseret

separat som univariate modeller. Når fokus er på potentiel (fraktionel) kointegration mellem multiple

prisserier, så er en system betragtning mere naturlig, men også mere kompleks taget de givne træk

in mente, som en model skal kunne behandle. Generelt �nder vi, at prisdynamikken for de forskellige

x

geogra�ske regioner er forskellig på tværs af regimer. Analysen viser, at det er vigtigt, at betinge

på om der er kapacitetsbegrænsning eller ej. Tre førende typer af fejlklassi�cering af prisdynamikken

kan opstå. Først og fremmest kan modeller hvor vi ingen skift mellem regimer tillader indikere, at

prisserierne er fraktionelt integreret. Hvorimod hvis man betinger på skift mellem regimerne, så er

det kun tilfældet i regimet med ingen kapacitetsbegrænsning (som er kointegreret per de�nition, idet

de to prisserier er identiske). For det andet kan en model uden regime skift indikere, at der ingen

fraktionel kointegration er, hvorimod dette er tilfældet i regimet med ingen kapacitetsbegrænsning.

Til sidst er der tilfældet, hvor der er fraktionel kointegration i begge regimer men ikke i modellen uden

regime skift. Desuden er det også vigtigt, at betinge på om der er kapacitetsbegrænsning eller ej, når

man kigger på justeringskoe¢ cienterne i en vektor autoregressiv model, idet man kan fejlfortolke kon-

vergensen af de geogra�ske prisområder mod ligevægt, hvis man udelukkende fokuserer på modellen

uden regime skift.

Kapitel 2 omhandler semiparametrisk estimation af fraktionel integration i det empirisk relevante

tilfælde, hvor der potentielt er korttidsdynamik og ikke-stationaritet. Vi foreslår en estimator som

udvider local polynomial Whittle estimatoren af Andrews & Sun (2004) til fraktionelle processer som

kan være ikke-stationære. Vi benytter begreberne vedrørende den udvidede Fourier transform og

periodogram til at modi�cere local polynomial Whittle estimatoren. Ved at approximere korttids

komponenten af spektrummet, ' (�), med et polynomie Pr (d; �) (lige og endelig), i stedet for en

konstant i nærheden af nul frekvensen. Ved at gøre dette mindsker vi den bias, der er tilstede i den

klassiske local Whittle estimator. Bias reduktionen har den omkostning at vi in�aterer variansen

med en multiplikativ konstant. Ydermere, givet at den generende process er lineær, så vil central

grænse værdi argumentet udledt af Robinson (1995) holde som i det stationære tilfælde jdj < 1=2,

dog ikke for d0 =�12 ;32 ; :::

. Vi etablerer konsistens og asymptotisk normalitet for d0 2 (�1=2;1).

Yderligere hvis ' (�) er uendelig glat i nærheden af nul frekvensen, så vil konvergenshastigheden

komme arbitrært tæt på den parametriske hastighed. Simulationer viser hvor god vores estimator

er, når der er tilstedeværelse af korttidsdynamik og ikke-stationaritet. Til sidst implementerer vi den

foreslået estimator i en analyse af kreditspænd. Generelt set så kan vi ikke udelukke at logarithmen

til renten på Aaa, Baa og statsobligationer indeholder en enhedsrod. Dog er resultaterne ret blandet

hvis vi kigger på spændene i stedet. Dette leder os til ligesom Ratta & Urga (2005), at givet vores

speci�kke setup, så kan vi forkaste den reducerede form modellering (Das & Tufano (1996), Jarrow

et al. (1997) og Du¢ e & Singleton (1999)) som eksplicit implicerer at den data generende proces for

den risiko-frie proces, og derved kreditspændene, følger en kort hukommelse proces.

I kapitel 3, sammen med Per Frederiksen og Morten Ørregaard Nielsen, foreslår vi en semipara-

metrisk local polynomial Whittle with noise (LPWN) estimator til estimation af hukommelses para-

meteren i lang hukommelse tidsrækker, hvor vi har tillagt støj som kan være seriel korreleret. Den

foreslåede estimator tillader at både spektrummet af det tillagte støjled og spektrummet af signalet,

dvs. �w(�) og �y(�), kan approksimeres af hy(�y; �) og hw(�w; �) af (lige og endelig) orden 2Rw og 2Ryi nærheden af nul frekvensen, i stedet for konstanter, og derved opnås en bias reduktion som afhænger

af glatheden af �w(�) og �y(�). Denne tilgang til modellering af kortsigtsdynamikken er magen til den

anvendt af Andrews & Sun (2004) for ikke-tillagt støj fraktionelle processer, men den er ny i denne

her sammenhæng. Vores resultater viser, at introduktionen af polynomier in�aterer den asymptotiske

xi

varians for lang hukommelses parameteren, d, med en multiplikativ konstant som afhænger af den

sande lang hukommelses parameter, d. Dog vil denne in�atering falde ved stigende d, og vi opnår

herved en ordensreduktion af biasleddet, hvis �(�) er tilstrækkelig glat i nærheden af nul frekvensen.

Vi viser, at estimatoren er konsistent for d 2 (0; 1) og asymptotisk normal for d 2 (0; 3=4), og hvis�(�) er uendelig glat i nærheden af nul frekvensen, så vil konvergens hastigheden komme arbitrært

tæt på den parametriske hastighed. Monte Carlo studiet viser brugbarheden af den foreslåede LPWN

estimator. Sammenlignet med standard estimatorer, det være sig local Whitle with noise estimatoren

af Hurvich & Ray (2003), så er LPWN estimatoren i stand til at opnå betragelige bias redukioner,

specielt i de tilfælde hvor vi har korttidsdynamik i både signalet og det tillagte støjled. Vi har også

inkluderet et empirisk studie af aktierne i DJIA30, hvor LPWN estimatoren indikerer stærkere per-

sistens i volatilitet end standard estimatorer gør. For de �este aktier estimeres d i det ikke-stationære

område.

I kapitel 4 er vi interesserede i at karakterisere forholdet mellem volatilitet og volumen på lang sigt i

henhold til Mixture of Distributions Hypothesis (MDH), fremsat af Clark (1973), Epps & Epps (1976)

og Tauchen & Pitts (1983). MDH hævder at afkast og handelsvolumen er underlagt den samme latente

informations ankomstsproces. Ved at benytte log-periodogram estimatoren af Geweke & Porter-Hudak

(1983) viste Bollerslev & Jubinski (1999), at volatilitet og handelsvolumen for S&P100 har en fælles

grad af fraktionel integration. Dette bevis for parvis sammenhæng mellem lang hukommelse estimater

på tværs af volatilitet-volumen serierne understøtter en langsigtsfortolkning af MDH; dvs. at begge

processer er drevet af latent information proces som udviser lang hukommelse. I stedet for at benytte

log-periodogram estimatoren, som vides negativt biased i fraktionel integrede med tillagt støj processer,

benytter vi en semiparametrisk estimator som er robust overfor tidsrækker med tillagt støj, som

potentielt kan være serielt korrelerede. Ved at benytte disse semiparametriske estimatorer, så er der

bevis for at volatilitet og volumen er mere persistent end vist andet steds i litteraturen. Ydermere

udviser volatilitet højere grad af lang hukommelse end volumen for aktierne i S&P100. Desuden

udviser volatiliteten tegn på at være en fraktionelt integreret proces med tillagt støj, hvorimod dette

ikke ser ud til at være tilfældet for handelsvolumen. Derudover �nder vi svage tegn på at der er en

kointegrerende relation mellem volatilitet og handelsvolumen. Dette er i tråd med andre studier, som

viser at selvom volatilitet og handelsvolumen udviser samme grad af fraktionel integration, så bevæger

de sig ikke sammen over tid.

References

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), �The distribution of realized stock

return volatility�, Journal of Financial Economics 61, 43�76.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), �The distribution of realized

exchange rate volatility�, Journal of the American Statistical Association 96, 42�55.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2003), �Modelling and forecasting realized

volatility�, Econometrica 71, 579�625.

xii

Andrews, D. W. K. & Sun, Y. (2004), �Adaptive local polynomial Whittle estimation of long-range

dependence�, Econometrica 72, 569�614.

Baillie, R. T. (1996), �Long memory processes and fractional integration in econometrics�, Journal of

Econometrics 73, 5�59.

Baillie, R. T., Bollerslev, T. & Mikkelsen, H. O. (1996), �Fractionally integrated generalized autore-

gressive conditional heteroscedasticity�, Journal of Econometrics 74, 3�30.

Bandi, F. M. & Perron, B. (2004), �Long memory and the relation between implied and realized

volatility�, Manuscript, Université de Montréal .

Bollerslev, T. & Jubinski, D. (1999), �Equity trading volume and volatility: Latent information arrivals

and common long-run dependencies�, Journal of Business and Economic Statistics 17, 9�21.

Breidt, F. J., Crato, N. & de Lima, P. (1998), �The detection and estimation of long memory in

stochastic volatility�, Journal of Econometrics 83, 325�348.

Clark, P. K. (1973), �A subordinated stochastic process model with �nite variance for speculative

prices�, Econometrica 41, 135�155.

Das, S. & Tufano, P. (1996), �Pricing credit-sensitive debt when interest rates, credit ratings and credit

spreads are stochastic�, Journal of Financial Engineering 5, 161�198.

Diebold, F. X. & Rudebusch, G. D. (1989), �Long memory and persistence in aggregate output�,

Journal of Monetary Economics 24, 189�209.

Diebold, F. X. & Rudebusch, G. D. (1991), �Is consumption too smooth�, Review of Economics and

Statistics 73, 1�9.

Du¢ e, D. & Singleton, K. J. (1999), �Modeling term structures of defaultable bonds�, Review of

Financial Studies 12, 687�720.

Epps, T. & Epps, M. (1976), �The stochastic dependence in security price changes and transaction

volumes: implications for the mixture-of-distribution hypothesis�, Econometrica 44, 305�321.

Geweke, J. & Porter-Hudak, S. (1983), �The estimation and application of long-memory time series

models�, Journal of Time Series Analysis 4, 221�238.

Gil-Alana, L. A. & Robinson, P. M. (1997), �Testing of unit root and other non-stationary hypotheses

in macroeconomic time series�, Journal of Econometrics 80, 241�268.

Granger, C. W. J. (1980), �Long memory relationships and the aggregation of dynamic models�, Journal

of Econometrics 14, 227�238.

Granger, C. W. J. & Joyeux, R. (1980), �An introduction to long-memory time series models and

fractional di¤erencing�, Journal of Time Series Analysis 1, 15�29.

xiii

Haldrup, N. & Nielsen, M. Ø. (2006), �A regime switching long memory model for electricity prices�,

Journal of Econometrics 135, 349�376.

Henry, M. & Za¤aroni, P. (2003), The long range paradigm for macroeconomics and �nance, in

P. Doukhan, G. Oppenheim & M. S. Taqqu, eds, �Theory and Applications of Long-Range De-

pendence�, Birkhäuser, Boston, pp. 417�438.

Hosking, J. R. M. (1981), �Fractional di¤erencing�, Biometrika 68, 165�176.

Hurvich, C. M. & Ray, B. K. (2003), �The local Whittle estimator of long-memory stochastic volatility�,

Journal of Financial Econometrics 1, 445�470.

Jarrow, R. A., Lando, D. & Turnbull, S. M. (1997), �A markov model for the term structure of credit

risk spreads�, Review of Financial Studies 10, 481�523.

Ratta, L. D. & Urga, G. (2005), Modelling credit spread: A fractional integration approach, Working

Paper CEA-07-2005, Centre for Econometric Analysis, Cass Business School.

Robinson, P. M. (1994), Time series with strong dependence, in C. Sims, ed., �Advances in Economet-

rics�, Cambridge University Press, Cambridge, pp. 47�95.

Robinson, P. M. (1995), �Gaussian semiparametric estimation of long range dependence�, The Annals

of Statistics 23, 1630�1661.

Sowell, F. B. (1992), �Modeling long-run behavior with the fractional arima model�, Journal of Mon-

etary Economics 29, 277�302.

Tauchen, G. E. & Pitts, M. (1983), �The price variability-volume relationship on speculative markets�,

Econometrica 51, 485�505.

xiv

Chapter 1Chapter 1Chapter 1Chapter 1

A vector autoregressive model for electricity prices subject to long memory and regime switching


A vector autoregressive model for electricity prices subject to long

memory and regime switching�

Niels Haldrupy

Aarhus University and CREATESFrank S. Nielsenz

Aarhus University and CREATES

Morten Ørregaard Nielsenx

Queen�s University and CREATES

March 20, 2009

Abstract

A regime dependent VAR model is suggested that allows long memory (fractional integration) in

each of the observed regime states as well as the possibility of fractional cointegration. The model

is relevant in describing the price dynamics of electricity prices where the transmission of power

is subject to occasional congestion periods. For a system of bilateral prices non-congestion means

that electricity prices are identical whereas congestion makes prices depart. Hence, the joint price

dynamics implies switching between essentially a univariate price process under non-congestion

and a bivariate price process under congestion. At the same time it is an empirical regularity that

electricity prices tend to show a high degree of fractional integration, and thus that prices may be

fractionally cointegrated. An empirical analysis using Nord Pool data shows that even though the

prices strongly co-move under non-congestion, the prices are not, in general, fractional cointegrated

in the congestion state.

Keywords: Cointegration, electricity prices, fractional integration, long memory, Markov switch-

ing.

JEL Classi�cation: C32.

�We are grateful to Javier Hualde for valuable suggestions and comments. Previous versions of this paper have beenpresented at the DGPE workshop in Sandbjerg, October 2006, the CREATES long memory symposium in Aarhus,July 2007, the ESRC econometrics study group meeting in Bristol, July 2007, the NBER-NSF time series conference inIowa, September 2007, and seminars at the University of Aarhus. We are grateful to the participants for comments andsuggestions. This work was partly done while F. S. Nielsen was visiting Cornell University, their hospitality is gratefullyacknowledged. We are also grateful for �nancial support from the Danish Social Sciences Research Council (grant no.FSE 275-05-220) and from CREATES funded by the Danish National Research Foundation.

yCREATES, School of Economics and Management, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark.email: [email protected].

zCREATES, School of Economics and Management, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark.email: [email protected].

xDepartment of Economics, Dunning Hall Room 307, 94 University Avenue, Queen�s University, Kingston, OntarioK7L 3N6, Canada; email: [email protected]

3

1 Introduction

Over the past decade or so electricity markets have been strongly liberalized throughout the world.

In particular, the Nordic power market consisting of Norway, Sweden, Finland, and Denmark has

developed remarkably towards liberalization and the establishment of competitive market conditions,

and today this market serves as a model for the restructuring of other power markets. The Nordic

power market is characterized by a grid of physical exchanges of power across geographical regions

where the actual exchange is constrained by the �ow capacity. Naturally, this has implications for

the way prices are formed: When there are no bilateral capacity restrictions then there is a free �ow

of power, and prices will be identical. On the other hand, when there is congestion prices tend to

depart to meet the supply and demand conditions subject to restricted access to power from other

regions. In order to model electricity prices it is thus natural to consider regime dependent price

processes re�ecting the presence or absence of �ow congestion. This particular feature of the market

has been addressed in recent work by Haldrup & Nielsen (2006a, 2006b). Another important property

of electricity prices modeled in these works is the presence of long memory. Statistical tests strongly

reject price series to be I(0) and I(1), whereas I(d) processes with d being fractional (see Granger

(1980), Granger & Joyeux (1980) and Hosking (1981)) provide a nice characterization of the data.

The combination of fractional integration and regime switching gives rise to some challenges. Ding

& Granger (1996), Diebold & Inoue (2001), and Granger & Hyung (2004) argue that under certain

conditions time series variables can spuriously have long memory when measured in terms of their

fractional order of integration, when in fact the series exhibit non-linear features such as regime

switching. In the model framework of Haldrup & Nielsen (2006a, 2006b) separate long memory price

dynamics is allowed in adjacent power regions depending upon whether the power exchange is subject

to congestion or non-congestion. The model is of the Markov switching type originally de�ned by

Hamilton (1989). However, because the de�ning property of e.g. a non-congestion state is that prices

are identical, the state variable is observable as opposed to being a latent variable. An important

feature of the model is that the price processes in the di¤erent regimes can have di¤erent degrees of

long memory, which gives rise to a number of interesting possibilities. For instance, consider the state

with non-congestion and assume that the associated bivariate prices are fractionally integrated of a

given order. It follows that prices are fractionally cointegrated in this case, i.e. extending the notion of

Granger (1981, 1986) and Engle & Granger (1987), in the sense that individual prices are fractionally

integrated but price di¤erences are identically zero. Thus, an extreme form of cointegration occurs

in this situation because the prices are identical and hence are governed by exactly the same price

shocks. The price behavior in the congestion state can (and typically will) be very di¤erent. That is,

the bivariate prices can be fractionally cointegrated in a more conventional way or the prices can appear

not to cointegrate. Hence, the model can potentially exhibit state dependent fractional cointegration.

By not appropriately conditioning on the congestion state, i.e. when having a model with no regime

switching, the full sample estimates are likely to be a convex combination of the behavior in the

individual states and hence misleading inference is likely to result. In fact, this is one of the major

empirical �ndings in Haldrup & Nielsen (2006b).

The modeling approach used in Haldrup & Nielsen (2006b) is limited in the sense that the individual

4

price series and the relative price series are analyzed separately as univariate models. When the focus

of analysis is the potential (fractional) cointegration amongst multiple series a system approach is

more natural, but clearly also more complex in the present context given the particular features the

model should allow. In principle, the full set of price series should be modeled jointly, and, depending

upon the market conditions, should shrink to a limited number of price series re�ecting periods with

non-congestion at some grid points.

We distinguish between price areas and geographical regions. Each geographical region corresponds

to a physical exchange (e.g., West Denmark, South Norway, etc.) and is therefore constant over time.

On the other hand, a price area is de�ned simply as an area with the same price and may therefore

clearly change over time. Thus, West Denmark and South Norway always constitute two geographical

regions, but in the case of non-congestion the same price prevails in both geographical regions and

they hence constitute just one price area.

In this paper we model multiple price series jointly in a vector autoregression (VAR), which allows

for fractionally integrated time series that potentially cointegrate in the congestion state. In the non-

congestion state, prices are identical by de�nition and hence a univariate model for the price process is

applied in this particular regime. Thus, our VAR model for fractionally cointegrated processes allows

for the possibility of regime switching, and in particular di¤ers from other speci�cations o¤ered in the

literature in the sense that our VAR model collapses to a pseudo-univariate model when a speci�c

state arises.

There are di¤erent reasons why the identi�cation of separate price dynamics is important. The

operation of electricity markets is similar to the operation of �nancial markets with electricity power

derivatives being priced and traded in highly competitive markets and hence appropriate modeling

of both means and variances is crucial. Furthermore, the price dynamics is of interest with respect

to competition analysis of electricity markets where market delineation is a central issue, see e.g.

Sherman (1989) and Motta (2004). Even though most power markets are highly liberalized there

is still scope for regulating authorities to closely follow the market behavior, see also Fabra & Toro

(2002). Under non-congestion there is obviously a single price existing in the market and the relevant

geographical market consists of the regions with identical prices. However, when there is congestion

it is of interest to follow the price dynamics closely because suppliers can have a dominating position.

The geographical market delineation thus becomes less straightforward in this case. If the price

dynamics appears to be very di¤erent there is scope for further examination of the market conditions

by regulatory authorities.

In our empirical analysis we �nd that generally the behavior of electricity prices in geographical

price regions are di¤erent across states. The analysis shows that it is important to condition on

congestion/non-congestion as non-switching models can generate misleading conclusions with regard

to the fractional integration orders and potential fractional cointegration. Three leading types of

misclassi�cation of the model dynamics may arise. First, non-switching models may indicate that the

price series are fractionally cointegrated, whereas when conditioning on states this is only the case

in the non-congestion state (which is cointegrated by de�nition). Secondly, the non-switching model

could indicate that there is no fractional cointegration when in fact there is cointegration in the non-

congestion state. Finally there is the possibility of fractional cointegration in both regimes, but not

5

in the non-switching model. Conditioning on states is also important when looking at the adjustment

coe¢ cients, as the non-switching models can lead to wrong conclusions about the convergence of

geographical price regions towards equilibrium.

The remainder of the paper is structured as follows: We next o¤er a brief description of the

structure of the Nordic electricity market. Section 3 introduces the data and argues for the importance

of allowing for long memory, regime switching and seasonality when building a model to describe the

geograhical region price processes. In section 4 the VAR modeling framework with long memory and

regime switching is presented. In section 5 the empirical results are discussed and section 6 concludes.

2 The operation of the Nordic power market

Within the Nordic countries (Denmark, Finland, Norway, and Sweden), major electricity reforms

were implemented during the 1990s. The deregulation process started in Norway in 1991, continued

in Sweden 1996, in Finland 1998, and was �nally completed in Denmark in 2000. As part of the

liberalization the national electricity markets were opened up for cross-border trade by establishment

of a common power exchange, Nord Pool. Today all member countries of the Nordic power market

have adapted to the new competitive environment and the Nordic exchange serves as a model for the

restructuring of other power markets throughout the world.1

The per capita consumption of electricity is very high in Norway and Sweden, slightly lower in

Finland and at EU average in Denmark. The relatively high consumption level in the Nordic countries

is caused by a relatively electricity intensive industrial production, a cold climate, and extensive use of

electric heating in homes and o¢ ces, especially in Norway and Sweden. The sources of electricity power

production are rather mixed in the Nordic area as a whole. The major energy source is hydropower

supplying approximately 65% of total electricity in years with normal precipitation. On the national

level the power generation systems di¤er signi�cantly and are generally dominated by one or two

technologies. In Norway the share of hydropower is close to 100%, in Sweden it is close to 50%,

in Finland around 15% and in Denmark 0%. With respect to nuclear power the share is 50% in

Sweden 30% in Finland, and 0% in Denmark and Norway. Power generation from fossil fuels is of

major signi�cance in Denmark and Finland, minor in Sweden, and close to non-existent in Norway.

In Denmark 15-20% of the power supply originates from wind power turbines.2

Because hydropower production is mainly found in the northern parts of the Nordic power web

and thermal power plants are located in the south, the relatively cheap hydropower generation is

transmitted to the heavily populated southern region which of course requires a well established

power grid transmission capacity to facilitate the �ow. When the reservoir levels are adequate, the

less costly hydropower production causes low spot prices. In these cases national and cross-border

transmission systems will be used to their capacity in order to level out price discrepancies across

regions. On the other hand, when reservoir levels are low there will be a net �ow from south to north,

and the market will see relatively high prices for thermally generated electricity.

1For a detailed description of the Nordic power market, see NordPool (2003a) or Amundsen & Bergman (2007).2 Increasing the relative production of electricity by renewable energy sources has considerable political focus in

Denmark. According to o¢ cial energy plans 50% of the Danish electricity production will come from wind power in2030.

6

From an institutional point of view there is a common Nordic market for electricity; however, even

though key market institutions are common this does not mean that the Nordic electricity market is an

integrated market in the sense that �the law of one price�applies. The reason is that the transmission

of power is subject to possible capacity constraints. The Nordic electricity market constitutes a

number of distinct geographical regions di¤erent from the countries themselves and several price areas

may coexist. Whenever the relevant interconnector capacity is insu¢ cient, the Nord Pool area is

divided into two or more price areas. The separate power regions consist of Sweden (SWE), Finland

(FIN), West Denmark (WDK), East Denmark (EDK), North Norway (NNO), Mid Norway (MNO),

and South Norway (SNO). Thus Denmark and Norway are each divided into multiple geographical

regions in Nord Pool.3 This division re�ects the grid of physical exchanges of power and the bidding

areas with respect to the pricing of electricity as we shall explain shortly. Figure 1 displays the actual

electricity exchange points.

Figure 1 about here

The power spot market4 operated by Nord Pool Spot A/S is an exchange where market participants

trade power contracts for physical delivery the next day. This is referred to as a day-ahead market.

The spot market is based on an auction with bids for purchase and sale of power contracts of one

hour duration covering the 24 hours of the following day. At the deadline for the collection of all buy

and sell orders the information is gathered into aggregate supply and demand curves for each power

delivery hour. From these supply and demand curves the equilibrium spot prices - referred to as the

system prices - are calculated.5 Therefore, the system price is determined under the assumption that

no transmission constraint is binding, and thus in a situation where no grid congestions exist across

neighboring interconnectors there will be a single identical price across the areas with no congestions.

The actual trade is not necessarily carried out at the system price. When there is insu¢ cient

transmission capacity in a sector of the grid, a grid congestion will arise and the market system will

establish di¤erent price areas across the geographical division of the Nord Pool area. The Nordic

market is then partitioned into separate bidding areas which therefore become separate price areas

when the contractual �ow between bidding areas exceeds the capacity allocated by the transmission

system operators for spot contracts. Within each price area the buyers pay, and the generators are

paid, the corresponding area price. The di¤erence between the area prices in two adjacent price

areas determines the congestion charge. Because separate prices may coexist depending upon regional

supply and demand conditions, the relevant market de�nition will vary with time. In practice, several

price area combinations will occur. Some hours there will only be a single price area (given by the

system price), other hours there will be two or more price areas.

3For the purpose of analysis of the Norwegian regions, only the SNO link is considered in the present paper.4Since only the spot market will be relevant for the present study, only this market will be described here, see also

NordPool (2003b). NordPool (2003c) describes the futures and forward markets of the Nordic power exchange which areused for price hedging and risk management.

5The system price is the reference price in the �nancial power contracts like futures, forwards, and options traded atNord Pool.

7

3 Data

The data used in this paper are (log transformed) hourly electricity spot prices for the Nord Pool area;

West Denmark (WDK), East Denmark (EDK), South Norway (SNO), Sweden (SWE) and Finland

(FIN).6 The data set is the same as that analyzed in Haldrup & Nielsen (2006a, 2006b) and covers the

period 3 January 2000 to 25 October 2003, including weekends and holidays. This yields a total of

33,404 observations. For EDK the sample period starts 1 October 2000 and thus covers 26,880 sample

points. The data series are displayed in Figure 2. Some stylized facts about the data are reported in

Haldrup & Nielsen (2006b).

Figure 2 about here

A pronounced characteristic of electricity markets is the abrupt and generally unanticipated ex-

treme changes in spot electricity prices. These jumps or spikes generally occur within a very short

period of time, implying that the general level of the di¤erent series tend to be highly persistent possi-

ble with mean reversion, see Escribano, Peña & Villaplana (2002), Haldrup & Nielsen (2006a, 2006b)

and Koopman, Ooms & Carnero (2007). In Haldrup & Nielsen (2006b) a range of tests document that

prices are neither I(0) nor I(1). Estimating the memory parameter for fractionally integrated, FI(d),

processes shows that the series generally exhibit long memory with d in the range 0.31-0.52 with the

SNO area being most persistent and in fact being nonstationary. The remaining areas have estimates

of d in the stationary region. It should be noted, however, that these estimates do not allow for regime

dependence.

Another important aspect of electricity prices is the very strong seasonal behaviour characterizing

the series. Seasonality is mainly driven from the demand side and appears as seasonal variation within

the day, within the week, and over the year. However, the supply side also contributes to seasonal

variation as electricity production is highly dependent upon weather conditions. In particular, the

seasonal variation in precipitation a¤ects water reservoir levels in the generation of hydropower, and

seasonal variation in wind conditions also plays an increasing role due to the growing number of wind

turbines, especially in West Denmark.

Figure 3 about here

In Figure 3 scatter plots of log prices for adjacent Nord Pool areas are shown. When there

are no capacity contraints across neigboring regions the prices will be identical, whereas congestion

makes prices di¤er. Observations on the 45� line therefore represent non-congestion hours, whereas

observations o¤ the 45� line represent congestion hours. It is especially this marked di¤erence in

observations that motivates the present analysis.

6Mid and North Norway are also member areas of Nord Pool, but are left out from the present analysis because theseareas coincide with South Norway for most of the year.

8

4 Modeling of regime dependent long memory

4.1 A univariate model

We here brie�y discuss the univariate model setup used in Haldrup & Nielsen (2006b). The main

features that the estimation model should allow include seasonality, long memory, and regime switching

of the type described above. Assume that individual electricity prices across adjacent regions are

fractionally integrated in the non-congestion state. This means that an extreme form of fractional

cointegration will exist in this state because the prices are identical across the two areas and thus

price di¤erences will be identically zero. On the other hand, the behavior of the two individual price

series in the congestion state can be very di¤erent. If prices are compared without considering the

di¤erent regime possibilities it is unclear what to expect from the data. However, the mixing of the

two processes is likely to produce price series with a behavior that is a convex combination of the two

state processes.

Consider the following model speci�cation, which we denote a regime switching multiplicative

RS-SARFIMA7 model:

Ast (L)�1� astL24

��dst

�yt � �st

�= "st;t; "st;t � nid

�0; �2st

�: (1)

Here, �dst is the fractional di¤erence operator, Ast (L) is a lag polynomial, and st 2 fc; ncg denotes theregime (c :congestion, nc: non-congestion), determined by a Markov chain with transition probabilities

P =

"p11 1� p11

1� p22 p22

#: (2)

Thus, for example, p11 denotes the probability that a congestion state will follow a congestion state,

i.e. Pr (st = cjst�1 = c). Note that because identical prices mean that we are in a non-congestionstate, all regimes are observable, which contrasts the standard regime switching model of Hamilton

(1989) where the regimes follow a latent Markov process.

The (univariate) series yt may denote one of two individual log price series or the associated log

relative price. The series yt has been corrected for deterministic seasonality prior to the estimation

whilst allowing interaction with the two observable regimes, that is, the coe¢ cients on the dummy

variables are allowed to di¤er across states. When yt denotes a log relative price, all parameters are

put to zero when st = nc; including �2nc. Estimation of the above model is by conditional maximum

likelihood and is discussed in detail in Haldrup & Nielsen (2006b).

4.2 A bivariate model

A disadvantage of the model described above is that parameters are estimated separately when in

fact the price series to a large extent are governed by the same price shocks. We therefore consider

the following fractional error correction model speci�cation for a bivariate regime switching vector

7RS-SARFIMA: Regime Switching Seasonal Autoregressive Fractionally Integrated Moving Average.

9

stochastic process subject to being in the congestion state: �d1 0

0 �d1

! p1t

p2t

!=

�1

�2

!� (p1;t�1 � p2;t�1) (3)

+

kXi=1

�c;i��st�i

p1;t�i

p2;t�i

!+ "c;t;

where "c;t � N(0;) and

�c;i =

"�c11;i �c12;i�c21;i �c22;i

#;

��st�i =

(diag

��dc ;�dc

�if st�i = c;

�dnc if st�i = nc;

such that the lagged fractional di¤erences re�ect whether a particular observation is associated with a

congestion or non-congestion state. Thus, dnc is the common fractional integration order in the non-

congestion state, whereas d1 and d2 are the integration orders of the two price areas in the congestion

state (cointegration requires that the two price areas integration order be identical, i.e. d1 = d2 = dc).

In the non-congestion state bilateral prices are identical, p1t = p2t = pt; and hence the bivariate

setup collapses to a pseudo-univariate model, i.e.

�dncpt =kXi=1

�nc;i��st�i

p1;t�i

p2;t�i

!+ "nc;t (4)

where "nc;t � N(0; �2) and�nc;i =

��nc11;i;�

nc12;i

�:

Essentially, the price process switches between being generated from (3) or (4) where switching takes

place in accordance with the transition probabilities (2).

We limit our study to the bivariate setup and disregard potential spill-overs from the other areas.

From a theoretical point of view, it is conceptually easy to extend the present bivariate model to the

multivariate case, and thereby model spill-overs using more advanced dynamics. However, from a

computational point of view this appears infeasible as the number of regimes, and thereby the number

of parameters, grows very fast. Indeed, in a multivariate setup withM geographical regions, there are

2M�1 di¤erent regimes.

A number of remarks are in order. Consider �rst the non-congestion state. In this regime the two

price series are forced to be governed by the same process (4) and hence any conditional forecast for

this regime will remain identical for both price series. This feature is not captured in the univariate

model of Haldrup & Nielsen (2006b) and indeed requires our multivariate setup. Thus, in particular,

forecasts of each price series in the non-congestion state may appear di¤erent when based on (1),

whereas forecasts based on (4) will be identical for the two price series in the non-congestion state.

Note that in the non-congestion state the prices are fractionally integrated of order dnc and fractionally

cointegrated in the sense that the series perfectly co-move. This notion of (fractional) cointegration

10

is somewhat di¤erent than originally suggested by Granger (1986) and Engle & Granger (1987).

Next, consider the congestion regime. We will discriminate between two situations, i.e. when p1tand p2t cointegrate or do not cointegrate. (i) Assume �rst the situation with fractional cointegration.

In this case it must hold that d1 = d2 = dc, i.e. the price series have to be of the same order of

fractional integration. Notice that whilst the single price series are FI(d), the log relative prices are

FI( ) where < d: At the same time we require that (�1; �2)0 6= (0; 0)0 with either �1 < 0 and/or

�2 > 0 such that the model is truly error correcting. (ii) When prices do not cointegrate in the

congestion regime nothing guarantees that d1 = d2 = d: Most importantly, there is no error correction

towards equilibrium in this case and the usual interpretation of the parameters (�1; �2)0 and is

invalid.

The adjustment coe¢ cients, (�1; �2)0; may give an indication of whether the speci�c price areas

adjust towards equilibrium, which we expect them to do under cointegration. Speci�cally, if �1 > 0

then p1t is moving away from equilibrium (non-congestion), whereas if �2 > 0 then p2t is moving

towards equilibrium. Note that the full stability of the model requires that the entire system dynamics

is included in the calculation, but in any case the values of �1 and �2 give a rough idea of the

system dynamics under a ceteris paribus assumption. An alternative interpretation of the adjustment

coe¢ cients follows from the market setup and varying costs of electricity production in di¤erent

geographical regions. For example, if there is no congestion between SNO and WDK prices are

identical and electricity �ows from the cheaper area (usually SNO because of the hydropower) to the

more expensive area (WDK). However, if there is congestion, prices in WDK will be higher re�ecting

the higher costs of electricity production. This increase in price in WDK corresponds to �1 > 0 in

the WDK-SNO bivariate model, i.e. a move away from equilibrium. Importantly, this is not due to

system instability but rather to electricity being more expensive to produce in WDK compared to

SNO.

The model analyzed in this paper is unique in the literature on regime switching and/or (frac-

tionally) cointegrated models since it collapses to a pseudo-univariate model in one of the regimes.

The error correction model speci�cation (3)-(4) re�ects the particular structure and features of the

market design. For discussions of representation theory in the context of (non-switching) fractional

cointegration, see Granger (1986), Davidson (2002), Robinson & Yajima (2002), Davidson, Peel &

Byers (2005), and Johansen (2008).

4.3 Estimation

In our case congestion/non-congestion is an observed state such that regimes are observable, and the

maximum likelihood estimates of the transition probabilities are

p11 =nc;c

nc;c + nc;nc; (5)

p22 =nnc;nc

nnc;c + nnc;nc; (6)

where nij is the number of times we observe regime i followed by regime j for i; j 2 fc; ncg :Estimation of the remaining parameters of the two states is done by (quasi) conditional maximum

11

likelihood. The regime-speci�c log-likelihood functions, omitting the constant, is

lc (dc; �c) = �Pt 1 fst = cg

2log jj � 1

2

Xt

trace��1"st;t1 fst = cg "0st;t

�;

lnc (dnc; �nc) = �Pt 1 fst = ncg

2log �2 � 1

2

Xt

��2"st;t1 fst = ncg "0st;t

�;

where 1fAg is the indicator function of the event A. The full-sample log-likelihood function is givenby (assuming independence between "c;t and "nc;t)

l (dc; dnc; �) = �T

2log (2�) + lc (dc; �c) + lnc (dnc; �nc) : (7)

When using a numerical optimization algorithm to maximize the log-likelihood function, concern

must be given to the selection of starting values. The reason for concern is that the log-likelihood

function is not globally concave and hence the results of the selected numerical optimization algorithm

may depend on the choice of starting values. In our case we have used the fractional integration

estimates from Haldrup & Nielsen (2006b) as our starting values. For the remaining parameters, i.e.

autoregressive and variance-covariance terms etc., we �nd starting values by letting the fractional

integration parameters be �xed at their initial values and maximizing the log-likehood with respect

to the remaining parameters.

Finally, we remark that our model framework assumes that states are observable and that the

cointegrating vector in the congestion state, � = (1;�1), is given. Therefore, asymptotic distributiontheory for the remaining parameters will be standard under suitable regularity conditions on the errors

"st;t, such as serial independence and moment conditions. In particular, Gaussianity of the errors is not

a necessary condition for the asymptotic distribution theory, but is used only to derive the likelihood

function.

5 Empirical results

Prior to estimation, each log price series had deterministic seasonality removed by regression on a

constant, a time trend, dummy variables for hour-of-day, day-of-week, month-of-year, and a holiday

dummy. The parameter estimates for the constant, trend, and dummy variables are allowed to di¤er

across states. For computational reasons we have selected to set k = 4 to capture the within-the-

day e¤ects and also include a 24th lag, to capture the daily stochastic seasonality. The gain from

introducing more lags and/or e.g. a weekly lag instead of a daily, was not signi�cant enough in terms

of whiteness of the residuals to compensate for the considerable estimation time.

5.1 Estimation of transition dynamics

Since the states are observable, as discussed earlier, estimates of the transition probabilities for each

state are easily calculated and are reported in Table 1. It is clear that some grid points are more

subject to congestion than others. This fact may be explained by demand and supply �uctuations,

but there is also the possibility that congestion may be caused by exploitation of market power.

12

Table 1: Estimated transition probabilities (mean duration of states)

EDK-SWE WDK-SWEcongestion non-congestion congestion non-congestion

congestion 0:7848 (4:65) 0:2152 congestion 0:8216 (5:60) 0:1784non-congestion 0:0131 0:9869 (76:57) non-congestion 0:1259 0:8740 (7:94)

WDK-SNO SNO-SWEcongestion non-congestion congestion non-congestion

congestion 0:9247 (13:28) 0:0753 congestion 0:9478 (19:16) 0:0523non-congestion 0:1221 0:8779 (8:19) non-congestion 0:0462 0:9538 (21:64)

SWE-FINcongestion non-congestion

congestion 0:8505(6:51) 0:1495non-congestion 0:0210 0:9790(48:78)

Table 2: Estimates for the EDK-SWE linkSwitching

No switching Non-congestion CongestionModel d1 d2 dnc1 dnc2 nc dc1 dc2 c

Univariate 0:43(0:012)

0:43(0:012)

0:05(0:018)

0:46(0:012)

0:46(0:011)

0 0:03(0:013)

0:03(0:012)

�0:26(0:077)

VAR estimates 0:45(0:011)

0:49(0:018)

0:21(0:019)

0:32(0:011)

0 0:09(0:021)

0:10(0:038)

0:04(0:04)

VAR estimatesRestricted d1=d2

0:49(0:009)

0:21(0:047)

0:32(0:013)

0 0:09(0:040)

0:00(0:049)

Notes: Subscripts denote the geographical region and superscripts denote the state. Standard errorsare given in parentheses.

The estimated transition probabilities indicate a high degree of persistence in the states. The

probability of staying in the congestion regime, p11, is highest for the grid point SNO-SWE, 0:9478,

whereas it is lowest for EDK-SWE link, 0:7848. This corresponds to a mean duration of 19:16 and

4:65 hours, respectively. In general, the probability of staying in the non-congestion regime, p22, is

higher, estimated at 0:8740� 0:9870, corresponding to mean duration of 7:94� 76:57 hours.

5.2 Estimation of fractional integration and cointegration parameters

In Tables 2-6 we present the estimates of the fractional integration d for a number of di¤erent cases.

The models estimated under the heading �No switching�use pooled data, i.e. there is no separation

of data connected with congestion and non-congestion periods. The estimates of d1, d2 refer to the

fractional orders estimated for the �rst and second region, respectively, whereas the estimate is

the fractional integration order of the log relative price. The results presented under the heading

�Switching� refer to similar estimates when data is partitioned into congestion and non-congestion

periods, where we use superscripts c or nc to denote estimates under the congestion and non-congestion

regimes, respectively. Note that by de�nition nc = 0 in the non-congestion state because the single

13

Table 3: Estimates for the WDK-SWE linkSwitching



0:42(0:011)

0:27(0:017)

0:38(0:024)

0:33(0:013)

0 0:28(0:021)

0:46(0:014)

0:37(0:015)


0:54(0:020)

0:51(0:025)

0:19(0:021)

0 0:12(0:020)

0:39(0:012)

0:23(0:012)


0:56(0:011)

0:53(0:072)

0:25(0:018)

0 0:33(0:033)

0:11(0:029)


price series are identical and hence the series are fractionally cointegrated in an extreme form. Results

are reported using three di¤erent models. For comparison, �Univariate� reproduces the estimates

reported in Haldrup & Nielsen (2006b), i.e. this corresponds to estimates using the model (1) for

both the regime switching and non-regime switching cases. The row named �VAR estimates�displays

estimates based on the model (3)-(4). Note that, as opposed to the univariate estimates, dnc1 = dnc2by construction since the price series follow the same process in these cases. Finally, VAR estimates

are reported where we restrict d1 = d2 in the non-switching case and dc1 = dc2 in the congestion state

under the regime switching case.

Consider �rst the East Denmark-Sweden connection exhibited in Table 2, and consider initially

the pooled data set without regime switching. The estimates of d for the two regions are rather similar

regardless of the underlying model being estimated, i.e. estimates are in the range 0:43 � 0:49 andhence on the borderline of the stationary region. The estimates of are somewhat lower: 0:05 when the

univariate model is used for estimation and 0:21 when the VAR model is used. These results indicate

that when data is not classi�ed according to regimes, then there is evidence of fractional cointegration

amongst the series. Now, the question is whether this result is caused by the non-congestion state

dominating the sample or whether both regimes contribute to the cointegration �nding. In the regime

switching case, the non-congestion estimates clearly indicate cointegration (as expected) with estimates

of d in the range 0:32�0:46. In the congestion case, the memory parameter for each of the price seriesare similar but somewhat lower, i.e. 0:09� 0:10. Also, there is indication of a weak form of fractional

cointegration in the congestion state since the relative price is FI(0:04). When we restrict d1 = d2

over the di¤erent scenarios, we see the same story as not restricting the parameters.

Next, we turn to the West Denmark-Sweden link in Table 3. For the model without regime

switching both restricted and unrestriced parameter estimates using the VAR model indicates no

presence of fractional cointegration which is similar to what is found in the univariate case. Under

regime switching there is clearly cointegration in the non-congestion state, however, for the model with

unrestricted integration orders there is no cointegration in the congestion state. The results from the

no switching models are thus some combination of their regime switching counterparts, and it is clear

that by not taking regime switching into account we falsely conclude that there is no sign of fractional

cointegration, whereas it is evident that fractional cointegration is present in the non-congestion state.

When we restrict d1 = d2 the VAR model in fact shows cointegration also in the congestion state.

The West Denmark-South Norway link with estimates in Table 4 is an interesting case where

14

Table 4: Estimates for the WDK-SNO linkSwitching



0:44(0:011)

0:28(0:016)

0:30(0:026)

0:16(0:008)

0 0:31(0:017)

0:63(0:017)

0:37(0:015)


0:57(0:018)

0:91(0:019)

0:43(0:016)

0 0:20(0:017)

0:22(0:013)

- 0:07(0:015)


0:92(0:012)

1:13(0:018)

0:37(0:021)

0 0:34(0:047)

0:10(0:032)


Table 5: Estimates for the SNO-SWE linkSwitching



0:41(0:012)

0:31(0:016)

0:38(0:008)

0:41(0:012)

0 0:32(0:013)

0:21(0:013)

0:39(0:018)


0:59(0:017)

0:06(0:010)

0:49(0:018)

0 0:32(0:007)

0:18(0:007)

0:31(0:013)


0:46(0:007)

0:21(0:010)

0:39(0:012)

0 0:28(0:007)

0:25(0:015)


there seems to be no fractional cointegration in the non-switching models. However, looking at the

VAR models where we condition on congestion/non-congestion we see that there is in fact fractional

cointegration in both states. That is, an extreme form in the non-congestion state by de�nition and

in the congestion state because dc1 � dc2 (or dc1 = dc2) and we have a reduction of fractional order forthe relative price series ( c). In the univariate model there is no sign of fractional cointegration.

As seen from Table 5 the link between South Norway and Sweden indicates fractional cointegration

in the model without regime switching. However, when conditioning on states, it is seen that it is

only in the non-congestion state that cointegration takes place. In the congestion state the fractional

orders of the single price series and the relative price series are almost identical for all three models.

Finally, for the Sweden-Finland link in Table 6 there is some evidence of fractional cointegration

Table 6: Estimates for the SWE-FIN linkSwitching



0:38(0:012)

0:24(0:017)

0:42(0:011)

0:43(0:012)

0 �0:02(0:012)

�0:02(0:005)

0:48(0:022)


0:60(0:015)

0:34(0:017)

0:31(0:014)

0 0:02(0:010)

0:02(0:009)

0:01(0:012)


0:49(0:009)

�0:07(0:037)

0:31(0:013)

0 0:02(0:011)

0:01(0:013)


15

Table 7: Estimated adjustment coe¢ cients

No switching Switchingd1 6= d2 d1 = d2 d1 6= d2 d1 = d2

Series �1 �2 �1 �2 �1 �2 �1 �2EDK-SWE 0:1775�� 0:3526�� 0:3495�� 0:0263 0:1592�� 0:3130�� 0:1987�� 0:2587��

(0:0169) (0:0172) (0:1054) (0:0218) (0:0331) (0:0443) (0:0397) (0:0538)WDK-SWE 0:0328 0:0576 �0:1239 �0:0647�� 0:4798�� 0:0284 0:0033 0:0059

(0:0943) (0:0678) (0:1244) (0:0321) (0:0374) (0:025) (0:0891) (0:0503)WDK-SNO 0:9253�� 0:0220 0:0173 �0:0200 0:0429�� 0:0048 0:1247�� 0:0784��

(0:0219) (0:0243) (0:0271) (0:0212) (0:0161) (0:0143) (0:0246) (0:0171)SNO-SWE �0:0433 �0:0017 0:8682�� 0:0376 �1:039�� 0:3130�� 0:9871�� 0:5623��

(0:0263) (0:0448) (0:0522) (0:0224) (0:1231) (0:1043) (0:1126) (0:1526)SWE-FIN 0:3106�� 0:9949�� 0:0154� 0:0862�� 0:7152�� 0:2694�� 0:6991�� 0:3479��

(0:0527) (0:0842) (0:0090) (0:0177) (0:0210) (0:0313) (0:0304) (0:0154)

Notes: Subscripts denote the geographical region. Numbers in bold face refer to situations withindication of fractional cointegration based on the d1; d2, and estimates reported in Tables 2-6.Standard errors are given in parentheses. One and two asterisks denote signi�cance at the 10% and5% levels, respectively.

in the non-switching models. For the univariate model, the regime switching results do not make

much sense because c > maxfdc1; dc2g. The two regime switching VAR models (with and without therestriction d1 = d2) give identical results in the regime switching case. There is cointegration the non-

congestion state whereas all series seem to be I(0) in the congestion state. Hence, the non-congestion

state seems to dominate the data when there is no conditioning on state.

5.3 Estimation of adjustment coe¢ cients

By modeling the data using the multivariate switching VAR model (3)-(4) we obtain estimates of

the adjustment coe¢ cients in the congestion state which is not possible when estimating univariate

models. The adjustment coe¢ cients indicate (ceteris paribus) whether a speci�c geographical price

region is moving towards or away from equilibrium in response to a particular price gap. An alter-

native interpretation of the adjustment coe¢ cients follows from the market setup and varying costs

of electricity production in di¤erent geographical regions, i.e. if an inexpensive electricity supply

from another geographical region is suddenly stopped due to a congestion, prices are expected to be

higher until non-congestion is restored which may result in adjustment parameters indicating a move

away from equilibrium. Parameter interpretation is of course an issue here, because we force the

cointegrating vector to be (1;�1) and the parameter estimates �1; �2, and do not have the usualinterpretation in the congestion state if in fact there is no cointegration present in that state (due to

lack of identi�cation): Therefore, if cointegration is not present the interpretation of (�1; �2)0 should

be made with caution.

In Table 7 the adjustment coe¢ cients (�1; �2) associated with the VAR models are reported,

both with restricted and unrestricted d parameters and for the switching and non-switching cases.

Numbers in boldface font indicate situations where, based upon the d1; d2; and estimates, some

degree of fractional cointegration is likely to take place. In the regime switching models, boldface

indicates situations where there appears to be cointegration in the congestion state.

Consider �rst the East Denmark-Sweden connection. When we do not condition on regime switch-

16

ing and d1 6= d2, neither East Denmark nor Sweden appear to correct towards equilibrium. On

the other hand, when d1 = d2 is enforced, East Denmark moves towards equilibrium whereas Swe-

den�s adjustment coe¢ cient is insigni�cant. When we condition on regime switching, East Denmark

moves away from equilibrium, whereas Sweden now moves towards equilibrium. This would appear

to contradict error correction adjustment. However, there may be other reasons for these seemingly

contradictory results. For example, if there is no congestion between EDK and SNO prices are identi-

cal and electricity �ows from the cheaper area (usually SWE because of the hydropower and nuclear

electricity production) to the more expensive area (usually EDK because of the majority of electricity

production stemming from thermal plants), see Table 1 in Haldrup & Nielsen (2006a). Therefore,

when congestion occurs prices in East Denmark will usually be higher re�ecting the higher marginal

cost of electricity production in East Denmark compared to Sweden. This increase in price in East

Denmark corresponds to �1 > 0 in the EDK-SWE bivariate model, i.e. a move away from equilibrium.

Importantly, this is not due to system instability but rather due to electricity being more expensive

to produce in East Denmark compared to Sweden.

Next, we look at the West-Denmark-Sweden link. Only the case with d1 = d2 for the switching

model makes sense in this case, i.e. this is the only situation where some degree of cointegration was

found. However, since both adjustment parameters are small and insigni�cant the power of the error

correction mechanism should be questioned in this case.

Looking at the West Denmark-South Norway connection we found no immidiate sign of cointegra-

tion in the non-switching model, see Table 4. When we condition on regimes there is cointegration, and

we see that both areas appear to move away from equilibrium. An explanation for this feature follows

again from the setup of the market and the marginal costs of electricity production. When congestion

occurs prices in West Denmark will be higher re�ecting the higher costs of electricity production. If

demand continues to increase in West Denmark during the congestion more expensive generators will

be taken into use thus increasing marginal cost of production even further. This increase in price

in West Denmark corresponds to �1 > 0 in the WDK-SNO bivariate model, i.e. a move away from

equilibrium. Again, this is not due to system instability but rather due to electricity being more

expensive to produce in West Denmark compared to South Norway.

The South Norway-Sweden and Sweden-Finland cases are similar in the sense that no cointegration

was found in the congestion state. However, in the non-switching model fractional cointegration was

suggested by the data. Enforcing d1 = d2 seems to a¤ect the adjustment mechanisms rather radically,

which we attribute to the lack of conditioning on states.

To sum up, appropriate modeling of the regime switching feature is seen to have a major impact on

the dynamic price adjustment mechanism. In addition to giving estimates of the adjustment process

speci�c to the particular state, conditioning on congestion/non-congestion allows interpretation of the

adjustment coe¢ cients in terms of the prices in each geographic region under the congestion regime

and not necessarily in terms of the stability of the system.

17

6 Conclusion

In this paper we have proposed a multivariate extension of the univariate framework of Haldrup

& Nielsen (2006b). This extension enables us to describe the dynamic structure of congestion and

non-congestion of electricity prices within the Nord Pool area. The notions of congestion and non-

congestion are motivated by the organization of the Nord Pool market, which is characterized by

physical exchanges of power across geographical regions. When the actual transmission of electricity is

constrained by the �ow capacity, congestion occurs. Therefore, the presence or absence of transmission

bottlenecks may have implications for the way prices are formed. Our multivariate modeling framework

allows us to explicitly take into account the fact that, in non-congestion periods, prices are the same

across geographical regions are therefore also subject to the same price shocks. This, in particular, is

not possible in the univariate frameworks in previous studies.

From our empirical analysis it is clear that conditioning on states, i.e. congestion vs. non-

congestion has a major impact on the implications for the dynamics of the electricity prices. That is,

when not conditioning on the speci�c states, misleading conclusions in regards to potential fractional

cointegration and the adjustment to equilibrium may be drawn.

There are three possible types of misclassi�cation of the model dynamics in the empirical analysis.

That is, (1) non-switching models may indicate that the price series are fractionally cointegrated,

whereas when conditioning on states this is only the case in the non-congestion state (which is cointe-

grated by de�nition); (2) the non-switching model could indicate that there is no fractional cointegra-

tion when in fact there is cointegration in the non-congestion state; and (3) there is the possibility of

fractional cointegration in both regimes, but not in the non-switching model. A feature of our model

that is particular to its multivariate nature is that we are able to estimate adjustment coe¢ cients in

the error correction representation. Again, it is important to condition on congestion/non-congestion,

since we may otherwise draw false conclusion about the adjustment to equilibrium (non-congestion).

Some geographical regions are indirectly connected, e.g. West Denmark and East Denmark are

indirectly connected through Sweden, so there are regimes where West Denmark and East Denmark

constitute the same price area. The e¤ects of these indirect links between geographical regions and

how they potentially a¤ect fractional cointegration and the adjustment in the system are therefore of

major interest. A detailed analysis which includes indirect links is conceptually straightforward using

a higher-dimensional model, but computionally infeasible and therefore left for future research.

References

Amundsen, E. S. & Bergman, L. (2007), �Integration of multiple national markets for electricity: The

case of norway and sweden�, Energy Policy 35, 3383�3394.

Davidson, J. (2002), �A model of fractional cointegration, and tests for cointegration using the boot-

strap�, Journal of Econometrics 110, 187�212.

Davidson, J., Peel, D. & Byers, D. (2005), The long memory model of political support: some further

results, Working Papers 003057, Lancaster University Management School, Economics Depart-

ment.

18

Diebold, F. X. & Inoue, A. (2001), �Long memory and regime switching�, Journal of Econometrics

105, 131�159.

Ding, Z. & Granger, C. W. J. (1996), �Modeling volatility persistence of speculative returns: a new

approach�, Journal of Econometrics 73, 185�215.

Engle, R. F. & Granger, C. W. J. (1987), �Modeling the persistence iof conditional variances�, Econo-

metric Reviews 5, 1�50.

Escribano, Á., Peña, J. I. & Villaplana, P. (2002), Modeling electricity prices: International evidence,

Economics Working Papers we022708, Universidad Carlos III, Departamento de Economía.

Fabra, N. & Toro, J. (2002), Price wars and collusion in the spanish electricity market, Industrial

Organization 0212001, EconWPA.

Granger, C. W. J. (1980), �Long memory relationships and the aggregation of dynamic models�, Journal

of Econometrics 14, 227�238.

Granger, C. W. J. (1981), �Some properties of time series data and their use in econometric model

speci�cation�, Journal of Econometrics 16, 121�130.

Granger, C. W. J. (1986), �Developments in the study of cointegrated economic variables�, Oxford

Bulletin of Economics and Statistics 48, 213�228.

Granger, C. W. J. & Hyung, N. (2004), �Occasional structural breaks and long memory with an

application to the s&p 500 absolute stock returns�, Journal of Empirical Finance 11, 399�421.

Granger, C. W. J. & Joyeux, R. (1980), �An introduction to long-memory time series models and

fractional di¤erencing�, Journal of Time Series Analysis 1, 15�29.

Haldrup, N. & Nielsen, M. Ø. (2006a), �Directional congestion and regime switching in a long memory

model for electricity prices�, Studies in Nonlinear Dynamics & Econometrics 10, 1�24.

Haldrup, N. & Nielsen, M. Ø. (2006b), �A regime switching long memory model for electricity prices�,


Hamilton, J. D. (1989), �A new approach to the economic analysis of nonstationary time series and

the business cycle�, Econometrica 57, 357�84.

Hosking, J. R. M. (1981), �Fractional di¤erencing�, Biometrika 68, 165�176.

Johansen, S. (2008), �A representation theory for a class of vector autoregressive models for fractional

processes�, Econometric Theory 24, 651�676.

Koopman, S. J., Ooms, M. & Carnero, M. A. (2007), �Periodic seasonal reg-ar�magarch models for

daily electricity spot prices�, Journal of the American Statistical Association 102, 16�27.

Motta, M. (2004), Competition policy, theory and practise, Cambridge University Press, Cambridge.

19

NordPool (2003a), �Derivatives trade at nord pool�s �nancial market�, Working Paper,

www.nordpool.no .

NordPool (2003b), �The nordic power market, electricity power exchange across national borders�,

Working Paper, www.nordpool.no .

NordPool (2003c), �The nordic spot market, the worlds �rst international spot power exchange�,Work-

ing Paper, www.nordpool.no .

Robinson, P. M. & Yajima, Y. (2002), �Determination of cointegrating rank in fractional systems�,


Sherman, P. (1989), The regulation of monopoly, Cambridge University Press, Cambridge.

20

7 Appendix: Figures

Figure 1: Map of the Nord Pool area.

21

4800 9600 14400 19200 24000 28800

4

6

8 East Denmark

4800 9600 14400 19200 24000 28800

0

5

10 West Denmark

4800 9600 14400 19200 24000 28800

4

6

South Norway

4800 9600 14400 19200 24000 28800

5.0

7.5Sweden

4800 9600 14400 19200 24000 28800

5.0

7.5Finland

Figure 2: Hourly log spot electricity prices for the Nord Pool area covering the period 3 January 2000to 25 October 2003.

3 4 5 6 7

4

6

8 East Denmark Sweden

3 4 5 6 7 8

0

5

10 West Denmark Sweden

3 4 5 6 7

0

5

10 West Denmark South Norway

3 4 5 6 7 8

4

6

South Norway Sweden

3 4 5 6 7 8

5.0

7.5

Sweden Finland

Figure 3: Scatter plots of hourly log prices across Nord Pool regions.

22



Chapter Chapter Chapter Chapter 2222

Local polynomial Whittle estimation covering non-stationary fractional processes


Local polynomial Whittle estimation covering non-stationary

fractional processes�

Frank S. Nielseny


March 18, 2009

Abstract

This paper extends the local polynomial Whittle estimator of Andrews & Sun (2004) to frac-

tionally integrated processes covering both stationary and non-stationary regions. We utilize the

notion of the extended discrete Fourier transform and periodogram to extend the local polynomial

Whittle estimator to the non-stationary region. We further, approximate the short-run component

of the spectrum by a polynomial instead of a constant in a shrinking neighborhood of zero, and

thereby alleviate some of the bias that the local Whittle estimator is prone to. This bias reduction

comes at a cost as the variance is in�ated by a multiplicative constant. We show consistency and

asymptotic normality for d 2 (�1=2;1), and if the spectral density of the short-run component isin�nitely smooth near frequency zero, we obtain a rate of convergence arbitrarily close to the para-

metric rate. A simulation study illustrates the performance of the proposed estimator compared

to the classical local Whittle estimator and the local polynomial Whittle estimator. The empirical

justi�cation of the proposed estimator is shown through an analysis of credit spreads.

Keywords: Bias reduction, fractional integration, local polynomial, local Whittle estimation,

long memory.

JEL Classi�cation: C22

�I am grateful to Niels Haldrup, Javier Hualde, and Morten Ørregaard Nielsen for valuable suggestions and comments.

This work was partly done while I was visiting Cornell University, their hospitality is gratefully acknowledged. I would

also like to thank Thomas Stephansen for proofreading the manuscript. I greatly acknowledge �nancial support from the

Danish Social Sciences Researh Council (grant no. FSE275-05-0199) and Center for Research in Econometric Analysis

of Time Series (CREATES), funded by the Danish National Research Foundation.yCREATES, School of Economics and Management, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark.

email: [email protected].

27

1 Introduction

We are interested in semiparametric frequency-domain estimation based on the local approximation

f(�) � '(�)��2d as �! 0+; (1)

where '(0) 2 (0;1) and the symbol ��means that the ratio of the left and right hand sides tendsto one in the limit. '(�) is an even, positive, continuous function on [��; �) which can be thoughtof as the spectral density of the short-memory component of the series of interest. Semiparametric

based estimators have been popular for a long time as it is believed that the loss of e¢ ciency with

respect to the parametric estimators entailed by the local speci�cations may be o¤set by a possible

greater robustness. This robustness stems from avoiding the inconsistency in estimating the long-run

dynamics that may be caused by a misspeci�cation of short-run dynamics.

Under stationarity and modeling '(�) in (1) by a constant G 2 (0;1), a common semiparametricestimator is the local Whittle (LW) estimator proposed by Künsch (1987). Robinson (1995a) shows

its consistency and asymptotic normality for d 2 (�1=2; 1=2). Velasco (1999a) extended Robinson�s(1995a) results to show that the estimator is consistent for d 2 (�1=2; 1) and asymptotically normallydistributed for d 2 (�1=2; 3=4) ; given that the fractional process is of Type I, see Marinucci &Robinson (1999) and Robinson (2005). Phillips & Shimotsu (2004) show that the LW estimator is

consistent for d 2 (1=2; 1] and has a nonnormal limit distribution for d 2 (3=4; 1), and a mixed normallimit distribution for d = 1. When d > 1 the LW estimator converges to unity in probability and

therefore is inconsistent, given that the fractional process is of Type II, Phillips & Shimotsu (2004).

This convergence in probability to unity when d > 1 also holds for log periodogram estimators as

shown in simulations studies by Hurvich & Ray (1995) and Velasco (1999b), and theoretically by Kim

& Phillips (2006). That is, in general the LW (or log periodogram) estimator is not a good general

purpose estimator when d takes on values in the non-stationary region beyond 3=4: The asymptotic

theory is discontinuous at d 2 f3=4; 1g and the estimator is not consistent for d > 1. Several methodsare available to avoid the problems when entering the non-stationary region. A simple one is to �rst

di¤erence the series before using the semiparametric estimator and then add one to the estimate. This

method runs into problems if the series of interest is trend stationary, Shimotsu & Phillips (2005) and

Shimotsu (2006). Tapering the data is another method often implemented and suggested, see Velasco

(1999a) and Hurvich & Chen (2000).

Shimotsu & Phillips (2005) introduce what they call an exact local Whittle estimator1 which is

consistent and has the same N(0; 1=4) limit distribution for all values of d if the I (d) series is generated

by a linear sequence and the range of the estimator is not wider than 9=2:2 Instead of using fractional

di¤erencing of the data, Abadir, Distaso & Giraitis (2007) use a di¤erent approach �rst noted by

Phillips (1999). They extend the discrete Fourier transform to the non-stationary case and use this in

whitening of the periodogram. Abadir et al. (2007) show that when the I (d) series is generated by a

1Shimotsu (2006) extends this to a feasible exact local Whittle estimator when introducing an unknown mean and

trend.2The assumption concerning the width of the admissible parameter space is needed to ensure that the di¤erence in

the criteria function is uniformly bounded away from zero, see Shimotsu & Phillips (2005).

28

linear sequence the extended discrete Fourier transform and periodogram have the same asymptotic

behavior for d 2 (�3=2;1).Our main interest in this paper is to analyze a general purpose estimator where the limiting

distribution holds in the non-stationary case and when there is short-run contamination. To achieve

bias reduction when there is contamination by short-run dynamics, we follow Andrews & Sun (2004)

and model the spectral density of the short-memory component '(�) as a �nite and even polynomial

instead of a constant near frequency zero. In extending the local polynomial Whittle (LPW) estimator

of Andrews & Sun (2004) to the non-stationary region, we use the notion of the extended discrete

Fourier transform and periodogram as in Abadir et al. (2007). We call the new estimator the extended

local polynomial Whittle (ExtLPW) estimator. In establishing consistency and asymptotic normality

for the estimator d we follow the method set out by Andrews & Sun (2004). Given that the generating

process is linear, the same central limit theorem argument as in the stationary case jdj < 12 derived by

Robinson (1995a) holds; although, not for d0 =�12 ;32 ; :::

. We establish consistency and asymptotic

normality for d0 2 (�1=2;1) : Furthermore, if '(�) is in�nitely smooth near frequency zero, the rateof convergence can become arbitrary close to the parametric rate. The simulations reveal that our

proposed estimator is superior when considering possible short-run contamination and non-stationary

values of d: We also include an analysis of credit spreads that demonstrates the usefulness of the

estimator.

The remainder of the paper is structured as follows: Section 2 gives a short introduction to the

LPW estimator of Andrews & Sun (2004). Section 3 expands the usual stationary framework to the

non-stationary framework thereby de�ning the ExtLPW estimator. Section 4 states the assumptions

needed for showing consistency and asymptotic normality. Section 5 introduces the theorem for

consistency and asymptotic normality. Section 6 presents the results from a small simulation study.

Section 7 provides an empirical investigation of potential long memory properties of treasury yield

and yields on corporate bonds, spreads over treasury and spreads between corporate yields. Section 8

concludes. Lemmas, and proofs to Theorem 1 and Lemma 1-2 are situated in the Appendix.

2 The local polynomial Whittle approach

De�ne at the jth frequency �j =2�jn for 1 � j � m; the discrete Fourier transform (DFT) and

periodogram of Xt as

w (�j) =1p(2�n)

nXt=1

Xt exp (it�j) (2)

I (�j) = j!(�j)j2 : (3)

Following Andrews & Sun (2004) the (negative) local polynomial Whittle log-likelihood is

Un(d;G; �) =1

m

mXj=1

"log(G��2dj exp (�Pr (�j ; �))) +

I (�j)

G��2dj exp (�Pr (�j ; �))

#; (4)

where Pr (�j ; �) =Pr�=1 ��

2�j and de�nes the closed interval of admissible estimates to be D =

[�1;�2] � [�1=2; 1=2] and m = o(n) is the bandwidth choice, i.e. the number of periodogram

29

ordinates to be used in the estimation. Then concentrating Un(d;G; �) with respect to G we can write

the likelihood function as

Ln (d; �) = log G(d; �)� 1

m

mXj=1

Pr (d; �)�2d

m

mXj=1

log �j + 1; (5)

G(d; �) =1

m

mXj=1

I(�j) exp (Pr (�j ; �))�2dj : (6)

Thus, Andrews & Sun (2004) propose to minimize (5) over the admissible set (d; �) 2 D ��

(dAS ; �AS) = argmin(d;�)2D��

Ln (d; �) ; (7)

where � is compact and convex set in Rr: As shown by Andrews & Sun (2004) the asymptotic varianceof dAS is in�ated by a multiplicative constant.

It should be noted that it is not necessary to correct for an unknown mean of fXtg as we onlycompute the DFT at the frequencies �j =

2�jn for j = 1; :::;m where m = o (n), rendering the log-

likelihood local to frequency zero. This general result only holds for stationary values of d: Assuming

an unknown mean of the generating process when we are in the non-stationary region is the same as

saying that the data generating process is free of linear trends, in the usual setup of e.g. Robinson

(1995a) and Andrews & Sun (2004).

The di¤erence between the objective function de�ned in Robinson (1995a) and Andrews & Sun

(2004) is how we approximate '(�) as �! 0 by logG� Pr(�; �) where the polynomial term Pr(�; �)

vanishes for � = 0:

Given Assumptions 1-4 in Andrews & Sun (2004) and utilizing their Lemma 1 and Lemma 2, the

estimates (dAS ; �AS) are equal to the solution to the �rst-order conditions with probability that goes

to one as n ! 1. This solution is consistent and asymptotically normal, Andrews & Sun (2004, pp.

572). The asymptotic bias is of order O(�minfs;2+2rg), where s is measure of the smoothness of the

spectral density near frequency zero, for the LPW estimator and O(�2) for the LW estimator. That

the asymptotic bias of the LPW estimator is of order O(�minfs;2+2rg) follows from Assumption 4 and

it is clearly seen that if r = 0 and by Assumption 2 that s > 2r the asymptotic bias reduces to that

of the classical LW estimator, i.e. O(�2): In this paper, we only consider long memory processes with

potential short-run contamination, but if fXtg is a perturbed fractional process, the orders will besmaller and dependent on d. Nonetheless, the LPW estimator will still be consistent at the expense

of lower convergence rate and higher asymptotic bias, see Arteche (2004) for the LW case. In the

perturbed case, the asymptotic bias of not modeling the spectral density appropriately will be of

order O(�2d), Hurvich & Ray (2003), Arteche (2004) and Hurvich, Moulines & Soulier (2005).

3 The extended local polynomial Whittle estimator

We de�ne a fractional integrated process as one that is stationary or exhibits some weak dependence

after the application of the fractional �lter, (1 � L)d. We often distinguish between two ways ofexpressing a fractional process as a function of weakly dependent innovations, i.e. Type I and Type

II processes, see Marinucci & Robinson (1999). As we want to stay in the framework of Abadir et al.

30

(2007) and Andrews & Sun (2004), we work in the setting of de�ning the fractional process as a Type

I process. Because we are not only interested in the stationary region, it is not enough just to expand

the �lter (1� L)d and express it as an in�nite order moving average of the innovations which resultsin a stationary process when d < 1=2:When we move into the non-stationary region, i.e. d � 1=2; thisprocedure breaks down because the in�nite order moving average of the innovations does not converge.

This is circumvented by modeling the process as the partial sum of the component I(d�p) process forsome p 2 Z and expanding (1�L)p�d in terms of the innovations. This results in a stationary integerdi¤erenced series. The disadvantage is that it introduces discontinuities at d = 1=2; 3=2; ::p � 1=2,where p 2 Z. The Type II process of fractional integration is designed to cover a wider range of d andthereby circumvent some of the problems concerning the Type I process, see Robinson (1994), Phillips

(1999), Tanaka (1999), Marinucci & Robinson (1999), and Robinson (2005). For the derivation of the

ExtLPW estimator, we de�ne the fractional process as a Type I process. More speci�cally, we de�ne

the I(d) process as in Abadir & Taylor (1999)

De�nition 1 For d = p+du; where p 2 Z and du 2 (�1=2; 1=2) ; we say that fXtg is an I(d) process,i.e. Xt � I(d); if

(1� L)pXt = ut; t = 1� p; 2� p; :::; (8)

where futg is a second order stationary sequence with spectral density

fu(�) = G0 j�j�2du + o�j�j�2du

�; (9)

as �! 0; where G0 2 (0;1) :

De�ne the extended DFT and the extended periodogram of a time series fXtg evaluated at theFourier frequencies �j =

2�jn ; where j = 1; :::; n; by

w(�j ; d) = wx(�j) + c(�j ; d); (10)

I(�j ; d) = jw(�j ; d)j2 ; (11)

where wx(�j) is the usual DFT de�ned as

wx(�j) =1p(2�n)

nXt=1

Xt exp (it�j) ; (12)

and the correction term c(�j ; d) takes on constant values on the intervals d 2 Dp := [p� 1=2; p+1=2);p 2 N and is de�ned by

c(�j ; d) =

(0 if d 2 D0 = [�1=2; 1=2)exp(i�j)

Pp`=1(1� exp(i�j))�`Z` if d 2 Dp for p = 1; 2; :::;

(13)

where

Z0 = wx(0) =1p(2�n)

nXt=1

Xt; (14)

Z` =1p(2�n)

n(1� L)`�1Xn � (1� L)`�1X0

o; ` = 1; 2; :::; p: (15)

31

In the computation of the step function c(�j ; d), we have to enumerate the data depending on what

subspace of D = [d1; d2] we are interested in. This is apparent from looking at (15), for example

when p = 2: That is, X�i+1; X�i+2; :::; Xn where i = (0 _ bd2 � 1=2c) : The usual DFT, (12) is alwayscomputed using the enumeration fXtgnt=1 :

This notion of the extended DFT allows us to estimate the usual LPW estimator in the context

of non-stationary values for d by minimizing the criteria function de�ned as (5) over the admissible

parameter space. The extension of the DFT to the non-stationary case is based on the work of Phillips

(1999), Lahiri (2003), Dalla, Giraitis & Hidalgo (2006) and Abadir et al. (2007). De�ne the pseudo

spectral density of the sequence fXtg � I(d0); where d0 = p0 + du and du 2 (�1=2; 1=2) as

f(�) = j1� exp (i�)j�p0 fu (�) ; j�j � �: (16)

From this de�nition it is clear that

f (�) � G0 j�j�2d0 as �! 0+: (17)

Then following Abadir et al. (2007, Lemma 4.4), De�nition 1, and (10), the extended DFT has the

property that

w(�j ; d0) = (1� exp(i�j))�p0 !u(�j); j = 1; :::; n; (18)

where !u (�j) is the DFT of the stationary sequence futg. From Abadir et al. (2007, Lemma 4.4(i)),

it follows that

wx(�j) = (1� exp(i�j))�p0w�p0x(�j)� exp(i�j)p0Xr=1

(1� exp(i�j))�rw�rx (19)

= (1� exp(i�j))�p0wu(�j)� exp(i�j)p0Xr=1

(1� exp(i�j))�rw�rx; (20)

where the second equality follows from De�nition 1. Then the de�nition in (10) follows trivially.

Denote the rescaled extended DFT by

vj = v (�j ; d0) =w (�j ; d0)

' (�j)1=2 ��d0j

; 1 � j � m: (21)

Given that the generating process is linear, equation (18) and Lemma 2 show that the asymptotic

behavior of the rescaled extended DFT and periodogram is the same for all d0 2 (�1=2;1). Further-more, given consistency, d

p! d0 and the de�nition of the extended DFT, we get

w��j ; d

�p! w (�j ; d0) : (22)

This follows because c(�j ; d) is a step function and therefore constant on the intervals d 2 (p� 1=2; p+ 1=2)for p 2 N: This considerably eases the estimation as we are left with the same estimation procedureas in the stationary case.

If the process is stationary the ExtLPW estimator is identical to the LPW estimator of Andrews

& Sun (2004). Similarly to the estimators in Robinson (1995a), Andrews & Sun (2004), and Abadir

et al. (2007) this estimator is based on the whitening principle of the periodogram. That is, similarly

32

to the stationary case, Robinson (1995a) and Andrews & Sun (2004), the ExtLPW estimator is based

on the behavior of the random variables

�j = � (�j) =Iu(�j)

fu(�j); 1 � j � m: (23)

Then given the spectral density of the second order stationary sequence futg ;(9), the �rst moment isgiven by

E��j�= 1 + o(1) +O(j�1 log j) 8 1 � j � m as n!1: (24)

Additionally, under regularity assumptions, see Lahiri (2003) and Abadir et al. (2007), the random

variables also satisfy

var��j�� C 8 1 � j � m; (25)

where C is a positive �nite constant and

cov��j ; �s

�! 0 for j; s!1 and j 6= s: (26)

In the proof to Lemma 4.6 in Abadir et al. (2007), the above equations are proven.

Then given the equations (24), (25) and (26), the random variables �j satisfy a weak law of large

numbers (WLLN), i.e.1

m

mXj=1

�jp! 1; as n!1: (27)

Given additional assumptions, this result is su¢ cient to ensure consistency of the estimator d. See

further discussions on this later. The WLLN for the random variables �j is equivalent to a WLLN for

the random variables jvj j2 ; i.e.1

m

mXj=1

jvj j2p! 1; as n!1: (28)

Then given the nature of the spectral density (9) and (18)

jvj j2 = �j (1 + o(1)) 8 1 � j � m as n!1: (29)

Furthermore, given equation (24)

Ehjvj j2

i� C 8 1 � j � m: (30)

For a more thorough walkthrough of the extended DFT, see Phillips (1999), Lahiri (2003), Dalla et al.

(2006), and Abadir et al. (2007). We further note that the variables vj and �j are invariant with

respect to the mean of futg :

4 Assumptions

In this section, we introduce the assumptions needed to establish consistency and asymptotic normality

of the proposed estimator.

Assumption 1 D�� is a compact and convex subset of Rr+1 and d0 and �0 lie in the interior ofD = [d1; d2] � [�1=2;1] where d0 6= p0 � 1=2; p0 2 N and �, respectively:

33

Assumption 1 is a combination of similar assumptions given in Andrews & Sun (2004) and Abadir

et al. (2007). Yet, our lower bound is more restrictive than in Abadir et al. (2007), because we need

to restrict E [Xt] = 0 to facilitate d1 = �3=2. Therefore, we only consider invertible processes, i.e.d > �1=2: Furthermore, the assumption restricts the parameters of interest to be in the interiorof a compact and convex set: If d lies on the boundary of the parameter space, we conjecture that

the estimator will be consistent,3 but it may not be asymptotically normal. As noted by Newey &

McFadden (1986, p. 2144), it is su¢ cient that the estimator is in the relative interior of the parameter

space, allowing for equality restrictions to be imposed on the parameters of interest:

Assumption 2 The spectral density of the stationary sequence futg is

fu(�) = '(�) j�j�2du + o�j�j�2du

�as �! 0+; (31)

where '(�) is continuous at � = 0, '(0) 2 (0;1), and du 2 (�1=2; 1=2).

Assumption 2 is a result of using the basic semiparametric setup from De�nition 1.

Assumption 3 Let '(�) be smooth of order s at � = 0; where s > 2r and r 2 Nn f0g ; s � 1: That is,in a neighborhood of � = 0; '(�) is bsc times continuously di¤erentiable with bsc� derivative,

'(bsc); satisfying a Hölder condition of order s � bsc at zero, i.e.��'(bsc) (�)� '(bsc) (0)��

C j�js�bsc for a constant C <1:

The assumption imposes a regularity condition on the function '(�) that characterizes the semi-

parametric setup, Andrews & Sun (2004), i.e. ' (�) has a Taylor expansion around � = 0

'(�) =

bs=2cXk=0

�k�2k +O(�s) (32)

= Pr(�; �) +O(�s); as �! 0+; (33)

where �0 = '(0) and

�k = � 1

(2k)!

dk

d�k'(�)

��=0

: (34)

In general, Assumption 3 holds for general ARFIMA processes for all �nite s:

As noted by Andrews & Sun (2004), if r = 0 and Assumption 3 holds with s = 2; then Assumption

A1�of Robinson (1995a) holds with � = 2.

Assumption 4 (a) fXtg is generated by the linear sequence futg

ut = A(L)"t =

1Xj=0

aj"t�j ;1Xj=0

a2j <1; (35)

3See e.g. the proof of Hurvich et al. (2005) Theorem 3.1, and their discussion of bounding d away from zero. It is

not a trivial question, as we in some sense need d to be bounded away from the boundary because the convergence of

the log-likelihood is not uniform on D � Rr:

34

where

E ["tj=t�1] = 0; E�"2t j=t�1

�= 1 a.s., (36)

E�"3t j=t�1

�= �3 a.s., (37)

E�"4t j=t�1

�= �4 a.s. 8 t = 0;�1;�2; :::; (38)

and =t�1 is the �� eld generated by f"s : s < tg : (b) There exists a random variable " with

E"2 <1 such that for all � > 0 and some generic constant K > 0; Pr(j"tj > �) < K Pr(j"j > �):(c) In some neighborhood (0; �) of the origin �(�) is di¤erentiable

d

d��(�) = O (j�(�)j =�) as �! 0+; (39)

where �(�) =P1j=0 aj exp(ij�):

Assumption 4 says that futg is a linear sequence with martingale di¤erence innovations. That is,f"tg is adapted to the �ltration f=tg : Furthermore, Assumption 4 does not rule out non-Gaussianprocesses. It should be possible to relax the linearity assumption, see the consideration regarding

non-linearity of futg in Abadir et al. (2007).

Assumption 5 m2r+1=2

n2r!1 and m�+1=2

n�! 0 as n!1; where � = min fs; 2 + 2rg :

Assumption 5 is the same as Assumption 4 in Andrews & Sun (2004). The assumptions are needed

to show simultaneous consistency of (d; �) and asymptotic normality. Note that the �rst condition

imposes a lower bound on the growth of m which ensures simultaneous consistency of d and � by

ensuring that the scaling matrix used to normalize the score and Hessian satis�es a regularity condition

that is necessary for consistency of (d; �) which will be clari�ed later on. The second condition is to

ensure that the normalized score in distribution converges to a zero mean Gaussian process which is

required to show asymptotic normality of the estimators (d; �). Andrews & Sun (2004) instead work

with limn!1

m�+1=2

n�= A 2 (0;1) where � = min fs; 2 + 2rg. They set the divergence rate of m such that

they can derive the asymptotic bias and asymptotic mean squared error of d: We choose a bandwidth

m that diverges at a slower rate. Note that the two conditions never exclude each other as s > 2r

which follows from Assumption 3.

Assumption 6 For m = o(n) the renormalized periodogram, �0j 8 1 � j � m; satis�es a WLLN

1

m

mXj=1

�0jp! 1; as m;n!1; (40)

where �0j =Iu(�j)

'(�j)��2duj

81 � j � m.

Assumption 6 is equivalent to Assumption B in Abadir et al. (2007) and states that if Assumption

2, 3 and equation (18) hold then

�0j = �j (1 + o(1)) 8 1 � j � m as n!1: (41)

Furthermore, (24) implies that

E��0j�� C 8 1 � j � m as n!1: (42)

35

5 Consistency and asymptotic normality

Theorem 1 states consistency and asymptotic normality of the proposed ExtLPW estimator.

Theorem 1 Let fXtg be generated by (8) and assume that Assumptions 1 through 6 hold. Then, asn!1, d and � are both consistent and

Bn

d� d0� � �0

!d! N

�0;�1r

�; (43)

where Bn being the (r + 1)� (r + 1) diagonal matrix with jth diagonal element de�ned as

[Bn]11 = m1=2 (44)

[Bn]jj =

�2�m

n

�2j�2m1=2 for j = 2; 3; :::; r + 1: (45)

And r is the (r + 1)� (r + 1) covariance matrix de�ned as

r =

4 2�0r2�r �r

!: (46)

�r is a r � 1 vector with kth element �r;k and �r is an r � r matrix with (i; k)th element [�r]i;k ;

�r;k =2k

(2k + 1)2for k = 1; :::; r; (47)

[�r]i;k =4ik

(2i+ 2k + 1) (2i+ 1) (2k + 1)for i; k = 1; :::; r: (48)

We could have shown log5m�consistency (irrespective of � 2 �) using Robinson�s (1995a) pp.1642-1643 proof of the log3m�consistency of d (r = 0) and adjusting it to account for our weakerAssumption 5 compared to his Assumption 4�. But as this would mainly be a theoretical addition

it is left out. Theorem 1 utilizes the F.O.C. approach (because of the multidimensionality of the

parameter space) as in Andrews & Sun (2004). Therefore, Theorem 1 jointly delivers consistency

and asymptotic normality of (d; �): More speci�cally, since the ExtLPW likelihood ((5) where the

periodogram is de�ned by its extended DFT, i.e. (18)) is a continuous function on a compact set the

ExtLPW estimator exists. From Lemma 1 (in the Appendix) we know by Lemma 1 of Andrews & Sun

(2004) that there exists a solution to the �rst order conditions with probability tending to one, and

that the solution satis�es the convergence result in Theorem 1, see also Lemmas 1 and 2 of Andrews

& Sun (2004). If the (negative) likelihood function is strictly convex and twice di¤erentiable then the

solution to the �rst order conditions is unique and minimizes the log-likelihood and hence equals the

ExtLPW estimator.

By the formula for a partitioned inverse Theorem 1, in consequence, implies that,

�1r =

(4� 2�0r��1r 2�r)�1 �4 � 2�0r

��r � 2�r4�12�0r

��1��1r 2�r

�4� 2�0r��1r 2�r

��1 ��r � 2�r4�12�0r

� !(49)

=

cr=4 � cr

2 �0r��1r

� cr2 �

�1r �r ��1r + cr�

�1r �r�

0r��1r

!; (50)

36

where cr =�1� �0r��1r �r

��1 for r > 0 and c0 = 1:A few remarks are in order. First of all, the asymptotic variance of

pm�d� d0

�is free of nuisance

parameters and equal to cr=4: Secondly, in light of Assumption 5 the estimator given by ExtLPW for

r > 0 allows one to choose a bandwidth m much larger than in the classical LW approach, resulting

in an estimator that has asymptotic normality with a faster rate of convergence, as a function of

the sample size n: The cost of introducing a polynomial is in�ation of the asymptotic variance by a

multiplicative constant, i.e. c0 = 1; c1 = 9=4; c2 = 3:52 ; :::, see Andrews & Sun (2004).

Consistency of d provides no information about �0 as the concentrated log-likelihood becomes �at

as a function of � as n!1: The rate at which it becomes �at furthermore di¤ers for each element of�.

As discussed earlier, our model setup does not consider volatility processes, e.g. in the sense of

long memory signal-plus-noise models as in Hurvich et al. (2005) and Frederiksen, Nielsen & Nielsen

(2008), among others. Introducing a perturbation in our framework would indeed bias our estimator

in the same manner as the classical fractional integration estimators (LW and log-periodogram), i.e.

the leading bias term is of order O��2d�implying a slower convergence rate compared to the leading

bias term of the pure long memory process of O��2�, Hurvich & Ray (2003), Arteche (2004), Hurvich

et al. (2005), and Frederiksen et al. (2008).

6 Simulation study

6.1 Setup

This sections concerns the �nite sample performance of the proposed estimator. We generate I(d)

processes according to De�nition 1 by using the circulant embedding method as described in Davies &

Harte (1987), i.e. as a stationary Type I fractionally integrated process in the terminology of Marinucci

& Robinson (1999), see also Beran (1994, pp. 215-217). Non-stationary processes are then de�ned

as [d] fold partial sums of stationary I (d� [d]) processes. [d] is de�ned as the integer closest to d:Furthermore, when d� [d] = 1=2, [d] is equal to d+1=2: futg is contaminated by autoregressive (AR)and moving average (MA) roots. That is, we consider the following data generating process (DGP)

(1� �L) (1� L)d (Xt) = (1 + �L)ut;

where uti:i:d:� N (0; 1) ; � 2 f�0:8;�0:5; 0; 0:5; 0:8g and � 2 f�0:8;�0:5; 0; 0:5; 0:8g. We set the

fractional parameter of interest equal to d = f�0:3; 0; 0:3; 0:7; 1; 1:3; 1:7; 2g: Sample size is set equalto n 2 f128; 512; 1024g and bandwidth m = bnac where a 2 f0:5; 0:65; 0:8g. The bias and root

mean squared error (RMSE) were computed using 1000 replications. Simulations were done in Matlab

v7.2. The optimization procedure was implemented using the unconstrained minimization procedure

in Matlab where we used the BFGS algorithm. We tried di¤erent procedures to �nd the optimum,

among others evaluating the �rst-order conditions and thereby �nding the corresponding roots. All

the di¤erent approaches yielded similar results and we therefore elected to use the BFGS algorithm

as it is easy to implement and fairly fast computationalwise.

We compare our derived estimators to the local Whittle (LW) estimator of Robinson (1995a), local

polynomial Whittle (LPW) estimator of Andrews & Sun (2004), and extended local Whittle (ExtLW)

37

estimator of Abadir et al. (2007). Regarding the parameterization of the polynomial, we set r = f1; 2g.As initial values we set the memory parameter equal to the log-periodogram estimate of Geweke &

Porter-Hudak (1983) and the polynomial terms were all set equal to 1:

Tables 1-5 present results from the small simulation study. We only display a subset of the results.

Attention is restricted to the cases with no short-run dynamics with n = 512, Table 1, with moving-

average short-run dynamics, � 2 f�0:8;�0:5g with n = 512 , Table 2-3, and �nally with autoregressiveshort-run contamination, � 2 f0:5; 0:8g with n = 512 , Table 4-5.

6.2 Simulation results

When the DGP is not contaminated by any short-run dynamics, it is be preferable to use a larger

bandwidth. If there is short-run contamination, the opposite is the case, excluding of course the bias

reducing methods, i.e. LPW estimators. Furthermore, bias decreases as a function of sample size.

In Table 1, results without short-run contamination are presented. For the stationary region all the

estimators are seemingly unbiased, and we clearly see that the extended estimators are in a statistical

sense equal to their non-extended counterparts. The RMSE shows that the fractional parameter is

estimated quite accurately, and we notice that the estimators using a polynomial to reduce potential

bias has a larger RMSE than the estimators using a constant in a shrinking neighborhood of zero.

Moving on to the non-stationary region, i.e. d � 1=2, we see that the bias of LW and LPW increases

quite considerably, especially when d is larger than 1: This is to be expected as the LW estimator

is not consistent for d > 1 and the LW estimator is biased towards unity, thereby con�rming the

results of Phillips & Shimotsu (2004). This result for the LW estimator also seems to hold for the

LPW estimator. Clearly, in the case where d > 1, the extended estimators are the best, and with

the ExtLW estimator being the best in a bias sense, as there is no short-run contamination. For

the extended estimators, regardless of which region we are in, RMSE indicates that the fractional

parameter of interest is estimated accurately. Additionally, the RMSE does not vary much in the

given range of d:

Looking at the case where we introduce short-run contamination, Tables 2-5, we generally �nd

that the estimators are biased and the bias increases as the contamination of the signal increases.

This is expected as the low frequencies (long-run in the time domain) are contaminated by the higher

frequencies (short-run in the time domain) of the spectral density. The bias is highest when introducing

positive AR noise and negative MA noise. When � = 0:8 and � = �0:8 we clearly see the advantageof using an estimator that approximates this short-run contamination. Furthermore, when looking

at more moderate negative MA noise and positive AR noise, � = �0:5 and � = 0:5; respectively, it

is not preferable to use a lower bandwidth for the LPW (only in the stationary case) and ExtLPW

estimators, as for the other estimators. That is, the LPW and ExtLPW estimators are very robust to

MA and AR contamination because of the way they approximate the spectral density of the short-run

noise by a polynomial. Hence, it is possible to choose a higher bandwidth without increasing the bias

which is an important result especially when looking at shorter time series.

To sum up, it is important to approximate the short-run component of the local approximation by a

polynomial function instead of merely a constant, especially when there is a high degree of persistence,

since the polynomial estimators are clearly less biased than the LW and ExtLW. This is especially

38

important in shorter time series as the bias can be extreme when there is short-run contamination. As

shown by Andrews & Sun (2004) and in this paper, the improved bias comes at a cost of increasing

the variance by a multiplicative constant. When looking at the non-stationary region, it is important

to use the extended versions especially when d � 1 as there is considerable bias gains from using these

extensions.

For the ExtLW estimator where � = f�0:5; 0; 0:5g Abadir et al. (2007) arrive at similar results.Furthermore, the simulation results for the ExtLW estimator from Abadir et al. (2007) when n = 500

and m =�n:65

�are similar to the results obtained in Shimotsu & Phillips (2005) for their exact local

Whittle estimator (ELW). Shimotsu & Phillips (2005) compare their ELW estimator to two di¤erent

types of tapered estimators (the tapering proposed in Velasco (1999a) and Hurvich & Chen (2000)),

and conclude that their estimator is the best general purpose estimator when compared to the tapered

version of the LW estimator. Therefore, we conclude that our proposed estimator also outperforms

the tapered LW estimators especially in the presence of short-run contamination.

7 Application to credit spreads

In this section, we investigate potential long memory properties of treasury yield and yields on cor-

porate bonds, spreads over treasury and spreads between corporate yields, as previously examined by

Ratta & Urga (2005). Both in structural models (Merton (1974), Black & Cox (1976), Das (1995),

Longsta¤& Schwartz (1995), Hull & White (1995), Leland & Toft (1996), among others) and reduced-

form models (Ramaswamy & Sundaresan (1986), Jarrow & Turnbull (1995), Das & Tufano (1995),

Du¢ e & Huang (1996), Jarrow, Lando & Turnbull (1997), Madan & Unal (1996), Du¢ e & Single-

ton (1999), among others), credit spreads play an integral role in pricing of risky debt and credit

derivatives. Neither approach considers that the process driving the data generating process might

be poorly approximated by considering the classical I(0)=I(1) setup, as discussed in Ratta & Urga

(2005), see references therein. The objective of Ratta & Urga (2005) was to �ll a gap in the credit

spread literature, i.e. to investigate if credit spreads exhibit potential fractional integration and if

there are some long-run relations that can be explained through fractional cointegration. They use

the log-periodogram estimator of Geweke & Porter-Hudak (1983) and the LW estimator analyzed by

Robinson (1995a). As both of these estimators are severely biased in the presence of short-run conta-

mination, see Nielsen & Frederiksen (2005) for a simulation study, and there is no asymptotic theory

for d � 3=4; we suggest using more up-to-date semiparametric estimators that potentially mitigate thebias introduced by short-run contamination and where the distributional theory holds for d � 3=4 :

The usual way to reduce bias for the log-periodogram and the LW estimator is to select a smaller

bandwidth thereby sacri�cing e¢ ciency in the form of a larger variance.

The data considered here consists of daily observations for the 30 year historical US Treasury

Constant Maturity Yields and Moody�Aaa and Baa.4 For a more thorough description of the data

and our reason for using rating-speci�c indices, see Ratta & Urga (2005). Our data covers the period

4Ratta & Urga (2005) look at two other ratings besides the two that we consider, i.e., Aa and A. The

reason we only look at Aaa and Baa is that these data series are downloadable from the Federal Reserve at

http://www.federalreserve.gov/releases/h15/data.htm

39

2nd of January 1986 through 15th of February 2002 for a total of 4,034 observations. The 30-year

Treasury constant maturity series was discontinued on February 18th, 2002, and reintroduced on

February 9th, 2006. We could have used the 20-year Treasury constant maturity series and used a

correction factor delivered by the U.S. Treasury, but we choose to focus on the shorter sample period.

As opposed to Ratta & Urga (2005) we opt to log transform5 the series before considering further

analysis. Therefore, spreads, i.e. spreads over treasury (sAaaTreas, sBaaTreas) and spreads between

corporate yields (sBaaAaa), are de�ned as the di¤erence between the logs of the respective series.

Time series plot of the individual series and the spreads are shown in Figure 1. There are signs of

heteroskedasticity, volatility clustering and potential structural breaks. Granger & Terasvirta (1999)

show that the number of regime switches a¤ects the long memory parameter. Diebold & Inoue (2001),

Granger & Hyung (2004) and Haldrup & Nielsen (2007) discuss that if series display breaks, particular

in their deterministic components, these processes will give the impression of persistence. That is, we

can mistakenly conclude that a process displays long memory, where in fact it is due to a structural

break in the series. Therefore, we split the full sample in four even subperiods. The results from looking

at subperiods were comparable to the whole sample period, and therefore omitted. Additionally, we

implemented a test for spurious long memory where we temporally aggregated the data and compared

the long memory estimates through a Wald type test for identical memory across aggregation, see

Ohanissian, Russell & Tsay (2008) and Frederiksen & Nielsen (2008). We could not reject that

the memory parameters are identical. Hence, we conjecture that the estimated long memory is not

spurious in the sense that it is generated by structural breaks, e.g. a non-stationary level shift in mean

DGP. Looking at �rst di¤erences of the respective series, they seem stationary (when looking at the

autocorrelation diagrams which are omitted). Especially, the spread series look as if they have been

overdi¤erenced, i.e. the introduction of moving average behavior in the autocorrelation diagram.

Insert Figure 1 about here

Figures 2-7 display the semiparametric results for the LW, LPW, ExtLW and ExtLPW estimators,

for di¤erent bandwidth ranges.

Insert Figures 2-7 about here

Generally, results for the fractional integration estimates show that the estimators that do not model

the short-run components by a polynomial have a tendency to decrease as a function of the bandwidth,

at least for su¢ ciently large bandwidth. This is of course reasonable considering the theoretical

properties of these estimators.

The logs of Aaa, Baa and Treasury yields are in the non-stationary area with the long memory

parameter estimated in the proximity of a unit root. As the asymptotic theory does not hold for the

LPW estimator when d � 1=2 and for the LW estimator when d � 3=4, we primarily rely on the

extended estimators. In general, we cannot reject that the log yields contain a unit root.

Looking at the spreads over treasury (sAaaTreas, sBaaTreas) and spreads between corporate yields

(sBaaAaa), the estimated long memory is clearly in the non-stationary region regardless of the chosen

5Log transforming the data is also preferred in the sense that it better captures the non-linear relationship between

yields and ratings, Manzoni (2002).

40

bandwidth and estimator. The LW and ExtLW are for larger bandwidth choices signi�cantly di¤erent

from d = 1; whereas we cannot reject the presence of a unit root for the polynomial estimators.

Like Ratta & Urga (2005), we have also applied a parametric ARFIMA-GARCH.6 The results

con�rm the �ndings obtained from the semiparametric analysis, so they are omitted. If indeed the

true generating process is modeled by GARCH innovations this does not a¤ect the asymptotic theory

of the semiparametric estimates as shown in a simulation study by Nielsen & Frederiksen (2005), so,

in that respect, it is not unreasonable that the conclusions are the same.

Overall, it cannot be rejected that the log yields of Aaa, Baa and Treasury bonds contain a unit

root. However, the results are more mixed when looking at spreads, depending on the estimator and

bandwidth choice. Therefore, as in Ratta & Urga (2005), we can reject the reduced-form modeling of

Das & Tufano (1995), Jarrow et al. (1997) and Du¢ e & Singleton (1999). This explicitly implies the

data generating process of the risk-free process, and hence also credit spreads, follows a short-memory

process, i.e. I(0). The relevance of modeling yields in a more �exible fractional cointegration setup

should be considered and is at least a relevant alternative to the classical I(1)=I(0) terminology.

8 Concluding remarks

In this paper, we propose a semiparametric estimator that circumvents the relatively slow convergence

and �nite sample bias of the classical local Whittle estimator when there is short-run contamination

(e.g. autoregressive and/or moving average roots) and non-stationarity. The bias reduction is ob-

tained by approximating the spectrum of the short memory component by a polynomial instead of

a constant in a shrinking neighborhood of frequency zero. In addition, the notion of extended DFT

and periodogram is used to extend the estimator to cover non-stationary values of the fractional inte-

gration parameter d. We show consistency and asymptotic normality of the estimator. A simulation

study con�rms the asymptotic results. The adequacy of the estimator is shown through an empirical

analysis of credit spreads.

As a �nal note, we could also have opted to expand the work of Andrews & Sun (2004) by utilizing

the work of Shimotsu & Phillips (2005). However, we conjecture that such an estimator would in fact

be consistent and asymptotically normal in the same manner as the exact local Whittle estimator of

Shimotsu & Phillips (2005). Robinson (2005) showed that the expected squared deviation between

the DFT of the Type I and the Type II model is of order O�n�1

�: Therefore, we conjecture that the

derived results also hold for Type II fractional processes.

9 Appendix of proofs and lemmas

The proof to Theorem 1 relies heavily on Abadir et al. (2007) and Andrews & Sun (2004).

The Appendix section is structured as follows: In the �rst section the proof to Theorem 1 is given.

Section 2 presents technical lemmas adapted from Andrews & Sun (2004) and Abadir et al. (2007).

6We also estimated other GARCH speci�cations, e.g., IGARCH and FIGARCH. These other speci�cations seem

indeed to be justi�ed in the sense that � + � � 1 in the GARCH(1,1) speci�cation. A further analysis is beyond the

scope of this paper.

41

9.1 Proof of Theorem 1

Proof. Set

Dm (�) =�d 2 [d1; d2] :

�log5m

�jd� d0j < �

for � > 0; (51)

gj = ��2dj G exp(�Pr (�j ; �)) (52)

As in Andrews & Sun (2004) denote the score and the Hessian of the scaled objective function as

Sn(d; �) = mrLn(d; �) and Hn(d; �) = mr2Ln(d; �), respectively.

Sn(d; �) = G�1(d; �)mXj=1

Ij(d) exp (Pr (�j ; �))�

2dj �m�1

mXk=1

Ik(d) exp (Pr (�k; �))�2dk

!(53)

��2 log j; �2j ; :::; �

2rj

�0= G�1(d; �)

mXj=1

GIj(d)

gj(d; �)�m�1

mXk=1

GIk(d)

gk(d; �)

!Xj : (54)

Hn(d; �) = G�2 (d; �)

0BB@G (d; �)

Pmj=1 Ij(d) exp (Pr (�j ; �))�

2dj

�2 log j; �2j ; :::; �

2rj

�0 �2 log j; �2j ; :::; �

2rj

��m

�m�1Pm

j=1 Ij(d) exp (Pr (�j ; �))�2dj

�2 log j; �2j ; :::; �

2rj

�0��m�1Pm

j=1 Ij(d) exp (Pr (�j ; �))�2dj

�2 log j; �2j ; :::; �

2rj

�0�01CCA

= G�2 (d; �)

0@ G (d; �)Pmj=1

GIj(d)gj(d;�)

X 0jXj �m

�m�1Pm

j=1GIj(d)gj(d;�)

Xj

��m�1Pm

j=1GIj(d)gj(d;�)

Xj

�01A ; (55)

whereXj =�2 log j; �2j ; :::; �

2rj

�0:De�ne the deterministic scaling matrixBn equal to the (r+1)�(r + 1)

diagonal matrix with jth diagonal element de�ned as

[Bn]11 = m1=2 (56)

[Bn]jj =

�2�m

n

�2j�2m1=2 for j = 2; 3; :::; r + 1: (57)

Since Ln(d; �) is a continuous function de�ned on a compact set the estimator exists. Strict convexity

of the (negative) log-likelihood, Ln (:), implies that the estimator is unique. Then, by strict convexity

and twice continuous di¤erentiability of Ln(:), implies that if a solution to the F.O.C. exists with

probability tending to one, which essentially follows by Andrews & Sun (2004, Lemma 1), then it is

unique and minimizes the objective function. Now we can use Lemma 1 to verify that the conditions

in Lemma 1 of Andrews & Sun (2004) hold. Andrews & Sun (2004, Lemma 1(i)) holds by Assumption

5. Andrews & Sun (2004, Lemma 1(ii)) holds by Lemma 1(e) and the second condition in Assumption

5. Andrews & Sun (2004, Lemma 1(iii)) holds by Lemma 1(a) and Lemma 1(b) and the positive

de�niteness of r: Andrews & Sun (2004, Lemma 1(iv)) holds by Lemma 1(c) and Lemma 1(d) as

it ensures for some sequence �n that goes su¢ ciently slowly to zero, Cn ! 1, holds.7. Thus, whatremains is to show strict convexity. We know that if for all leading principal minors Dl(d; �) > 0,

l = 1; 2; :::; 1 + r + 1 and 8 (d; �) 2 D �� [d1; d2]� Rr+1 then it follows that (negative) Ln(d; �) is7Andrews & Sun (2004), give an example of such a sequence, i.e., setting �n = log

�1m:

42

strictly convex on D�� [d1; d2]�Rr+1: Andrews & Sun (2004) prove this by noticing that for anyc 2 Rr+1n f0g

c0Hn(d; �)cG2(d; �)m�1 = G(d; �)m�1

mXj=1

GIj(d)

gj(d; �)(c0Xj)

2 �

0@m�1mXj=1

GIj(d)

gj(d; �)c0Xj

1A2= a0a � b0b� (a0b)2 > 0; (58)

where a and b are vectors of order m with aj =�m�1GIj(d)

gj(d;�)

�1=2and bj =

�m�1GIj(d)

gj(d;�)

�1=2c0Xj(d; �)

for j = 1; :::;m and the inequality holds by the Cauchy-Schwarz inequality.

9.2 Lemmas

Lemma 1 is the same as Lemma 2 in Andrews & Sun (2004). Part (e) is lacking the bias term as

we impose a weaker form of divergence of the bandwidth m in Assumption 5 than Andrews & Sun

(2004) do. Otherwise, the proof follows from Andrews & Sun (2004) with modi�cations to allow for

d � 1=2. These modi�cations follow Abadir et al. (2007) and there notion of the extended DFT

and periodogram. Lemma 2 is adapted from Lemma 4.6 in Abadir et al. (2007) and deals with the

asymptotic properties of the renormalized DFT�s and hence generalizes Theorem 2 of Robinson (1995b,

Theorem 2) as done in Abadir et al. (2007) to suit the non-stationary case. Furthermore, we will use

Lemma 4.2 and Lemma 4.4 of Abadir et al. (2007) extensively. In short, Lemma 4.2 gives relevant

bounds for proving consistency of the estimator d. Lemma 4.4 gives the algebraic relation between

the DFT of the series fXtg and the di¤erenced series f�pXtg :

Lemma 1 Under Assumptions 1-6, as n!1; we have

(a) B�1n JnB�1n

p! r

(b) B�1n (Hn(d0; �0)� Jn)B�1n

= op(1)(c) sup

�2�

B�1n (Hn(d0; �)�Hn(d0; �0))B�1n = op(1)

(d)sup

(d;�)2Dm(�n)��

B�1n (Hn(d; �)�Hn(d0; �))B�1n = op(1) for all

sequences of constants f�ngn�1 for which �n = o(1)(e) B�1n Sn(d0; �0)

d! N(0;r);

Proof of (a). Follows by approximating sums by integrals, see Andrews & Sun (2004, pp. 597),

where they refer to Andrews & Guggenberger (2003) and Lemma 2(a), (h), and (i).

Proof of (b). As in Andrews & Sun (2004) write with the only di¤erence that our extended

periodogram of a time series fXtg depends not only on the Fourier frequencies, but also on the valueof d, i.e. Ij (d) = I(�j ; d) = j!(�j ; d)j2

Ga;b (d; �) = m�1mXj=1

Ij(d) exp (Pr (�j ; �))�2dj (2 log j)

a

�j

m

�2b(59)

Ja;b = G0m�1

mXj=1

(2 log j)a�j

m

�2b; (60)

43

for a = 0; 1; 2, b = 0; :::; r and where we in de�ning Ja;b have substituted Ij (d) exp (Pr (�j ; �))�2djwith G0: As in (A.7) in Andrews & Sun (2004), the (1; 1) ; (1; k) and (k; i) element of B�1n HnB

�1n and

B�1n JnB�1n for k; i = 2; :::; r + 1 are then given by

G�20;0

�G0;0G2;0 � G21;0

�; (61)

G�20;0

�G0;0G1;k�1 � G1;0G0;k�1

�;

G�20;0

�G0;0G1;k+i�2 � G0;k�1G0;i�1

�;

where for B�1n JnB�1n just substitute Ga;b (d; �) with Ja;b. To prove Lemma 1(b), it then su¢ ces to

show that

�a;b =

��Ga;b (d0; �0)G0� Ja;bG0

�� = op �log�2m� , (62)

8a = 0; 1; 2 and b = 0; 1; :::; r. Write

�a;b =

��m�1Pm

j=1 Ij(d0) exp (Pr (�j ; �0))�2d0j (2 log j)a

�jm

�2bG0

�G0m

�1Pmj=1 (2 log j)

a�jm

�2bG0

��=

��m�1mXj=1

Ij(d0) exp (Pr (�j ; �0))�2d0j (2 log j)a

�jm

�2bgj exp (Pr (�j ; �0))�

2d0j

�m�1mXj=1

(2 log j)a�j

m

�2b��=

��m�1mXj=1

Ij(d0)

gj(2 log j)a

�j

m

�2b�m�1

mXj=1

(2 log j)a�j

m

�2b��=

��m�1mXj=1

�Ij(d0)

gj� 1�(2 log j)a

�j

m

�2b��

��m�1m�1Xj=1

"(2 log k)a

�k

m

�2b� (2 log (k + 1))a

�k + 1

m

�2b# kXj=1

�Ij(d0)

gj� 1��

+

��(2 logm)am�1mXj=1

�Ij(d0)

gj� 1��

: = �1;a;m + �2;a;m;

where the inequality follows from using summation by parts. Furthermore, under Assumption 1, i.e.

d0 6= p0 � 1=2 for p0 2 Z and the de�nition of the extended DFT, the assumption of linearity ofthe generating process (Assumption 4(a)), and together with Abadir et al. (2007, Lemma 4.4) and

Lemma 2, implies that the behavior of the extended DFT and periodogram are the same for all

d 2 (�1=2;1) : Therefore, the results from Andrews & Sun (2004) also hold in our case. That is, the

proof of (62) follows by collecting the terms (A.11)-(A.13) in Andrews & Sun (2004, pp. 598-599) and

using Assumption 5

�1;a;m + �2;a;m = Op

�(logam)m�1=2 + (logam)m�n��

�= op

�log�2m

�:

44

Proof of (c). By (62) and Ja;b = O (logam) ; we get that

Ga;b (d0; �0) = Op (logam) ; (63)

for a = 0; 1; 2 and b = 0; :::; r and

G0;0 (d0; �0) = G0 + op�log�2m

�; (64)

where G0 > 0. Then given that we can write the elements for B�1n HnB�1n as in (61) and the above

results hold, it su¢ ces to show that

sup�2�

��Ga;b (d0; �)� Ga;b (d0; �0)�� = op �log�2m� ; (65)

8a = 0; 1; 2 and b = 0; :::; r: Write the left-hand side of (65) as

sup�2�

��m�1Pm

j=1 Ij (d0) exp (Pr (�j ; �))�2d0j (2 log j)a

�jm

�2b�m�1Pm

j=1 Ij (d0) exp (Pr (�j ; �0))�2d0j (2 log j)a

�jm

�2b��

= sup�2�

��m�1mXj=1

Ij (d0) [exp (Pr (�j ; �))� exp (Pr (�j ; �0))]�2d0j (2 log j)a�j

m

�2b�� sup

�2�sup

k=1;:::;mjexp (Pr (�k; �))� exp (Pr (�k; �0))� 1j

�m�1mXj=1

Ij (d0) exp (Pr (�j ; �0))�2d0j (2 log j)a

= O��2m�Ga;0 (d0; �0)

= Op

�(m=n)2 (logam)

�= op

�log�2m

�: (66)

The second equality holds by a mean-value expansion using the compactness of �, the third equation

holds by (62) and Ja;b = O (logam). The last equality holds by Assumption 5.

Proof of (d). Given the same arguments as in Andrews & Sun (2004), we note that, (i) utilizing

equations (62) and (65) we have that Ga;b (d0; �) = Ja;b + op�log�2m

�(ii) Ja;b = O (logam) ; (iii)

J0;0J2;0 � J21;0 = O(1) by replacing sums by integrals and noting that the part of J0;0J2;0 that is

O�log2m

�cancels with an identical term in J21;0, (iv) J0;0J1;k�1 � J1;0J0;k�1 = O(1) by the same

argument as in (iii), and (v) J0;0 = G0 > 0: Then from (i)-(v) and equation (61) it su¢ ces to show

sup(d;�)2Dm(�n)��

��Ga;b (d; �)� Ga;b (d0; �)�� = op �log�2m� : (67)

Replacing �2dj with j2d in Ga;b (d; �), and thereby de�ning Ea;b (d; �), equation (61) also holds for

Ga;b (d; �) replaced by Ea;b (d; �) : Hence, it su¢ ces in proving Lemma 1(d) that

Za;b = sup(d;�)2Dm(�n)��

��Ea;b (d; �)� Ea;b (d0; �)�� = op �n2d0 log�2m� ; (68)

45

8a = 0; 1; 2 and b = 0; :::; r: Then from Andrews & Sun (2004, pp. 600), 9C < 1 and for (d; �) 2Dm (�n)��

Za;b = sup(d;�)2Dm(�n)��

��m�1mXj=1

Ij (d0) exp (Pr (�j ; �)) (2 log j)a

�j

m

�2bj2d0

�j2(d�d0) � 1

�� C sup

d2Dm(�n)m�1

mXj=1

Ij (d0) (log j)a j2d0

��j2(d�d0) � 1��+ op (1)� 2C exp

�2�n log

�4m�

supd2Dm(�n)

m�1mXj=1

Ij (d0) (log j)a+1 j2d0 jd� d0j

� �n�log�2m

�2C exp

�2�n log

�4m�m�1

mXj=1

Ij (d0)�2d0j

�2�

n

��2d0:

The �rst inequality follows from using sup0��2�;�2�

supj=1;:::;m

exp (Pr (�j ; �)) < 1 because � is compact.

The second inequality stems from noting��j2(d�d0) � 1��jd� d0j

� 2m2jd�d0j log j � 2m2�n log�5m log j = 2 exp

�2�n log

�4m�log j

for d 2 Dm (�n) by a mean-value expansion where we use that mlog�1m = e: The third inequality

uses d 2 Dm (�n) : Then from equations (62) and (65) we have m�1Pmj=1 Ij (d0)�

2d0j = G0;0(d0; 0) =

G0 + op�log�2m

�: Hence, (68) follows.

Proof of (e). By using (62), and setting a = b = 0 we get G (d0; �0) = G0�1 + op

�log�2m

��;

and therefore the normalized score can be written as

B�1n Sn (d0; �0) = G�1 (d0; �0)m�1=2

mXj=1

Ij (d0)

gj (d0; �0)�m�1

mXk=1

Ik (d0)

gk (d0; �0)

!~Xj

= (1 + op (1))m�1=2

mXj=1

�Ij (d0)

gj (d0; �0)� 1�

~Xj �m�1mXk=1

~Xk

!; (69)

where~Xj =

�log j; (j=m)2 ; :::; (j=m)2r

�0: (70)

46

Therefore, omitting the small order terms write the RHS of (69) as Andrews & Sun (2004, pp. 601),

T1;n + T2;n + T3;n + T4;n; where

T1;n = m�1=2mXj=1

�Ij (d0)

gj (d0; �0)� 2�I" (�j)� E

�Ij (d0)

gj (d0; �0)� 2�I" (�j)

��(71)

� ~Xj �m�1

mXk=1

~Xk

!;

T2;n = m�1=2mXj=1

�E[Ij (d0)]

fj(d0)� 1�

fj (d0)

gj(d0; �0)

~Xj �m�1

mXk=1

~Xk

!; (72)

T3;n = m�1=2mXj=1

(2�I" (�j)� 1) ~Xj �m�1

mXk=1

~Xk

!; (73)

T4;n = m�1=2mXj=1

�fj (d0)

gj(d0; �0)� 1�

~Xj �m�1mXk=1

~Xk

!; (74)

using that E (2�I" (�j)) = 1: Next, we need to show that T1;n and T4;n are op(1), T2;n = o (1), and

T3;nd! N (0;r) : To show, T1;n = op(1); use summation by parts

T1;n = m�1=2m�1Xk=1

�~Xk � ~Xk+1

� kXj=1

�Ij (d0)

gj (d0; �0)� 2�I" (�j)� E

�Ij (d0)

gj (d0; �0)� 2�I" (�j)

��

+

~Xm �m�1

mXk=1

~Xk

!m�1=2

mXj=1

�Ij (d0)

gj (d0; �0)� 2�I" (�j)� E

�Ij (d0)

gj (d0; �0)� 2�I" (�j)

��

= m�1=2m�1Xk=1

O�k�1

�Op

�k1=3 log2=3 k + k�+1=2n�� + k1=2n�1=4

�+O(1)m�1=2Op

�m1=3 log2=3m+m�+1=2n�� +m1=2n�1=4

�= Op

�m�1=6 log2=3m+ (m=n)� + n�1=4

�= op(1); (75)

which follows from noting that ~Xk� ~Xk+1 = O�k�1

�uniformly over k = 1; :::;m and ~Xm�m�1Pm

k=1~Xk =

O(1) which follows from approximating sums by integrals, see Andrews & Sun (2004, pp. 602). A

remark is in order. Remember that under the assumption of linearity of the generating process, As-

sumption 4(a), together with Abadir et al. (2007, Lemma 4.4) and Lemma 2(ii), says that the behavior

is the same for all d 2 (�1=2;1) : Therefore, the results from Andrews & Sun (2004) also hold in

our case. Since d0 belongs to the interior of the admissible parameter space, T1;n = op(1): To prove

that T2;n = o(1), we again utilize Assumption 4(a) together with Abadir et al. (2007, Lemma 4.4) and

Lemma 2(ii) that enables us to use the result that

E

�Ij (d0)

fj (d0)

�= 1 +O

�j�1 log j

�; (76)

47

where o(1)! 0 uniformly over 1 � j � m as n!1: Then using (76), T2;n is bounded by

T2;n = m�1=2mXj=1

O�j�1 log j

�O (1)

~Xj �m�1

mXk=1

~Xk

!(77)

= O

0@m�1=2 logmmXj=1

j�1 log j

1A= O

�m�1=2 log3m

�;

where we have used that ~Xj �m�1Pmk=1

~Xk = O (logm) uniformly in 1 � j � m: Therefore, T2;n =o(1). Next, we need to show that 8� 6= 0 �0T3;n

d! N(0; �0r�): That is, we need to verify that for

n!1m�1

mXj=1

&2j ! �0r�; (78)

where &j = �0j�~Xj �m�1Pm

k=1~Xk

�and r =

4 2�0r2�r �r

!which follows from Lemma 1(a), Lemma

1(d) and �nally noting that j&j � &j+1j � k�k ~Xj � ~Xj+1

� Cj�1 for some constant C > 0 inde-

pendent of j: Finally, we need to show that T4;n = op(1): This follows from summation by parts andfj(d0)gj(d0;�0)

� 1 = O�(j=n)�

�uniformly on 1 � j � m; Frederiksen et al. (2008). This implies

T4;n = m�1=2m�1Xk=1

�~Xk � ~Xk+1

� kXj=1

�fj (d0)

gj (d0; �0)� 1�

+

~Xm �m�1

mXk=1

~Xk

!m�1=2

mXj=1

�fj (d0)

gj (d0; �0)� 1�

= m�1=2m�1Xk=1

O�k�1

� kXj=1

O�(j=n)�

�+O(1)m�1=2

mXj=1

O�(j=n)�

�= O

�m�1=2+�n��

�= op(1); (79)

where the last equality holds by Assumption 5.

Lemma 2 Assume that the sequence fvjg is given as in (21). The following holds uniformly in

1 � k < j � m = o(n); as n!1: (i) If fu satis�es Assumption 2, then

Ehjwu (�j)j2 =fu (�j)

i= 1 + o(1) +O

�j�1 log j

�; (80)

where o(1)! 0 uniformly in 1 � j � m; as n!1; and

jE [vjvj ]j+ jE [vjvk]j = O

�log j

j � k

�+O

�log j

kjdjj1�jdj

�; (81)

jE [vjvj ]j = O

�log j

j

�: (82)

48

(ii) If fu satis�es Assumption 2 and 4(c), then

Ehjwu (�j)j2 =fu (�j)

i= 1 +O

�j�1 log j

�; (83)

and

jE [vjvj ]j+ jE [vjvk]j = O

�log j

kjdjj1�jdj

�; (84)

jE [vjvj ]j = O

�log j

j

�: (85)

Proof. Follows from Abadir et al. (2007) and their proof to Lemma 4.6, given Assumption 3 and

by interchanging b0 by G0 exp (�Pr (�j ; �)). Let d0 = p0 + du: Then for d0 = du equations (80)-(82)and (83)-(85) follow from Robinson (1995b) and his proof of Theorem 2 pp. 1060. For p0 2 Nn f0gand the property of the extended DFT and the rescaled extended DFT, (18) and (21), respectively, it

follows

vj =

�1� exp (i�j)

�j

��p0 wu (�j)

' (�)1=2 ��duj

: (86)

As��1�exp(i�j)�j

��p0 � C uniformly in 1 � j � m (81)-(82) and (84)-(85) also hold for p0 2 Nn f0g :

References

Abadir, K. M., Distaso, W. & Giraitis, L. (2007), �Nonstationarity-extended local whittle estimation�,


Abadir, K. M. & Taylor, R. (1999), �On the de�nitions of (co-)integration�, Journal of Time Series

Analysis 20, 129�13.

Andrews, D. W. K. & Guggenberger, P. (2003), �A bias-reduced log-periodogram regression estimator

for the long memory parameter�, Econometrica 71, 675�712.



Arteche, J. (2004), �Gaussian semiparametric estimation in long memory in stochastic volatility and

signal plus noise models�, Journal of Econometrics 119, 131�154.

Beran, J. (1994), Statistics for Long-Memory Processes, Chapman-Hall, New York.

Black, F. & Cox, J. C. (1976), �Valuing corporate securities: Some e¤ects of bond indenture provisions�,

Journal of Finance 31, 351�67.

Dalla, V., Giraitis, L. & Hidalgo, J. (2006), �Consistent estimation of the memory parameter for

nonlinear time series�, Journal of Time Series Analysis 27, 211�251.

Das, S. (1995), �Credit risk derivatives�, The Journal of Derivatives 2, 7�23.

49

Das, S. & Tufano, P. (1995), �Pricing credit-sensitive debt when interest rates, credit ratings and credit

spreads are stochastic�, Journal of Financial Engineering 5, 161�198.

Davies, R. B. & Harte, D. S. (1987), �Tests for hurst e¤ects�, Biometrika 74, 95�102.


105, 131�159.

Du¢ e, D. & Huang, M. (1996), �Swap rates and credit quality�, Journal of Finance 51, 921�49.

Du¢ e, D. & Singleton, K. J. (1999), �Modeling term structures of defaultable bonds�, Review of

Financial Studies 12, 687�720.

Frederiksen, P. H. & Nielsen, F. S. (2008), �Testing for spurious long memory in potentially nonstation-

ary perturbed fractional processes�, CREATES RP 2008-59, University of Aarhus, and Working

Paper, Nordea Markets .

Frederiksen, P. H., Nielsen, F. S. & Nielsen, M. Ø. (2008), �Local polynomial Whittle estimation

of perturbed fractional processes�, CREATES RP 2008-29, University of Aarhus, and Working

Paper, Nordea Markets and Cornell University .





Granger, C. W. J. & Terasvirta, T. (1999), �A simple nonlinear time series model with misleading

linear properties�, Economics Letters 62, 161�165.

Haldrup, N. & Nielsen, M. Ø. (2007), �Estimation of fractional integration in the presence of data

noise�, Computational Statistics and Data Analysis 51, 3100�3114.

Hull, J. & White, A. (1995), �The impact of default risk on the prices of options and other derivative

securities�, Journal of Banking & Finance 19, 299�322.

Hurvich, C. M. & Chen, W. W. (2000), �An e¢ cient taper for potentially overdi¤erenced long-memory

time series�, Journal of Time Series Analysis 21, 155�180.

Hurvich, C. M., Moulines, E. & Soulier, P. (2005), �Estimating long memory in volatility�, Economet-

rica 73, 1283�1328.

Hurvich, C. M. & Ray, B. K. (1995), �Estimation of the memory parameter for nonstationary or

noninvertible fractionally integrated processes�, Journal of Time Series Analysis 16, 17�41.



50

Jarrow, R. A., Lando, D. & Turnbull, S. M. (1997), �A markov model for the term structure of credit

risk spreads�, Review of Financial Studies 10, 481�523.

Jarrow, R. A. & Turnbull, S. M. (1995), �Pricing derivatives on �nancial securities subject to credit

risk�, Journal of Finance 50, 53�85.

Kim, C. S. & Phillips, P. C. B. (2006), Log periodogram regression: The nonstationary case, Cowles

Foundation Discussion Papers 1587, Cowles Foundation, Yale University.

Künsch, H. R. (1987), Statistical aspects of self-similar processes, in Y. Prokhorov & V. V. Sazanov,

eds, �Proceedings of the First World Congress of the Bernoulli Society�, VNU Science Press,

Utrecht, pp. 67�74.

Lahiri, S. N. (2003), �A necessary and su¢ cient condition for asymptotic independence of discrete

fourier transforms under short- and long-range dependence�, Annals of Statistics 31, 613�641.

Leland, H. E. & Toft, K. B. (1996), �Optimal capital structure, endogenous bankruptcy, and the term

structure of credit spreads�, Journal of Finance 51, 987�1019.

Longsta¤, F. A. & Schwartz, E. (1995), �Valuing credit derivatives�, Journal of Fixed Income 5, 6�12.

Madan, D. & Unal, H. (1996), Pricing the risks of default, Center for Financial Institutions Working

Papers 94-16, Wharton School Center for Financial Institutions, University of Pennsylvania.

Manzoni, K. (2002), �Modeling credit spreads: An application to the sterling eurobond market�,

International Review of Financial Analysis 11, 183�218.

Marinucci, D. & Robinson, P. M. (1999), �Alternative forms of fractional brownian motion�, Journal

of Statistical Planning and Inference 80, 111�122.

Merton, R. C. (1974), �On the pricing of corporate debt: The risk structure of interest rates�, Journal

of Finance 29, 449�70.

Newey, W. K. & McFadden, D. (1986), Large sample estimation and hypothesis testing, in R. F. Engle

& D. McFadden, eds, �Handbook of Econometrics�.

Nielsen, M. Ø. & Frederiksen, P. H. (2005), �Finite sample comparison of parametric, semiparametric,

and wavelet estimators of fractional integration�, Econometric Reviews 24, 405�443.

Ohanissian, A., Russell, J. R. & Tsay, R. S. (2008), �True or spurious long memory? a new test�,

Journal of Business & Economic Statistics 26(2), 161�175.

Phillips, P. C. B. (1999), �Discrete fourier transformation of fractional processes�, Discussion Paper

1243(Yale University (Cowles Foundation)).

Phillips, P. C. B. & Shimotsu, K. (2004), �Local whittle estimation in nonstationary and unit root

cases�, The Annals of Statistics 32, 656�692.

51

Ramaswamy, K. & Sundaresan, S. M. (1986), �The valuation of �oating-rate instruments : Theory

and evidence�, Journal of Financial Economics 17, 251�272.

Ratta, L. C. & Urga, G. (2005), Modeling credit spreads: A fractional integration approach, Working

Paper CEA-07-2005, Cass Business School, City University London.

Robinson, P. M. (1994), �Semiparametric analysis of long-memory time series�, The Annals of Statistics

22, 515�539.

Robinson, P. M. (1995a), �Gaussian semiparametric estimation of long range dependence�, The Annals


Robinson, P. M. (1995b), �Log-periodogram regression of time series with long range dependence�, The

Annals of Statistics 23, 1048�1072.

Robinson, P. M. (2005), �The distance between nonstationary fractional processes�, Journal of Econo-

metrics 128, 195�236.

Shimotsu, K. (2006), �Exact local whittle estimation of fractional integration with unknown mean and

time trend�, Working Paper, Department of Economics, Queen�s University, Canada (1061).

Shimotsu, K. & Phillips, P. (2005), �Exact local whittle estimation of fractional integration�, The


Tanaka, K. (1999), �The nonstationary fractional unit root�, Econometric Theory 15, 549�582.

Velasco, C. (1999a), �Gaussian semiparametric estimation of non-stationary time series�, Journal of

Time Series Analysis 20, 87�127.

Velasco, C. (1999b), �Non-stationary log-periodogram regression�, Journal of Econometrics 91, 325�371.

52

Table 1: Simulation results for ARFIMA(0,d,0) with n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)

d Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE Bias RMSE

Panel A: m =�n0:5

�-0.3 -0.0087 0.1450 -0.0490 0.2708 -0.0546 0.3984 -0.0059 0.1385 -0.0309 0.2459 -0.0219 0.37690 -0.0123 0.1385 -0.0548 0.2659 -0.0616 0.3958 -0.0119 0.1387 -0.0500 0.2549 -0.0287 0.3724

0.3 -0.0162 0.1413 -0.0476 0.2631 -0.0670 0.3972 -0.0161 0.1423 -0.0510 0.2595 -0.0634 0.38930.7 0.0117 0.1480 -0.0432 0.2729 -0.0359 0.3975 -0.0020 0.1462 -0.0539 0.2667 -0.0464 0.40031 -0.0208 0.1273 -0.0542 0.2370 -0.0570 0.3352 -0.0221 0.1427 -0.0642 0.2609 -0.0663 0.3750

1.3 -0.1937 0.2314 -0.1935 0.2769 -0.1902 0.3459 -0.0138 0.1395 -0.0415 0.2568 -0.0423 0.34831.7 -0.5869 0.6103 -0.5540 0.5927 -0.5408 0.5979 -0.0053 0.1393 -0.0184 0.2256 -0.0153 0.32722 -0.9094 0.9256 -0.8703 0.8969 -0.8478 0.8851 -0.0146 0.1383 -0.0492 0.2547 -0.0352 0.3298

Panel B: m =�n0:65

�-0.3 0.0008 0.0771 -0.0151 0.1283 -0.0176 0.1807 0.0007 0.0777 -0.0125 0.1223 -0.0091 0.17060 -0.0017 0.0763 -0.0168 0.1279 -0.0203 0.1740 -0.0017 0.0763 -0.0167 0.1275 -0.0198 0.1723

0.3 -0.0075 0.0783 -0.0230 0.1309 -0.0232 0.1783 -0.0071 0.0793 -0.0226 0.1319 -0.0192 0.18620.7 0.0100 0.0806 0.0031 0.1359 0.0019 0.1849 -0.0037 0.0781 -0.0097 0.1329 -0.0133 0.18211 -0.0140 0.0691 -0.0205 0.1165 -0.0287 0.1598 -0.0160 0.0772 -0.0278 0.1309 -0.0315 0.1759

1.3 -0.2141 0.2344 -0.1961 0.2293 -0.1904 0.2405 -0.0128 0.0797 -0.0169 0.1266 -0.0136 0.18101.7 -0.6229 0.6382 -0.5882 0.6103 -0.5709 0.5986 -0.0158 0.0770 -0.0100 0.1224 -0.0043 0.16252 -0.9506 0.9581 -0.9177 0.9311 -0.8990 0.9165 -0.0186 0.0785 -0.0210 0.1282 -0.0206 0.1789

Panel C: m =�n0:8

�-0.3 0.0115 0.0448 -0.0021 0.0713 0.0004 0.0949 0.0115 0.0448 -0.0018 0.0705 0.0017 0.09070 -0.0040 0.0435 -0.0093 0.0709 -0.0107 0.0941 -0.0040 0.0435 -0.0093 0.0709 -0.0107 0.0941

0.3 -0.0093 0.0446 -0.0067 0.0711 -0.0067 0.0952 -0.0093 0.0446 -0.0067 0.0711 -0.0068 0.09500.7 -0.0147 0.0510 0.0079 0.0738 0.0083 0.0977 -0.0280 0.0537 -0.0075 0.0708 -0.0077 0.09171 -0.0363 0.0531 -0.0065 0.0603 -0.0082 0.0817 -0.0378 0.0574 -0.0090 0.0699 -0.0115 0.0933

1.3 -0.2528 0.2662 -0.2052 0.2261 -0.2001 0.2257 -0.0472 0.0644 -0.0085 0.0703 -0.0111 0.09441.7 -0.6847 0.6918 -0.6315 0.6431 -0.6187 0.6327 -0.0595 0.0734 -0.0007 0.0690 -0.0014 0.09122 -0.9950 1.0001 -0.9439 0.9523 -0.9308 0.9417 -0.0746 0.0876 -0.0129 0.0703 -0.0146 0.0954

Notes: LW, LPW, ExtLW, and ExtLPW denotes the local Whittle estimator of Robinson (1995a),

local polynomial Whittle estimator of Andrews & Sun (2004), extended local Whittle estimator of

Abadir et al. (2007), and our proposed estimator the extended local polynomial Whittle estimator,

respectively. r denotes the degree of parameterization of the polynomial, i.e. Pr =Pr�=1 ��

2�j .

53

Table 2: Simulation results for ARFIMA(0,d,1) with � = �0:8 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)


Panel A: m =�n0:5

�-0.3 -0.1159 0.1940 -0.0132 0.2611 0.0077 0.3638 -0.1184 0.1950 -0.0054 0.2493 0.0192 0.35090 -0.1669 0.2168 -0.0859 0.2771 -0.0816 0.3852 -0.1666 0.2167 -0.0801 0.2630 -0.0694 0.34170.3 -0.1606 0.2151 -0.0842 0.2659 -0.0609 0.3838 -0.1604 0.2156 -0.0944 0.2531 -0.0842 0.35580.7 -0.1444 0.2101 -0.0565 0.2659 -0.0412 0.3699 -0.0826 0.1932 0.0003 0.2822 0.0053 0.39461 -0.1275 0.1866 -0.0739 0.2553 -0.0611 0.3594 -0.1654 0.2164 -0.0879 0.2674 -0.0636 0.37421.3 -0.2494 0.2746 -0.2099 0.2973 -0.2036 0.3608 -0.1727 0.2248 -0.1027 0.2585 -0.0998 0.33981.7 -0.6037 0.6207 -0.5666 0.6007 -0.5428 0.6014 -0.0851 0.1889 -0.0074 0.2669 0.0142 0.34002 -0.9138 0.9276 -0.8721 0.8978 -0.8456 0.8858 -0.1677 0.2188 -0.0775 0.2573 -0.0470 0.3369Panel B: m =

�n0:65

�-0.3 -0.3216 0.3365 -0.1284 0.1915 -0.0467 0.1859 -0.3378 0.3505 -0.1342 0.1972 -0.0463 0.18460 -0.3603 0.3699 -0.1773 0.2215 -0.1025 0.2091 -0.3580 0.3682 -0.1769 0.2210 -0.0999 0.20390.3 -0.3697 0.3794 -0.1875 0.2274 -0.1039 0.2030 -0.3697 0.3794 -0.1875 0.2274 -0.1041 0.20330.7 -0.3467 0.3586 -0.1609 0.2115 -0.0774 0.1976 -0.3388 0.3541 -0.1262 0.2098 -0.0325 0.21561 -0.3056 0.3225 -0.1326 0.1854 -0.0645 0.1754 -0.3653 0.3741 -0.1816 0.2215 -0.0984 0.19911.3 -0.3311 0.3372 -0.2439 0.2644 -0.2084 0.2545 -0.3748 0.3844 -0.1850 0.2271 -0.1040 0.20321.7 -0.6457 0.6523 -0.6045 0.6209 -0.5822 0.6075 -0.3493 0.3623 -0.1951 0.2299 -0.1501 0.22532 -0.9516 0.9581 -0.9168 0.9301 -0.8965 0.9153 -0.3674 0.3756 -0.1834 0.2245 -0.1026 0.2009Panel C: m =

�n0:8

�-0.3 -0.4946 0.4996 -0.3523 0.3650 -0.2386 0.2636 -0.5094 0.5133 -0.3693 0.3797 -0.2290 0.25490 -0.5333 0.5362 -0.3930 0.4006 -0.2866 0.3028 -0.4455 0.4739 -0.3889 0.3958 -0.2859 0.30150.3 -0.5469 0.5500 -0.3992 0.4073 -0.2960 0.3128 -0.5469 0.5500 -0.3992 0.4073 -0.2960 0.31280.7 -0.5359 0.5397 -0.3712 0.3805 -0.2614 0.2808 -0.5359 0.5397 -0.3696 0.3798 -0.2183 0.25561 -0.4980 0.5046 -0.3223 0.3371 -0.2216 0.2462 -0.5145 0.5167 -0.3855 0.3920 -0.2836 0.29741.3 -0.4657 0.4730 -0.3425 0.3493 -0.2900 0.2988 -0.5739 0.5768 -0.3986 0.4065 -0.2928 0.30811.7 -0.7071 0.7086 -0.6516 0.6563 -0.6341 0.6427 -0.5625 0.5665 -0.3740 0.3834 -0.2601 0.28252 -1.0010 1.0035 -0.9502 0.9555 -0.9364 0.9444 -0.5301 0.5336 -0.3890 0.3962 -0.3080 0.3309





2�j .

54

Table 3: Simulation results for ARFIMA(0,d,1) with � = �0:5 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)


Panel A: m =�n0:5

�-0.3 -0.0293 0.1431 -0.0478 0.2630 -0.0522 0.3787 -0.0291 0.1429 -0.0364 0.2503 -0.0240 0.35370 -0.0365 0.1509 -0.0553 0.2649 -0.0646 0.3864 -0.0364 0.1505 -0.0496 0.2591 -0.0437 0.3666

0.3 -0.0384 0.1527 -0.0485 0.2676 -0.0506 0.3782 -0.0402 0.1505 -0.0578 0.2564 -0.0580 0.36450.7 -0.0154 0.1448 -0.0278 0.2758 -0.0470 0.3862 -0.0191 0.1450 -0.0355 0.2747 -0.0500 0.39261 -0.0329 0.1306 -0.0441 0.2481 -0.0453 0.3532 -0.0406 0.1476 -0.0509 0.2647 -0.0428 0.3865

1.3 -0.2058 0.2402 -0.2073 0.2842 -0.2119 0.3359 -0.0422 0.1496 -0.0615 0.2542 -0.0711 0.33411.7 -0.5915 0.6136 -0.5554 0.5931 -0.5365 0.5968 -0.0186 0.1394 -0.0027 0.2247 0.0105 0.33612 -0.9134 0.9269 -0.8732 0.8968 -0.8480 0.8828 -0.0383 0.1435 -0.0383 0.2535 -0.0236 0.3273


�-0.3 -0.0956 0.1225 -0.0233 0.1318 -0.0114 0.1763 -0.0962 0.1237 -0.0223 0.1291 -0.0074 0.16950 -0.1028 0.1288 -0.0365 0.1352 -0.0320 0.1733 -0.1028 0.1288 -0.0365 0.1349 -0.0313 0.1710

0.3 -0.1014 0.1269 -0.0334 0.1326 -0.0249 0.1802 -0.1014 0.1269 -0.0335 0.1329 -0.0248 0.18180.7 -0.0901 0.1241 -0.0170 0.1363 -0.0087 0.1782 -0.0956 0.1294 -0.0215 0.1383 -0.0135 0.18021 -0.0827 0.1112 -0.0247 0.1236 -0.0207 0.1654 -0.1082 0.1317 -0.0294 0.1280 -0.0216 0.1737

1.3 -0.2358 0.2479 -0.2047 0.2371 -0.1989 0.2462 -0.1135 0.1372 -0.0371 0.1390 -0.0351 0.18151.7 -0.6342 0.6452 -0.5990 0.6177 -0.5806 0.6057 -0.0969 0.1270 -0.0147 0.1290 -0.0031 0.16802 -0.9528 0.9590 -0.9192 0.9315 -0.8992 0.9165 -0.1168 0.1402 -0.0360 0.1377 -0.0272 0.1751

Panel C: m =�n0:8

�-0.3 -0.2392 0.2441 -0.1047 0.1262 -0.0386 0.0975 -0.2278 0.2328 -0.1029 0.1232 -0.0374 0.09470 -0.2582 0.2629 -0.1142 0.1348 -0.0565 0.1099 -0.2582 0.2629 -0.1142 0.1348 -0.0565 0.1099

0.3 -0.2655 0.2696 -0.1117 0.1318 -0.0493 0.1037 -0.2655 0.2696 -0.1117 0.1318 -0.0493 0.10370.7 -0.2621 0.2679 -0.0966 0.1223 -0.0314 0.1042 -0.2583 0.2644 -0.1023 0.1254 -0.0356 0.10291 -0.2287 0.2390 -0.0781 0.1084 -0.0312 0.0916 -0.2852 0.2895 -0.1118 0.1324 -0.0486 0.1037

1.3 -0.3183 0.3200 -0.2354 0.2468 -0.2146 0.2359 -0.2966 0.3006 -0.1146 0.1369 -0.0530 0.11011.7 -0.6865 0.6913 -0.6289 0.6394 -0.6128 0.6278 -0.2818 0.2884 -0.1125 0.1388 -0.0490 0.11482 -0.9998 1.0036 -0.9493 0.9567 -0.9370 0.9468 -0.3191 0.3229 -0.1169 0.1359 -0.0543 0.1058





2�j .

55

Table 4: Simulation results for ARFIMA(1,d,0) with � = 0:8 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)


Panel A: m =�n0:5

�-0.3 0.1530 0.2114 -0.0133 0.2535 -0.0323 0.3805 0.1530 0.2114 -0.0012 0.2475 0.0080 0.38910 0.1404 0.2015 -0.0290 0.2553 -0.0572 0.3771 0.1413 0.2034 -0.0248 0.2552 -0.0260 0.3647

0.3 0.1436 0.2030 -0.0266 0.2505 -0.0578 0.3806 0.1477 0.2088 -0.0237 0.2511 -0.0479 0.37330.7 0.1518 0.2117 0.0021 0.2495 -0.0041 0.3575 0.1421 0.2039 -0.0129 0.2413 -0.0126 0.36711 0.0751 0.1567 -0.0265 0.2323 -0.0582 0.3499 0.1385 0.2036 -0.0201 0.2608 -0.0431 0.3671

1.3 -0.1604 0.2210 -0.1813 0.2621 -0.1960 0.3381 0.1511 0.2108 0.0029 0.2683 -0.0195 0.37201.7 -0.5879 0.6133 -0.5554 0.5946 -0.5425 0.6058 0.1383 0.1978 0.0115 0.2225 0.0040 0.33202 -0.9022 0.9199 -0.8565 0.8878 -0.8322 0.8761 0.1407 0.2023 0.0088 0.2670 0.0192 0.3579


�-0.3 0.4058 0.4144 0.1595 0.2037 0.0584 0.1855 0.4074 0.4171 0.1597 0.2036 0.0624 0.17910 0.4053 0.4138 0.1634 0.2103 0.0602 0.1916 0.4163 0.4276 0.1637 0.2112 0.0609 0.1937

0.3 0.4026 0.4106 0.1597 0.2078 0.0658 0.1873 0.4114 0.4195 0.1692 0.2216 0.0722 0.19830.7 0.3729 0.3816 0.1671 0.2121 0.0766 0.1898 0.3962 0.4049 0.1565 0.2064 0.0609 0.18531 0.1873 0.2329 0.0985 0.1590 0.0338 0.1653 0.4045 0.4154 0.1732 0.2295 0.0707 0.2103

1.3 -0.1653 0.2412 -0.1598 0.2219 -0.1681 0.2314 0.4058 0.4139 0.1935 0.2335 0.1281 0.22011.7 -0.6165 0.6385 -0.5792 0.6080 -0.5606 0.5938 0.3895 0.3988 0.1600 0.2082 0.0842 0.22172 -0.9361 0.9489 -0.8980 0.9193 -0.8765 0.9034 0.3864 0.3940 0.2500 0.3172 0.2115 0.3369

Panel C: m =�n0:8

�-0.3 0.6635 0.6655 0.4595 0.4665 0.3122 0.3279 0.6687 0.6713 0.4595 0.4665 0.3122 0.32790 0.6490 0.6510 0.4583 0.4648 0.3058 0.3202 0.6603 0.6623 0.4641 0.4721 0.3061 0.3209

0.3 0.6367 0.6386 0.4640 0.4701 0.3132 0.3278 0.6406 0.6425 0.4731 0.4791 0.3226 0.33840.7 0.5339 0.5415 0.4318 0.4393 0.3026 0.3175 0.6209 0.6232 0.4649 0.4713 0.3088 0.32441 0.1922 0.2685 0.2135 0.2625 0.1693 0.2101 0.6120 0.6143 0.4876 0.4958 0.3438 0.3641

1.3 -0.2298 0.2829 -0.1696 0.2415 -0.1630 0.2294 0.5840 0.5863 0.4694 0.4757 0.3290 0.34341.7 -0.6783 0.6883 -0.6187 0.6384 -0.6017 0.6267 0.5514 0.5547 0.4572 0.4637 0.3158 0.33782 -1.0016 1.0052 -0.9513 0.9584 -0.9386 0.9486 0.4845 0.4854 0.4534 0.4568 0.4315 0.4466





2�j .

56

Table 5: Simulation results for ARFIMA(1,d,0) with � = 0:5 and n = 512.LW LPW (r=1) LPW (r=2) ExtLW ExtLPW (r=1) ExtLPW (r=2)


Panel A: m =�n0:5

�-0.3 0.0108 0.1404 -0.0426 0.2469 -0.0382 0.3678 0.0123 0.1369 -0.0305 0.2400 -0.0107 0.37740 0.0088 0.1441 -0.0598 0.2682 -0.0818 0.3961 0.0086 0.1448 -0.0534 0.2642 -0.0529 0.3668

0.3 0.0167 0.1470 -0.0621 0.2660 -0.0754 0.4006 0.0174 0.1501 -0.0616 0.2690 -0.0690 0.3854.7 0.0295 0.1451 -0.0210 0.2485 -0.0165 0.3702 0.0131 0.1389 -0.0350 0.2460 -0.0219 0.38641 0.0104 0.1251 -0.0399 0.2309 -0.0464 0.3604 0.0159 0.1391 -0.0450 0.2528 -0.0415 0.4000

1.3 -0.1915 0.2349 -0.2109 0.2887 -0.2113 0.3579 0.0043 0.1407 -0.0547 0.2594 -0.0395 0.36001.7 -0.5861 0.6113 -0.5580 0.5953 -0.5428 0.6002 0.0121 0.1422 -0.0217 0.2276 -0.0136 0.32122 -0.9062 0.9227 -0.8612 0.8891 -0.8321 0.8728 0.0094 0.1363 -0.0286 0.2457 -0.0047 0.3275


�-0.3 0.0950 0.1221 -0.0070 0.1296 -0.0261 0.1750 0.0950 0.1221 -0.0052 0.1263 -0.0185 0.16240 0.0916 0.1194 -0.0064 0.1260 -0.0234 0.1734 0.0916 0.1194 -0.0064 0.1259 -0.0226 0.1707

0.3 0.0921 0.1191 -0.0060 0.1256 -0.0191 0.1741 0.0946 0.1240 -0.0049 0.1285 -0.0174 0.17760.7 0.1027 0.1300 0.0168 0.1320 0.0027 0.1756 0.0913 0.1208 -0.0014 0.1262 -0.0146 0.17121 0.0476 0.0872 -0.0110 0.1205 -0.0237 0.1710 0.0827 0.1130 -0.0104 0.1317 -0.0216 0.1833

1.3 -0.1954 0.2285 -0.1931 0.2309 -0.1906 0.2428 0.0859 0.1181 0.0065 0.1407 -0.0026 0.18921.7 -0.6226 0.6379 -0.5876 0.6102 -0.5697 0.5974 0.0808 0.1137 -0.0012 0.1286 -0.0093 0.16652 -0.9521 0.9602 -0.9200 0.9340 -0.9012 0.9197 0.0823 0.1135 -0.0068 0.1350 -0.0134 0.1877

Panel C: m =�n0:8

�-0.3 0.3043 0.3081 0.1112 0.1331 0.0402 0.1017 0.3043 0.3081 0.1112 0.1331 0.0404 0.10120 0.2897 0.2937 0.1054 0.1274 0.0330 0.1019 0.2897 0.2937 0.1054 0.1274 0.0330 0.1019

0.3 0.2804 0.2844 0.1064 0.1273 0.0364 0.0988 0.2892 0.2935 0.1072 0.1293 0.0364 0.09890.7 0.2588 0.2632 0.1173 0.1385 0.0484 0.1105 0.2631 0.2673 0.1081 0.1292 0.0360 0.10021 0.1107 0.1465 0.0676 0.0966 0.0255 0.0864 0.2497 0.2542 0.1088 0.1323 0.0366 0.1014

1.3 -0.2351 0.2676 -0.1952 0.2263 -0.1958 0.2240 0.2460 0.2512 0.1322 0.1574 0.0644 0.12581.7 -0.6751 0.6867 -0.6195 0.6366 -0.6049 0.6256 0.2178 0.2232 0.1103 0.1317 0.0378 0.10102 -0.9966 1.0010 -0.9449 0.9528 -0.9317 0.9419 0.1978 0.2041 0.1068 0.1293 0.0332 0.1030





2�j .

57

0 300 600 900 1200 1500 1800 2100 2400 2700 3000 3300 3600 3900

1.75

2.00

2.25

2.50Panel A: Aaa Baa Treas

0 300 600 900 1200 1500 1800 2100 2400 2700 3000 3300 3600 3900

0.1

0.2

0.3

0.4

0.5 Panel B: sBaaAaa sBaaTreas sAaaTreas

Figure 1: Time series plot of log yields (Panel A) and their respective spreads (Panel B).

58

Panel A

0,6

0,7

0,8

0,9

1

1,1

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950

LW LPW LW+/2s.e. LPW+/2s.e.

Panel B

0,7

0,8

0,9

1

1,1

1,2

1,3

1,4

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950

ExtLW ExtLPW ExtLW+/2s.e. ExtLPW+/2s.e.

Figure 2: Estimated long memory of log Aaa yield for bandwidth equal to 50 through 2000.

59

Panel A

0,7

0,8

0,9

1

1,1

1,2

1,3

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Panel B

0,7

0,8

0,9

1

1,1

1,2

1,3

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Figure 3: Estimated long memory of log Baa yield for bandwidth equal to 50 through 2000.

60

Panel A

0,6

0,7

0,8

0,9

1

1,1

1,2

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Panel B

0,6

0,7

0,8

0,9

1

1,1

1,2

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Figure 4: Estimated long memory of log Treasury yield for bandwidth equal to 50 through 2000.

61

Panel A

0,5

0,6

0,7

0,8

0,9

1

1,1

1,2

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Panel B

0,5

0,6

0,7

0,8

0,9

1

1,1

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Figure 5: Estimated long memory of Aaa spread over Treasury yield for bandwidth equal to 50 through

2000.

62

Panel A

0,6

0,7

0,8

0,9

1

1,1

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Panel B

0,6

0,7

0,8

0,9

1

1,1

1,2

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Figure 6: Estimated long memory of Baa spread over Treasury yield for bandwidth equal to 50 through

2000.

63

Panel A

0,5

0,6

0,7

0,8

0,9

1

1,1

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Panel B

0,5

0,6

0,7

0,8

0,9

1

1,1

1,2

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950


Figure 7: Estimated long memory of Baa spread over Aaa yield for bandwidth equal to 50 through

2000.

64


Local polynomial Whittle estimation of perturbed fractional processes


Local polynomial Whittle estimation of perturbed fractional

processes�

Per FrederiksenEquity Trading & Derivatives

Nordea Markets

Frank S. Nielseny


Morten Ørregaard NielsenQueen�s University and CREATES

March 20, 2009

Abstract

We propose a semiparametric local polynomial Whittle with noise (LPWN) estimator of the memory

parameter in long memory time series perturbed by a noise term which may be serially correlated.

The estimator approximates the spectrum of the perturbation as well as that of the short-memory

component of the signal by two separate polynomials. Including these polynomials we obtain a

reduction in the order of magnitude of the bias, but also in�ate the asymptotic variance of the

long memory estimate by a multiplicative constant. We show that the estimator is consistent for

d 2 (0; 1), asymptotically normal for d 2 (0; 3=4), and if the spectral density is in�nitely smoothnear frequency zero, the rate of convergence can become arbitrarily close to the parametric rate,pn. A Monte Carlo study reveals that the LPWN estimator performs well in the presence of

a serially correlated perturbation term. Furthermore, an empirical investigation of the 30 DJIA

stocks shows that this estimator indicates stronger persistence in volatility than the standard local

Whittle estimator.

JEL Classi�cations: C22.

Keywords: Bias reduction, local Whittle, long memory, perturbed fractional process, semiparamet-

ric estimation, stochastic volatility.

�We are grateful to Torben G. Andersen, Jörg Breitung, Niels Haldrup, Esben Høg, Asger Lunde, and the participants

at SETA 2008 for valuable suggestions and comments. This work was partly done while P. Frederiksen was visiting

Northwestern University, F. S. Nielsen was visiting Cornell University, and M. Ø. Nielsen was visiting Queen�s University

and the University of Aarhus; their hospitality is gratefully acknowledged. We are grateful for �nancial support from

the Danish Social Sciences Research Council (grant no. FSE 275-05-0220) and the Center for Research in Econometric

Analysis of Time Series (CREATES, funded by the Danish National Research Foundation).yPlease address correspondence to: Frank S. Nielsen, School of Economics and Management, Aarhus University,

Universitetsparken building 1322, 8000 Aarhus, Denmark; phone: +45 8942 5419; e-mail: [email protected]

67

1 Introduction

We are interested in estimation of the memory parameter in a so-called perturbed fractional process,

zt = yt + wt; (1)

i.e. a signal-plus-noise model where the signal process yt is a long memory process with memory

parameter d which is perturbed by the additive noise term wt. These processes have found extensive

use in modeling the long memory characteristics of many observed time series. In particular, they

are a version of the random walk plus noise or local level unobserved components model, e.g. Harvey

(1989), except the signal is a long memory process rather than a random walk.

Another motivation for the perturbed fractional process is the version of the long memory stochastic

volatility (LMSV) model for �nancial returns proposed by Bollerslev & Jubinski (1999),

rt = �peyt+xtut; (2)

where rt denotes the return, yt is the (long memory component of) log-volatility of the returns, xt is a

short-memory process, and yt, xt, and ut are independent to satisfy the requirement that E (rt) = 0.

This generalizes the usual LMSV model introduced by Breidt, Crato & de Lima (1998) and Harvey

(1998),

rt = �peytut; (3)

by arguing that allowing for di¤erent short-lived news impacts, while imposing a common long mem-

ory component, may provide a better characterization of the joint volume-volatility relationship in

the context of the Mixture of Distributions Hypothesis, which asserts that stock returns and trading

volumes are jointly dependent on the same underlying latent information arrival process. The formu-

lation in (2) allows the volatility to be a¤ected by both long and short-lived news impacts, which is

also consistent with the �ndings of Liesenfeld (2001). It therefore seems natural that an estimator of

the memory in log r2t should be able to incorporate both (2) and (3).

The LMSV models (2) and (3) imply that a logarithmic transformation of the squared returns

series log r2t becomes a long memory signal-plus-noise process (1) where the signal yt corresponds to

(the long memory component of) the log-volatility of the original returns series and wt is an additive

noise term. In the context of the LMSV model (3), wt is usually assumed to be i:i:d:, but to allow for

short-memory persistence in wt as implied by (2) we will not make that restriction here. In general,

when wt is not assumed to be i:i:d:, zt is referred to as a perturbed fractional process.1 For reviews of

fractionally integrated processes and some applications, see Baillie (1996), Henry & Za¤aroni (2003), or

Robinson (2003). In particular, long memory in volatility has received considerable interest recently.2

If we assume that the log-volatility process fytg and the noise process fwtg are independent, thespectral density of zt can be written as

fz (�) = ��2d�y (�) + �w(�) = �

�2dG

��y(�)

�y(0)+ �2d

�w(�)

�y(0)

�; (4)

1 In the following we use the terms �long memory process� and �fractionally integrated process� or just �fractional

process�synonymously, although strictly speaking a fractional process is just a particular form of a long memory process.2See, e.g., Ding, Granger & Engle (1993), Baillie, Bollerslev & Mikkelsen (1996), Comte & Renault (1998), Ray &

Tsay (2000), Andersen, Bollerslev, Diebold & Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001, 2003), Wright

(2002), Hurvich & Ray (2003), and Arteche (2004) among others.

68

where fy(�) = ��2d�y (�) is the spectrum of the signal yt, �w(�) is the spectrum of the noise term wt,

and d is the degree of long memory in yt (or equivalently in zt).

The assumption of independence between the processes fytg and fwtg rules out the so-calledleverage e¤ect. This assumption is common in the so-called random walk plus noise unobserved

components models, and has also been imposed by Breidt et al. (1998), Deo & Hurvich (2001), and

Arteche (2004), among others, in the LMSV model. To accommodate the leverage e¤ect, we could

allow contemporaneous correlation, while the return process remains a martingale di¤erence sequence

by replacing yt with yt�1 in (2). An additional assumption of distributional symmetry around (0; 0)

would imply that the spectral density decomposition in (4) holds, see Hurvich, Moulines & Soulier

(2005). Alternatively, the model could be modi�ed along the lines of model (P2) of Hurvich et al.

(2005).

In semiparametric spectral estimation of long memory models, the spectrum (4) is typically ap-

proximated using the periodogram of the data near the zero frequency, i.e. for frequencies up to

�m = 2�m=n only, where n is the sample size and m is a user-chosen bandwidth number, see sections

2 and 3 below, which tends to in�nity slower than n such that �m ! 0. Although the popular log-

periodogram regression (LPR) estimator of Geweke & Porter-Hudak (1983) and Robinson (1995b) and

the local Whittle (LW) estimator of Künsch (1987) and Robinson (1995a) both preserve consistency

and asymptotic normality when applied to perturbed fractional processes, as shown recently by Deo

& Hurvich (2001) and Arteche (2004), these estimators can be severely biased since they do not take

the perturbation into account. Indeed, for non-perturbed processes (where �w(�) = 0) the bias of

the standard semiparametric frequency domain estimators is of order O(�2m), whereas the leading bias

term when �w(�) 6= 0 is of order O(�2dm ). As shown in Deo & Hurvich (2001) and Arteche (2004), thisbias is typically negative and can be very large (note that d < 1). Therefore, estimating long memory

in perturbed time series can be a challenging task, and calls for an estimator which explicitly accounts

for the perturbation.

Sun & Phillips (2003), Hurvich & Ray (2003), and Hurvich et al. (2005) have proposed such

estimators with �y(�) and �w(�) approximated by constants as � ! 0, see section 2 below. On the

other hand, we propose an estimator where we allow both the spectrum of the perturbation and the

spectrum of the short-memory component of the signal, i.e. �w(�) and �y(�), to be approximated by

polynomials hw(�w; �) and hy(�y; �) of (�nite and even) orders 2Rw and 2Ry near the zero frequency,

instead of constants, thereby obtaining a bias reduction depending on the smoothness of �w(�) and

�y(�) near the origin. The approach taken here in modeling the short-run dynamics by a polynomial

was introduced by Andrews & Sun (2004) for non-perturbed processes, but is novel in the context

of perturbed fractional processes. To maintain generality, �w(�) and �y(�) are only characterized by

regularity conditions near frequency zero instead of imposing speci�c functional forms.

The LMSV model (3) often assumes that the noise term is i:i:d: in which case �w(�) = �2w=(2�) is

a constant. This case is of independent interest and is considered in simulations and in an empirical

study in Frederiksen & Nielsen (2008). In that paper �y(�) is approximated by a polynomial and

�w(�) by a constant as �! 0 thus focusing on exactly the LMSV model (3). However, the theory for

their estimator is developed in the present paper.

Thus, to allow serial dependence in the noise as in (2) above we include both polynomials, hy(�y; �)

69

and hw(�w; �). Furthermore, empirical studies have typically found that the noise term has much

higher (long-run) variance than the short-memory component of the signal. Indeed, Breidt et al.

(1998) and Hurvich & Ray (2003) �nd that the noise term may be as much as 10 or 20 times as

variable as the short-memory component of the signal. Thus, careful modeling of the noise term is

important and this consideration has lead us to approximate the spectrum of the noise term by a

polynomial instead of a constant as �! 0.

Our results show that introducing hy(�y; �) and hw(�w; �) in�ates the asymptotic variance of

the long memory estimator, d, by a multiplicative constant which depends on the true long memory

parameter, d. However, the in�ation decreases when d increases, and we obtain a reduction in the

order of magnitude of the bias if �(�) is su¢ ciently smooth near frequency zero. We show that the

estimator is consistent for d 2 (0; 1), asymptotically normal for d 2 (0; 3=4), and if �(�) is in�nitelysmooth near frequency zero, the rate of convergence can become arbitrary close to the parametric

rate, n1=2. This constitutes a rate of convergence improvement relative to Sun & Phillips (2003),

Hurvich & Ray (2003), and Hurvich et al. (2005) who are only able to obtain a semiparametric rate

of convergence m1=2, which is much slower than the parametric rate due to the minimal requirement

that m=n! 0.

We present the results of a Monte Carlo study which shows the usefulness of the proposed LPWN

estimator. Compared to standard estimators, such as Hurvich & Ray�s (2003) local Whittle with noise

(LWN) estimator, the LPWN estimator is able to achieve considerable bias reductions in practice, es-

pecially in cases with short-run dynamics in both the signal and noise components. We also include an

empirical application to the 30 DJIA stocks where the LPWN estimator indicates stronger persistence

in volatility than the standard estimators, and for most of the stocks produce estimates of d in the

nonstationary region.

The remainder of the paper is organized as follows. In the next section we discuss semiparametric

spectral estimation of long memory for perturbed processes and formally de�ne the proposed local

Whittle estimator. In section 3 we establish consistency and asymptotic normality of the estimator.

Section 4 investigates the �nite sample performance in simulations, and section 5 presents an empirical

study of daily log-squared returns series of the 30 DJIA stocks. Section 6 concludes. The proofs of

our theorems are gathered in the appendix.

2 Local Whittle estimation of perturbed fractional processes

Semiparametric frequency-domain estimators are essentially based on the local approximation

fz (�) � G��2d as �! 0; (5)

where G is a constant and the symbol ��means that the ratio of the left and right hand sides tendsto one in the limit. Thus, the estimators enjoy robustness to short-run dynamics, since they use only

information from periodogram ordinates in the vicinity of the origin.

The local Whittle (LW) estimation method by Künsch (1987) and Robinson (1995a) has become

popular because of its likelihood interpretation, nice asymptotic properties, and mild assumptions. It

70

is de�ned as the minimizer of the (negative) local Whittle likelihood function

Q (G; d) =1

m

mXj=1

"log�G��2dj

�+Iz (�j)

G��2dj

#; (6)

where m = m(n) is a bandwidth number which tends to in�nity as n!1 but at a slower rate than

n, �j = 2�j=n are the Fourier frequencies, and Iz(�) = (2�n)�1jPnt=1 zte

it�j2 is the periodogram of

zt. Note that the estimator is invariant to a non-zero mean since j = 0 is left out of the minimization.

Concentrating (6) with respect to G, the estimator of d is

dLW = argmind

24log G(d)� 2d 1m

mXj=1

log �j

35 ; G(d) =1

m

mXj=1

�2dj Iz (�j) :

It was shown by Robinson (1995a) that

pm(dLW � d) d! N(0; 1=4); (7)

and later by Velasco (1999) that the range of consistency is d 2 (�1=2; 1] and the range of asymptoticnormality is d 2 (�1=2; 3=4).

To reduce the asymptotic bias of the standard LW estimator, Andrews & Sun (2004) have suggested

to replace the constant, logG, in (6) by the polynomial �0 �PRr=1 �r�

2rj . That is, by modeling the

logarithm of the spectral density of the short-run component by a polynomial instead of a constant in

the vicinity of the origin. This leads to the following (negative) likelihood function,

Q (G; d;�) =1

m

mXj=1

24log ��2dj G exp

�

RXr=1

�r�2rj

!!+

Iz (�j)

��2dj G exp��PRr=1 �r�

2rj

�35 ;

such that

(dLPW ; �) = argmind2(�1=2;1=2);�2�

24log G(d;�)� 2d 1m

mXj=1

log �j �1

m

mXj=1

RXr=1

�r�2rj

35 ;G(d;�) =

1

m

mXj=1

�2dj exp

RXr=1

�r�2rj

!Iz (�j) ;

where � is a compact and convex set in RR. As shown by Andrews & Sun (2004), this method does,however, increase the asymptotic variance of d in (7) by a multiplicative constant.

For non-perturbed fractional processes, the asymptotic bias of dLW and dLPW is of order O(�2m)

and O(�minfs;2+2Rgm ), respectively, where s is a measure of the smoothness of the spectral density near

frequency zero, see below. However, for perturbed fractional processes the bias is of order O(�2dm ) and,

as shown by e.g. Hurvich & Ray (2003) and Arteche (2004), this bias is typically negative and can be

very severe.

For perturbed fractional processes we have the spectral representation (4) rather than (5). There

are two main consequences: �rst, the extra additive term in (4) needs to be taken into account to avoid

serious asymptotic bias as mentioned above, and second the rate of convergence of the estimators is

71

reduced if the extra term is not modeled. The latter follows because the choice of bandwidth parameter

is severely constrained for perturbed fractional processes when the perturbation term in (4) is not

modeled. Thus, for non-perturbed processes the bandwidth requirement is typically m = o(n4=5),

whereas for perturbed processes it is m = o(n2d=(1+2d)) (apart from logarithmic terms). Since d � 1and the estimator is

pm�consistent this is a serious constraint.

To allow for (moderate) nonstationarity in volatility we generalize (1) as

zt =

(yt + wtPts=1 xs + wt

if d 2 (0; 1=2) ;if d 2 [1=2; 1) ;

(8)

where, if d 2 [1=2; 1), xt has spectrum of the form fx(�) = ��2dx�x (�) with memory parameter

dx = d � 1. De�ning yt =Pts=1 xs if d 2 [1=2; 1), this approach allows zt = yt + wt to possibly be

nonstationary with memory parameter d 2 (0; 1). Velasco (1999), Hurvich & Ray (2003), and Hurvichet al. (2005) also assume this type of process. Since f

Pts=1 xsg is nonstationary3 zt does not have a

spectral density if d 2 [1=2; 1) but it has a pseudo spectral density, see e.g. Hurvich & Ray (1995) andVelasco (1999). Thus, we may de�ne

fz (�) =

(fy (�) + fw (�)��1� ei��2 fx (�) + fw (�) if d 2 (0; 1=2) ;

if d 2 [1=2; 1) ;

= ��2dG

��y(�)

�y(0)+ �2d

�w(�)

�y(0)

�; (9)

where we maintain the assumption of independence between fytg and fwtg.Taking (9) into account we propose to approximate (4) locally near the zero frequency by4

g (�) = ��2dG�1 + hy(�y; �) + �

2dhw(�w; �)�; (10)

where hy(�y; �) =PRyr=1 �y;r�

2r, hw(�w; �) =PRwr=0 �w;r�

2r. If Ry = 0 we set hy(�y; �) = 0. De�ning

also the polynomial h(d;�; �) =hy(�y; �) + �2dhw(�w; �) with � = (�0y;�0w)0 this yields the (concen-

trated) likelihood

Q (d;�) = log G (d;�) +1

m

mXj=1

log��2dj (1 + h(d;�; �j))

�; (11)

G (d;�) =1

m

mXj=1

�2dj Iz (�j)

1 + h(d;�; �j): (12)

Thus, we propose to minimize (11) over the admissible set D ��,

(d; �) = argmin(d;�)2D��

Q (d;�) ;

where � is a compact and convex set in RR+1, R = Ry +Rw, and D = [d1; d2] with 0 < d1 < d2 < 1.

We call this estimator the local polynomial Whittle with noise (LPWN) estimator.

3 In the nonstationary case, fPt

s=1 xsg is a type I fractional process in the terminology of Marinucci & Robinson

(1999).4Note that �y(�) and �w(�) are symmetric around � = 0 and are therefore approximated by even polynomials.

72

Note that h(�; �) = 0 is the standard local Whittle speci�cation in (6), which does not explicitly

account for the perturbation. For Ry = Rw = 0 we get h (�; �) = �, where �y(�) and �w(�) in (4) are

both modeled locally by constants. This is the local Whittle with noise (LWN) estimator of Hurvich

& Ray (2003) and Hurvich et al. (2005) (parameterization (P1)). Thus, our model parameterization

includes the standard LW estimator and the LWN estimator as special cases. Furthermore, the model

with Rw = 0, where the noise is modeled by a constant near the zero frequency, is analyzed empirically

and in simulations by Frederiksen & Nielsen (2008), using the asymptotic theory provided in this paper.

3 Asymptotic properties

In this section we �rst introduce the assumptions needed to establish consistency and asymptotic

normality of the proposed estimator for the perturbed fractional process, and consequently we present

the main results in two theorems. In the following, true values of the parameters are denoted by

subscript zero and bxc denotes the integer part of a real number x. We also de�ne a function �(�) to besmooth of order s at � = 0 if, in a neighborhood of � = 0, � (�) is bsc times continuously di¤erentiablewith bsc�derivative, �(bsc), satisfying j�(bsc) (�) � �(bsc) (0) j � C j�js�bsc for some constant C < 1.To simplify the presentation, we list only one set of assumptions even though these could be relaxed

somewhat for the consistency proof, see e.g. Hurvich et al. (2005).

A1 The noise process fwtg is independent of the signal process fytg.

A2 The spectral density of zt is fz (�) = ��2d0G0�y(�)

�y(0)+ �w(�), where �y (�) and �w(�) are real,

even, positive, continuous functions on [��; �) and d0 2 D = [d1; d2] with 0 < d1 < d2 < 1.

A3 The functions �y (�) and �w(�) are smooth of orders sy and sw at � = 0, where sy > 2Ry,

sw > 2Rw, and sy; sw � 1.

Assumption A1 is the independence assumption used above to write the spectral density of zt as the

sum of the (pseudo) spectral densities of yt and wt. Assumption A3 is a smoothness condition on the

functions �y (�) and �w(�) similar to that applied by Andrews & Sun (2004). Note that Assumption

A3 holds for all sy < 1 when, e.g., yt is a �nite order ARFIMA process, and for all sw < 1 when,

e.g., wt is a �nite order ARMA process. Under Assumption A3 we establish the following Taylor series

expansions of �y (�) and �w(�) around � = 0 (recall that odd order derivatives of even functions are

zero at frequency zero),

�y (�)

�y(0)= 1 +

bsy=2cXr=1

�y;r�2r +O (�sy) = 1 + hy(�y; �) +O(�

minfsy ;2+2Ryg) as �! 0;

and

�w (�)

�y(0)=�w (0)

�y(0)+

bsw=2cXr=1

�w;r�2r +O (�sw) = hw(�w; �) +O(�

minfsw;2+2Rwg) as �! 0;

73

where �y;r = 1(2r)!�y(0)

@2r

@�2r�y (�)

��=0

and �w;r+1 = 1(2r)!�y(0)

@2r

@�2r�w (�)j�=0. Hence, the approximation

(10) to (9) is

log (fz (�) =g (�)) = log

��y(�)

�y(0)+ �2d

�w(�)

�y(0)

�� log (1 + h(d;�,�))

= log

1 +

O(�minfsy ;2+2Ryg) + �2dO(�minfsw;2+2Rwg)

1 + h(d;�; �)

!as �! 0;

fz (�)

g (�)= 1 +O(�minfsy ;2+2Ryg) + �2dO(�minfsw;2+2Rwg) as �! 0; (13)

and the true values of G and � are G0 = �y (0) and �0 = (�0;1; :::; �0;R+1)0, where

�0;r =1

(2r)!�y(0)

@2r

@�2r�y (�)

��=0

; r = 1; : : : ; Ry;

�0;Ry+r+1 =1

(2r)!�y(0)

@2r

@�2r�w (�)j�=0 ; r = 0; : : : ; Rw:

A4 (a) The signal yt has zero mean and admits an in�nite order moving average representationyt =

P1j=0 �j"t�j (stationary case) or �yt = xt =

P1j=0 �j"t�j (nonstationary case), whereP1

j=0 �2j <1 and "t satis�es, for all t, E ("tj Ft�1) = 0, E

�"2t��Ft�1� = 1, E �"3t ��Ft�1� = �3 <

1, and E�"4t��Ft�1� = �4 <1 almost surely, where Ft�1 is the �-�eld generated by f"s; s < tg.

(b) There exists a random variable " with E("2) < 1 such that for all � > 0 and some K > 0,

P (j"tj > �) < KP (j"j > �).

(c) In a neighborhood of the origin, @@�� (�) = O (j� (�) j=�) as �! 0, where � (�) =

P1k=0 �ke

ik�.

A5 (a) The noise wt has zero mean and admits an in�nite order moving average representation wt =P1j=0 �j�t�j , where

P1j=0 �

2j <1 and �t satis�es, for all t, E (�tj Ft�1) = 0, E

��2t��Ft�1� = 1,

E��3t��Ft�1� = �3 < 1, and E

��4t��Ft�1� = �4 < 1 almost surely, where Ft�1 is the �-�eld

generated by f�s; s < tg.

(b) There exists a random variable " with E("2) < 1 such that for all � > 0 and some K > 0,

P (j�tj > �) < KP (j"j > �).

(c) In a neighborhood of the origin, @@�� (�) = O (j� (�) j=�) as �! 0, where � (�) =

P1k=0 �ke

ik�.

Since our estimator is a function of the periodogram at nonzero frequencies only, we assume without

loss of generality5 that the signal process yt has zero mean. Importantly, Assumptions A4 and A5

allow for non-Gaussian processes. Note that Assumptions A1-A4 plus the assumption that wt is white

noise with �nite fourth moment imply the assumptions needed on yt and wt to prove consistency and

asymptotic normality (if, in addition, d2 < 3=4) of the LWN estimator of Hurvich & Ray (2003). It

follows from Theorems 1 and 2 below that their results for the LWN estimator are also valid for our

more general assumptions on wt in Assumption A5.

5 In the nonstationary case the zero mean assumption implies that zt is free of linear trends which does entail a loss

of generality in that case.

74

A6 � is a compact and convex subset of RR+1 and �0 lies in the interior of �.

We are now ready to prove consistency of our estimator. As mentioned above, some of our as-

sumptions could be relaxed somewhat to prove this theorem, but we have preferred to list only one

set of assumptions which will be used also for the proof of asymptotic normality below. The proofs of

both theorems are given in the appendix.

Theorem 1 If Assumptions A1-A6 hold and the bandwidth m = m (n) is such that

1

m+m

n! 0; (14)

then d� d0 = oP ((log n)�5).

Note that the theorem proves consistency only for the estimator of the memory parameter (at

logarithmic rate). There is no proof of consistency for the estimators of the polynomial parameters in

�. The strategy of proof in Hurvich et al. (2005) would require next a separate proof of consistency of

the polynomial parameters, however, we follow instead the method of proof in Andrews & Sun (2004)

which does not require an intermediate result on the consistency of �. Thus, we give next the joint

asymptotic normality of d and �.

Theorem 2 Let Assumptions A1-A6 hold with d0 in the interior of D = [d1; d2], 0 < d1 < d2 < 3=4,

and suppose the bandwidth m = m (n) is such that

m1+4Ry

n4Ry+m1+4(d0+Rw)

n4(d0+Rw)!1 and

m2'y+1

n2'y+m2'w+4d0+1

n2'w+4d0! 0; (15)

where 'a = min fsa; 2 + 2Rag ; a = y; w. Then d and � are both consistent and

Bn

d� d0� � �0

!d! N(0;�1Ry ;Rw); Ry ;Rw =

0B@ 4 �0Ry � 0Rw�Ry �Ry 0Ry ;Rw�Rw Rw;Ry Rw

1CA ;where Bn = Bn (d0) is the (R+ 2)� (R+ 2) deterministic diagonal matrix with diagonal elements

(Bn)11 =pm, (Bn)k+1;k+1 =

pm�2km for k = 1; : : : ; Ry;

and (Bn)k+Ry+2;k+Ry+2 =pm�2d0+2km for k = 0; : : : ; Rw;

�Ry and �Rw = �Rw(d0) are the vectors

(�Ry)k =�4k

(1 + 2k)2for k = 1; : : : ; Ry and (�Rw)k+1 =

�4(d0 + k)(1 + 2d0 + 2k)

2 for k = 0; : : : ; Rw;

�Ry and Rw = Rw(d0) are the Ry �Ry and (Rw + 1)� (Rw + 1) matrices��Ry

�ik

=4ik

(1 + 2i+ 2k) (1 + 2i) (1 + 2k)for i; k = 1; : : : ; Ry;

(Rw)i+1;k+1 =4(d0 + i)(d0 + k)

(1 + 2i+ 2k + 4d0) (1 + 2i+ 2d0) (1 + 2k + 2d0)for i; k = 0; : : : ; Rw;

75

and Rw;Ry = Rw;Ry(d0) is the (Rw + 1)�Ry matrix

( Rw;Ry)i+1;k =4k(d0 + i)

(1 + 2d0 + 2k + 2i) (1 + 2d0 + 2i) (1 + 2k)for i = 0; : : : Rw; k = 1; : : : ; Ry:

If Ry = Rw = 0 de�ne 0;0 =

4 � 00�0 0

!.

First of all, we note that by setting Ry = Rw = 0 we obtain as a special case the results for the

LWN estimator of Hurvich & Ray (2003). Secondly, the leading (Ry + 1) � (Ry + 1) submatrix ofRy ;Rw is the same as that obtained by Andrews & Sun (2004). Third, we note that the asymptotic

variance ofpm(d�d0) is free of the polynomial parameters �0, but it depends on d0. Moreover, the use

of the polynomials hy(�y; �) and hw(�w; �) increases the asymptotic variance of d by a multiplicative

constant compared to LWN estimator of Hurvich & Ray (2003) (easily seen by use of the formula

for the inverse of a partitioned matrix). Andrews & Sun (2004) obtain a similar result for their local

polynomial Whittle (LPW) estimator in a non-volatility model.

The �rst condition in (15) guarantees that all the elements of the scaling matrix Bn diverge as

n ! 1, which is a minimal condition for consistency. The second condition restricts the expansionrate of the bandwidth to control bias and ensure that the estimator uses only relevant information from

periodogram ordinates su¢ ciently near the zero frequency. Alternatively, we can view the bandwidth

conditions in (15) separately for the signal process and the noise process. In this way we would write

the conditions as

m1+4Ry

n4Ry!1; m

2'y+1

n2'y! 0 and

m1+4(d0+Rw)

n4(d0+Rw)!1; m

2'w+4d0+1

n2'w+4d0! 0:

It is now easy to see that the bandwidth conditions for both the signal process and the noise process

are always compatible because sy > 2Ry and sw > 2Rw, respectively, by Assumption A3.

Note that the second condition in (15) implies that if �y (�) and �w (�) are in�nitely smooth

near frequency zero then any (Ry; Rw) can be chosen and the estimator is n1=2�� consistent for all

� > 0. Hence, in that case, the rate of convergence is arbitrarily close to the parametric rate. Thus,

the condition (15) allows the bandwidth m to be much larger than for the LWN estimator and the

standard LW estimator, which require that (assuming sy � 2; sw � 2) m5n�4 ! 0 and m4d0+1n�4d0 !0, respectively, see Hurvich & Ray (2003) and Arteche (2004). Therefore, Theorem 2 provides an

improvement in the rate of convergence relative to existing estimators of the memory parameter for

perturbed fractional processes. This comes at the cost of an increase in the asymptotic variance by a

multiplicative constant, but this is clearly more than o¤-set by the faster rate of convergence, at least

asymptotically. For example, in the empirically relevant case of d0 = 0:4, which is a typical value of d0for �nancial volatility series, the LW estimator is at most n0:31-consistent and the LWN estimator is

at most n0:4-consistent, whereas our estimator can be arbitrarily close to n0:5-consistent if the spectral

density is su¢ ciently smooth near the zero frequency.

Finally, as in Andrews & Sun (2004) we could calculate the asymptotic bias which is of order

O((m=n)'y + (m=n)2d0+'w), where 'a = min fsa; 2 + 2Rag ; a = y; w, see the proof of Lemma 3(e)

in the appendix. This is in contrast to the orders O((m=n)2) and O((m=n)2d0) for the LWN and

LW estimators in Hurvich & Ray (2003) and Arteche (2004). Thus, as in Andrews & Sun (2004) for

76

the pure long memory case, the order of magnitude of the asymptotic bias is smaller when modeling

the (smooth) spectral density of the short-memory component locally by a polynomial instead of a

constant.

4 Finite sample comparison

In this section we present simulation results to examine the �nite sample bias and root mean squared

error (RMSE) performance of our LPWN estimator. In particular, we want to examine the accuracy

with realistic sample sizes and short-run contamination in both signal and noise.

Our LPWN estimator is implemented with (Ry; Rw) equal to (1; 0), (0; 1), and (1; 1), denoted

LPWN(Ry;Rw), and is compared with the LW, LPW, and LWN estimators. From Hurvich & Ray

(2003) we know that the LWN estimator is superior to the LW estimator in terms of bias and RMSE in

the context of the standard LMSV model. Furthermore, Hurvich et al. (2005) show that the polynomial

log-periodogram regression estimator of Andrews & Guggenberger (2003) su¤ers from severe bias in

the case of perturbed fractional processes and the LPW estimator is expected to perform similarly.

Therefore, to conserve space we only compare the LWN and LPWN estimators in the Monte Carlo

setup. The results for the LW and LPW estimators are avaible from the authors upon request.

4.1 Monte Carlo setup

We simulate model (2), i.e.

zt = yt + xt + wt; (16)

where fytg is the signal process and fxt + wtg is the perturbation process. We model fxtg as anARMA process and fwtg as

wt = log u2t ; ut � NID(0; 1): (17)

Note that the variance of wt is �2w = �2=2 regardless of the variance of ut. The signal process fytgand the ARMA part fxtg of the perturbation process follow di¤erent DGPs. For brevity, we consider�ve di¤erent DGPs for the signal and ARMA perturbation processes. The general setup for fytg andfxtg is

(1� �yL) (1� L)d yt =�1 + �yL

��t; �t � NID(0; �2�); (18)

(1� �xL)xt = (1 + �xL) "t; "t � NID(0; 1); (19)

with parameter con�gurations

Model I : �y = �y = �x = �x = 0;

Model II : �y = �y = �x = 0; �x 2 f�0:8; 0:5g ;Model III : �y = �y = �x = 0; �x 2 f�0:8; 0:8g ;Model IV : �y = �x = 0; (�y; �x) 2 f(�0:8; 0:5); (�0:8; 0:8)g ;Model V : �y = �x = 0; (�y; �x) 2 f(�0:8;�0:8); (�0:8; 0:8)g :

77

We remark that in all the models the noise-to-signal ratio is given as

nsr =fx(0) + fw(0)

f(1�L)dyt(0)=

(1+�x)2

(1��x)2+ �2

2

�2�(1+�y)

2

(1��y)2

: (20)

For each Monte Carlo DGP we generated 1000 arti�cial time series with a sample size of 1024;

2048, 4096, and 8192.6 For all estimators we set the bandwidth as m = bnac, where a 2 f0:6; 0:7; 0:8g.The parameter of interest, d, is set equal to either 0:4 or 0:6. For the noise-to-signal ratio, we choose

nsr 2 f5; 10; 20g ; and the variance �2� is set as a function of �y; �x; �y; and �x such that the nsrhas the desired value. The values of d, nsr, (�y; �y), (�x; �x), and the sample sizes are chosen to

re�ect empirical �ndings on long memory in volatility (see the references in the introduction for some

examples). The chosen parameter values for the short-run contamination in the signal and the noise

are also inspired by the results from the empirical (parametric) analysis of the DJIA stocks in section

5 below.

The signal fytg is generated by the circulant embedding method as described in Davies & Harte

(1987), i.e. the stationary type I fractionally integrated process in the terminology of Marinucci &

Robinson (1999), see also Beran (1994, pp. 215-217). To generate nonstationary series with d � 1=2,we simulate the ARFIMA process with integration order d � 1 and cumulate the resulting series.Numerical optimization was carried out in Matlab v7.2 using the BFGS and DFP optimization routines

and selecting the one with the best log-likelihood value. The initial values were set as follows. For

the LWN estimator we used the LW estimate, dLW ; if it was in the interior of the admissible space of

d; i.e. [0:01; 0:99], c.f. Assumption A2. Otherwise, d was set equal to 0:1. As starting value for the

LPWN estimators we used the LWN estimate if it was in the admissible interval, otherwise d was set

equal to 0:1.7 As initial values for the polynomial parameters we used 1 for all estimators.

To conserve space we present only a subset of the results. The left-out results (d = 0:6, n = 1024,

and m =�n0:6

�) are qualitatively very similar to the ones presented, and are available upon request.

4.2 Monte Carlo results

Tables 1-9 display the results of the simulation study and show how the two di¤erent sources of bias,

i.e. the additive noise term and the contamination from the short-memory dynamics in both the signal

and the noise, a¤ect the estimators.

[Table 1 about here]

In the case where there is no contamination by short-run dynamics in the signal or noise, i.e. Model

I with results displayed in Table 1, the bias is small for all estimators. The theoretical in�ation of the

variances from h (�; d;�) is also noticeable in the RMSEs. Additionally, the RMSE decreases as either

the sample size or bandwidth increase. The only case with any noticeable bias is for the LPWN(1,1)

estimator with nsr = 20, smallest sample size, and highest bandwidth.6The number of observations is chosen as a power of two in order to use the fast Fourier transform in calculating the

periodogram. This speeds up the estimations considerably compared to using the discrete Fourier transform.7We tried di¤erent starting values for d in these cases and the results were indistinguishable.

78

[Tables 2 and 3 about here]

In Tables 2 and 3 we consider model II, i.e the signal is an ARFIMA(0; d; 0) process and the noise

is an ARMA process with coe¢ cients (�x; �x) = (0:5; 0) and (�x; �x) = (�0:8; 0), respectively. Herewe would presume that the LPWN(0,1) estimator is the better choice. We clearly see that we are

able to obtain considerable reduction in bias relative to the LWN estimator, especially for the positive

AR root case in Table 2. In that case we �nd that all three LPWN estimators outperform the LWN

estimator in terms of bias, and for the highest bandwidth choice, the LPWN(0,1) estimator is often

also superior in terms of RMSE. In the model with a negative AR root in Table 3 the results are very

similar to those in Table 1.


We consider next Model III, i.e. where there is MA contamination in the noise, with results

presented in Tables 4 and 5. The results for this model are similar to those in Tables 1 and 3. That

is, for this model there is only little bias in the LWN estimator and no bias in the LPWN estimators.

For the highest bandwidth choice, LWN and LPWN have similar RMSE.


Tables 6 and 7 contain results for Model IV, where��y; �y

�= (�0:8; 0) ; (�x; �x) = (0:5; 0) and�

�y; �y�= (�0:8; 0) ; (�x; �x) = (�0:8; 0), respectively. In the case of Table 6 the LWN estimator

su¤ers from very high bias and the LPWN estimators are able to reduce this bias considerably. In

particular, the LPWN(1,1) estimator is nearly unbiased in most cases. For the high bandwidth the

RMSEs are similar for all estimators. In Table 7, where the contamination is by a negative root, the

performance of the LWN estimator is similar to that of the LPWN estimators.


Results for Model V where��y; �y

�= (0;�0:8) ; (�x; �x) = (0; 0:8) and

��y; �y

�= (0;�0:8) ; (�x; �x) =

(0;�0:8) are shown in Tables 8 and 9, respectively.8 The LWN estimator su¤ers from very severe bias

in Model V, and consequently its RMSE is also higher than for the previous models. On the other

hand, the LPWN estimators have relatively low biases, and in particular the LPWN(1,1) estimator

appears essentially unbiased. When compared in terms of RMSE the LPWN estimators are superior in

both tables as well. Thus, we have a considerable reduction in bias for all LPWN estimators compared

to the LWN estimator, and we also have quite a remarkable reduction in RMSE.

To sum up, the Monte Carlo study shows the usefulness of estimators that explicitly take the short-

run dynamics in the perturbation into account, i.e. the LPWN estimators where (Ry; Rw) = (0; 1)

and (Ry; Rw) = (1; 1), although the LPWN estimator with (Ry; Rw) = (1; 0) also performs well. All

three estimators generally have much smaller biases than the LWN estimator and are fairly insensitive

to the persistence in the perturbation and to the contamination from short-memory dynamics in the

signal.8 In a few cases for the LWN estimator (marked with asterisks) we had convergence problems due to boundary issues

resulting in a markedly bimodal �nite-sample distribution. In these cases we set the initial value for the polynomial

parameter to 10, which resolved the issue.

79

5 Long memory in DJIA stock volatility

This section analyzes the long memory in daily log-squared returns series of the 30 DJIA stocks

corrected for the e¤ects of stock splits and dividends from January 1 1990 to March 31 2008, for

a sample of n = 4753. To avoid the problem of taking logarithm of zero we based the analysis on

adjusted log-squared returns using the method of Fuller (1996, pp. 495-496), i.e. we analyze

log ~r2t = log�r2t + �

��

r2t + �;

where � = 0:02n

Pnt=1 r

2t . We estimate the long memory in log ~r

2t using the proposed LPWN estimator.

We implement the estimator with (Ry; Rw) equal to (1; 0), (0; 1), and (1; 1), and with starting values

etc. as in the Monte Carlo study above. For comparison we also report the standard LW, LPW, and

LWN estimates. For all estimators we set the bandwidth as m = bnac, where a 2 f0:6; 0:7; 0:8g.


Table 10 presents the results for the LW, LPW, and LWN estimators. As expected from theory,

the LW and LPW estimators appear downward biased and are decreasing in the bandwidth. For the

LWN estimator the memory estimates of some of the stocks are in the stationary region, but for the

most part they are in the nonstationary region.


In Table 11 we present the results for the three variants of the LPWN estimator, i.e. for (Ry; Rw)

equal to (1; 0), (0; 1), and (1; 1). First of all, as expected from theory and the simulations above,

it is clear that this estimator does not su¤er from the downward bias present in the LW and LPW

estimators. Second, we note that the three di¤erent implementations of the estimator agree with each

other for most of the stocks and bandwidth choices. Thirdly, the LPWN estimates are of the same

order of magnitude as the LWN estimates, although a little higher on average.

To emphasize the importance of the polynomial approximation of the signal process fytg andthe pertubation process fxt + wtg, we also �tted an extended parametric LMSV-ARFIMA(1; d; 1)model, where the extension is that the noise is modeled by an ARMA process. That is, we model the

periodogram of log ~r2t using the Whittle likelihood framework of Fox & Taqqu (1986) and Breidt et al.

(1998), where the �tted model has spectral density

fz (�) =�2�2�

�2 sin

�

2

��2d �1 + 2�y cos�+ �2y��1� 2�y cos�+ �2y

� + �2"2�

�1 + 2�x cos�+ �

2x

�(1� 2�x cos�+ �2x)

: (21)

In Table 12 the resulting estimates are reported, where we have removed insigni�cant ARMA terms

from both the signal and the noise.

[Insert Table 12 about here]

The estimated values of d from the parametric results are in line with those from the LWN and

LPWN estimators in Tables 10 and 11. Furthermore, there is signi�cant (at 10% level) short-run

80

dynamics in the signal (19 out of 30 cases), in the noise (16 out of 30 cases), and in both the signal

and noise (13 out of 30 cases). The estimated (long-run) nsr�s can be calculated from the parameter

estimates as in (20), and are for most of the stocks in the vicinity of 10 � 30, although there arecases where the nsr is very high because �2� is very small and insigni�cant. Taking the high nsr�s

and signi�cant short-run dynamics in both the signal and the noise into consideration stresses the

importance of the LPWN estimators.


In this paper we have proposed a semiparametric local polynomial Whittle with noise estimator of

the degree of long memory, d, in �nancial volatility time series perturbed by dynamic short-run noise.

The estimator allows the spectrum of the perturbation and that of the short-memory component of

the signal to be modeled as �nite even polynomials, instead of constants near the zero frequency. This

is shown to yield a bias reduction depending on the smoothness of the spectra. However, including

the polynomials in�ates the asymptotic variance of d by a multiplicative constant which depends on

the true long memory parameter, d.

We have shown that the estimator is consistent for d 2 (0; 1), asymptotically normal for d 2(0; 3=4), and if the spectral density is su¢ ciently smooth near frequency zero the rate of convergence

becomes arbitrary close to the parametric rate,pn.

A Monte Carlo study revealed that the proposed local polynomial Whittle with noise estimator

is able to achieve considerable bias reductions in practice compared to standard (e.g., local Whittle

with noise) estimators, especially in cases with short-run dynamics in both the signal and noise com-

ponents. In an empirical investigation of the 30 DJIA stocks the local polynomial Whittle with noise

estimator indicated stronger persistence in volatility than standard estimators, and for most of the

stocks produced estimates of d in the nonstationary region.

Appendix A: Proof of Theorem 1

This proof follows the proofs of Theorem 3.1 and Lemma C.2 of Hurvich et al. (2005). As in the proofs

of Theorem 1 of Robinson (1995a) and Theorem 3.1 of Hurvich et al. (2005), to show consistency of d

we need to separately prove that limn!1 P (d 2 D1) = 0 and that (d� d0)1(d 2 D2)P! 0, where 1(A)

is the indicator function of the set A, D1 = (�1; d0� 1=2+ �)\D, D2 = [d0� 1=2+ �;+1)\D, and� < 1=4 is a positive real number to be set later.

Let �k (d;�) =1+hk(d0;�0)1+hk(d;�)

. Then the proof that (d� d0)1(d 2 D2)P! 0 follows as in Hurvich et al.

(2005, pp. 1303-1305) by showing that

Zm =mXk=1

k2(d�d0)�k(d;�)Pmj=1 j

2(d�d0)�j(d;�)

�Iz (�k)

fz (�k)� 1�= oP (1) (22)

uniformly on (d;�) 2 D2 �� and that

Rm(d;�) = log

1 +

Pmk=1 k

2(d�d0) (�k (d;�)� 1)Pmj=1 j

2(d�d0)

!� 1

m

mXk=1

log (1 + (�k (d;�)� 1)) = o (1) (23)

81

uniformly on (d;�) 2 D ��.Note that there exists a constant C > 0 such that

sup(d;�)2D��

supk=1;:::;m

j�k (d;�)� 1j = sup(d;�)2D��

supk=1;:::;m

��hk(d0;�0)� hk(d;�)1 + hk(d;�)

�� C (m=n)2d1 ;since � is compact and d � d1 > 0, see Lemma 4. Now we use that log (1 + x) = x+O(x2) as x! 0

to obtain

sup(d;�)2D��

jRm(d;�)j � C sup(d;�)2D��

supk=1;:::;m

j�k (d;�)� 1j � C (m=n)2d1 = o (1) :

To show (22) we apply Proposition A.1 of Hurvich et al. (2005), which holds here since our

Assumptions A1-A6 imply their Assumptions (H1)-(H3) with the exception that we allow serially

correlated peturbation terms. It is, however, easily shown that replacing their Assumption (H2)

with our Assumption A5, their Proposition A.1 still holds. The only other change is that the term

(k=n)min(�;d0) in their eq. (F.15) should be replaced by (k=n)'y + (k=n)'w due to the more accurate

approximation of fz (�) o¤ered by our function g(�) in (10) due to the included polynomials, see also

Lemma 5 below. Thus, according to their Proposition A.1, letting

ck =k2(d�d0)�k(d;�)Pmj=1 j

2(d�d0)�j(d;�);

then for � 2 (0; 1), K 2 (0;1), and all k 2 f1; : : : ;m� 1g, we need to show that

jck � ck+1j � Km��k��2; jcmj � Km�1

uniformly on (d;�) 2 D2 ��, which implies (22).Note that, uniformly on (d;�) 2 D2 ��, we have that

Pmj=1 j

2(d�d0)�j(d;�) � Cm2(d�d0)+1 and

jk2(d�d0)�k(d;�)� (k + 1)2(d�d0)�k+1(d;�)j� jk2(d�d0) � (k + 1)2(d�d0)j�k(d;�) + (k + 1)2(d�d0)j�k(d;�)� �k+1(d;�)j� (k + a)2(d�d0)�1C + (k + 1)2(d�d0)C(�k+1 � �k)�2d�1k+a ; a 2 [0; 1]� Ck2(d�d0)�1;

where the �rst inequality is the triangle inequality and the second follows from the mean value theorem

and Lemma 4. It follows that

sup(d;�)2D2��

��k2(d�d0)�k(d;�)� (k + 1)2(d�d0)�k+1(d;�)Pmj=1 j

2(d�d0)�j(d;�)

�� sup(d;�)2D2��

C

�� k2(d�d0)�1m2(d�d0)+1

�� Ck2��2m�2�;

sup(d;�)2D2��

�� m2(d�d0)�m(d;�)Pmj=1 j

2(d�d0)�j(d;�)

�� Cm�1;

which proves (22).

The proof that limn!1 P (d 2 D1) = 0 follows exactly as in Hurvich et al. (2005, pp. 1305-1306)since their Proposition A.1 holds in our case as well. Thus we have shown that d P! d0. To strengthen

this result to d� d0 = oP ((log n)�5) we use the proof of Lemma C.2 of Hurvich et al. (2005) withoutchange.

82

Appendix B: Proof of Theorem 2

For the proof of Theorem 2 we need the score and Hessian (both multiplied by m) of (11):

Sn (d;�) = G (d;�)�1mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!Xj ;

Hn (d;�) = H1n (d;�) +H2n (d;�) ;

H1n (d;�) = G (d;�)�2

0@G (d;�) mXj=1

GIz (�j)

gj (d;�)XjX

0j �m

0@ 1

m

mXj=1

GIz (�j)

gj (d;�)Xj

1A0@ 1

m

mXj=1

GIz (�j)

gj (d;�)Xj

1A01A ;H2n(d;�) = G (d;�)�1

mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!@Xj@(d;�0)

;

where

Xj = (X1j ;X02j ;X

03j)

0;

X1j = 2 log j �2hw(�w; �j)�

2dj log �j

(1 + hj (d;�));

X2j =

��2j

(1 + hj (d;�)); : : : ;

��2Ryj

(1 + hj (d;�))

!0;

X3j =

��2dj

(1 + hj (d;�)); : : : ;

��2d+2Rwj

(1 + hj (d;�))

!0;

hj (d;�) = h(d;�; �j), gj (d;�) = ��2dj G (1 + hj(d;�)), and Dm (�) = fd 2 D : (logm)5 jd � d0j < �gfor � > 0. Note that Xj is the vector of partial derivatives of � log gj(d;�). The matrix H2n(d;�) is

symmetric and has (i; l)�th and (l; i)�th elements

G (d;�)�1mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(Xj)i (Xj)l ; i; l = 2; : : : ; R+ 2;

�G (d;�)�1mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(Xj)i

2hw(�w; �j)�2dj log �j

(1 + hj(d;�)); i = 2; : : : ; Ry + 1; l = 1;

G (d;�)�1mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(Xj)i 2 log �j

1�

hw(�w; �j)�2dj

(1 + hj(d;�))

!; i = Ry + 2; : : : ; R+ 2; l = 1;

G (d;�)�1mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(Xj)Ry+2 4hw(�w; �j) (log �j)

2

1�


(1 + hj(d;�))

!; i = l = 1:

We also de�ne the matrix

Jn =mXj=1

Xj �

1

m

mXk=1

Xk

! Xj �

1

m

mXk=1

Xk

!0:

We next state a lemma adapted from Andrews & Sun (2004), henceforth abbreviated AS. The

proof is given in the next section.

83

Lemma 3 Under the assumptions of Theorem 2 we have, as n!1,(a) B�1n JnB

�1n ! Ry ;Rw ;

(b) B�1n (H1n (d0;�0)� Jn)B�1n

= oP (1) and B�1n H2n (d0;�0)B�1n

= oP (1) ;(c) sup�2�

B�1n (Hkn (d0;�)�Hkn (d0;�0))B�1n

= oP (1) ; k = 1; 2;(d) supd2Dm(�n);�2�

B�1n (Hkn (d;�)�Hkn (d0;�))B�1n

= oP (1) ; k = 1; 2; for all sequences ofconstants f�ngn�1 for which �n = o (1) ;

(e) B�1n Sn (d0;�0)d! N

�0;Ry ;Rw

�:

Since the LPWN likelihood (11) is a continuous function on a compact set the LPWN estimator

exists. From Lemma 3 we know by Lemma 1 of AS that there exists a solution to the �rst order

conditions with probability tending to one, and that the solution satis�es the convergence result in

Theorem 2, see also Lemmas 1 and 2 of AS. If the (negative) likelihood function is strictly convex and

twice di¤erentiable then the solution to the �rst order conditions is unique and minimizes (11) and

hence equals the LPWN estimator.

Thus, all that remains is to show that the Hessian is positive de�nite which proves convexity. The

positive de�niteness of H1n follows as in eq. (5.1) of AS. Compared to AS we have the additional

term H2n. For H2n we know that B�1n H2n(d;�)B

�1n

= oP (1) uniformly on (d;�) 2 Dm (�n) � �by Lemma 3(b)-(d) and the triangle inequality. Since d 2 Dm (�n) with probability tending to one byTheorem 1, this shows that Hn is positive de�nite with probability tending to one, which concludes

the proof.

Appendix C: Proof of Lemma 3

We now turn to the proof of Lemma 3, which follows the method of proof for Lemma 2 of AS, with

modi�cations to allow d � 1=2 (following Velasco (1999)) and to accommodate the additive noise termin the spectral density (see Lemma 5), and with an additional proof for each of (b), (c), and (d) of

negligibility of the term H2n(d;�).

C.1 Proof of (a)

Part (a) of the lemma follows by approximating sums by integrals, see, e.g., Lemma 2 of Andrews &

Guggenberger (2003).

84

C.2 Proof of (b), �rst statement

The proof roughly follows that of Lemma 2(b) in AS, except now b can be non-integer (equal to d or

2d) in their eq. (A.6), which we write a little di¤erently as

~Ga;b;c(d;�) = m�1mXj=1

�2dj Iz(�j)

(1 + hj(d;�))c+1

2 log j �


(1 + hj(d;�))

!a�j

m

�2b;

Ga;b(d;�) = m�1mXj=1

�2dj Iz(�j) (2 log j)a

�j

m

�2b;

Ja;b = Gm�1mXj=1

(2 log j)a�j

m

�2b;

for a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d+1; : : : ; d+Rw+Ry; 2d; 2d+1; : : : ; 2d+2Rw. The elements

of B�1n H1n (d;�)B�1n are (omitting the argument for brevity)

(1; 1) : ~G�20;0;0

�~G0;0;0 ~G2;0;0 � ~G21;0;0

�;

(1; 1 + k) : ~G�20;0;0

�~G0;0;0 ~G1;k;1 � ~G1;0;0 ~G0;k;1

�for k = 1; : : : ; Ry;

(1; 2 +Ry + k) : ~G�20;0;0

�~G0;0;0 ~G1;k+d;1 � ~G1;0;0 ~G0;k+d;1

�for k = 0; : : : ; Rw;

(1 + i; 1 + k) : ~G�20;0;0

�~G0;0;0 ~G0;i+k;2 � ~G0;i;1 ~G0;k;1

�for i; k = 1; : : : ; Ry;

(1 + i; 2 +Ry + k) : ~G�20;0;0

�~G0;0;0 ~G0;k+i+d;2 � ~G0;i;1 ~G0;k+d;1

�for i = 1; : : : ; Ry; k = 0; : : : ; Rw;

(2 +Ry + i; 2 +Ry + k) : ~G�20;0;0

�~G0;0;0 ~G0;k+i+2d;2 � ~G0;i+d;1 ~G0;k+d;1

�for i; k = 0; : : : ; Rw;

and the corresponding elements of B�1n Jn (d;�)B�1n are given by the same expressions with ~Ga;b;c

replaced by Ja;b. To prove the �rst statement of Lemma 3(b) it su¢ ces to show that (since b can take

values including d, we distinguish between b and b0)

�a;b0 =��Ga;b0(d0;�0)� Ja;b0�� = oP ((logm)�2); (24)

~�a;b0;c =�� ~Ga;b0;c(d0;�0)� Ga;b0(d0;�0)�� = oP ((logm)�2): (25)

In view of Lemma 5 below, the proof of (A.9) in AS pp. 598-599 works also for our eq. (24) where

we �nd that (�k;n(d) is de�ned in Lemma 5)

�a;b0 = OP

�(logm)am�1�m;n(d0) + (logm)

am'yn�'y + (logm)amd0+'wn�d0�'w

+ (logm)am2d0n�2d0 + (logm)a+1m2d0�1n�d0 + (logm)am�1=2�;

which is

OP

�(logm)a+2=3m�2=3 + (logm)am�1=2n�1=4 + (logm)a(m=n)min('y ;d0+'w;2d0)

+(logm)a+1m2d0�1n�d0 + (logm)am�1=2�

85

in the stationary case and

OP

�(logm)a+2=(5�4d0)m1=(5�4d0)�1 + (logm)a+1m2d0�2 + (logm)am(d0�1)=2n�1=2(log n)5=4

+(logm)a+1=2n�1=4md0�1 + (logm)a(m=n)min('y ;d0+'w;2d0) + (logm)a+1m2d0�1n�d0 + (logm)am�1=2�

in the nonstationary case. Since d0 < d2 < 3=4 and by (15), clearly �a;b0 = oP ((logm)�2) in both

cases.

To prove (25) we write ~Ga;b0;c(d0;�0)� Ga;b0(d0;�0) as

m�1mXj=1

�2d0j Iz(�j)

"1

(1 + hj(d0;�0))c+1

2 log j �

2hw(�w;0; �j)�2d0j log �j

(1 + hj(d0;�0))

!a� (2 log j)a

#�j

m

�2b0= m�1

mXj=1

�2d0j Iz(�j)

�1

1 +O((j=n)2d0)

�2 log j � O((j=n)

2d0 log n)

1 +O((j=n)2d0)

�a� (2 log j)a

��j

m

�2b0by Lemma 4(i). This proves (25) for a = 0 since

~G0;b0;c(d0;�0)� G0;b0(d0;�0) = m�1mXj=1

�2d0j Iz(�j)

�1

1 +O((j=n)2d0)� 1��

j

m

�2b0= OP

�(m=n)2d0G0;b0(d0;�0)

�= OP

�(m=n)2d0

�= oP ((logm)

�2)

because d0 belongs to the interior of the parameter space and is therefore bounded away from zero.

When a � 1 we apply the mean value theorem, i.e. xa = ya + (y � x)a�xa�1 for x � �x � y, such that 2 log j �


(1 + hj(d0;�0))

!a� (2 log j)a = a


(1 + hj(d0;�0))O((log j)a�1)

uniformly in j = 1; : : : ;m. This impiles that (25) is

m�1mXj=1

�2d0j Iz(�j)haO((j=n)2d0 log n)O((log j)a�1)

i� jm

�2b0= OP ((m=n)

2d0(log n)(logm)a�1G0;b0(d0;�0))

= OP

�(m=n)2d0(log n)a

�= oP ((logm)

�2):

C.3 Proof of (e)

We now prove part (e) since it will be useful in the proof of the remaining statements. By (24) and

(25) with a = b = c = 0 we get that G (d0;�0) = G0(1 + oP ((logm)�2)), so that, apart from smaller

order terms,

B�1n Sn (d0;�0) = m�1=2mXj=1

Iz (�j)

gj (d0;�0)� 1

m

mXk=1

Iz (�k)

gk (d0;�0)

!~X0;j

= m�1=2mXj=1

�Iz (�j)

gj (d0;�0)� 1�

~X0;j �1

m

mXk=1

~X0;k

!; (26)

86

where

~Xj = (X1;j ; ~X02;j ; ~X

03;j)

0;

~X2;j =

��(j=m)2

(1 + hj(d;�)); : : : ;

�(j=m)2Ry(1 + hj(d;�))

�0;

~X3;j =

��(j=m)2d

(1 + hj(d;�)); : : : ;

�(j=m)2d+2Rw(1 + hj(d;�))

�0;

and ~X0;j is ~Xj evaluated at (d0;�0).

As in AS p. 601 we write the right-hand side of (26) as T1;n + T2;n + T3;n + T4;n, where

T1;n = m�1=2mXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)� E

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

�� ~X0;j �

1

m

mXk=1

~X0;k

!;

T2;n = m�1=2mXj=1

�EIz (�j)

fz (�j)� 1�

fz (�j)

gj (d0;�0)

~X0;j �

1

m

mXk=1

~X0;k

!;

T3;n = m�1=2mXj=1

(2�I" (�j)� 1) ~X0;j �

1

m

mXk=1

~X0;k

!;

T4;n = m�1=2mXj=1

�fz (�j)

gj (d0;�0)� 1�

~X0;j �1

m

mXk=1

~X0;k

!:

Then we show that T3;nd! N (0;r) while Ti;n = oP (1) for i = 1; 2; 4.

Clearly the proof for T3;n of AS works here as well. We just have to verify that

1

m

mXj=1

�2j ! �0Ry ;Rw�;

where

�j = �0(~X0;j �

1

m

mXk=1

~X0;k) and Ry ;Rw =

0B@ 4 �0Ry � 0Rw�Ry �Ry 0Ry ;Rw�Rw Rw;Ry Rw

1CA ;which follows from part (a) of the lemma.

To show the result for T1;n we use summation by parts:

T1;n = m�1=2m�1Xk=1

�~X0;k � ~X0;k+1

� kXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)� E

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

��

+

~X0;m �

1

m

mXk=1

~X0;k

!m�1=2

mXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)� E

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

��

= m�1=2m�1Xk=1

O(k�1)OP (�k;n(d0) + k'y+1=2n�'y + k1=2+2d0n�2d0)

+O(1)m�1=2OP (�m;n(d0) +m'y+1=2n�'y +m1=2+2d0n�2d0)

= OP (m�1=2(logm)�m;n(d0) + (m=n)

min('y ;2d0));

87

where �k;n(d) is de�ned in Lemma 5. The second equality above applies Lemma 5 and that ~X0;k �~X0;k+1 = O(k

�1) uniformly in k = 1; : : : ;m and ~X0;m � 1m

Pmk=1

~X0;k = O(1) (follows from approx-

imating sums by integrals, see also AS p. 602). Thus T1;n = OP ((logm)5=3m�1=6 + (logm)n�1=4 +

(m=n)min('y ;2d0)) in the stationary case and T1;n = OP ((logm)1+2=(5�4d0)m�(3�4d0)=(10�8d0)+(logm)2m2d0�3=2+

(logm)(log n)5=4n�1=2md0=2 + (logm)3=2n�1=4md0�1=2 + (m=n)min('y ;2d0)) in the nonstationary case.

Since d0 belongs to the interior of the parameter space it follows that T1;n = oP (1).

To prove the result for T2;n we use Robinson�s (1995b) Theorem 2, i.e., that EIy (�j) =fy (�j) =

1+O(j�1(log j)) uniformly in j = 1; : : : ;m in the stationary case, as well as Velasco�s (1999) Theorem

1, EIy (�j) =fy (�j) = 1 + O(j2d0�2(log j)) uniformly in j = 1; : : : ;m in the nonstationary case. Note

that, as in AS, the remainder terms are di¤erent from those of Robinson (1995b) and Velasco (1999)

because of the normalization by fy (�j) rather than by G0��2d0j . Thus, as in the proof of Lemma 5

we can write

EIz (�j)

fz (�j)� 1 =

fy (�j)� fz (�j)fz (�j)

�EIy (�j)

fy (�j)� 1�+

�EIy (�j)

fy (�j)� 1�

+2pfy (�j)

fz (�j)

E Re (Iyw(�j))pfy (�j)

+EIw(�j) + fy (�j)� fz (�j)

fz (�j):

Because EIw (�j) = fw(�j)+O(j�1(log j)) and fz (�j)�fy (�j) = fw (�j), the last term isO(j�1(log j)�2d0j ).

By the same reasoning and by independence of fytg and fwtg, the second to last term isOP (�d0j j�1(log j))in the stationary case and OP (�

d0j j

2d0�2(log j)) in the nonstationary case (see also the proof of Lemma

5 below and the second to last equation on p. 108 of Velasco (1999)). We thus obtain the bounds

EIz (�j) =fz (�j)�1 = O(j�1(log j)) for the stationary case and EIz (�j) =fz (�j)�1 = O(j2d0�2(log j))for the nonstationary case, for all j = 1; : : : ;m. We also have that fz (�j) =gj (d0;�0)�1 = O((j=n)'y+(j=n)2d0+'w) for all j = 1; : : : ;m by (13). Therefore, in the stationary case, T2;n can be bounded sim-

ilarly to (A.24) of AS,

T2;n = m�1=2mXj=1

O(j�1(log j))O(1)O(logm)

= O

0@m�1=2(logm)mXj=1

j�1(log j)

1A= O((logm)3m�1=2);

using also that j~X0;j � 1m

Pmk=1

~X0;kj = O(logm) uniformly in j = 1; : : : ;m. In the nonstationary casewe �nd in the same way that

T2;n = m�1=2mXj=1

O(j2d0�2(log j))O(1)O(logm)

= O

0@m�1=2(logm)mXj=1

j2d0�2(log j)

1A= O((logm)3m2d0�3=2):

In both the stationary and nonstationary cases, T2;n is o(1) since d0 < d2 < 3=4.

88

The proof for T4;n follows from summation by parts and the approximation fz (�j) =gj (d0;�0)�1 =O((j=n)'y + (j=n)2d0+'w) for all j = 1; : : : ;m, which implies that

T4;n = m�1=2m�1Xk=1

�~X0;k � ~X0;k+1

� kXj=1

�fz (�j)

gj (d0;�0)� 1�

+

~X0;m �

1

m

mXk=1

~X0;k

!m�1=2

mXj=1

�fz (�j)

gj (d0;�0)� 1�

= m�1=2m�1Xk=1

O(k�1)kXj=1

O((j=n)'y + (j=n)2d0+'w)

+O(1)m�1=2mXj=1

O((j=n)'y + (j=n)2d0+'w)

= O(m1=2+'yn�'y +m1=2+2d0+'wn�2d0�'w):

Condition (15) shows that this is oP (1).

C.4 Proof of (b), second statement

To prove the second statement of Lemma 3(b) we have to show that

1

mG (d;�)�1

mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(~Xj)i(~Xj)l; i; l = 2; : : : ; R+ 2;

� 1mG (d;�)�1

mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(~Xj)i

2hw(�w; �j)�2dj (log �j)

(1 + hj(d;�)); i = 2; : : : ; Ry + 1;

1

mG (d;�)�1

mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(~Xj)i2(log �j)

1�


(1 + hj(d;�))

!; i = Ry + 2; : : : ; R+ 2;

1

mG (d;�)�1

mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(~Xj)Ry+24hw(�w; �j) (log �j)

2

1�


(1 + hj(d;�))

!;

are all negligible when evaluated at (d0;�0). Note that it su¢ ces to prove the result for the generic

term

Vn (d;�) =1

mG (d;�)�1

mXj=1

GIz (�j)

gj (d;�)� 1

m

mXk=1

GIz (�k)

gk (d;�)

!(~Xj)Ry+2qj(d;�); (27)

89

where qj(d0;�0) depends on j but is at most of order O (log n) and satis�es qj+1(d0;�0)� qj(d0;�0) =O(j�1) uniformly in j = 1; : : : ;m. Summation by parts on Vn(d0;�0) yields

Vn (d0;�0) =1

mG (d0;�0)

�1 qm(d0;�0)mXj=1

GIz (�j)

gj (d0;�0)� 1

m

mXk=1

GIz (�k)

gk (d0;�0)

!(~X0;j)Ry+2

+1

mG (d0;�0)

�1m�1Xk=1

(qk(d0;�0)� qk+1(d0;�0))kXj=1

GIz (�j)

gj (d0;�0)� 1

m

mXk=1

GIz (�k)

gk (d0;�0)

!(~X0;j)Ry+2

= m�1qm(d0;�0)OP (m1=2) +m�1

m�1Xk=1

(qk(d0;�0)� qk+1(d0;�0))OP (k1=2)

= OP

�m�1=2(log n) +m�1=2

�;

where the second equality follows from part (e) of the lemma.

C.5 Proof of (c)

First we prove the result for H1n, where we need to show that

sup�2�

�� ~Ga;b0;c(d0;�)� ~Ga;b0;c(d0;�0)�� = oP ((logm)�2)

for a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d + 1; : : : ; d + Rw + Ry; 2d; 2d + 1; : : : ; 2d + 2Rw. By the

triangle inequality it su¢ ces to show that

sup�2�

�� ~Ga;b0;c(d0;�)� Ga;b0(d0;�)��+ sup�2�

��Ga;b0(d0;�)� Ga;b0(d0;�0)��+ ~�a;b0;c = oP ((logm)�2): (28)

We showed in (25) that ~�a;b0;c = oP ((logm)�2).

Following the proof of (25), the �rst term on the left-hand side of (28) is

sup�2�

��m�1mXj=1

�2d0j Iz(�j)

"1

(1 + hj(d0;�))c+1

2 log j �

2hw(�w; �j)�2d0j log �j

(1 + hj(d0;�))

!a� (2 log j)a

#�j

m

�2b0��= m�1

mXj=1

�2d0j Iz(�j)

�1

1 +O((j=n)2d0)

�2 log j � O((j=n)

2d0 log n)

1 +O((j=n)2d0)

�a� (2 log j)a

��j

m

�2b0by Lemma 4(i), which proves the result for the �rst term of (28) by the same arguments as those

applied to (25).

For n su¢ ciently large, gj (d0;�0) > 0 for all j = 1; : : : ;m, and then the second term on the

left-hand side of (28) is

sup�2�

��m�1mXj=1

Iz(�j)

gj (d0;�0)

�gj (d0;�0)

gj (d0;�)� 1�(2 log j)a

�j

m

�2b0��= sup

�2�;j=1;:::;m

��gj (d0;�0)gj (d0;�)� 1��m�1

mXj=1

Iz(�j)

gj (d0;�0)(2 log j)a

�j

m

�2b0= Ga;b0(d0;�0) sup

�2�;j=1;:::;m

��hj (d0;�0)� hj (d0;�)1 + hj (d0;�)

�� ;90

noting that all the terms inside the summation on the right-hand side of the second equality are

positive. From Lemma 4(ii) and the fact that Ga;b0(d0;�0) = OP ((logm)a) by (24), it thus follows

that the second term on the left-hand side of (28) is OP ((logm)a(1+ o(1))�1�2d0m ), which proves (28).

Next we prove the result for H2n. Again, it su¢ ces to show the result for the generic term Vn (d;�)

de�ned in (27), i.e. we must show that sup�2� jVn(d0;�)� Vn(d0;�0)j = oP (1). By (24) and (28) wehave that

sup�2�

G(d0;�) = G(1 + oP ((logm)�2)); (29)

and sup�2� jVn(d0;�)� Vn(d0;�0)j is, apart from a term that is negligible uniformly in �,

sup�2�

�� 1mmXj=1

Iz (�j)

gj (d0;�)� 1

m

mXk=1

Iz (�k)

gk (d0;�)

!(j=m)2d0

(1 + hj(d0;�))qj(d0;�)

� 1m

mXj=1

Iz (�j)

gj (d0;�0)� 1

m

mXk=1

Iz (�k)

gk (d0;�0)

!(j=m)2d0

(1 + hj(d0;�0))qj(d0;�0)

�� sup

�2�

�� 1mmXj=1

�Iz (�j)

gj (d0;�)

(j=m)2d0

(1 + hj(d0;�))qj(d0;�)�

Iz (�j)

gj (d0;�0)

(j=m)2d0

(1 + hj(d0;�0))qj(d0;�0)

�� (30)+ sup�2�

�� 1mmXj=1

1

m

mXk=1

�Iz (�k)

gk (d0;�)

qj(d0;�)

(1 + hj(d0;�))� Iz (�k)

gk (d0;�0)

qj(d0;�0)

(1 + hj(d0;�0))

��j

m

�2d0�� :(31)By the triangle inequality, (30) is bounded by

sup�2�

�� 1mmXj=1

Iz (�j)

gj (d0;�0)

�j

m

�2d0 � qj(d0;�)

(1 + hj(d0;�))� qj(d0;�0)

(1 + hj(d0;�0))

�� (32)

+ sup�2�

�� 1mmXj=1

�j

m

�2d0 � Iz (�j)

gj (d0;�)� Iz (�j)

gj (d0;�0)

�qj(d0;�)

(1 + hj(d0;�))

�� : (33)

Note that, by inspection of the de�nition of qj(d;�) in (27) we have two prototypical expressions for

the di¤erence appearing in (32),

qj(d0;�)

(1 + hj(d0;�))� qj(d0;�0)

(1 + hj(d0;�0))=

8<: O�

(j=m)2d0

(1+hj(d0;�))� (j=m)2d0

(1+hj(d0;�0))

�;

O�4 (hw(�w; �j)� hw(�w;0; �j)) (m=n)2d0 (log �j)2

�:

(34)

Inserting the �rst term of (34) into (32) we obtain, since for n su¢ ciently large gj (d0;�0) > 0 for all

j = 1; : : : ;m,

(32) = OP

0@sup�2�

�� 1mmXj=1

Iz (�j)

gj (d0;�0)

�j

m

�4d0 �(1 + hj(d0;�0))� (1 + hj(d0;�))(1 + hj(d0;�)) (1 + hj(d0;�0))

��1A

= OP

G0;2d0(d0;�0) sup

�2�;j=1;:::;m

�� hj(d0;�0)� hj(d0;�)(1 + hj(d0;�)) (1 + hj(d0;�0))

��!;

91

and by Lemma 4(ii) it follows that (32) is OP (�2d0m ) in this case. Inserting the second term of (34)

into (32) we obtain

(32) = OP

G0;d0(d0;�0) sup

�2�;j=1;:::;m

��(hw(�w; �j)� hw(�w;0; �j)) (m=n)2d0 (log �j)2��!= OP

�(m=n)2d0(log n)2

�by compactness of �. It follows that (32) is oP (1). Applying summation by parts to (33) we get the

bound

sup�2�

�� qm(d0;�)

(1 + hm(d0;�))

1

m

mXj=1

�j

m

�2d0 � Iz (�j)

gj (d0;�)� Iz (�j)

gj (d0;�0)

��+ sup�2�

�� 1mm�1Xk=1

�qk(d0;�)

(1 + hk(d0;�))� qk+1(d0;�)

(1 + hk+1(d0;�))

� kXj=1

�j

m

�2d0 � Iz (�j)

gj (d0;�)� Iz (�j)

gj (d0;�0)

�� ;where the �rst term is

sup�2�

�� qm(d0;�)

(1 + hm(d0;�))

1

G

�G0;d0(d0;�)�G0;d0(d0;�0)

�� = oP �(log n)(logm)�2�by (24), (28), and sup�2� qm(d0;�) = O(log n), and the second term is

OP

sup�2�

�� 1mm�1Xk=2

�qk(d0;�)

(1 + hk(d0;�))� qk+1(d0;�)

(1 + hk+1(d0;�))

��k

m

�2d0k(log k)�2

��!

= OP

sup�2�

�� 1mm�1Xk=2

�qk(d0;�)� qk+1(d0;�)

(1 + hk(d0;�))

��k

m

�2d0k(log k)�2

��!

+OP

sup�2�

�� 1mm�1Xk=2

qk+1(d0;�)

�1

(1 + hk(d0;�))� 1

(1 + hk+1(d0;�))

��k

m

�2d0k(log k)�2

��!;

which, using qk(d0;�)� qk+1(d0;�) = O(k�1) and qk+1(d0;�) = O(log n) for any �, is

OP

1

m

m�1Xk=2

�k

m

�2d0(log k)�2 +

1

m

m�1Xk=1

(log n)��2d0k+1 � �

2d0k

�� km

�2d0k(log k)�2

!

= OP

1

m

m�1Xk=2

�k

m

�2d0(log k)�2 + (log n)�2d0m

1

m

m�1Xk=1

�k

m

�2d0(log k)�2

!= OP

�(logm)�2 + (log n)(logm)�2(m=n)2d0

�:

Thus both terms of (33) are oP (1) under (15).

Along the same lines we rewrite (31) as

sup�2�

�� 1mmXj=1

�qj(d0;�)

(1 + hj(d0;�))� qj(d0;�0)

(1 + hj(d0;�0))

��j

m

�2d0 1m

mXk=1

Iz (�k)

gk (d0;�0)

��+ sup�2�

�� 1mmXj=1

qj(d0;�)

(1 + hj(d0;�))

�j

m

�2d0 1m

mXk=1

�Iz (�k)

gk (d0;�)� Iz (�k)

gk (d0;�0)

��92

and, using the de�nition of Ga;b(d;�), this is equal to

sup�2�

��G0;0(d0;�0)G

1

m

mXj=1

�qj(d0;�)

(1 + hj(d0;�))� qj(d0;�0)

(1 + hj(d0;�0))

��j

m

�2d0��+ sup�2�

�� 1G�G0;0(d0;�)� G0;0(d0;�0)

� 1m

mXj=1

qj(d0;�)

(1 + hj(d0;�))

�j

m

�2d0��= OP

0@sup�2�

�� 1mmXj=1

�qj(d0;�)

(1 + hj(d0;�))� qj(d0;�0)

(1 + hj(d0;�0))

��j

m

�2d0��1A

+oP

0@sup�2�

��(logm)�2 1mmXj=1

qj(d0;�)

(1 + hj(d0;�))

�j

m

�2d0��1A ;

where the second term is easily seen to be oP ((logm)�2(log n)) = oP (1). By (34), the �rst term is

OP

0@sup�2�

�� 1mmXj=1

�(j=m)4d0

(1 + hj(d0;�))� (j=m)4d0

(1 + hj(d0;�0))

��1A

+OP

0@sup�2�

�� 1mmXj=1

(hw(�w; �j)� hw(�w;0; �j)) (m=n)2d0 (log �j)2�j

m

�2d0��1A

= OP

0@ 1

m

mXj=1

�j

m

�4d0sup�2�

�� hj(d0;�0)� hj(d0;�)(1 + hj(d0;�)) (1 + hj(d0;�0))

��1A

+OP

0@(m=n)2d0 1m

mXj=1

(log �j)2

�j

m

�2d01A= OP

0@ 1

m

mXj=1

�j

m

�4d0�2d0j + (m=n)2d0(log n)2

1

m

mXj=1

�j

m

�2d01A= OP

�(m=n)2d0(log n)2

�;

using compactness of � and Lemma 4. This is oP (1) which proves part (c).

C.6 Proof of (d)

Again, we �rst prove the result for H1n which follows if

supd2Dm(�n);�2�

�� ~Ga;b;c(d;�)� ~Ga;b0;c(d0;�)�� = oP ((logm)�2) (35)

for a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d+ 1; : : : ; d+Rw +Ry; 2d; 2d+ 1; : : : ; 2d+ 2Rw. De�ning

~Ea;b;c(d;�) =1

m

mXj=1

j2dIz(�j)

(1 + hj(d;�))c+1

2 log j �


(1 + hj(d;�))

!a�j

m

�2b;

Ea;b(d;�) =1

m

mXj=1

j2dIz(�j)

(1 + hj(d;�))(2 log j)a

�j

m

�2b;

93

we need to show that, for all a; c = 0; 1; 2 and b = 0; 1; : : : ; 2Ry; d; d + 1; : : : ; d + Rw + Ry; 2d; 2d +

1; : : : ; 2d+ 2Rw,

Za;b;c (�n) := supd2Dm(�n);�2�

�� ~Ea;b;c(d;�)� ~Ea;b0;c(d0;�)�� = oP (n2d0(logm)�2);

see also AS p. 600. Note that since b can take values including d, we distinguish between b and b0which are obviously the same in case b = 0; 1; : : : ; 2Ry. By the triangle inequality it is su¢ cient to

show that


�� ~Ea;b;c(d;�)� Ea;b(d;�)��+ supd2Dm(�n);�2�

��Ea;b(d;�)� Ea;b0(d0;�)��+ sup�2�

��Ea;b0(d0;�)� ~Ea;b0;c(d0;�)��

=: Z1;a;b;c(�n) + Z2;a;b;c(�n) + Z3;a;b0;c(�n) = oP (n2d0(logm)�2):

The result for Z3;a;b0;c(�n) follows from part (c) of the lemma since it does not depend on d.

For Z2;a;b;c(�n) we �nd that

Z2;a;b;c(�n)

= supd2Dm(�n);�2�

�� 1mmXj=1

"�j2dIz(�j)

1 + hj(d;�)

��j

m

�2b��j2d0Iz(�j)

1 + hj(d0;�)

��j

m

�2b0#(2 log j)a

��= sup

d2Dm(�n);�2�

�� 1mmXj=1

�j2d � j2d0

�Iz(�j)

1

1 + hj(d;�)

�j

m

�2b(2 log j)a

��+ supd2Dm(�n);�2�

�� 1mmXj=1

j2d0Iz(�j)

1 + hj(d;�)

1 + hj(d0;�)

�j

m

�2b0�2b� 1!

1

1 + hj(d;�)

�j

m

�2b(2 log j)a

�� :Since � is compact and 0 < d1 � d � d2 <1, for n su¢ ciently large it holds that

infd2[d1;d2];�2�j=1;:::;m

j1 + hj(d;�)j � c > 0; supb�0;j=1;:::;m

jj=mj2b = 1: (36)

Thus, the �rst term of Z2;a;b;c(�n) is bounded by

supd2Dm(�n)

��c�1 1mmXj=1

j2d0Iz(�j)(2 log j)a��j2d�2d0 � 1��

�� ;which is oP (n2d0(logm)�2) as in (A.18) of AS.

The second term of Z2;a;b;c(�n) is bounded by


�� 1mmXj=1

j2d0Iz(�j)

�1 + hj(d;�)

1 + hj(d0;�)� 1�

1

1 + hj(d;�)

�j

m

�2b(2 log j)a

�� (37)

+ supd2Dm(�n);�2�

�� 1mmXj=1

j2d0Iz(�j)1 + hj(d;�)

1 + hj(d0;�)

�j

m

�2b0��j

m

�2b! 1

1 + hj(d;�)(2 log j)a

�� ;(38)94

and using (36) and Lemma 4(iii) we �nd that (37) is

oP

0@ 1

m

mXj=1

j2d0Iz(�j)�2d1m (logm)a

1A = oP

0@ 1

m

mXj=1

�2d0j Iz(�j)� n2�

�2d0�2d1m (logm)a

1A :Noting thatm�1Pm

j=1 �2d0j Iz(�j) = G0;0(d0;0) = G(1+oP ((logm)

�2)); (37) is equal to oP (n2d0�2d1m (logm)a) =

oP (n2d0(logm)�2). By the mean value theorem, xa = xb + (a � b)xa�(log x) for a � a� � b which

implies that

supd2Dm(�n)

��j

m

�2b0��j

m

�2b�� = O

supd2Dm(�n)

(b0 � b) (logm)!= O

�(logm)�5�n(logm)

�:

Thus, applying also (36) and Lemma 4(iii), (38) is

OP

0@ 1

m

mXj=1

j2d0Iz(�j)(logm)a�4�n

1A = OP

��nn

2d0(logm)a�4G0;0(d0;0)�

= OP

��nn

2d0(logm)a�4�

= oP

�n2d0(logm)�2

�since �n = o (1) and a � 2.

Next, Z1;a;b;c(�n) is

supd2Dm(�n)�2�

�� 1mmXj=1

j2d0Iz(�j)

(1 + hj(d;�))j2d�2d0

"1

(1 + hj(d;�))c

2 log j �


(1 + hj(d;�))

!a� (2 log j)a

#�j

m

�2b�� :If a = 0 the result follows by (36), Lemma 4(iv), supd2Dm(�n);j=1;:::;m j

2d�2d0 = O(1), and 1m

Pmj=1 j

2d0Iz(�j) =

G0;0(d0;0)(2�=n)�2d0 = OP (n

2d0). When a � 1 we apply the mean value theorem as in the proof of

(25) such that 2 log j �


(1 + hj(d;�))

!a� (2 log j)a = O

�(log j)a�1�2dj (log n)

�(39)

uniformly in � 2 �. We then bound Z1;a;b;c(�n) as

supd2Dm(�n)�2�

�� 1mmXj=1

j2d0Iz(�j)

(1 + hj(d;�))j2d�2d0

" 2 log j �


(1 + hj(d;�))

!a� (2 log j)a

#�j

m

�2b��+ supd2Dm(�n)�2�

�� 1mmXj=1

j2d0Iz(�j)

(1 + hj(d;�))j2d�2d0

�1

(1 + hj(d;�))c� 1�

2 log j �2hw(�w; �j)�

2dj log �j

(1 + hj(d;�))

!a�j

m

�2b�� ;where the �rst term is G0;0(d0;0)(2�=n)�2d0OP ((logm)a�1�2d1m (log n)) = oP (n

2d0(logm)�2) by (36)

and (39) and the second term is G0;0(d0;0)(2�=n)�2d0oP (�2d1m (logm)a) = oP (n2d0(logm)�2) by (36),

(39), and Lemma 4(iv).

95

We proceed to show that supd2Dm(�n);�2�B�1n kH2n(d;�)�H2n(d0;�)kB�1n = oP (1) or equiva-

lently that supd2Dm(�n);�2� jVn(d;�)� Vn(d0;�)j = oP (1). Since we have shown (35) we have that

G(d;�)P! G uniformly in � 2 �; d 2 Dm(�n), so we need to show that the following is oP (1) :


�� 1mmXj=1

Iz (�j)

gj (d;�)� 1

m

mXk=1

Iz (�k)

gk (d;�)

!(j=m)2d

(1 + hj(d;�))qj(d;�)

� 1m

mXj=1

Iz (�j)

gj (d0;�)� 1

m

mXk=1

Iz (�k)

gk (d0;�)

!(j=m)2d0

(1 + hj(d0;�))qj(d0;�)

�� sup

d2Dm(�n);�2�

�� 1mmXj=1

Iz (�j)

gj (d;�)

qj(d;�)

(1 + hj(d;�))

�j

m

�2d� Iz (�j)

gj (d0;�)

qj(d0;�)

(1 + hj(d0;�))

�j

m

�2d0!�� (40)


�� 1mmXj=1

1

m

mXk=1

Iz (�k)

gk (d0;�)

qj(d0;�)

(1 + hj(d0;�))

�j

m

�2d0� Iz (�k)

gk (d;�)

qj(d;�)

(1 + hj(d;�))

�j

m

�2d!�� :(41)

By the triangle inequality we get the bounds

(40) � supd2Dm(�n);�2�

�� 1mmXj=1

Iz (�j)

gj (d0;�)

�j

m

�2d��j

m

�2d0! qj(d0;�)

(1 + hj(d0;�))

�� (42)


�� 1mmXj=1

Iz (�j)

gj (d0;�)

�j

m

�2d�gj (d0;�)gj (d;�)

� 1�

qj(d0;�)

(1 + hj(d0;�))

�� (43)


�� 1mmXj=1

Iz (�j)

gj (d;�)

�j

m

�2d� qj(d;�)

(1 + hj(d;�))� qj(d0;�)

(1 + hj(d0;�))

�� (44)

and

(41) � supd2Dm(�n);�2�

�� 1mmXj=1

�j

m

�2d0��j

m

�2d! qj(d0;�)

(1 + hj(d0;�))

1

m

mXk=1

Iz (�k)

gk (d0;�)

�� (45)


�� 1mmXj=1

�j

m

�2d qj(d0;�)

(1 + hj(d0;�))

1

m

mXk=1

Iz (�k)

gk (d0;�)

�1� gk (d0;�)

gk (d;�)

�� (46)


�� 1mmXj=1

�j

m

�2d� qj(d0;�)

(1 + hj(d0;�))� qj(d;�)

(1 + hj(d;�))

�1

m

mXk=1

Iz (�j)

gk (d;�)

�� :(47)The required results for (42) and (45) follow using the mean value theorem as in (38), whereas the

results for (43) and (46) follow as in (37). For (44) and (47) we note that, by inspection of the

de�nition of qj(d;�) in (27), c.f. (34), it is su¢ cient to show the result for

qj(d;�)

(1 + hj(d;�))� qj(d0;�)

(1 + hj(d0;�))=

(j=m)2d

(1 + hj(d;�))� (j=m)2d0

(1 + hj(d0;�))

=

�j

m

�2d� 1

(1 + hj(d;�))� 1

(1 + hj(d0;�))

�+

1

(1 + hj(d0;�))

�j

m

�2d��j

m

�2d0!:

96

Inserting this into (44) ((47) follows the same way) we get the bound

(44) � supd2Dm(�n);�2�

�� 1mmXj=1

Iz (�j)

gj (d;�)

�j

m

�4d� 1

(1 + hj(d;�))� 1

(1 + hj(d0;�))

��+ supd2Dm(�n);�2�

�� 1mmXj=1

Iz (�j)

gj (d;�)

�j

m

�2d 1

(1 + hj(d0;�))

�j

m

�2d��j

m

�2d0!�� ;which we can handle similarly to (37) respectively (38).

Appendix D: Auxiliary lemmas

We now state two useful lemmas, which are used in the proofs of the main theorems. The �rst is stated

without proof and gathers some properties of the function hj(d;�), which all follow by compactness

of �.

Lemma 4 Let hj(d;�) = h(d;�;�j) =PRyr=1 �y;r�

2r + �2djPRwr=0 �w;r�

2r, 0 < d1 < d2 < 1, and let �

be compact. Then, as n!1 and for c = 0; 1; 2,

(i) sup�2� j(1 + hj(d0;�))c+1 � 1j = O(sup�2� hj(d0;�)) = O((j=n)2d0);(ii) inf

d2[d1;d2];�2�j=1;:::;m

j1 + hj (d;�)j = 1 + o(1) and sup�2�;j=1;:::;m jhj (d0;�0)� hj (d0;�)j = O(�2d0m );

(iii) supd2[d1;d2];�2�j=1;:::;m

�� 1+hj(d;�)1+hj(d0;�)� 1�� = O supd2[d1;d2];�2�

j=1;:::;m

�� r+1(�2dj ��2d0j )

1+hj(d0;�)

��!= O(�2d1m );

(iv) supd2[d1;d2];�2�j=1;:::;m

j(1 + hj(d;�))c � 1j = O(supd2[d1;d2];�2�j=1;:::;m

hj(d;�)) = O(�2d1m ):

The next lemma provides approximations of the periodogram of zt by that of "t, following well

known results from, e.g., Robinson (1995a), Velasco (1999), AS, and Hurvich et al. (2005).

Lemma 5 Let Assumptions A1-A6 hold. Then, as n!1 and for all k = 1; : : : ;m,

kXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

�= OP

��k;n(d0) + k

'y+1n�'y + kd0+'w+1n�d0�'w + k1+2d0n�2d0 + k2d0n�d0(log k)�

and

kXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)� E

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

��= OP

��k;n(d0) + k

'y+1=2n�'y + k1=2+2d0n�2d0�;

where

�k;n(d) = k1=3(log k)2=3 + k1=2n�1=4

in the stationary case and

�k;n(d) = k1=(5�4d)(log k)2=(5�4d) + k2d�1(log k) + n�1=2k(1+d)=2(log n)5=4 + n�1=4kd(log k)1=2

in the nonstationary case.

97

Proof. Note that, in the nonstationary case, Hurvich et al. (2005) examine the di¤erence betweenthe normalized periodograms of zt and �yt (in our notation), whereas we examine the di¤erence

between the normalized periodograms of zt and yt itself in both the stationary and nonstationary

cases.

De�ne ~gj(d;�) = ��2dj G0 (1 + hy(�y; �j)) and write

kXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

�=

kXj=1

�Iz (�j)

gj (d0;�0)� Iy (�j)

~gj(d0;�0)

�(48)

+kXj=1

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)

�: (49)

In the stationary case (49) is OP (k1=3(log k)2=3+k'y+1n�'y+k1=2n�1=4) by (A.13)(i) of AS, and in the

nonstationary case (49) isOP (k1=(5�4d0)(log k)2=(5�4d0)+k'y+1n�'y+k2d0�1(log k)+n�1=2k(1+d0)=2(log n)5=4+

n�1=4kd0(log k)1=2) by slight modi�cation of Lemma 1 of Velasco (1999) to account for the better ap-

proximation of fy(�j) by ~gj(d0;�0) due to our polynomial appearing in ~gj(d0;�0) (the required modi-

�cation is the same as that used by AS to modify (4.8) of Robinson (1995a) to obtain their (A.13)(i)).

The term (48) is

Iz (�j)

gj (d0;�0)� Iy (�j)

~gj(d0;�0)=

~gj(d0;�0)� gj (d0;�0)gj (d0;�0)

�Iy (�j)

~gj(d0;�0)� 1�

(50)

+2phw(�w; �j)

p~gj(d0;�0)

gj (d0;�0)

Re (Iyw(�j))p~gj(d0;�0)

phw(�w; �j)

(51)

+Iw(�j) + ~gj(d0;�0)� gj (d0;�0)

gj (d0;�0); (52)

where Iab(�) = 12�n

Pnt=1

Pns=1 atbse

i(s�t)� denotes the cross-periodogram between the two series atand bt. Using summation by parts on (50) we �nd that

kXj=1

~gj(d0;�0)� gj (d0;�0)gj (d0;�0)

�Iy (�j)

~gj(d0;�0)� 1�

=k�1Xj=1

�~gj(d0;�0)� gj (d0;�0)

gj (d0;�0)� ~gj+1(d0;�0)� gj+1 (d0;�0)

gj+1 (d0;�0)

� jXl=1

�Iy (�l)

~gl(d0;�0)� 1�

+~gk(d0;�0)� gk (d0;�0)

gk (d0;�0)

kXj=1

�Iy (�j)

~gj(d0;�0)� 1�;

which is OP ((k=n)2d0(k1=3(log k)2=3 + k'y+1n�'y + k1=2n�1=4 + k1=2)) in the stationary case whereas

it is OP ((k=n)2d0(k1=(5�4d0)(log k)2=(5�4d0) + k'y+1n�'y + k2d0�1(log k) + n�1=2k(1+d0)=2(log n)5=4 +

n�1=4kd0(log k)1=2 + k1=2)) in the nonstationary case, by the same methods as applied previously

and using also (4.9) of Robinson (1995a) and that j~gj(d0;�0)=gj (d0;�0) � 1j � C(j=n)2d0 . Next,

(52) is easily seen to be OP ((j=n)2d0) because EjIw(�j)j = OP (1) uniformly in j = 1; : : : ;m. Since

fytg and fwtg are independent (51) is OP ((j=n)d0(j�1(log j) + (j=n)min('y ;'w))) in the stationarycase by Theorem 2 of Robinson (1995b), yielding a contribution to (48) of OP ((k=n)d0((log k) +

k1+min('y ;'w)n�min('y ;'w))). In the nonstationary case we use Theorem 1 of Velasco (1999) which

98

shows that Re(Iyw(�j))j~gj (d0;�0) j�1=2jhw(�w; �j)j�1=2 = OP ((j2d0�2(log j)+ (j=n)min('y ;'w))), yield-ing a contribution to (48) of OP ((k=n)d0(kd0(log k) + k1+min('y ;'w)n�min('y ;'w)) (Velasco�s result has

to be modi�ed to accommodate multivariate time series, but the modi�cation is simple by comparing

e.g. his equation (A.1) with equation (4.3) of Robinson (1995b), see also the second to last equation

on p. 108 of Velasco (1999)). The di¤erence in the remainder terms relative to Robinson (1995b) and

Velasco (1999) is due to the di¤erent remainder term in the approximation of fy(�j) by ~gj (d0;�0) due

to our polynomial appearing in ~gj (d0;�0).

To prove the second result we write

kXj=1

�Iz (�j)

gj (d0;�0)� 2�I" (�j)� E

�Iz (�j)

gj (d0;�0)� 2�I" (�j)

��

=kXj=1

�Iz (�j)

gj (d0;�0)� Iy (�j)

~gj(d0;�0)� E

�Iz (�j)

gj (d0;�0)� Iy (�j)

~gj(d0;�0)

��(53)

+kXj=1

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)� E

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)

��: (54)

By (A.21) of AS, (54) is OP (k1=3(log k)2=3 + k'y+1=2n�'y + k1=2n�1=4) in the stationary case, and by

(slight modi�cation of) Lemma 1 of Velasco (1999), (54) isOP (k1=(5�4d0)(log k)2=(5�4d0)+k'y+1=2n�'y+

k2d0�1(log k) + n�1=2k(1+d0)=2(log n)5=4 + n�1=4kd0(log k)1=2) in the nonstationary case. For eq. (53)

we write

Iz (�j)

gj (d0;�0)� Iy (�j)

~gj(d0;�0)� E

�Iz (�j)

gj (d0;�0)� Iy (�j)

~gj(d0;�0)

�=

~gj(d0;�0)� gj (d0;�0)gj (d0;�0)

��Iy (�j)

~gj(d0;�0)� 2�I" (�j)

�� E

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)

��(55)

+~gj(d0;�0)� gj (d0;�0)

gj (d0;�0)(2�I" (�j)� 1) (56)

+2p~gj(d0;�0)

gj (d0;�0)

Re (Iyw(�j)� EIyw(�j))p~gj(d0;�0)

(57)

+hw(�w;0; �j)

gj (d0;�0)

��Iw (�j)

hw(�w;0; �j)� 2�I�(�j)

�� E

�Iw (�j)

hw(�w;0; �j)� 2�I�(�j)

��(58)

+hw(�w;0; �j)

gj (d0;�0)(2�I�(�j)� 1) ; (59)

using also that hw(�w;0; �j) = gj (d0;�0)� ~gj(d0;�0).

99

Using summation by parts we �nd that (59) is

kXj=1

hw(�w;0; �j)

gj (d0;�0)(2�I�(�j)� 1) =

k�1Xj=1

�hw(�w;0; �j)

gj (d0;�0)� hw(�w;0; �j+1)

gj+1 (d0;�0)

� jXl=1

(2�I�(�l)� 1)

+hw(�w;0; �k)

gk (d0;�0)

kXj=1

(2�I�(�j)� 1)

=

k�1Xj=1

��hw(�w;0; �j)gj+1 (d0;�0)� hw(�w;0; �j+1)gj (d0;�0)gj (d0;�0) gj+1 (d0;�0)

��OP (j1=2)+hw(�w;0; �k)

gk (d0;�0)OP (k

1=2)

= OP

0@k�1Xj=1

j2d0�1=2n�2d0

1A+OP (k1=2+2d0n�2d0)= OP (k

1=2+2d0n�2d0);

using (4.9) of Robinson (1995a) for the second equality. The term (56) is handled in exactly the same

way yielding the same contribution. For the term (57) we can split it up in the same way as (58) and

(59), and the contribution is the same.

Using summation by parts on (55) we �nd that, in the stationary case,

kXj=1

~gj(d0;�0)� gj (d0;�0)gj (d0;�0)

��Iy (�j)

~gj(d0;�0)� 2�I" (�j)

�� E

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)

��

=k�1Xj=1

�~gj(d0;�0)� gj (d0;�0)

gj (d0;�0)� ~gj+1(d0;�0)� gj+1 (d0;�0)

gj+1 (d0;�0)

�

�jXl=1

��Iy (�l)

~gl(d0;�0)� 2�I" (�l)

�� E

�Iy (�l)

~gl(d0;�0)� 2�I" (�l)

��

+~gk(d0;�0)� gk (d0;�0)

gk (d0;�0)

kXj=1

��Iy (�j)

~gj(d0;�0)� 2�I" (�j)

�� E

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)

��= OP

�(k=n)2d0(k1=3(log k)2=3 + k'y+1=2n�'y + k1=2n�1=4)

�using (A.21) of AS. In the nonstationary case we use Lemma 1 of Velasco (1999) and get

kXj=1

~gj(d0;�0)� gj (d0;�0)gj (d0;�0)

��Iy (�j)

~gj(d0;�0)� 2�I" (�j)

�� E

�Iy (�j)

~gj(d0;�0)� 2�I" (�j)

��= OP ((k=n)

2d0(k1=(5�4d0)(log k)2=(5�4d0) + k'y+1=2n�'y

+k2d0�1(log k) + n�1=2k(1+d0)=2(log n)5=4 + n�1=4kd0(log k)1=2)):

Finally the term (58) is handled in exactly the same way as the stationary case of (55) yielding the

contribution OP�(k=n)2d0(k1=3(log k)2=3 + k'w+1=2n�'w + k1=2n�1=4)

�.

100

References

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), �The distribution of realized stock

return volatility�, Journal of Financial Economics 61, 43�76.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), �The distribution of realized

exchange rate volatility�, Journal of the American Statistical Association 96, 42�55.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2003), �Modelling and forecasting realized

volatility�, Econometrica 71, 579�625.





Arteche, J. (2004), �Gaussian semiparametric estimation in long memory in stochastic volatility and


Baillie, R. T. (1996), �Long memory processes and fractional integration in econometrics�, Journal of

Econometrics 73, 5�59.

Baillie, R. T., Bollerslev, T. & Mikkelsen, H. O. (1996), �Fractionally integrated generalized autore-

gressive conditional heteroscedasticity�, Journal of Econometrics 74, 3�30.

Beran, J. (1994), Statistics for Long-Memory Processes, Chapman-Hall, New York.





Comte, F. & Renault, E. (1998), �Long memory in continuous-time stochastic volatility models�,

Mathematical Finance 8, 291�323.

Davies, R. B. & Harte, D. S. (1987), �Tests for hurst e¤ects�, Biometrika 74, 95�102.

Deo, R. S. & Hurvich, C. M. (2001), �On the log periodogram regression estimator of the memory

parameter in long memory stochastic volatility models�, Econometric Theory 17, 686�710.

Ding, Z., Granger, C. W. J. & Engle, R. F. (1993), �A long memory property of stock returns and a

new model�, Journal of Empirical Finance 1, 83�106.

Fox, R. & Taqqu, M. S. (1986), �Large-sample properties of parameter estimates for strongly dependent

stationary Gaussian series�, Journal of Time Series Analysis 4, 221�238.

Frederiksen, P. H. & Nielsen, M. Ø. (2008), �Bias-reduced estimation of long memory stochastic

volatility�, Forthcoming in Journal of Financial Econometrics .

101

Fuller, W. A. (1996), Introduction to statistical time series, Wiley, New York.



Harvey, A. (1989), Forecasting, structural time series models and the Kalman �lter, Cambridge Uni-

versity Press, Cambridge.

Harvey, A. (1998), Long memory in stochastic volatility, in J. Knight & S. Satchell, eds, �Forecasting

Volatility in Financial Markets�, Butterworth-Heinemann, London, pp. 307�320.

Henry, M. & Za¤aroni, P. (2003), The long range paradigm for macroeconomics and �nance, in

P. Doukhan, G. Oppenheim & M. S. Taqqu, eds, �Theory and Applications of Long-Range De-

pendence�, Birkhäuser, Boston, pp. 417�438.


rica 73, 1283�1328.

Hurvich, C. M. & Ray, B. K. (1995), �Estimation of the memory parameter for nonstationary or

noninvertible fractionally integrated processes�, Journal of Time Series Analysis 16, 17�41.



Künsch, H. R. (1987), Statistical aspects of self-similar processes, in Y. Prokhorov & V. V. Sazanov,

eds, �Proceedings of the First World Congress of the Bernoulli Society�, VNU Science Press,

Utrecht, pp. 67�74.

Liesenfeld, R. (2001), �A generalized bivariate mixture model for stock price volatility and trading

volume�, Journal of Econometrics 104, 141�178.

Marinucci, D. & Robinson, P. M. (1999), �Alternative forms of fractional brownian motion�, Journal

of Statistical Planning and Inference 80, 111�122.

Ray, B. K. & Tsay, R. (2000), �Long-range dependence in daily stock volatility�, Journal of Business

and Economic Statistics 18, 254�262.





Robinson, P. M. (2003), Long-memory time series, in P. M. Robinson, ed., �Time Series With Long

Memory�, Oxford University Press, Oxford, pp. 4�32.

Sun, Y. & Phillips, P. C. B. (2003), �Nonlinear log-periodogram regression for perturbed fractional

processes�, Journal of Econometrics 115, 355�389.

102

Velasco, C. (1999), �Gaussian semiparametric estimation of non-stationary time series�, Journal of


Wright, J. H. (2002), �Log periodogram estimation of long memory volatility dependencies with con-

ditionally heavy tailed returns�, Econometric Reviews 21, 397�417.

103

Table 1: Simulation results for Model ILWN LPWN(1,0) LPWN(0,1) LPWN(1,1)

nsr n Bias RMSE Bias RMSE Bias RMSE Bias RMSEPanel A: m =

�n0:7

�5 2048 0.0109 0.2037 0.0184 0.2707 0.0124 0.2606 0.0135 0.2917

4096 -0.0040 0.1280 0.0135 0.2093 -0.0029 0.1791 0.0085 0.22048192 0.0061 0.0911 0.0128 0.1479 0.0026 0.1205 0.0103 0.1553

10 2048 0.0140 0.2639 0.0166 0.3106 0.0264 0.3123 0.0207 0.32714096 0.0035 0.1793 0.0146 0.2462 0.0020 0.2268 0.0212 0.26068192 0.0019 0.1194 0.0032 0.1570 0.0191 0.1901 0.0161 0.1947

20 2048 0.0004 0.3373 -0.0348 0.3391 -0.0097 0.3567 -0.0253 0.35244096 -0.0005 0.2474 0.0003 0.2922 -0.0001 0.2840 0.0026 0.30238192 -0.0047 0.2175 -0.0009 0.2392 0.0003 0.2380 -0.0001 0.2419

Panel B: m =�n0:8

�5 2048 -0.0002 0.1567 0.0002 0.2154 -0.0081 0.1953 -0.0139 0.2279

4096 0.0015 0.0966 0.0076 0.1506 -0.0020 0.1233 0.0054 0.17598192 0.0054 0.0706 0.0075 0.1025 0.0044 0.0907 0.0082 0.1244

10 2048 0.0057 0.2276 0.0056 0.2777 0.0094 0.2738 -0.0224 0.27354096 0.0078 0.1399 0.0155 0.1930 0.0089 0.1774 -0.0095 0.19928192 0.0047 0.0917 0.0125 0.1410 0.0034 0.1177 0.0002 0.1385

20 2048 -0.0152 0.3011 -0.0549 0.3058 -0.0294 0.3212 -0.1062 0.29974096 0.0002 0.2201 -0.0055 0.2531 -0.0020 0.2518 -0.0445 0.23668192 0.0073 0.1361 0.0146 0.1768 0.0073 0.1629 -0.0230 0.1629

Note: The polynomial approximation used under the heading �LPWN(Ry; Rw)�is (Ry; Rw).

Table 2: Simulation results for Model II with (�y; �y) = (0; 0) and (�x; �x) = (0:5; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 -0.0620 0.1977 0.0136 0.2678 -0.0045 0.2510 0.0042 0.2885

4096 -0.0528 0.1346 0.0134 0.2116 -0.0131 0.1738 0.0080 0.21828192 -0.0228 0.0931 0.0174 0.1523 0.0005 0.1250 0.0126 0.1595

10 2048 -0.0937 0.2484 0.0041 0.2968 0.0044 0.3052 0.0167 0.32634096 -0.0661 0.1818 0.0170 0.2449 -0.0040 0.2235 0.0235 0.26538192 -0.0416 0.1191 0.0144 0.1873 -0.0048 0.1600 0.0093 0.1906

20 2048 -0.1102 0.3165 -0.0479 0.3250 -0.0186 0.3525 -0.0271 0.35664096 -0.1043 0.2482 -0.0071 0.2815 -0.0126 0.2819 0.0056 0.30048192 -0.0613 0.1686 0.0102 0.2222 -0.0120 0.1966 0.0108 0.2248

Panel B: m =�n0:8

�5 2048 -0.1486 0.1869 -0.0524 0.2245 -0.0896 0.1881 -0.0381 0.2089

4096 -0.1275 0.1519 -0.0116 0.1787 -0.0612 0.1319 -0.0023 0.17158192 -0.1027 0.1206 -0.0049 0.1175 -0.0371 0.0944 0.0174 0.1309

10 2048 -0.2208 0.2570 -0.0709 0.2732 -0.1102 0.2537 -0.0693 0.25754096 -0.1870 0.2138 -0.0149 0.2220 -0.0848 0.1736 -0.0260 0.19838192 -0.1610 0.1806 -0.0100 0.1617 -0.0649 0.1284 0.0005 0.1525

20 2048 -0.2748 0.3104 -0.1417 0.2999 -0.1656 0.3127 -0.1423 0.29134096 -0.2640 0.2919 -0.0583 0.2633 -0.1195 0.2556 -0.0728 0.24278192 -0.2349 0.2554 -0.0155 0.2028 -0.0894 0.1800 -0.0252 0.1826


104

Table 3: Simulation results for Model II with (�y; �y) = (0; 0) and (�x; �x) = (�0:8; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 0.0133 0.2036 0.0205 0.2735 0.0138 0.2645 0.0126 0.2916

4096 -0.0012 0.1285 0.0155 0.2072 0.0018 0.1825 0.0076 0.21898192 0.0066 0.0904 0.0131 0.1501 0.0024 0.1214 0.0088 0.1564

10 2048 0.0149 0.2689 0.0156 0.3092 0.0241 0.3120 0.0179 0.33134096 0.0046 0.1797 0.0144 0.2445 0.0012 0.2234 0.0227 0.25978192 0.0037 0.1141 0.0076 0.1837 -0.0029 0.1575 0.0047 0.1899

20 2048 -0.0001 0.3340 -0.0329 0.3351 -0.0157 0.3509 -0.0213 0.35204096 0.0032 0.2510 -0.0049 0.2858 -0.0025 0.2759 0.0048 0.29988192 0.0067 0.1619 0.0095 0.2223 -0.0086 0.1973 0.0062 0.2258

Panel B: m =�n0:8

�5 2048 0.0139 0.1595 -0.0034 0.2189 -0.0082 0.1978 -0.0137 0.2357

4096 0.0114 0.0972 0.0055 0.1509 -0.0019 0.1256 0.0070 0.17828192 0.0116 0.0716 0.0074 0.1028 0.0053 0.0903 0.0086 0.1257

10 2048 0.0291 0.2328 0.0065 0.2787 0.0092 0.2746 -0.0285 0.27534096 0.0210 0.1404 0.0116 0.1936 0.0092 0.1799 -0.0048 0.20128192 0.0046 0.0939 -0.0093 0.1324 -0.0053 0.1152 -0.0083 0.1398

20 2048 0.0060 0.3059 -0.0564 0.3084 -0.0331 0.3239 -0.1075 0.29984096 0.0168 0.2232 -0.0132 0.2476 -0.0049 0.2478 -0.0493 0.23588192 0.0119 0.1428 0.0045 0.1805 0.0005 0.1688 -0.0222 0.1704


Table 4: Simulation results for Model III with (�y; �y) = (0; 0) and (�x; �x) = (0; 0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 -0.0035 0.1535 0.0247 0.2290 0.0059 0.2064 0.0103 0.2503

4096 -0.0083 0.0989 0.0095 0.1718 -0.0056 0.1410 0.0012 0.18588192 0.0025 0.0719 0.0106 0.1180 0.0036 0.0990 0.0068 0.1340

10 2048 -0.0078 0.1848 0.0257 0.2653 0.0103 0.2462 0.0196 0.28284096 -0.0064 0.1248 0.0114 0.1958 0.0008 0.1727 0.0097 0.21238192 -0.0027 0.0867 0.0111 0.1453 0.0010 0.1200 0.0082 0.1564

20 2048 0.0006 0.2585 0.0118 0.3011 0.0196 0.3071 0.0194 0.32774096 -0.0164 0.1666 0.0132 0.2412 -0.0000 0.2184 0.0123 0.25528192 -0.0094 0.1141 0.0108 0.1776 -0.0034 0.1480 0.0093 0.1835

Panel B: m =�n0:8

�5 2048 -0.0396 0.1124 0.0051 0.1665 -0.0115 0.1366 -0.0084 0.1765

4096 -0.0266 0.0758 0.0110 0.1151 -0.0017 0.0936 0.0056 0.14358192 -0.0141 0.0547 0.0101 0.0808 0.0035 0.0701 0.0081 0.1042

10 2048 -0.0670 0.1585 0.0146 0.2235 -0.0071 0.1932 -0.0064 0.21744096 -0.0371 0.0967 0.0226 0.1489 0.0004 0.1118 0.0020 0.16418192 -0.0246 0.0695 0.0127 0.0998 0.0016 0.0830 0.0069 0.1217

20 2048 -0.0890 0.2183 -0.0099 0.2542 -0.0136 0.2503 -0.0406 0.25334096 -0.0597 0.1479 0.0233 0.1955 0.0018 0.1701 -0.0029 0.20088192 -0.0434 0.0935 0.0174 0.1341 -0.0022 0.1077 0.0032 0.1431


105

Table 5: Simulation results for Model III with(�y; �y) = (0; 0) and (�x; �x) = (0;�0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 0.0144 0.1427 0.0188 0.2184 0.0045 0.1921 0.0095 0.2345

4096 0.0043 0.0917 0.0083 0.1621 -0.0036 0.1329 0.0034 0.18208192 0.0079 0.0673 0.0085 0.1068 0.0035 0.0913 0.0074 0.1306

10 2048 0.0125 0.1713 0.0257 0.2513 0.0120 0.2291 0.0135 0.26264096 0.0083 0.1159 0.0075 0.1831 -0.0031 0.1584 0.0056 0.19448192 0.0065 0.0821 0.0085 0.1329 0.0010 0.1118 0.0046 0.1467

20 2048 0.0333 0.2349 0.0220 0.2936 0.0204 0.2831 0.0199 0.30664096 0.0057 0.1470 0.0128 0.2208 0.0002 0.1940 0.0116 0.23808192 0.0045 0.1041 0.0018 0.1508 -0.0044 0.1293 -0.0029 0.1585

Panel B: m =�n0:8

�5 2048 0.0536 0.1144 -0.0099 0.1469 -0.0062 0.1293 0.0045 0.1891

4096 0.0409 0.0785 -0.0031 0.0971 0.0028 0.0891 0.0067 0.14158192 0.0319 0.0591 0.0029 0.0719 0.0074 0.0656 0.0074 0.0982

10 2048 0.0780 0.1555 -0.0036 0.1958 0.0005 0.1727 0.0042 0.22224096 0.0602 0.1003 -0.0024 0.1208 0.0048 0.1044 0.0099 0.15828192 0.0415 0.0715 -0.0044 0.0838 0.0024 0.0765 0.0064 0.1124

20 2048 0.1203 0.2265 -0.0163 0.2287 0.0076 0.2305 -0.0094 0.24364096 0.0839 0.1455 -0.0021 0.1643 0.0092 0.1522 0.0088 0.18538192 0.0529 0.0913 -0.0078 0.1031 0.0013 0.0952 -0.0003 0.1331


Table 6: Simulation results for Model IV with (�y; �y) = (�0:8; 0) and (�x; �x) = (0:5; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 -0.0579 0.1987 0.0167 0.2710 -0.0015 0.2515 0.0077 0.2867

4096 -0.0498 0.1344 0.0150 0.2139 -0.0117 0.1766 0.0086 0.21868192 -0.0207 0.0929 0.0177 0.1535 0.0007 0.1250 0.0067 0.1928

10 2048 -0.0900 0.2488 0.0029 0.2985 0.0051 0.3061 0.0175 0.32824096 -0.0635 0.1812 0.0191 0.2492 -0.0021 0.2281 0.0242 0.26478192 -0.0400 0.1186 0.0144 0.1859 -0.0047 0.1605 0.0106 0.1893

20 2048 -0.1076 0.3173 -0.0469 0.3274 -0.0186 0.3515 -0.0240 0.35694096 -0.1017 0.2469 -0.0080 0.2827 -0.0108 0.2814 0.0038 0.30408192 -0.0600 0.1685 0.0117 0.2239 -0.0121 0.1966 -0.0090 0.2250

Panel B: m =�n0:8

�5 2048 -0.1302 0.1759 -0.0543 0.2222 -0.0871 0.1894 -0.0439 0.2122

4096 -0.1135 0.1414 -0.0153 0.1745 -0.0592 0.1324 -0.0006 0.17248192 -0.0927 0.1125 -0.0062 0.1171 -0.0353 0.0942 0.0182 0.1316

10 2048 -0.2081 0.2491 -0.0619 0.2745 -0.1078 0.2542 -0.0682 0.25594096 -0.1767 0.2052 -0.0157 0.2236 -0.0820 0.1731 -0.0259 0.19748192 -0.1540 0.1747 -0.0110 0.1623 -0.0638 0.1275 -0.0001 0.1538

20 2048 -0.2699 0.3086 -0.1392 0.3017 -0.1656 0.3142 -0.1452 0.29194096 -0.2587 0.2882 -0.0602 0.2637 -0.1175 0.2540 -0.0716 0.24168192 -0.2297 0.2507 -0.0188 0.2041 -0.0873 0.1820 -0.0225 0.1838


106

Table 7: Simulation results for Model IV with (�y; �y) = (�0:8; 0) and (�x; �x) = (�0:8; 0)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 0.0175 0.2051 0.0226 0.2725 0.0156 0.2640 0.0122 0.2922

4096 0.0019 0.1289 0.0158 0.2095 0.0006 0.1838 0.0073 0.22228192 0.0086 0.0911 0.0112 0.1455 0.0033 0.1206 0.0112 0.1568

10 2048 0.0184 0.2689 0.0156 0.3090 0.0243 0.3121 0.0188 0.32924096 0.0068 0.1803 0.0149 0.2459 0.0025 0.2209 0.0232 0.26178192 0.0052 0.1444 0.0090 0.1866 -0.0024 0.1582 0.0068 0.1874

20 2048 0.0021 0.3350 -0.0341 0.3342 -0.0170 0.3487 -0.0242 0.35234096 0.0052 0.2520 -0.0047 0.2861 0.0016 0.2797 0.0051 0.30118192 0.0077 0.1622 0.0046 0.2261 -0.0093 0.1947 0.0095 0.2292

Panel B: m =�n0:8

�5 2048 0.0329 0.1646 -0.0054 0.2180 -0.0058 0.2001 -0.0116 0.2314

4096 0.0248 0.1004 0.0032 0.1486 -0.0006 0.1259 0.0090 0.18058192 0.0205 0.0740 0.0069 0.1028 0.0063 0.0904 0.0091 0.1262

10 2048 0.0439 0.2369 0.0062 0.2805 0.0105 0.2754 -0.0237 0.27224096 0.0308 0.1431 0.0098 0.1916 0.0093 0.1790 -0.0036 0.19748192 0.0133 0.0946 -0.0012 0.1327 -0.0043 0.1168 -0.0063 0.1415

20 2048 0.0116 0.3081 -0.0567 0.3087 -0.0321 0.3241 -0.1091 0.29894096 0.0218 0.2246 -0.0142 0.2478 -0.0053 0.2485 -0.0472 0.23778192 0.0170 0.1435 0.0031 0.1710 0.0029 0.1714 -0.0221 0.1710


Table 8: Simulation results for Model V with (�y; �y) = (0;�0:8) and (�x; �x) = (0; 0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 0.3067 0.3790 0.0179 0.2211 0.0883 0.2694 0.0270 0.2585

4096 0.2261 0.2608 0.0037 0.1550 0.0492 0.1735 0.0133 0.19558192 0.1655 0.1864 0.0130 0.1087 0.0561 0.1255 0.0117 0.1382

10 2048 0.2710 0.3675 0.0130 0.2572 0.0701 0.2928 0.0250 0.28334096 0.2040 0.2525 0.0119 0.1846 0.0470 0.1894 0.0183 0.21688192 0.1357 0.1681 0.0113 0.1307 0.0406 0.1319 0.0110 0.1542

20 2048 0.1837 0.3620 0.0203 0.3127 0.0539 0.3298 0.0355 0.32634096 0.1635 0.2571 0.0219 0.2473 0.0408 0.2378 0.0286 0.26588192 0.1057 0.1636 0.0074 0.1640 0.0222 0.1541 0.0151 0.1790

Panel B: m =�n0:8

�5 2048 -0.2596 0.4143 -0.0444 0.1989 0.1301 0.3058 0.0208 0.2102

4096 0.2995� 0.4583� 0.0112 0.1083 0.1513 0.2153 0.0200 0.14768192 0.3611 0.4097 0.0151 0.0686 0.1221 0.1514 0.0124 0.1089

10 2048 -0.2123 0.4204 -0.0190 0.1988 0.1460 0.3243 0.0265 0.23304096 0.2645� 0.4359� 0.0167 0.1044 0.1464 0.2112 0.0208 0.15298192 0.3123 0.3751 0.0136 0.0784 0.1095 0.1473 0.0156 0.1140

20 2048 -0.1827 0.4107 -0.0063 0.2145 0.1267 0.3235 -0.0173 0.25694096 0.1994� 0.4026� 0.0168 0.1477 0.1245 0.2348 0.0164 0.19428192 0.2554 0.3329 0.0175 0.1033 0.0990 0.1659 0.0215 0.1405


107

Table 9: Simulation results for Model V with (�y; �y) = (0;�0:8) and (�x; �x) = (0;�0:8)LWN LPWN(1,0) LPWN(0,1) LPWN(1,1)


�n0:7

�5 2048 0.3311 0.3899 0.0103 0.2054 0.0860 0.2616 0.0248 0.2486

4096 0.2452 0.2754 0.0074 0.1487 0.0597 0.1721 0.0108 0.18878192 0.1687 0.1876 0.0086 0.1016 0.0574 0.1213 0.0059 0.1305

10 2048 0.2993 0.3815 0.0073 0.2410 0.0736 0.2789 0.0266 0.27444096 0.2229 0.2628 0.0036 0.1714 0.0440 0.1835 0.0209 0.20628192 0.1579 0.1822 0.0144 0.1186 0.0504 0.1278 0.0167 0.1410

20 2048 0.2224 0.3741 0.0138 0.2809 0.0684 0.3174 0.0321 0.30664096 0.1989 0.2618 0.0214 0.2085 0.0532 0.2115 0.0298 0.23398192 0.1388 0.1764 0.0142 0.1524 0.0380 0.1461 0.0186 0.1655

Panel B: m =�n0:8

�5 2048 -0.2917 0.4077 -0.0846 0.2273 0.0918 0.3189 0.0174 0.2027

4096 0.2863� 0.4677� -0.0021 0.1165 0.1491 0.2215 0.0212 0.14318192 0.3540 0.4284 0.0080 0.0645 0.1234 0.1510 0.0081 0.1032

10 2048 -0.2717 0.4115 -0.0677 0.2208 0.1099 0.3330 0.0313 0.23114096 0.2509� 0.4533� -0.0015 0.1216 0.1394 0.2185 0.0174 0.15538192 0.3026 0.4056 0.0066 0.0696 0.1139 0.1471 0.0098 0.1123

20 2048 -0.2992 0.4025 -0.0528 0.2180 0.1012 0.3235 0.0004 0.24274096 0.1423� 0.4261� -0.0104 0.1397 0.1020 0.2170 -0.0166 0.15878192 0.2319 0.3805 0.0014 0.0813 0.1010 0.1509 0.0136 0.1201


108

Table 10: Local Whittle estimation of long memory in volatility of DJIA stocksm =

�n0:6

�m =

�n0:7

�m =

�n0:8

�Ticker Symbol LW LPW LWN LW LPW LWN LW LPW LWN

AA 0:3002(0:0395)

0:3718(0:0592)

0:5292(0:0768)

0:2019(0:0258)

0:2956(0:0387)

0:5916(0:0476)

0:1379(0:0169)

0:1977(0:0253)

0:6063(0:0308)

AIG 0:3696(0:0395)

0:5017(0:0592)

0:6793(0:0686)

0:2938(0:0258)

0:3941(0:0387)

0:6202(0:0466)

0:2042(0:0169)

0:2990(0:0253)

0:6471(0:0299)

AXP 0:3691(0:0395)

0:4860(0:0592)

0:8225(0:0635)

0:3260(0:0258)

0:3928(0:0387)

0:5552(0:0490)

0:2115(0:0169)

0:3088(0:0253)

0:6514(0:0298)

BA 0:2593(0:0395)

0:4087(0:0592)

0:6809(0:0685)

0:2094(0:0258)

0:2484(0:0387)

0:5870(0:0478)

0:1509(0:0169)

0:2065(0:0253)

0:5336(0:0327)

C 0:3853(0:0395)

0:4789(0:0592)

0:7273(0:0666)

0:2908(0:0258)

0:3855(0:0387)

0:6677(0:0451)

0:2141(0:0169)

0:2992(0:0253)

0:6309(0:0302)

CAT 0:2477(0:0395)

0:3364(0:0592)

0:6247(0:0711)

0:1915(0:0258)

0:3053(0:0387)

0:5522(0:0491)

0:1280(0:0169)

0:2022(0:0253)

0:5788(0:0315)

DD 0:1425(0:0395)

0:1738(0:0592)

0:4238(0:0861)

0:1008(0:0258)

0:1292(0:0387)

0:4195(0:0565)

0:0810(0:0169)

0:0956(0:0253)

0:3366(0:0420)

DIS 0:3033(0:0395)

0:4448(0:0592)

0:9074(0:0613)

0:2361(0:0258)

0:3155(0:0387)

0:7582(0:0428)

0:1744(0:0169)

0:2134(0:0253)

0:6824(0:0292)

GE 0:3615(0:0395)

0:5044(0:0592)

0:7528(0:0657)

0:2497(0:0258)

0:3659(0:0387)

0:7796(0:0423)

0:1807(0:0169)

0:2580(0:0253)

0:7546(0:0281)

GM 0:2567(0:0483)

0:3489(0:0592)

0:4949(0:0794)

0:1987(0:0258)

0:2606(0:0387)

0:4890(0:0522)

0:1603(0:0169)

0:2027(0:0253)

0:4091(0:0375)

HD 0:3703(0:0395)

0:4581(0:0592)

0:6782(0:0686)

0:2489(0:0258)

0:3670(0:0387)

0:7321(0:0434)

0:1723(0:0169)

0:2432(0:0253)

0:7401(0:0283)

HON 0:2614(0:0395)

0:3354(0:0592)

0:9898(0:0594)

0:2323(0:0258)

0:2447(0:0387)

0:5859(0:0478)

0:1787(0:0169)

0:2253(0:0253)

0:4242(0:0368)

HPQ 0:3503(0:0395)

0:4688(0:0592)

0:8591(0:0625)

0:2366(0:0258)

0:3049(0:0387)

0:9061(0:0400)

0:1845(0:0169)

0:2290(0:0253)

0:7583(0:0280)

IBM 0:3417(0:0395)

0:4778(0:0592)

0:7626(0:0654)

0:2653(0:0258)

0:3295(0:0387)

0:6922(0:0444)

0:1931(0:0169)

0:2638(0:0253)

0:6359(0:0301)

INTC 0:3467(0:0395)

0:4755(0:0592)

0:7436(0:0661)

0:2396(0:0258)

0:3325(0:0387)

0:7685(0:0426)

0:1807(0:0169)

0:2532(0:0253)

0:6894(0:0291)

JNJ 0:3734(0:0395)

0:4400(0:0592)

0:6639(0:0692)

0:2601(0:0258)

0:3750(0:0387)

0:6850(0:0446)

0:1940(0:0169)

0:2641(0:0253)

0:6394(0:0301)

JPM 0:3603(0:0395)

0:5424(0:0592)

0:7173(0:0670)

0:3032(0:0258)

0:3741(0:0387)

0:6029(0:0472)

0:2174(0:0169)

0:2865(0:0253)

0:6058(0:0308)

KO 0:3677(0:0395)

0:5028(0:0592)

0:8104(0:0639)

0:2584(0:0258)

0:3653(0:0387)

0:8065(0:0418)

0:1833(0:0169)

0:2506(0:0253)

0:7923(0:0275)

MCD 0:2640(0:0395)

0:4591(0:0592)

0:6632(0:0693)

0:1798(0:0258)

0:2513(0:0387)

0:6936(0:0444)

0:1170(0:0169)

0:1701(0:0253)

0:7116(0:0287)

MMM 0:2635(0:0395)

0:3792(0:0592)

0:9891(0:0595)

0:2016(0:0258)

0:2744(0:0387)

0:9899(0:0388)

0:1430(0:0169)

0:1944(0:0253)

0:8712(0:0266)

MO 0:3041(0:0395)

0:4106(0:0592)

0:7409(0:0662)

0:2531(0:0258)

0:3152(0:0387)

0:5484(0:0493)

0:1879(0:0169)

0:2505(0:0253)

0:5163(0:0332)

MRK 0:2612(0:0395)

0:3504(0:0592)

0:5687(0:0742)

0:2063(0:0258)

0:2535(0:0387)

0:5034(0:0514)

0:1540(0:0169)

0:1930(0:0253)

0:4599(0:0352)

MSFT 0:3421(0:0395)

0:4756(0:0592)

0:8192(0:0636)

0:2908(0:0258)

0:3507(0:0387)

0:6156(0:0467)

0:2023(0:0169)

0:2883(0:0253)

0:6223(0:0304)

PFE 0:3354(0:0395)

0:3740(0:0592)

0:6473(0:0700)

0:2407(0:0258)

0:3168(0:0387)

0:6237(0:0465)

0:1644(0:0169)

0:2407(0:0253)

0:6324(0:0302)

PG 0:3262(0:0395)

0:4378(0:0592)

0:7656(0:0653)

0:2274(0:0258)

0:3433(0:0387)

0:7514(0:0429)

0:1944(0:0169)

0:2435(0:0253)

0:5525(0:0321)

SBC 0:3411(0:0395)

0:4017(0:0592)

0:7310(0:0665)

0:2692(0:0258)

0:3410(0:0387)

0:5545(0:0491)

0:1866(0:0169)

0:2581(0:0253)

0:5784(0:0315)

UTX 0:3435(0:0395)

0:4700(0:0592)

0:6515(0:0698)

0:2413(0:0258)

0:3426(0:0387)

0:6751(0:0449)

0:1650(0:0169)

0:2417(0:0253)

0:6892(0:0291)

VZ 0:3317(0:0395)

0:4262(0:0592)

0:8357(0:0631)

0:2578(0:0258)

0:3458(0:0387)

0:6661(0:0452)

0:1866(0:0169)

0:2642(0:0253)

0:6138(0:0306)

WMT 0:3728(0:0395)

0:4847(0:0592)

0:7582(0:0655)

0:2570(0:0258)

0:3477(0:0387)

0:7860(0:0422)

0:1668(0:0169)

0:2573(0:0253)

0:8247(0:0271)

XOM 0:2498(0:0395)

0:3889(0:0592)

0:6534(0:0697)

0:2271(0:0258)

0:2390(0:0387)

0:4419(0:0550)

0:1507(0:0169)

0:2254(0:0253)

0:5013(0:0337)

Note: Asymptotic standard errors in parentheses.

109

Table 11: LPWN estimation of long memory in volatility of DJIA stocksm =

�n0:6

�m =

�n0:7

�m =

�n0:8

�Ticker Symbol (1; 0) (0; 1) (1; 1) (1; 0) (0; 1) (1; 1) (1; 0) (0; 1) (1; 1)

AA 0:5427(0:1139)

0:5525(0:1538)

0:5448(0:2315)

0:5196(0:0759)

0:5394(0:1012)

0:5247(0:1528)

0:5869(0:0469)

0:6027(0:0649)

0:5115(0:1007)

AIG 0:5872(0:1097)

0:6785(0:1487)

0:5887(0:2279)

0:6156(0:0701)

0:6068(0:0990)

0:6107(0:1483)

0:5833(0:0470)

0:6006(0:0650)

0:6209(0:0969)

AXP 0:9538(0:0903)

0:8216(0:1465)

0:9612(0:2198)

0:9719(0:0586)

0:6755(0:0975)

0:8239(0:1441)

0:5877(0:0469)

0:6057(0:0649)

0:7991(0:0946)

BA 0:5071(0:1177)

0:5606(0:1534)

0:5096(0:2351)

0:7064(0:0661)

0:6994(0:0971)

0:7474(0:1448)

0:6396(0:0451)

0:5770(0:0654)

0:7329(0:0951)

C 0:8962(0:0923)

0:8305(0:1465)

0:8982(0:2195)

0:7761(0:0636)

0:7036(0:0970)

0:7396(0:1449)

0:6577(0:0446)

0:6307(0:0645)

0:8016(0:0946)

CAT 0:6275(0:1065)

0:6235(0:1504)

0:6073(0:2266)

0:3601(0:0924)

0:3838(0:1118)

0:6125(0:1482)

0:4861(0:0514)

0:5134(0:0671)

0:4698(0:1030)

DD 0:4049(0:1325)

0:4667(0:1604)

0:4548(0:2424)

0:4507(0:0816)

0:4513(0:1060)

0:4495(0:1592)

0:4479(0:0536)

0:4397(0:0700)

0:3991(0:1084)

DIS 0:8803(0:0929)

0:9059(0:1463)

0:8788(0:2195)

0:9092(0:0600)

0:9898(0:0963)

0:9131(0:1441)

0:7966(0:0412)

0:8451(0:0630)

0:7870(0:0947)

GE 0:7370(0:0995)

0:7521(0:1472)

0:6984(0:2223)

0:7141(0:0658)

0:7467(0:0965)

0:7184(0:1453)

0:9888(0:0381)

0:7671(0:0632)

0:7187(0:0952)

GM 0:3761(0:1381)

0:4049(0:1677)

0:3799(0:2574)

0:4737(0:0796)

0:4881(0:1037)

0:4790(0:1564)

0:4820(0:0516)

0:4459(0:0697)

0:5354(0:0997)

HD 0:7040(0:1013)

0:6777(0:1487)

0:7078(0:2220)

0:6591(0:0681)

0:6800(0:0974)

0:6633(0:1465)

0:9888(0:0381)

0:7769(0:0631)

0:7015(0:0954)

HON 0:9893(0:0892)

0:9898(0:1467)

0:9890(0:2201)

0:7424(0:0648)

0:9854(0:0963)

0:9841(0:1445)

0:9418(0:0388)

0:5515(0:0660)

0:7446(0:0950)

HPQ 0:8452(0:0943)

0:8583(0:1464)

0:9214(0:2196)

0:9882(0:0583)

0:8807(0:0960)

0:9890(0:1445)

0:8530(0:0402)

0:9114(0:0630)

0:9206(0:0945)

IBM 0:7089(0:1011)

0:7619(0:1471)

0:7099(0:2219)

0:7851(0:0633)

0:7875(0:0962)

0:8044(0:1442)

0:7361(0:0425)

0:6765(0:0639)

0:7777(0:0947)

INTC 0:7274(0:1000)

0:7428(0:1473)

0:7266(0:2215)

0:7571(0:0643)

0:7736(0:0963)

0:7606(0:1447)

0:9879(0:0381)

0:7136(0:0635)

0:7686(0:0948)

JNJ 0:8592(0:0937)

0:8002(0:1467)

0:8639(0:2195)

0:6528(0:0683)

0:6438(0:0981)

0:7825(0:1444)

0:6879(0:0437)

0:6691(0:0639)

0:6793(0:0958)

JPM 0:4847(0:1204)

0:5400(0:1546)

0:4904(0:2374)

0:7293(0:0652)

0:6681(0:0976)

0:7001(0:1456)

0:6627(0:0444)

0:6469(0:0642)

0:6157(0:0971)

KO 0:9505(0:0904)

0:8090(0:1466)

0:9498(0:2198)

0:8450(0:0616)

0:8158(0:0961)

0:8256(0:1441)

0:9896(0:0381)

0:8388(0:0630)

0:7714(0:0948)

MCD 0:3414(0:1461)

0:4715(0:1599)

0:3472(0:2665)

0:6579(0:0681)

0:6890(0:0972)

0:6678(0:1464)

0:9896(0:0381)

0:7150(0:0635)

0:6721(0:0959)

MMM 0:9874(0:0893)

0:9893(0:1467)

0:9879(0:2201)

0:9880(0:0583)

0:9008(0:0960)

0:9894(0:1445)

0:9845(0:0382)

0:9869(0:0632)

0:9647(0:0947)

MO 0:9526(0:0904)

0:7394(0:1474)

0:9335(0:2197)

0:9565(0:0589)

0:6506(0:0979)

0:7847(0:1444)

0:5946(0:0466)

0:5469(0:0661)

0:7824(0:0947)

MRK 0:5558(0:1126)

0:5643(0:1532)

0:5546(0:2306)

0:6002(0:0709)

0:5832(0:0997)

0:5917(0:1491)

0:5732(0:0474)

0:5396(0:0663)

0:5713(0:0983)

MSFT 0:8010(0:0963)

0:8184(0:1465)

0:8003(0:2200)

0:9278(0:0596)

0:7643(0:0964)

0:8381(0:1441)

0:6497(0:0448)

0:6093(0:0648)

0:7937(0:0946)

PFE 0:8072(0:0960)

0:9896(0:1467)

0:8145(0:2199)

0:7372(0:0649)

0:6778(0:0974)

0:7131(0:1454)

0:6184(0:0458)

0:6149(0:0647)

0:7040(0:0954)

PG 0:9260(0:0913)

0:7646(0:1470)

0:9320(0:2196)

0:6881(0:0668)

0:7028(0:0970)

0:6932(0:1458)

0:9086(0:0393)

0:6644(0:0640)

0:7960(0:0946)

SBC 0:9711(0:0898)

0:8535(0:1464)

0:9734(0:2200)

0:9480(0:0591)

0:6398(0:0982)

0:8381(0:1441)

0:6196(0:0458)

0:5861(0:0652)

0:7954(0:0946)

UTX 0:5862(0:1098)

0:5983(0:1515)

0:5865(0:2281)

0:6373(0:0691)

0:6438(0:0981)

0:6395(0:1473)

0:6574(0:0446)

0:6684(0:0640)

0:6621(0:0961)

VZ 0:9228(0:0914)

0:8350(0:1464)

0:8787(0:2195)

0:8688(0:0610)

0:7483(0:0965)

0:8166(0:1442)

0:7044(0:0433)

0:6157(0:0647)

0:7801(0:0947)

WMT 0:8249(0:0952)

0:7575(0:1471)

0:8242(0:2198)

0:8152(0:0624)

0:8149(0:0961)

0:8149(0:1442)

0:8994(0:0394)

0:7736(0:0631)

0:9865(0:0948)

XOM 0:4872(0:1201)

0:6519(0:1494)

0:4900(0:2374)

0:8676(0:0610)

0:6669(0:0976)

0:7009(0:1456)

0:4411(0:0540)

0:4415(0:0699)

0:6900(0:0956)

Note: The heading �(Ry; Rw)�indicates the LPWN(Ry; Rw) estimator. Asymptotic standard errorsin parentheses.

110

Table 12: Parametric Whittle estimation of long memory in volatility of DJIA stocksTicker Symbol d �y �y �2� �x �x �2"

AA 0:5777(0:0943)

�0:5431(0:1864)

� 0:1634(0:1122)

0:0362(0:0206)

� 3:3266(0:1360)

AIG 0:6377(0:0748)

�0:7946(0:1447)

� 0:2771(0:1629)

�0:8357(0:0925)

0:9185(0:3044)

2:8112(0:7084)

AXP 0:5915(0:0695)

� �0:7416(0:0863)

1:9245(0:2775)

0:2959(0:1167)

� 1:2799(0:3486)

BA 0:5532(0:1019)

�0:9275(0:2558)

0:6585(0:2779)

0:1152(0:0844)

� 0:0584(0:0215)

3:2629(0:1125)

C 0:6201(0:0671)

� � 0:0913(0:0482)

� � 3:1170(0:0866)

CAT 0:4444(0:1982)

� � 0:2211(0:4432)

0:6814(0:1389)

�0:7236(0:1823)

3:2299(0:5543)

DD 0:2478(0:0978)

�0:7859(0:3492)

0:8867(0:3360)

0:7682(0:8560)

� � 4:1923(0:7411)

DIS 0:7555(0:1331)

� � 0:0125(0:0161)

� 0:0604(0:0160)

3:3340(0:0758)

GE 0:7509(0:1177)

0:6409(0:2936)

�0:8718(0:0901)

0:1393(0:1774)

� � 3:1386(0:1740)

GM 0:5040(0:1438)

�0:9730(0:0440)

0:9371(0:1043)

0:1624(0:2038)

0:6010(0:2992)

�0:5739(0:2931)

3:2918(0:2419)

HD 0:6211(0:1200)

0:4356(0:0676)

�0:8670(0:0453)

1:5042(0:5819)

� � 1:8557(0:5603)

HON 0:4166(0:0672)

�0:3490(0:3161)

� 0:6309(0:4581)

�0:6202(0:3682)

0:6475(0:3093)

2:7333(0:4812)

HPQ 0:9298(0:1684)

�0:9010(0:1339)

� 0:0085(0:0133)

0:6844(0:1604)

�0:6567(0:1633)

3:3724(0:0762)

IBM 0:6775(0:0978)

�0:6652(0:1972)

� 0:1063(0:0743)

0:0325(0:0196)

� 3:1645(0:1037)

INTC 0:7168(0:0865)

�0:8816(0:1020)

� 0:0712(0:0557)

�0:9341(0:0319)

0:9686(0:0836)

3:0632(0:3243)

JNJ 0:5824(0:1038)

0:3924(0:0799)

�0:7923(0:0630)

0:9809(0:7210)

� � 2:4001(0:6923)

JPM 0:5798(0:0624)

� � 0:1461(0:0688)

� � 3:1921(0:1005)

KO 0:8234(0:1214)

�0:7461(0:2712)

� 0:0249(0:0267)

� 0:0423(0:0166)

3:2834(0:0802)

MCD 0:6211(0:1290)

0:5379(0:1005)

�0:8949(0:0414)

0:8105(0:4203)

� � 2:6296(0:4025)

MMM 0:7032(0:1748)

� � 0:0128(0:0236)

� � 3:5654(0:0871)

MO 0:5410(0:0760)

0:6217(0:3652)

� 0:0185(0:0339)

� 0:0349(0:0194)

3:2044(0:0876)

MRK 0:4903(0:0764)

� � 0:1430(0:0873)

� � 3:2380(0:1142)

MSFT 0:5987(0:0725)

�0:7656(0:1451)

� 0:2990(0:1846)

�0:8105(0:0887)

0:8940(0:2491)

2:8675(0:6008)

PFE 0:6093(0:0850)

� � 0:0777(0:0428)

� � 3:3014(0:0872)

PG 0:5724(0:0741)

� � 0:0901(0:0565)

� � 3:1889(0:0942)

SBC 0:5518(0:0657)

� � 0:1294(0:0681)

� � 3:2578(0:1018)

UTX 0:6159(0:1041)

0:5142(0:0943)

�0:8598(0:0475)

0:8469(0:4559)

� � 2:5056(0:4314)

VZ 0:6778(0:1022)

�0:5928(0:3156)

� 0:1039(0:0747)

� 0:0714(0:0240)

3:2410(0:1074)

WMT 0:8328(0:1229)

� � 0:0078(0:0087)

� 0:0363(0:0154)

3:4027(0:0733)

XOM 0:4962(0:0698)

� � 0:1532(0:0839)

� � 3:2217(0:1109)

Note: Asymptotic standard errors (evaluated as the inverse of the negative Hessian) in parentheses.

111



Long-run dependencies in return volatility and trading volume


Long-run dependencies in return volatility and trading volume�

Frank S. Nielseny

University of Aarhus and CREATES

March 20, 2009

Abstract

By using the log-periodogram estimator of Geweke & Porter-Hudak (1983), Bollerslev & Jubin-

ski (1999) show that volatility and trading volume of the S&P100 common stocks have a similar

degree of fractional integration. This evidence of pairwise correspondence between estimates of

long memory across the volatility-volume series supports a long-run view of the Mixture of Distri-

butions Hypothesis; i.e. both processes are driven by a slowly mean-reverting fractional integrated

latent information process. Instead of using the log-periodogram estimator, which is biased, in the

context of perturbed fractional processes (e.g. long memory stochastic volatility models), we use

a semiparametric estimator that is robust to time series perturbed by a noise term which may be

serially correlated. We �nd evidence that volatility in general displays a higher degree of long mem-

ory than trading volume for the S&P100 common stocks. Additionally, volatility displays evidence

of being governed by a long memory stochastic volatility model, whereas this is not generally the

case for trading volume. Furthermore, we �nd little evidence of there being a cointegrating relation

between volatility and trading volume.

Keywords: Fractional integration, local Whittle estimation, long memory, mixture of distri-

butions hypothesis, perturbed fractional process, return volatility, stochastic volatility, trading

volume.

JEL Classi�cation: C22

�I am grateful to Tim Bollerslev, Jörg Breitung, Niels Haldrup, and Morten Ørregaard Nielsen for valuable suggestions

and comments. I would also like to thank Birgitte Højklint Nielsen for proofreading the manuscript. I greatly acknowledge

�nancial support from the Danish Social Sciences Research Council (grant no. FSE275-05-0199) and Center for Research

in Econometric Analysis of Time Series (CREATES), funded by the Danish National Research Foundation.yCREATES, School of Economics and Management, University of Aarhus, Building 1322, DK-8000 Aarhus C, Den-

mark, email: [email protected].

115

1 Introduction

In this paper, we are interested in characterizing the long-run joint volatility-volume relationship in

the context of the Mixture of Distributions Hypothesis (MDH), set forward by Clark (1973), Epps &

Epps (1976), and Tauchen & Pitts (1983). MDH asserts that returns and trading volume are jointly

dependent on the same underlying latent information arrival process. As we focus on the long-run

relationship, we will use semiparametric techniques as these give a convenient way of avoiding the

di¢ culties of modeling the short-run dynamics. In line with other speci�cations that have been used

to analyze volatility, e.g. autoregressive conditionally heteroskedastic (ARCH) and generalized ARCH

(GARCH) type models originally introduced by Engle (1982) and Bollerslev (1986), stochastic volatil-

ity (SV) models proposed e.g. by Taylor (1994), and the long memory SV (LMSV) set forward by

Breidt, Crato & de Lima (1998) and Harvey (1998), these models only give us little or no information

about the economic sources of the persistence. Structural models which build on theoretical market

microstructure models where asymmetrical information-rational agents strategically interact can tell

us about these potential sources by linking volatility and trading volume and thereby obtaining in-

formation regarding the subordinated process driving the volatility process. But in general we do not

learn about the long-run relationship between volatility and trading volume. To these subordination

model approaches belong: Tauchen & Pitts (1983) which extend Clark (1973)�s subordination model

within the Grossman & Stiglitz (1980) noisy rational expectations framework. Further, Lamoureux &

Lastrapes (1994) extend Tauchen & Pitts (1983) bivariate mixture model by assuming a dependent

directing process and they �nd no evidence that volatility persistence is driven by trading volume. An-

dersen (1996), building upon the Glosten & Milgrom (1985) market microstructure model, de�nes the

modi�ed MDH with Poisson distributed trading volume that explicitly allows for non-informational

trading and common information arrivals. This modi�ed MDH is empirically shown to �t better than

the classical MDH, but the results remain diverse in the sense that Liesenfeld (1998) tests Andersen

(1996)�s model assuming a dependent directing process, and he �nds no evidence that volatility per-

sistence is driven by trading volume. More recently, Liesenfeld (2001) and Watanabe (2000, 2003)

showed that maintaining the assumption that the joint distribution of the volume and volatility is a

mixture of normals is still valid when incorporating more than one type of information arrival process

having di¤erent implications for the volume and volatility persistence. Bollerslev & Jubinski (1999)

contribute to the literature by looking at the long-run relationship using semiparametric techniques

and thereby not having to specify the short-run dynamics governing the latent informational process

driving the volatility and trading volume. The potential long memory feature of the directing process

may be a factor in the rejection of the earlier models. Bollerslev & Jubinski (1999) �nd that volatility

and volume have a similar degree of memory (0.41 and 0.40, respectively). Furthermore, for only

8 of the 100 �rms in the S&P100 index can they reject that volatility and volume share a common

fractional integration order. This leads Bollerslev & Jubinski (1999) to argue in favor of a long-run

view of the MDH in which the aggregate latent information arrival process displays long memory

characteristics.

We contribute to the literature in several ways. As opposed to Bollerslev & Jubinski (1999)

who use the log-periodogram (LP) estimator, we adopt the local Whittle (LW) setting. Robinson

116

(1995a) showed that the standard LW estimator dominates the LP estimator in several ways. It is

asymptotically more e¢ cient and it does not assume Gaussianity. Both the LP and LW estimators

are semiparametric using the approximation

fz � G��2d as �! 0+; (1)

where G is a constant and ��means that the ratio of the left and right hand sides tends to onein the limit. Thus, the estimator is robust against short-run dynamics since the estimator only uses

information from the periodogram ordinates in the vicinity of the origin. But as shown by e.g. Andrews

& Guggenberger (2003) and Andrews & Sun (2004), if the signal is contaminated by persistent short-

run components the estimate of d will be biased. We can reduce the asymptotic bias by modeling the

spectral density of the short-run component by a polynomial instead of a constant in the vicinity of

the origin. Additionally, if the process that we try to model is a perturbed fractional process

zt = yt + wt; (2)

where the signal yt is a long memory process which is perturbed by the additive term wt (e.g. short

memory measurement error), we potentially introduce bias into the estimate of the long memory

parameter as we do not take the perturbation into account. The validity of the LP and LW estimators

are to some extent diminished when the series of interest follows e.g. a LMSV model as the rate of

convergence can be considerable slower than in a pure long memory setting. Although the LP and LW

estimators preserve consistency and asymptotic normality when applied to the LMSV model several

paper, e.g. Deo & Hurvich (2001), Arteche (2004b), and Frederiksen, Nielsen & Nielsen (2008) show

both theoretically and via simulations that they are heavily biased in that case.

Further, motivation for the perturbed process is the version of the LMSV model for �nancial

returns indirectly proposed by Bollerslev & Jubinski (1999)

rt = �peyt+xtut; (3)

where rt denotes the return, yt the long memory component of the log-volatility of the returns, xt is

a short-memory process, and yt; xt; and ut are independent to satisfy the requirement that E(rt) = 0:

This is an extension of the LMSV model of Breidt et al. (1998) and Harvey (1998)

rt = �peytut: (4)

It is clear from the formulation in (3) that we allow for di¤erent short-lived news impacts while

imposing a common long memory component. Therefore, the volatility is allowed to be a¤ected by

both long- and short-lived news impacts, which is consistent with the �ndings of Liesenfeld (2001).

This may provide a better characterization of the joint volatility-volume relationship in the context

of the MDH.

By considering the perturbed fractional process (2), i.e. signal-plus-noise model, the LMSV models

(3) and (4) imply that the log squared returns become a long memory signal-plus-noise process as given

by (2) where yt is the long memory component (log-volatility part) and wt is the short-memory part

117

(additive noise term) of the original series zt. If we assume that the log-volatility process fytg and thenoise process fwtg are independent, the spectral density of zt can be written as

fz = ��2d�y (�) + �w (�) = ��2dG

��y (�)

�y (0)+ �2d

�w (�)

�y (0)

�; (5)

where fy = G��2d�y (�) is the spectrum of the signal, �w (�) is the spectrum of the noise term wt,

and d is the degree of long memory in yt (and zt).1

By using a semiparametric estimator that allows both the spectrum of the short-memory compo-

nent of the signal and the spectrum of the perturbation, i.e. �y (�) and �w (�), to be approximated

by polynomials of �nite and even orders near the zero frequency, instead of constants, we obtain a

bias reduction depending on the smoothness of �y (�) and �w (�) near the origin. The estimator

employed is the local polynomial Whittle with noise (LPWN) derived by Frederiksen et al. (2008),

which explicitly models the spectrum of the short-memory component of the signal and the spectrum

of the perturbation. The LPWN estimator allows both the spectrum of the perturbation and the

spectrum of the short-memory component of the signal to be approximated by polynomials near the

zero frequency, and therefore mitigating this bias. In this paper, we measure return volatility based

on daily returns. It is clear that this measure is more noisy than using realized volatility based on

intraday returns as our volatility measure, see e.g. Fleming & Kirby (2006). But using semiparametric

methods, i.e. the LPWN estimator, that mitigate this noise, see eqn. (2) and (3), we appropriately

deal with this.

In addition to estimating the univariate long-run persistence of volatility and trading volume, we

also employ a test for long memory vs. spurious long memory, where we use the notion of sample

splitting (and temporal aggregation), and test whether there is parameter constancy. Regarding the

issue of sample splitting we extend the results of Shimotsu (2006) to include robustness to perturbed

fractional processes. Exploiting the long memory parameter�s invariance to temporal aggregation,

Frederiksen & Nielsen (2008) derives the joint distributional properties for di¤erent semiparametric

estimators for fractionally integrated processes potentially perturbed and non-stationary, applied to

di¤erent aggregation levels of the original series, and introduced a formal test of true long memory.

We further model return volatility and trading volume in the context of fractional cointegration.

The setup used is the semiparametric cointegration rank determination procedure of Robinson &

Yajima (2002) and Nielsen & Shimotsu (2007). Nielsen & Shimotsu (2007) extends the work of

Robinson & Yajima (2002) as it accommodates both (asymptotically) stationary and nonstationary

fractionally integrated processes and cointegrating errors by applying the exact local Whittle (ELW)

setting of Shimotsu & Phillips (2005). In the analysis of the joint volume-volatility relationship, we

examine the rank of the spectral density matrix around the origin, where the fractional integration

order, d, is estimated by the LW estimator. Using the LW setup, we are subject to the problem as

other semiparametric estimators that do not model the potential perturbation in the noise. That is,

1By using the independence assumption between yt and wt, we exclude the so-called leverage e¤ect. The independence

assumption is common in the literature concerning LMSV, see Breidt et al. (1998), Deo & Hurvich (2001) and Arteche

(2004b) among others. This independence assumption could be relaxed; see Frederiksen et al. (2008) for a discussion of

the possibilities in that direction.

118

we are subject to size distortions in the rank test as we do not take potential bias of the LW estimator

into account.

Our results show that using semiparametric estimators that are robust to potential contamination

by an additive noise term, there is evidence that volatility and trading volume are more persistent in

terms of memory than other studies have shown. Additionally, volatility displays evidence of being

governed by a perturbed fractional process, more speci�cally a LMSV model, whereas this is not

generally the case for trading volume. In the cases where we cannot reject that the memory parameters

are equal, we furthermore �nd weak evidence of there being a cointegrating relation between volatility

and trading volume. This is in line with other studies showing that although volatility and volume

might share a common fractional integration order, they do not move together over time.

The remainder of the paper is organized as follows. Next we discuss the volatility and trading

volume in the context of the MDH and how long memory naturally relates to this subject. Section 3

introduces the data that is used in the analysis. In section 4, we set up the methodology that is used

in the analysis of volatility and trading volume, and section 5 presents the empirical results. Section

6 concludes.

2 Volatility, trading volume, and the Mixture of Distributions Hy-pothesis

In this section, we will give a short overview of the Mixture of Distributions Hypothesis (MDH)

literature. Tauchen & Pitts (1983) analyzed a bivariate mixture model (standard mixture model)

where the movement from one equilibrium to another is motivated by the arrival of new information to

the market. At the ith equilibrium the jth trader�s desired position in this security is q�ij = ��P �ij � Pi

�where P �ij is the traders reservation price, Pi is the current market price and � is a positive constant.

In equilibrium the market clears and the market price is (1=J)PJ

j=1 P�ij . Hence, the market price only

changes whenever new information arrives and the return is given by

Ri � dPi = (1=J)JXj=1

dP �ij ; (6)

where it is assumed that the change in traders reservation prices follow

dP �ij = �i + ij ; �i � i:i:d:N�0; �2�

�and ij � i:i:d:N

�0; �2

�: (7)

Here �i and ij represent a common (market wide) and a idiosyncratic component speci�c to the jth

trader, respectively. The parameters �2� and �2 measure the sensitivity of traders�reservations prices

with respect to new information.

Assuming that the number of traders J (assumed to be large) and the variances �2� and �2 are

constant over time, the joint distribution of the daily return, Rt, and volume, Vt, are mutually and

serially independent bivariate normal conditional on the number of information arrivals

Rt jIt � N�0; �2riIt

�; (8)

Vt jIt � N��0 + �viIt; �

2viIt�; (9)

119

where It is the daily number of information arrivals, Rt and Vt are given by the sum of the intra-

daily returns, Ri, and trading volumes, Vi, respectively, and �0 captures the part of trading volume

generated by liquidity traders (noise component that is non-in�uenced by the information �ow). As

seen from (8) and (9) the dynamics of the volatility of returns and the dynamics of trading volume

are dependent on the time-series behavior of It. Hence, if we allow for strong serial correlation in Itthe model predicts persistence and clustering in return volatility, which is empirically well founded,

see e.g. Engle (1982), Engle & Bollerslev (1986), Bollerslev (1986). Since unanticipated news often

tends to be followed by several explanations related to the initial news, it is sensible to assume some

degree of persistence in the information arrival process, and if the news impact even last for a random

number of subsequent days it follows from Parke (1996) (see Bollerslev & Jubinski (1999)) that the

resulting latent aggregated information arrival process will be fractionally integrated.

Several extension to the standard MDH have been proposed. In contrast to the Tauchen & Pitts

(1983) model, where all traders act unstrategically, Andersen (1996) showed that by modeling the

strategic interaction of informed investors and liquidity traders with a risk-neutral market maker,

i.e. the setting of Glosten & Milgrom (1985), the daily trading volume conditional on some latent

distribution process It is approximated by the Poisson distribution, i.e.

Vt jIt � Po (m0 +m1It) (10)

where m0 captures the part of trading volume generated by liquidity traders (noise component),

whereas m1 is the component related to the information �ow governed by the latent distribution

process It. By deriving the trading volume conditional on the information �ow, the nonnegative

constraint is always ensured, and by allowing for a nonzero constant m0, the parameterization is more

�exible, and therefore a nonproportional relationship is allowed for. Andersen (1996) showed that his

modi�ed mixture model is superior to the standard mixture model of Tauchen & Pitts (1983), but the

bivariate system implied a signi�cantly lower volatility persistence than empirically observed. This led

to the conclusion that more than one type of information arrival process having di¤erent implications

for the volume and return volatility persistence should be incorporated.

This problem is to some extend alleviated in the generalized mixture model of Liesenfeld (2001)

where he allowed for time-dependencies in the variances �2� and �2 in the Tauchen & Pitts (1983)

model, see (7), which are directed by a common latent random process, !t. Furthermore, as !t is

very persistent the generalized mixture model alleviates the shortcomings of the modi�ed model by

Andersen (1996). However, as the volume speci�cation is very close to that of the standard model, it

is not odd that the generalized mixture model does not alleviate the autocorrelation in the (squared)

standardized volume residuals found in the standard MDH.

A similar modi�cation is proposed by Watanabe (2000, 2003) where the volume dynamics are as

the standard MDH, setting �0 = 0, and the return dynamics are in�uenced by two latent variables, Itand Jt,

Rt jIt; Jt � N�0; �2riItJt

�; (11)

where log (It) and log (Jt) both follow a zero-mean Gaussian AR (1)-process. Similar to Liesenfeld

(2001), Watanabe (2000, 2003) �nds that the latent variable, that only in�uences the volatility, is

120

more persistent than the common latent factor but less volatile. However, the model su¤ers the same

drawbacks as the generalized mixture model of Liesenfeld (2001) regarding the volume speci�cation.2

Overall, and as noted by Bollerslev & Jubinski (1999), by forcing the same short-run dependence for

the two series, the long-run persistence may be underestimated. As we use semiparametric techniques

and therefore do not model the short-run dynamics, we conveniently disregard this discussion. In

this paper, we are particularly interested in the persistence and the long-run relationship between

volatility and trading volume. This is motivated by the arguments given in Andersen (1996), Andersen

& Bollerslev (1997a, 1997b), Bollerslev & Jubinski (1999), among others, i.e. a long-run proposition

for MDH where we have the following long-run link for large �

corr (jRtj ; jRt�� j) � �2d�1; (12)

corr (Vt; Vt�� ) � �2d�1;

corr (jRtj ; jRt�� j) � corr (Vt; Vt�� ) � �2d�1;

given some distributional assumptions.3

3 Data

We consider common stock return volatility and trading volume of the S&P 100 shares included

in the index on December 1, 2006. Share price, trading volume, number of outstanding shares, and

adjustment coe¢ cients to adjust share prices for dividends and splits were downloaded from the Center

for Research in Security Prices (CRSP) database. The sample period runs from July 2, 1962 through

December 31, 2006. Following Bollerslev & Jubinski (1999), among others, we remove the three weeks

following the October 1987 crash.4 We will only consider the 45 shares where the full sample of daily

observations is observed, i.e. n = 11; 187.

Returns (for the jth common stock) are measured in daily continuously compounded rates, Rjt =

log (Pjt=Pj;t�1) and were corrected for the e¤ects of stock splits and dividends using the correction

factor in the CRSP database. To avoid the problem of taking logarithm of zero, we based the analysis

on adjusted log-squared/absolute returns using the method of Fuller (1996), i.e. we analyze

~r2jt = log~R2jt = log

�R2jt + �

��

R2jt + �;

2 In on going work Frederiksen & Nielsen (2009) use Maximum Likelihood (using E¢ cient Importance Sampling) to

estimate a three factor bivariate mixture model where the common information arrival process is assumed to posses long

memory according to the Gaussian ARFIMA (1; d; 0) process and the distinct dynamics of the volume and volatility

follow Gaussian AR (1)-processes. It is a necessity at least to apply two-factor bivariate models like Liesenfeld (2001) and

Watanabe (2000, 2003) as one-factor models do not alleviate the problem that the persistence parameter of the (common)

information arrival process is too small compared to univariate empirical �ndings, see Andersen (1996), Liesenfeld (2001)

and Watanabe (2000, 2003).3For more on the de�nition of fractional processes and the assumptions needed in linking this to the hyperbolic decay

of the autocorrelations, �2d�1, see e.g. Beran (1994) and Robinson (2003).4We did do the analysis on the full sample and also looked at further reduced samples, e.g. removing all observations

between December 23 and January 2, as done in Andersen (1996). The results were qualitatively similar.

121

where � = 0:02n

Pnt=1R

2jt and

j~rtj = log�� ~Rjt�� = log (jRjtj+ �)� �

jRjtj+ �;

� = 0:02n

Pnt=1 jRjtj. Furthermore, following Bollerslev & Jubinski (1999) the daily trading volume for

share j; Vjt is measured by the turnover ratio5

Vjt =SjtNjt

;

where Sjt and Njt are the common stock volume and total number of outstanding shares for common

stock j at time t, respectively. Instead of working with Vjt, we consider, vjt = log (Vjt). We need to deal

with the deterministic trends in the volume measure before moving on to the analysis. Firstly, trading

volume is an increasing function of time, at least until the mid of 1997, after which it stabilizes

somewhat. Secondly, there is signi�cantly lower trading between Christmas and New Year�s eve,

Andersen (1996).

By log-transforming the raw series, we approximate it to a straight line, and instead of employing

a linear detrending procedure as in Bollerslev & Jubinski (1999), we regressed the log transformed

turnover ratio, vjt, on a constant, time trend, squared time trend, and a dummy equal to 1 if t 2[Christmas; January 1]. Sensitivity analysis of using other detrending methods, i.e. linear detrending

(excluding the dummy variable) or including only up to a cubic polynomial, did not change the

conclusions in a qualitative sense.

Summary statistics of return data and the detrended log turnover ratio for the Minnesota Mining

and Manufacturing Co. (3M), International Business Machine Corporation (IBM), and Aluminum

Corporation of America (AA), are given in Tables 1 and 2, respectively. For the return data, the

mean is not statistically di¤erent from zero, and the sample standard deviation is quit larger than

the mean for all three shares, and they all exhibit excess kurtosis and non-normality. Looking at the

respective subperiods for the returns, the mean is again not signi�cantly di¤erent from zero, and the

sample standard deviation is considerably larger than the mean and quite stable over subperiods for

all three shares. There is considerable kurtosis and especially in subperiod 5 (October 1984 through

April 1990) for the shares 3M and AA. Furthermore, there is a high degree of own serial correlation for

the shares 3M and AA, whereas it is not so profound for the IBM share. Turning next to the detrended

log turnover ratio and comparing it to the return series, the excess kurtosis is not as profound even

in subperiod 5. What is apparent is the considerable amount of own serial correlation which is much

higher than for the return series.

Insert Tables 1-2 about here.

Looking at Figure 1 where we have plotted the autocorrelation function for the return, adjusted

log-squared return, adjusted log-absolute return and detrended log there is clearly more persistence in

the detrended log turnover ratio than for the adjusted log-squared and adjusted log-absolute returns.

This is consistent with the descriptive statistics in Tables 1 and 2.

Insert Figure 1 about here.5There are several other measures of trading activity, which might be of interest. For example, number of trades per

period, share volume, values of shares traded, relative dollar volume, and dollar turnover.

122

If we were to plot the cross-correlation between return volatility and our measure for trading

volume, we would see a signi�cant positive relationship which is consistent with Karpo¤ (1987),

Andersen (1996), among others.

4 Methodology

This section contains the methodology used in our empirical investigation. We �rst brie�y describe lo-

cal Whittle estimation of perturbed processes and the estimators proposed by Frederiksen et al. (2008)

which are especially suited for this setting. Secondly, we discuss the subject of long memory versus

spurious long memory, where we focus on a Wald test of parameter constancy derived by Shimotsu

(2006) in a sample splitting context and by Ohanissian, Russell & Tsay (2008) and Frederiksen &

Nielsen (2008) in a temporal aggregation context. Finally, we describe the semiparametric fractional

cointegration methodology derived in Robinson & Yajima (2002) and Nielsen & Shimotsu (2007),

which is used in analyzing the long-run relation between volatility and trading volume.

4.1 Local Whittle estimation of perturbed fractional processes

For perturbed fractional processes, we have the spectral representation (5) rather than (1). There

are two main consequences: �rst, the extra additive term in (5) needs to be taken into account to

avoid serious asymptotic bias as mentioned in the introduction, and second the rate of convergence

of the estimators is reduced if the extra term is not modeled. The latter follows because the choice of

bandwidth parameter is severely constrained for perturbed fractional processes when the perturbation

term in (5) is not modeled. Thus, for non-perturbed processes the bandwidth requirement is typically

m = o(n4=5), whereas for perturbed processes it is m = o(n2d=(1+2d)) (apart from logarithmic terms).

Since d < 1 and the estimator ispm�consistent, this is a serious constraint because of the further

restriction on the number of periodogram ordinates used in the estimation to yield consistent estimates

of the long memory parameter.

Frederiksen et al. (2008) propose to approximate (5) locally near the zero frequency by

g (�) = ��2dG�1 + hy(�y; �) + �

2dhw(�w; �)�; (13)

where hy(�y; �) =PRy

r=1 �y;r�2r, hw(�w; �) =

PRwr=0 �w;r�

2r. We note that �y(�) and �w(�) are sym-

metric around � = 0 and are therefore approximated by even polynomials. If Ry = 0 set hy(�y; �) = 0.

De�ning also the polynomial h(d; �; �) =hy(�y; �) + �2dhw(�w; �) with � = (�0y; �0w)0 this yields the

(concentrated) likelihood

Q (d; �) = log G (d; �) +1

m

mXj=1

log��2dj (1 + h(d; �; �j))

�; (14)

G (d; �) =1

m

mXj=1

�2dj Iz (�j)

1 + h(d; �; �j): (15)

Thus, minimize (14) over the admissible set D ��,

(d; �) = argmin(d;�)2D��

Q (d; �) ; (16)

123

where � is a compact and convex set in RR+1, R = Ry +Rw, and D = [d1; d2] with 0 < d1 < d2 < 1.

This estimator is called the local polynomial Whittle with noise (LPWN) estimator, and it is shown

in Frederiksen et al. (2008) that under some regularity conditions the estimator will be consistent for

d 2 (0; 1) and asymptotically normal for d 2 (0; 3=4).A few remarks are in order. Note that if h(�; �) = 0, we have the standard local Whittle speci�ca-

tion of Robinson (1995a), which does not explicitly account for the perturbation. For (Ry; Rw) = (0; 0)

we get h (�; �) = �, where �y(�) and �w(�) in (5) are both modeled locally by constants. This is the

local Whittle with noise (LWN) estimator of Hurvich & Ray (2003) and Hurvich, Moulines & Soulier

(2005). Thus, the parameterization in (14) includes the standard LW estimator and the LWN esti-

mator as special cases. Furthermore, the use of the polynomials hy(�y; �) and hw(�w; �) increases the

asymptotic variance of d by a multiplicative constant compared to the LW estimator, see Frederiksen

et al. (2008) for more details.

4.2 Testing for long memory versus structural breaks

Recent literature has dealt with the presence of structural breaks and regime switching in analyzing

long memory in data. Granger & Terasvirta (1999) show that the number of regime switches a¤ects

the long memory parameter. Diebold & Inoue (2001), Granger & Hyung (2004) and Haldrup &

Nielsen (2007) discuss that if series display breaks, particularly in their deterministic components,

these processes will give the impression of persistence. That is, we can mistakenly conclude that a

process displays long memory, where in fact it is due to a structural break in the series. Especially,

the empirical literature on long memory in volatility is vast and one can roughly divide them into two

opposites. One in favor of the long memory hypothesis, and the other against. To the �rst group

belong papers of Ding, Granger & Engle (1993), Ding & Granger (1996), Bollerslev & Mikkelsen

(1996), Andersen & Bollerslev (1997a, 1997b), Breidt et al. (1998), Lobato & Savin (1998), among

others. To the second group belong papers by Mikosch & Starica (2000) and Granger & Hyung (2004),

among others.

To investigate this further in our context we apply the method of sample splitting (and temporal

aggregation), and test whether or not we have parameter constancy.

4.2.1 Sample splitting

Following Shimotsu (2006), we are interested in testing the hypothesis of true I(d) versus spurious

I(d), i.e. H0 : d0 = d(1)0 = ::: = d

(K)0 where d(i)0 , i = 1; :::;K is the true long memory value of d from

the ith block and where we have split the sample into K 2 N+ blocks, so that the sample consistsof n

K 2 N+ observations in each block. De�ne a K + 1 vector d =�d1; d2; : : : dK

�0of long memory

estimates and a K � (K + 1) matrix A

A =

0BB@1 �1 � � � 0...

.... . .

...

1 0 � � � �1

1CCA :

124

Then we can set up the Wald statistic testing the null that the original series is a long memory process.

WS =�Ad�0 �

A�A0��1 �

Ad�

d! �2K�1; (17)

where var�d�= � is the �nite sample covariance matrix of the estimates. This test statistic includes

several semiparametric estimators, as we just need the estimate of the individual K samples, and the

limiting distributional result for the semiparametric estimator of interest. Shimotsu (2006) uses a test

statistic to cover non-stationary values of d, i.e. using the 2-step feasible exact local Whittle (ELW)

estimator6. We, on the other hand, want to focus on estimators that are speci�cally applicable to

potentially perturbed fractional processes, therefore we use the LPWN framework.

We further analyse potential spurious long memory in the context of temporal aggregation, where

we exploit the long memory parameter�s invariance to temporal aggregation. Ohanissian et al. (2008)

derived the joint distributional properties of the log-periodogram regression estimator of Geweke &

Porter-Hudak (1983) and Robinson (1995b) applied to di¤erent aggregation levels of the original series,

and introduced a formal test of true long memory. Furthermore, Ohanissian et al. (2008) extend

their test to long memory signal plus added noise models (perturbed fractional processes). However,

especially in small samples, it should be noted that the LPR estimator may be substantially downward

biased when the noise-to-signal ratio is high, and that the bias increases as the bandwidth m, used in

the estimator, increases, see Deo & Hurvich (2001, 2002). Therefore, we use the LPWN estimation

framework, where the limiting distributional results given di¤erent aggregation levels are derived in

Frederiksen & Nielsen (2008).

4.3 Fractional cointegration

We analyze potential fractional cointegration in the semiparametric setting of Robinson & Yajima

(2002) and Nielsen & Shimotsu (2007).7 As the theory of modeling the perturbation and short-run

dynamics by polynomials is only applicable in the univariate setting (at least using the theory set

forward by Frederiksen et al. (2008)), we will go forward as in Robinson & Yajima (2002). We

should note that Nielsen & Shimotsu (2007) extend the results of Robinson & Yajima (2002) to cover

nonstationary values of d by using the ELW estimator proposed by Shimotsu & Phillips (2005). The

limiting distribution will not change if we consider the LW estimator instead of the ELW as in Nielsen

& Shimotsu (2007) (for d 2 (�1=2; 1) and mean equal to zero when d � 1=2). We, however, need

to modify some of the Assumptions in Robinson & Yajima (2002); e.g. we need to account for non-

stationary values of d, i.e. d 2 [1=2; 1) in the Type I fractional framework, see e.g. Velasco (1999).Furthermore, G is estimated di¤erently under the ELW setting.

Robinson & Yajima (2002) generalize the univariate results of Robinson (1995a) to the multivariate

case where consistency and asymptotic normality is established under both the presence and absence

of cointegration.6The feasible ELW estimator extends the ELW estimator to the case with an unknown mean and trend. Furthermore,

the 2-step feasible ELW uses a tapered estimator by Velasco (1999) in the �rst step, and in the second step r Newton-

Rahpson type iterations, i.e. dj = dj�1�Hn

�dj�1

��1Sn�dj�1

�for j = 1; :::; r, where d0 is the estimate from step one,

and Hn (:) and Sn (:) are the Hessian and score of the ELW objective function, respectively.7 It would be of interest to consider alternative (parametric) techniques, e.g. Breitung & Hassler (2002) and Hassler &

Breitung (2006), but in this analysis we will restrict ourselves to semiparametric methods as discussed in the introduction.

125

The consistency results for�d; G (d)

�are then used to establish a joint test for pairwise equality of

the integration orders, H12 : d1 = d2, or if we have p series, the hypothesis of equality of all integration

orders, H0 : d1 = :::dp = d

T12 =m1=2

�d1 � d2

��12

�1� G212

G11G22

��1=2+ h (n)

; (18)

T0 = m�Sd�0�

S1

4K�1

�G � G

�K�1S0 + h (n) Ip�1

��1 �Sd�; (19)

where h (n) satis�es an assumption that restricts the bandwidth choice, see Robinson & Yajima (2002,

Assumption G) and Nielsen & Shimotsu (2007, Assumption 6), K = diag�G�; S = [Ip�1;��] and �

is the (p� 1) vector of ones. Then under the regularity conditions governing the LW estimator, h (n),

and under H12 and H0 as n!1

(i) If X1t and X2t are not cointegrated, T12d! N (0; 1) ; (20)

(ii) If X1t and X2t are cointegrated, T12p! 0;

(iii) If Xt is not cointegrated, T0d! �2p�1;

(iv) If Xt is cointegrated, T0p! 0:

For (iii) this corresponds to a situation where r = 0 and (iv) to a situation where r � 1.When we want to consider the cointegration rank of Xt, we need to estimate G and its eigenvalues.

We know that G is de�ned (for a new bandwidth choice m1, which is due to the potential problems

arising from the estimation of d) for the new bandwidth choice m1 as

G (d) =1

m1

m1Xj=1

Ren� (�j)

�1 I (�j) � (�j)�1�o; (21)

where � (�j) =diagne��d=2��dj ; :::; e��d=2��dj

o, � denotes the complex conjugate, and where we for

simplicity have assumed d1 = ::: = dp = d, I (�j) is the periodogram of (X1t; :::; Xpt)0, and Gi is the

ith column of G for i = 1; :::; p. As d is unknown we need to substitute this with an estimate, and as

discussed in Robinson & Yajima (2002) and Nielsen & Shimotsu (2007) we cannot use the multivariate

version of the LW estimator as Xt does not have full rank when Xt is cointegrated. Furthermore, d

needs to converge at a faster rate than m1=21 . Therefore, estimate each di for i = 1; :::; p by (16), where

h(d; �; �j) is set to zero in eqn. (14) and (15), using m periodogram ordinates, select m1 such thatmm1! 0 for n!1, and then estimate G by (21). De�ne �d = 1

p

Ppi=1 di: Then for the ith eigenvalue of

G and G��d�; i.e. �i and �i for i = 1; :::; p, respectively, where the eigenvalues are ordered descendingly.

From this, a model selection procedure can be set up to determine the cointegration rank, r: Following

Robinson & Yajima (2002), estimate r by

r = arg minu=0;:::;p�1

L (u) ; (22)

where

L (u) = v (n) (p� u)�p�uXi=1

�i; (23)

126

for some v (n) > 0 which is assumed to satisfy some convergence assumption as n!1, see Robinson& Yajima (2002, Assumption J) and Nielsen & Shimotsu (2007, Assumption 8�).

In this paper, p = 2 as we analyze potential cointegration between volatility and trading volume.

The outcome of the selection procedure is especially a¤ected by v(n). The Monte Carlo study in

Nielsen & Shimotsu (2007) reveals that for a large v (n) we are likely to estimate a large r, whereas

the opposite is the case when v (n) is small. Therefore, we let v (n) take on di¤erent values in the

empirical section. Furthermore, the test of equality of integration orders is sensitive to the choice of

h (n). Selecting too large a h (n) leads to a underrejection ofH12 (H0) under noncointegration, whereas

selecting too small a h (n) leads to overrejection of H12 (H0) under cointegration. Nielsen & Shimotsu

(2007) show that h (n) = log�1 n works well, but the test statistic overrejects when h (n) = log�2 n.

5 Results

This section deals with univariate estimation8 of the fractional integration parameter of volatility

and trading volume by using semiparametric methods, testing for true long memory versus spurious

long memory by sample splitting (and temporal aggregation), and testing whether or not they are

fractionally cointegrated.

5.1 Results for return volatility and trading volume

The �rst part of this subsection looks at the long-run persistence of volatility and trading volume

using the semiparametric estimators discussed in the methodology section. We will look at three

shares in detail, 3M, IBM, and AA, and discuss the remaining shares later in the section. We consider

the following estimators: LP, LW, LPW, LWN, and LPWN implemented with (Ry; Rw) equal to

(1; 0) ; (0; 1) ; and (1; 1), denoted LPWN(Ry; Rw) :

For all estimators, we set the bandwidth as m = bnac, where a 2 f0:5; 0:6; 0:7g. For the LP andLW estimators, we conjecture9 that a choice of a = 0:5 is the more appropriate of the three, whereas

for the LPWN (including the LWN) estimators, we apply the same bandwidth for simple comparison.

However, regarding the estimators where we model a polynomial term also, we can have issues of

pinning down the ��s estimates. Therefore, a larger sample of periodogram ordinates is needed, and

therefore the inclusion of a 2 f0:6; 0:7g. Numerical optimization was carried out in Matlab v7.5 usingthe BFGS optimization routine. The initial values were set as follows. For all estimators, the starting

values were set equal to the Geweke & Porter-Hudak (1983) LP estimate. The admissible space of d for

the Whittle estimators is restricted to be [0:01; 0:99], c.f. Assumption A2, Frederiksen et al. (2008).

If dLP =2 [0:01; 0:99], then initial values are set equal to 0:4. As initial values for the polynomial

8We do not consider multivariate semiparametric estimation of the fractional integration parameters as the theory of

modeling the perturbation and short-run dynamics by polynomials is only applicable in the univariate setting (at least

using the theory set forward by Frederiksen et al. (2008)). Furthermore, Nielsen (2008) shows in a comparative study of

fractional cointegration that when data is contaminated by noise,.the multivariate LW estimator of Shimotsu (2007) is

more biased than the univariate LW estimator.9 It is well-known that the bias of the LP and the LW estimator is increasing in the bandwidth when the long memory

series is perturbed and that choosing a = 0:5 renders fairly unbiased results, see e.g. Sun & Phillips (2003) and Arteche

(2004a, 2004b).

127

parameters we used 1, for all estimators. Furthermore, the polynomial terms for the LWN and LPWN

estimators are restricted to be non-negative.

Table 3 presents the memory estimates for the adjusted log-squared returns, adjusted log-absolute

returns, and detrended log turnover ratio. Firstly, looking at the volatility series for the three shares,

we see that as expected from theory10, the LP, LW, and LPW estimators appear downward biased and

are decreasing as a function of the bandwidth. For the LWN and LPWN estimators, this is not the

case. The fractional integration estimate is reasonably constant regardless of bandwidth. Furthermore,

the memory estimate is in the nonstationary region for all three common stocks, except for the 3M

share, when the bandwidth is set equal to m =�n0:5

�, where it is borderline nonstationary. In Table

3, we have marked the cases where the polynomial coe¢ cient is signi�cant, and we clearly see that the

noise coe¢ cient (marked with an b) is signi�cant in all cases for the LWN estimator, but interestingly

this is not the case for the LPWN estimators. Looking at the standard errors when including a noise

polynomial (the LPWN(0,1) and LPWN(1,1) cases), we observe that the standard error on the noise

coe¢ cient in some cases increases considerably, which is evidence of collinearity. Therefore, including

a polynomial in the characterization of the short-run contamination in the perturbation is not needed

in most of the cases. Additionally, comparing the two volatility series, we see that d is estimated

higher under the absolute measure.

Turning our attention to the detrended log turnover ratio, d again falls as a function of bandwidth

for 3M and IBM for the LP, LW, and LPW estimators, whereas this is not the case for the AA share.

LWN and LPWN estimators are in the nonstationary region for the 3M and IBM shares. For the AA

common stock, all estimators estimate d reasonably close to each other and in the stationary region.

That is, for the AA share there is no sign that it is governed by a perturbed fractional process. There

is also evidence of this as it is only when m =�n0:6

�that the noise coe¢ cient is signi�cant for the

LWN estimator.

Table 4 presents the mean and median for the 45 estimated fractional integration parameters.

The LWN and LPWN estimators are in general considerably higher than for the LP, LW, and LPW

estimators, especially as the bandwidth increases, and in the non-stationary region for the two volatility

measures. Furthermore, the memory estimates for the adjusted log-absolute returns are higher than

for the adjusted log-squared returns for the LWN and LPWN estimators, whereas the conclusions are

reversed for the LP, LW and LPW estimators. For the detrended log turnover ratio, the LWN and

LPWN estimates are borderline nonstationary in most cases, and the volatility measures are clearly

more persistent in terms of memory than trading volume. As the memory estimate for the detrended

log turnover ratio across estimators and bandwidth is reasonably similar, it does not look as if trading

volume in general follows a perturbed fractional process. Bollerslev & Jubinski (1999) �nd a mean and

a median of 0:404 and 0:407, respectively, for the squared returns, and for their measure of trading

volume, they �nd a mean and median of 0:407 and 0:410, respectively. This is in line with the results

we obtain for the LP estimator for m =�n0:5

�, i.e. 0:440 and 0:423 for the adjusted log-squared

returns and 0:474 and 0:413 for the detrended log turnover ratio. That is, we clearly see the pattern

10From Hurvich & Ray (2003), we know that the LWN is superior to the LW estimator in terms of bias and RMSE

when we are in the context of a standard LMSV model. This is also shown (also for the LPW estimator) in Frederiksen

et al. (2008).

128

that the estimators that explicitly model the perturbation estimate a higher degree of memory than

the estimators that do not, i.e. LP, LW, and LPW. Furthermore, the fractional integration parameter

is in general estimated at a higher level for the two measures of the volatility process than for trading

volume for the LWN and LPWN estimators, whereas they are quite similar when looking at the LP,

LW, and LPW estimators. The �ndings of Bollerslev & Jubinski (1999) may therefore be misleading

as they do not employ estimators that are robust to potential perturbation even though they select a

reasonable low choice of bandwidth, i.e. m =�n0:5

�. But nontheless, we do not know the appropriate

bandwidth choice when the signal is perturbed by some noise term, and therefore using estimators

that have a reasonable rate of convergence is important. The range of the individual estimators is

tabulated in Table 5.

Looking at the t-statistics, the individual memory estimates for the LP, LW, and LPW estimators

are in general di¤erent from zero and one, at a 5% level. This is con�rmed when looking at Table 6.

For the LWN and LPWN estimators, the picture is not as clear-cut. For instance, for the adjusted

log-absolute returns for the LPWN(1,1) estimator we only reject the null that d = 1 in 6 cases when

m =�n0:5

�, whereas the rejection frequency rises to 33 and 42 when m =

�n0:6

�and m =

�n0:7

�,

respectively. This is clearly due to some boundary issue since we get more precision in estimating the

polynomial coe¢ cients when we include more periodogram ordinates.

To give an overall picture of whether or not the respective series potentially follow a perturbed

fractional process (e.g. in the context of the LMSV model, as discussed in the introduction), we have

in Figure 2 plotted the estimated values of d for the two volatility series and the trading volume series

for the 3M common stock11 using the LW and LWN estimators. In addition, we have also plotted

the approximate asymptotic con�dence interval given by plus/minus two asymptotic standard errors

of the respective estimates. The estimates are shown for a range of relevant values of the bandwidth

parameter, m 2 [50; 2000]. Following the assumption on the bandwidth choice of Frederiksen et al.(2008, Theorem 2) and the suggestion by Hurvich & Ray (2003), we emphasize the higher bandwidth

values where the estimates (and con�dence intervals) also appear more stable. The bandwidth values

corresponding to the range n0:5�n0:7 as investigated above are m 2 [105; 682]. The results in Figure 2for the volatility measures (Panel A and Panel B) show that the LW estimates is smaller than the LWN

estimates for essentially all bandwidth choices, and the LW estimate is decreasing in the bandwidth.

This is expected based on theoretical properties of the LW estimator in the case of perturbed fractional

processes (e.g. in the context of the LMSV model). The LWN estimate is higher and shows signs of

nonstationarity for higher bandwidth values, while the LW estimate is nonstationary for low values of

m and nonstationary when m > 100. Looking at the results for the detrended log turnover ratio series,

we see that the estimates for the LW and LWN estimators are statistical indistinguishable from each

other when m < 400. Furthermore, the LWN estimate is borderline nonstationary for most values of

m, while the LW estimate is stationary for m > 430.

Regarding the issue of potential fractional cointegration between volatility and trading volume,

11We focus on the 3M common stock and omit the other shares, as the conclusions drawn are quit similar in an

qualitative sense. Furthermore, we only plot for the LW and LWN estimator. The LP and LPW estimates are similar to

the LW estimates. The LPWN(1,0), LPWN(0,1), and LPWN(1,1) parameterizations are also omitted as these are similar

to the LWN estimates. Although, it should be noted that the estimates from the LPWN(Ry; Rw) parameterizations are

more volatile.

129

we see from Table 3 that there could be some long-run relationship between some of the shares in

the S&P100 composite index. For example, comparing the individual estimates for the IBM common

stock, it cannot be ruled out that the estimated integration orders are statistically indistinguishable.

Of course, the standard errors for the LPWN estimators are so large that in most cases we cannot rule

out that the estimated memory is identical across volatility and trading volume. In Figure 3, we have

plotted the LWN estimate for the adjusted log-squared returns, adjusted log-absolute returns, and

the detrended log turnover ratio for the 3M common stock. For all choices of bandwidth it seems as

though they share a common integration order as they are not statistically di¤erent from each other

(on a 5% signi�cance level). Looking at the rest of the estimates across common stocks, we can not

in the majority of the cases reject the null that they have the same degree of memory.

To sum up, using more appropriate semiparametric estimators, we �nd that the volatility and

trading volume are more persistent than other studies have found, see e.g. Bollerslev & Jubinski

(1999) and Fleming & Kirby (2006) for the S&P100 index, Lobato & Velasco (2000) for the DJIA30,

Liesenfeld (2002) for futures contracts on the DAX index, Gurgul & Wojtowicz (2006) for the DAX

index and the references therein. Additionally, there is evidence that volatility is more persistent in

terms of memory than trading volume. Furthermore, in light of our study and Bollerslev & Jubinski

(1999) there is evidence that in general volatility could be governed by a perturbed fractional process

(e.g. in the context of the LMSV model), whereas this is not the case for trading volume (or at least

not for the majority of the common stocks as seen from Table 4).

Before moving on to testing whether the two series share a common order of fractional integration

and if the two series move together over time, we test whether non-linearities induce the long memory

in the volatility and volume.

5.2 Long memory versus spurious long memory

Tables 7-9 display the estimates d (memory estimate of entire sample) and �d (mean of estimate

from the K sample splits) using the LW and LWN estimators12, the test statistics WS for m 2f400; 600; 800; 1000g and K 2 f2; 4g for the shares 3M, IBM and AA. We have to restrict the sample

size n such that nK 2 N+. We do this by removing the �rst 187 observations, so that n = 11; 000.

Looking at Table 7, the overall evidence of spurious long memory is not supported. For the adjusted

log-squared and adjusted log-absolute returns, d and �d decrease as m increases for the LW estimator.

That the estimated fractional integration parameter decreases (for the LW estimator) is a sign that the

underlying process in fact is a perturbed fractional process as the LW estimator underestimates the

true d in this case, see discussion in the previous subsection. Looking further at the estimator which

should take the perturbation into account, the LWN estimator, we see that the fractional integration

estimate is in the nonstationary region and the memory estimate decreases with m for the detrended

log turnover ratio. But overall, for both semiparametric estimators, we cannot reject the null of equal

12The results for the LP and LPW are omitted as these are similar in a qualitative sense to the LW results. The

LPWN(1,0), LPWN(0,1), and LPWN(1,1) parameterizations are also omitted as these are similar to the LWN results.

Although, we note that when we split the sample in K = 4 we in some cases have problems with convergence for,

especially the LPWN(1,1). This is because we do not have enough periodogram ordinates to pin down the 4 parameters

(3 polynomial terms and the memory estimate).

130

memory across sample splits.

In Table 8, we display the results for the IBM share. The same pattern as in Table 7, i.e. LW

estimates in the stationary region and the LWN estimates in the nonstationary region. For the adjusted

log-absolute return series, we can reject the null that the series is long memory when (m = 800;K = 2)

and (m = 1000;K = 4), whereas we for the LWN estimator only reject when m = 800 and K = 4.

Looking at the detrended log turnover ratio, we reject the null in all but 1 case for the LW estimator,

whereas we for the LWN reject the null in 1 case. To sum up, there is at best weak evidence that the

adjusted log-absolute returns and detrended log turnover ratio for the IBM share are not governed by

a long memory process.

For the last common stock, i.e. AA, Table 9, the long memory hypothesis is supported in all cases

but when looking at the adjusted log-absolute returns for the LW estimator, m 2 f600; 800; 1000g,and K = 2. Furthermore, we again see a decrease in the memory estimate (volatility series only) for

the LW estimator, but not for the LWN estimator. For the detrended log turnover ratio, the fractional

integration estimate is similar and stable across bandwidth choice and estimator.

In Table 10, we have tabulated in how many cases we reject the null of true long memory versus

spurious long memory for the 45 common stocks where we observe a full sample. In general, we

reject the null in more cases when the test statistic is based on the LW estimator. This also holds

when comparing volatility and trading volume. Interestingly, we reject the null in more cases for the

absolute measure than for the squared measure, although only for K = 4 and m 2 f600; 800; 1000gwhen considering the test based on the LWN estimator.

As a �nal note, we also implemented the test for long memory against spurious long memory

using the notion of temporal aggregation, i.e. exploiting the long memory parameter�s invariance to

temporal aggregation, see Ohanissian et al. (2008) and Frederiksen & Nielsen (2008). The results were

in a qualitative sense similar so the results are omitted (avaible from the author upon request).

5.3 Common long-run dependence

Here we will analyze the potential common long-run dependence between return volatility and trading

volume. Instead of estimating the cointegration vector and the memory parameter of the equilibrium

error, we focus on testing whether we can reject the hypothesis of fractional cointegration being present.

This is in line with Nielsen (2004), Hassler, Marmol & Velasco (2000), and Nielsen & Shimotsu (2007).

Tables 11 and 12 present the univariate fractional integration estimates using the LW estimator.

We set the number of periodogram ordinates equal to m = bn�c for � 2 f0:5; 0:6g. Furthermore,Tables 11 and 12 contain the test statistic for the test of equal fractional integration order, rejection

frequency of H12 (for all 45 common stocks), i.e. d1 = d2 = d, where we have set h (n) = h1 = log�1 n

and h (n) = h2 = log�2 n, the estimated eigenvalues of G��d�, and the correlation matrix P

��d�=

K��d��1=2

G��d�K��d��1=2. The bandwidth choice m1 is set equal to

�n0:45

�when m =

�n0:50

�and�

n0:55�when m =

�n0:60

�. The results for the selection procedure for rank determination are displayed

in Tables 13 and 14 where, we as suggested by Nielsen & Shimotsu (2007), use the eigenvalues from

the correlation matrix P��d�in the selection procedure instead of the estimated eigenvalues of G

��d�.

Firstly, looking at Table 11 (Panel A), we con�rm the results from the section on univariate

estimation of the fractional integration parameter for the semiparametric estimators that do not

131

account for the potential perturbation, i.e. decrease in memory estimate when bandwidth is increased,

especially for the volatility series. Furthermore, we cannot reject (5% level) the null for identical

memory estimates, i.e. H12 : d1 = d2, for the AA share for both bandwidths, and for the 3M share

when setting m =�n0:50

�. The same conclusion does not apply when looking at potential long-run

dependence between adjusted log-absolute returns and the detrended log turnover ratio, Table 12

(Panel A), as we here reject in cases but one, i.e. for the AA share when considering the smallest

bandwidth. Overall, there could potentially be a cointegrating relation when considering the AA

share, but looking at the estimated eigenvalues of G��d�and the eigenvalues from the correlation

matrix P��d�, Panels C and D, there is little evidence to suggest that there is a potential cointegrating

relation as none of the eigenvalues are close to zero (relatively). So we expect the rank to be zero, i.e.

they do not move together over time.

Tables 13 and 14 con�rm that for all choices of m1 and v (n) there are no cointegrating relations.

Even for the case where we could not reject equality of integration orders (which is a necessary

condition for there to be cointegration), there is no sign of a cointegrating relation. We know from

Nielsen & Shimotsu (2007) that a large v (n) leads to a non-conservative estimate of r, so the evidence

of there being a cointegrating relation is at best weak.

In Tables 11 (Panel B), 12 (Panel B), and 15, we summarize the results across common stocks.

Tables 11 (Panel B) and 12 (Panel B) displays the rejection frequency of the null of equal fractional

integration orders, i.e. H12 : d~r2 = dv and dj~rj = dv. Firstly, we note that (as simulations in Nielsen

& Shimotsu (2007) show) selecting a smaller h (n) leads to more rejections of the null of identical

integration orders. In the previous subsection (univariate estimation of the fractional integration

parameter), there was evidence that in general the volatility process follows a perturbed fractional

process whereas trading volume did not. We would expect that if the volatility process follows a

perturbed fractional process, then we would reject H12 more often for higher bandwidth because

the estimated memory of the volatility process decreases for higher bandwidth while the estimated

memory for the trading volume stays constant as a function of bandwidth choice. It is clearly seen

that when m =�n0:60

�, we reject more than twice as many shares compared to when the bandwidth

is m =�n0:50

�when looking at the relation between adjusted log-squared returns and the detrended

log turnover ratio. For the hypothesis of identical memory between adjusted log-squared returns and

detrended log turnover ratio, we reject in 15 and 18 cases for h1 and h2, respectively, for the smalles

bandwidth, while this rises to 36 and 38 when the bandwidth is equal to the highest bandwidth.

Interestingly, when we instead test the hypothesis of identical memory between adjusted log-absolute

returns and detrended log turnover ratio, we reject in more cases, i.e. in 30 and 31 cases for h1 and

h2, respectively, for the smallest bandwidth, while this rises to 41 and 41 when the bandwidth is equal

to the largest bandwidth.

Table 15 shows for how many common stocks there is a cointegrating relation between volatility

and trading volume. There is little evidence of a cointegrating relation between volatility and trading

volume as it is only when we select the largest v (n), i.e. v (n) =�m�0:051

�, we estimate a relation for

approximately 7%�20% of the common stocks in the composite index. We note, that the cases where,we �nd a cointegrating relation actually corresponds to cases where, we cannot reject equality of the

speci�c integration orders. To sum up, the evidence of there being a cointegrating relation between

132

return volatility and trading volume is very weak for the majority of the common stocks analyzed.

As a �nal note, we also estimated the cointegrating coe¢ cient � in the cases where dx = dy = d

by means of the fully modi�ed frequency domain least squares of Nielsen & Frederiksen (2008) and

multiple local Whittle setting of Robinson (2008). The condition that there is fractional cointegration

present is only ful�lled in a few cases, i.e. (i) dy = dx = d and (ii) "t = yt��xt � I (d") where d" < d.

This is in line with other studies, e.g. Lobato & Velasco (2000) for the DJIA30 index, Liesenfeld (2002)

for futures contracts on the DAX index, Gurgul & Wojtowicz (2006) for the DAX30 index, showing

that volatility and trading volume might share a common fractional integration order but they do not

in general co-move.


Using semiparametric methods where we can model the contamination from the short-run dynamics

by a polynomial not only in the signal but also in the noise, we �nd that volatility and the trading

volume process of stock prices are mean reverting with high a degree of persistence, and this persistence

dissipates at a hyperbolic rate. We �nd evidence that the volatility process is a perturbed fractional

process, whereas this is not generally the case for trading volume. A part of this noise is an artifact

of the sampling frequency, i.e. if we used intra daily returns and aggregated this to daily returns, we

would mitigate some of the noise. This again emphasizes using methods that can model this potential

perturbation. Additionally, the long memory estimate of the volatility process is on a mean and

median metric higher than for the trading volume process. The evidence that there exists a common

order of fractional integration between volatility and trading volume is mixed, and the evidence is

weak when using estimators that are not downward biased in the presence of perturbation in the

process. Furthermore, there is little evidence of a cointegrating relation between volatility and trading

volume. Therefore, our �ndings are not in general consistent with a modi�ed version of the MDH,

in which volatility and trading volume are governed by a common latent information-arrival process

exhibiting long memory behavior.

Future research on the volatility-volume relationship will use intra daily observations on return

volatility and trading volume. In this direction, we can do several things. For instance, it is of

interest to analyze intra daily behavior of volatility and trading volume. Furthermore, an extension

of the LPWN estimator to a multivariate setting is relevant as we, directly from this, could develop

a powerful test for equality of fractional integration parameters and set up a testing procedure for

the rank of the spectral density matrix along the lines of Robinson & Yajima (2002) and Nielsen &

Shimotsu (2007).

Additionally, work on estimating a three-factor bivariate mixture model where the common infor-

mation arrival process is assumed to posses long memory according to the Gaussian ARFIMA (1; d; 0)

process and the distinct dynamics of the volume and volatility follow Gaussian AR (1)-processes looks

very promising. To our knowledge, there has not been any empirical investigation modeling the prop-

erties of the volume-volatility system adequately as the asset return volatility and the asset volume

processes might both exhibit long memory and volatility clustering. These empirical �ndings are pos-

sible to replicate in the model of Frederiksen & Nielsen (2009). Furthermore, this model also nest the

133

hypothesis that the volume and the volatility can be fractionally cointegrated. However, this is not

empirically justi�ed even though the series generally exhibit the same degree of long memory.

References

Andersen, T. G. (1996), �Return volatility and trading volume: An information �ow interpretation of

stochastic volatility�, Journal of Finance 51, 169�204.

Andersen, T. G. & Bollerslev, T. (1997a), �Heterogeneous information arrivals and return volatility

dynamics: Uncovering the long-run in high frequency returns�, Journal of Finance 52, 975�1005.

Andersen, T. G. & Bollerslev, T. (1997b), �Intraday periodicity and volatility persistence in �nancial

markets�, Journal of Empirical Finance 4, 115�158.





Arteche, J. (2004a), Augmented log-periodogram regression in long memory signal plus noise models,

in �2004 Hawaii International Conference on Statistics, Mathematics and Related Fields�, pp. 108�

119.

Arteche, J. (2004b), �Gaussian semiparametric estimation in long memory in stochastic volatility and


Bollerslev, T. (1986), �Generalized autoregressive conditional heteroscedasticity�, Journal of Econo-

metrics 31, 307�327.



Bollerslev, T. & Mikkelsen, H. O. (1996), �Modeling and pricing long memory in stock market volatil-

ity�, Journal of Econometrics 73, 151�184.



Breitung, J. & Hassler, U. (2002), �Inference on the cointegration rank in fractionally integrated


Clark, P. K. (1973), �A subordinated stochastic process model with �nite variance for speculative

prices�, Econometrica 41, 135�155.

Deo, R. S. & Hurvich, C. M. (2001), �On the log periodogram regression estimator of the memory

parameter in long memory stochastic volatility models�, Econometric Theory 17, 686�710.

134

Deo, R. S. & Hurvich, C. M. (2002), Estimation of long memory in volatility, in P. Doukhan,

G. Oppenhein & M. Taqqu, eds, �Theory and Applications of Long-Range Dependence�, Boston:

Birkhauser.


105, 131�159.

Ding, Z. & Granger, C. W. J. (1996), �Modeling volatility persistence of speculative returns: a new

approach�, Journal of Econometrics 73, 185�215.

Ding, Z., Granger, C. W. J. & Engle, R. F. (1993), �A long memory property of stock returns and a

new model�, Journal of Empirical Finance 1, 83�106.

Engle, R. F. (1982), �Autoregressive conditional heteroscedasticity with estimates of the variance of

united kingdom in�ation�, Econometrica 50, 987�1007.

Engle, R. F. & Bollerslev, T. (1986), �Modeling the persistence iof conditional variances�, Econometric

Reviews 5, 1�50.

Epps, T. & Epps, M. (1976), �The stochastic dependence in security price changes and transaction

volumes: implications for the mixture-of-distribution hypothesis�, Econometrica 44, 305�321.

Fleming, J. & Kirby, C. (2006), �Long memory in volatility and trading volume�, Working Paper .

Frederiksen, P. H. & Nielsen, F. S. (2008), �Testing for spurious long memory in potentially nonstation-

ary perturbed fractional processes�, CREATES RP 2008-59, University of Aarhus, and Working

Paper, Nordea Markets .

Frederiksen, P. H. & Nielsen, F. S. (2009), �A dynamic long memory bivariate mixture model�, Un-

published Working paper .

Frederiksen, P. H., Nielsen, F. S. & Nielsen, M. Ø. (2008), �Local polynomial Whittle estimation

of perturbed fractional processes�, CREATES RP 2008-29, University of Aarhus, and Working

Paper, Nordea Markets and Cornell University .

Fuller, W. A. (1996), Introduction to statistical time series, Wiley, New York.



Glosten, L. & Milgrom, P. (1985), �Bid, ask, and transaction prices in a specialist market with het-

erogeneously informed traders�, Journal of Financial Economics 14, 71�100.



Granger, C. W. J. & Terasvirta, T. (1999), �A simple nonlinear time series model with misleading

linear properties�, Economics Letters 62, 161�165.

135

Grossman, S. J. & Stiglitz, J. E. (1980), �On the impossibility of informationally e¢ cient markets�,

The American Economic Review 70, 393�408.

Gurgul, H. & Wojtowicz, T. (2006), �Long memory on the german stock exchange�, Czech Journal of

Economics and Finance 56, 447�468.

Haldrup, N. & Nielsen, M. Ø. (2007), �Estimation of fractional integration in the presence of data

noise�, Computational Statistics and Data Analysis 51, 3100�3114.

Harvey, A. (1998), Long memory in stochastic volatility, in J. Knight & S. Satchell, eds, �Forecasting

Volatility in Financial Markets�, Butterworth-Heinemann, London, pp. 307�320.

Hassler, U. & Breitung, J. (2006), �A residual-based lm-type test against fractional cointegration�,

Econometric Theory 22, 1091�1111.

Hassler, U., Marmol, F. & Velasco, C. (2000), �Fractional cointegrating regression in the presence of

linear time trends�, (138).


rica 73, 1283�1328.



Karpo¤, J. (1987), �The relation between price changes and trading volume: a survey�, Journal of

Financial and Quantitative Analysis 22, 109�126.

Lamoureux, C. G. & Lastrapes, W. D. (1994), �Endogenous trading volume and momentum in stock

return volatility�, Journal of Business and Economic Statistics 12, 253�260.

Liesenfeld, R. (1998), �Dynamic bivariate mixture models: Modeling the behavior of prices and trading

volume�, Journal of Business and Economic Statistics 16, 101�109.

Liesenfeld, R. (2001), �A generalized bivariate mixture model for stock price volatility and trading

volume�, Journal of Econometrics 104, 141�178.

Liesenfeld, R. (2002), �Identifying common long-range dependence in volume and volatility using high-

frequency data�, Working Paper .

Lobato, I. N. & Savin, N. E. (1998), �Real and spurious long-memory properties of stock-market data�,

Journal of Business & Economic Statistics 16, 261�68.

Lobato, I. N. & Velasco, C. (2000), �Long memory in stock-market trading volume�, Journal of Business

and Economic Statistics 18, 570�576.

Mikosch, T. & Starica, C. (2000), �Limit theory for the sample autocorrelations and extremes of a

garch (1, 1) process�, The Annals of Statistics 28, 1427�1451.

136

Nielsen, F. S. (2008), �Fractional cointegration and the �nite sample performance in the presence of

data noise�, Unpublished, Working paper .

Nielsen, M. Ø. (2004), �Local whittle analysis of stationary fractional cointegration and the implied-

realized volatility relation�, Working paper, Cornell University .

Nielsen, M. Ø. & Frederiksen, P. H. (2008), Fully modi�ed narrow-band least squares estimation

of stationary fractional cointegration, Working Papers 1171, Queen�s University, Department of

Economics.

Nielsen, M. Ø. & Shimotsu, K. (2007), �Determining the cointegrating rank in nonstationary fractional

systems by the exact local whittle approach�, Journal of Econometrics 127, 574�596.

Ohanissian, A., Russell, J. R. & Tsay, R. S. (2008), �True or spurious long memory? a new test�,

Journal of Business & Economic Statistics 26(2), 161�175.

Parke, W. R. (1996), �What is a fractional unit root?�, unpublished manuscript (University of North

Carolina, Dept. of Economics).





Robinson, P. M. (2008), �Multiple local whittle estimation in stationary systems�, Annals of Statistics,

volume = 36, pages = 2508-2530, key = Keywords: .

Robinson, P. M. & Yajima, Y. (2002), �Determination of cointegrating rank in fractional systems�,


Shimotsu, K. (2006), �Simple (but e¤ective) tests of long memory versus structural breaks�, Working

Paper, Department of Economics, Queen�s University, Canada (1101).

Shimotsu, K. (2007), �Gaussian semiparametric estimation of multivariate fractionally integrated


Shimotsu, K. & Phillips, P. (2005), �Exact local whittle estimation of fractional integration�, The


Sun, Y. & Phillips, P. C. B. (2003), �Nonlinear log-periodogram regression for perturbed fractional


Tauchen, G. E. & Pitts, M. (1983), �The price variability-volume relationship on speculative markets�,

Econometrica 51, 485�505.

Taylor, S. J. (1994), �Modelling stochastic volatility: a review and comparative study�, Mathematical

Finance 4, 183�204.

137

Velasco, C. (1999), �Gaussian semiparametric estimation of non-stationary time series�, Journal of


Watanabe, T. (2000), �Bayesian analysis of dynamic bivariate mixture models: Can they explain the

behavior of returns and trading volume?�, Journal of Business and Economic Statistics 18, 199�210.

Watanabe, T. (2003), �The estimation of dynamic bivariate mixture models: Reply to liesenfeld and

richard comments�, Journal of Business and Economic Statistics 21, 577�580.

138

0 10 20 30 40 50

0.00

0.05

0.10

Returns 3MAbsReturns 3M

SqrReturns 3M

0 10 20 30 40 50

0.00

0.05

0.10

0.15 Returns IBMAbsReturns IBM

SqrReturns IBM

0 10 20 30 40 50

0.00

0.05

0.10

Returns AAAbsReturns AA

SqrReturns AA

0 10 20 30 40 50

0.00

0.25

0.50

0.75Detrended turnover ratio 3MDetrended turnover ratio AA

Detrended turnover ratio IBM

Figure 1: Autocorrelation function for continuously compounded percentage returns, adjusted log-

squared returns, and adjusted log-absolute returns for the common stocks 3M (Panel A), IBM (Panel

B), and AA (Panel C). In Panel D the detrended log transformed turnover ratio for the three common

stocks are depicted.

139

Panel A

0,4

0,2

0

0,2

0,4

0,6

0,8

1

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950

Band

d

LW LW+/2s.e. LW+/2s.e. LWN LWN+/2s.e. LWN+/2s.e.

Panel B

0,1

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950

Band

d


Panel C

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

1,1

1,2

50 150 250 350 450 550 650 750 850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 1950

Band

d


Figure 2: Local Whittle (LW) and local Whittle with noise (LWN) estimates of the adjusted log-

squared returns (Panel A), adjusted log-absolute returns (Panel B), and detrended log trading volume

(Panel C) for the 3M common stock.140

0,6

0,4

0,2

0

0,2

0,4

0,6

0,8

1

1,2

1,4

Band 149 249 349 449 549 649 749 849 949 1049 1149 1249 1349 1449 1549 1649 1749 1849 1949

Band

d

LWN(trad) LWN(trad)+/2s.e. LWN(abs) LWN(abs)+/2s.e. LWN(sqr) LWN(sqr)+/2s.e.

Figure 3: Local Whittle with noise (LWN) estimates of the adjusted log-squared returns (LWN(sqr)),

adjusted log-absolute returns (LWN(abs)), and detrended log trading volume (LWN(trad)) for the 3M

common stock.

141

Table 1: Summary statistics for returns measured in continuously compounded rates for the common

stocks 3M, IBM, and AA.Sample Full 1 2 3 4 5 6 7 8Panel A: 3MMean 0:03 0:04 0:04 �0:03 0:02 0:05 0:03 0:05 0:02

Std. Dev. 1:43 1:41 1:22 1:55 1:36 1:46 1:15 1:82 1:34

Skewness �0:14 0:19 0:12 0:26 0:45 �2:03 �0:33 0:19 �0:22Excess Kurtosis 6:45 4:76 2:13 1:51 1:54 26:08 4:65 2:41 5:12

Maximum 10:50 8:72 5:78 7:31 6:98 7:06 4:96 10:50 6:88

Minimum �19:59 �9:58 �6:81 �5:85 �4:79 �19:59 �9:04 �10:08 �9:38LB(10) 51:00

(0:00)17:10(0:07)

38:63(0:00)

51:48(0:00)

23:57(0:01)

12:04(0:28)

25:32(0:01)

15:95(0:10)

9:98(0:44)

LB(20) 69:92(0:00)

29:13(0:09)

42:39(0:00)

71:76(0:00)

37:46(0:01)

20:74(0:41)

38:69(0:01)

31:91(0:04)

28:26(0:10)

Panel B: IBMMean 0:03 0:09 0:01 0:01 0:03 �0:01 �0:01 0:11 �0:01Std. Dev. 1:60 1:17 1:28 1:46 1:45 1:31 1:68 2:47 1:58

Skewness 0:05 0:16 0:21 0:36 0:46 �1:03 0:02 �0:04 0:05

Excess Kurtosis 7:56 1:30 1:79 4:12 1:53 8:99 6:09 5:15 7:07

Maximum 12:37 1:29 6:79 9:86 8:15 4:56 11:09 12:37 10:66

Minimum �16:89 �4:22 �5:49 �9:13 �4:74 �13:35 �11:36 �16:89 �10:67LB(10) 7:78

(0:65)10:93(0:36)

34:39(0:00)

7:19(0:71)

23:85(0:01)

10:17(0:43)

8:33(0:59)

13:31(0:21)

22:06(0:02)

LB(20) 37:69(0:01)

23:26(0:28)

52:90(0:00)

25:54(0:18)

32:05(0:04)

18:51(0:55)

17:51(0:62)

28:51(0:10)

42:59(0:00)

Panel C: AAMean 0:02 0:02 0:00 0:01 0:01 0:05 0:04 0:09 �0:03Std. Dev. 1:85 1:35 1:64 1:92 1:87 1:83 1:64 2:34 2:04

Skewness �0:18 0:09 0:22 �0:55 0:37 �2:36 0:26 0:43 �0:16Excess Kurtosis 7:08 0:74 0:95 5:49 1:11 36:39 1:19 2:85 2:29

Maximum 13:15 4:95 6:64 7:82 7:76 7:47 8:12 13:15 8:67

Minimum �27:29 �5:12 �6:21 �12:59 �5:49 �27:29 �5:72 �10:55 �11:66LB(10) 78:33

(0:00)31:79(0:00)

30:80(0:00)

32:81(0:00)

27:42(0:00)

33:98(0:00)

9:04(0:53)

20:39(0:03)

14:52(0:15)

LB(20) 95:08(0:00)

38:27(0:01)

45:06(0:00)

53:93(0:00)

34:97(0:00)

50:37(0:00)

24:89(0:21)

44:00(0:00)

25:44(0:19)

Note: Summary statistics are based on continuously compounded percentage returns, corrected for

dividends and stock splits, multiplied by 100, i.e. Rjt = log(Pjt=Pj;t�1) for j = 3M, IBM, AA.

Full sample covers observations from July 2 1962 through December 31 2006 and consists of 11; 187

observations. The subsamples are; 1: 1962.07-1968.01, 2: 1968.02-1973.04, 3: 1973.10-1979.03, 4:

1979.04-1984.09, 5: 1984.10-1990.04, 6: 1990.05-1995.11, 7: 1995.12-2001.05, and 8: 2001.06-2006.12.

All subsamples have 1398 observations. The Ljung-Box Portmanteau statistic (LB) tests for 10th and

20th order autocorrelation in returns. p - values are provided in parentheses.

142

Table 2: Summary statistics for detrended log turnover ratio for the common stocks 3M, IBM, and

AA.Sample Full 1 2 3 4 5 6 7 8Panel A: 3MMean 0:00 0:09 �0:22 �0:10 0:18 0:39 �0:37 �0:12 0:14

Std. Dev. 0:55 0:46 0:57 0:49 0:53 0:49 0:51 0:46 0:44

Skewness 0:26 0:44 0:83 0:21 0:16 �0:04 0:54 0:48 0:51

Excess Kurtosis 0:54 1:18 2:64 0:30 0:44 0:37 0:61 1:27 1:31

Maximum 3:46 2:13 3:46 1:65 2:20 2:19 1:86 2:42 2:14

Minimum �2:40 �1:62 �1:97 �1:89 �2:40 �1:80 �1:96 �1:85 �1:52LB(10) 21106

(0:00)1080(0:00)

804(0:00)

1166(0:00)

912(0:00)

1609(0:00)

1332(0:00)

2486(0:00)

2783(0:00)

LB(20) 35544(0:00)

1229(0:09)

1342(0:00)

1580(0:00)

1260(0:00)

2563(0:00)

2201(0:00)

3902(0:00)

4121(0:00)

Panel B: IBMMean 0:00 �0:03 �0:05 0:04 0:07 0:11 �0:12 �0:07 0:05

Std. Dev. 0:45 0:59 0:44 0:40 0:38 0:40 0:44 0:47 0:37

Skewness 0:26 0:58 0:56 0:38 0:27 0:06 0:45 �0:16 0:24

Excess Kurtosis 1:46 0:32 0:82 0:53 �0:01 0:00 1:07 5:29 1:49

Maximum 2:05 1:87 1:98 1:73 1:30 1:33 1:81 2:05 1:96

Minimum �4:15 �2:01 �1:20 �1:26 �1:06 �1:17 �1:66 �4:15 �1:54LB(10) 18366

(0:00)4513(0:00)

1516(0:00)

2066(0:00)

1234(0:00)

1769(0:00)

1311(0:00)

1467(0:00)

2301(0:00)

LB(20) 25875(0:00)

7263(0:00)

1876(0:00)

2675(0:00)

1471(0:00)

2426(0:00)

1608(0:00)

2054(0:00)

2848(0:00)

Panel C: AAMean 0:00 0:01 �0:03 �0:01 �0:06 0:23 �0:07 �0:08 0:00

Std. Dev. 0:59 0:58 0:69 0:66 0:67 0:58 0:52 0:46 0:41

Skewness 0:05 0:20 0:28 �0:20 �0:02 �0:18 0:03 0:10 0:44

Excess Kurtosis 0:68 1:27 0:43 0:06 0:25 0:43 �0:01 0:67 0:97

Maximum 2:98 2:65 2:59 1:93 2:98 2:77 1:66 1:74 1:98

Minimum �3:47 �3:47 �2:18 �2:61 �2:37 �2:01 �1:81 �2:00 �1:46LB(10) 8548

(0:00)1357(0:00)

596(0:00)

489(0:00)

1554(0:00)

1151(0:00)

849(0:00)

1093(0:00)

2249(0:00)

LB(20) 11485(0:00)

1886(0:00)

807(0:00)

522(0:00)

2025(0:00)

1480(0:00)

1051(0:00)

1240(0:00)

3521(0:00)

Note: Summary statistics are based on the detrended turnover ratio for the shares 3M, IBM, and AA.

Full sample covers observations from July 2 1962 through December 31 2006 and consists of 11; 187

observations. The subsamples are; 1: 1962.07-1968.01, 2: 196802-1973.04, 3: 1973.10-1979.03, 4:

1979.04-1984.09, 5: 1984.10-1990.04, 6: 1990.05-1995.11, 7: 1995.12-2001.05, and 8: 2001.06-2006.12.

All subsamples have 1398 observations. The Ljung-Box Portmanteau statistic (LB) tests for 10th and

20th order autocorrelation in returns. p - values are provided in parentheses.

143

Table 3: Estimated fractional integration parameter for the daily adjusted log-squared returns, ad-

justed log-absolute returns, and detrended turnover ratio for the common stocks 3M, IBM, and AA.~r23M j~r3M j v3M ~r2IBM j~rIBM j vIBM ~r2AA j~rAAj vAA

Panel A: m =�n0:5

�dLP 0:418

(0:062)0:404(0:062)

0:477(0:062)

0:536(0:062)

0:492(0:062)

0:671(0:062)

0:410(0:062)

0:401(0:062)

0:339(0:062)

dLW 0:447(0:048)

0:406(0:048)

0:527(0:048)

0:485(0:048)

0:452(0:048)

0:641(0:048)

0:427(0:048)

0:378(0:048)

0:349(0:048)

dLPW 0:486a(0:073)

0:481a(0:073)

0:626a(0:073)

0:570a(0:073)

0:541a(0:073)

0:583a(0:073)

0:475a(0:073)

0:486a(0:073)

0:383a(0:073)

dLWN 0:479b(0:099)

0:511b(0:096)

0:811b(0:078)

0:616b(0:088)

0:608b(0:088)

0:562b(0:092)

0:577b(0:091)

0:659b(0:085)

0:352(0:117)

dLPWN(1;0) 0:479(0:149)

0:510(0:144)

0:808b(0:118)

0:616b(0:132)

0:607b(0:133)

0:519a(0:143)

0:693ab(0:125)

0:658b(0:128)

0:351(0:177)

dLPWN(0;1) 0:479(0:196)

0:511(0:192)

0:811b(0:180)

0:617b(0:185)

0:609b(0:186)

0:562(0:188)

0:577b(0:187)

0:659b(0:183)

0:266c(0:247)

dLPWN(1;1) 0:479(0:294)

0:510(0:289)

0:808b(0:270)

0:616(0:278)

0:608(0:279)

0:413a(0:307)

0:576(0:281)

0:658b(0:275)

0:267(0:370)

Panel B: m =�n0:6

�dLP 0:344

(0:039)0:302(0:039)

0:481(0:039)

0:386(0:039)

0:289(0:039)

0:570(0:039)

0:303(0:039)

0:240(0:039)

0:378(0:039)

dLW 0:335(0:030)

0:283(0:030)

0:475(0:030)

0:391(0:030)

0:336(0:030)

0:523(0:030)

0:314(0:030)

0:254(0:030)

0:367(0:030)

dLPW 0:411a(0:045)

0:362a(0:045)

0:528a(0:045)

0:477a(0:045)

0:438a(0:045)

0:574a(0:045)

0:412a(0:045)

0:368a(0:045)

0:331a(0:045)

dLWN 0:564b(0:057)

0:588b(0:056)

0:611b(0:055)

0:610b(0:055)

0:635b(0:054)

0:655b(0:053)

0:624b(0:055)

0:709b(0:052)

0:306b(0:080)

dLPWN(1;0) 0:561(0:086)

0:583b(0:085)

0:829ab(0:073)

0:605(0:083)

0:629b(0:082)

0:688a(0:079)

0:618b(0:082)

0:700ab(0:078)

0:391a(0:104)

dLPWN(0;1) 0:564b(0:118)

0:588b(0:117)

0:611b(0:116)

0:610b(0:116)

0:635b(0:116)

0:655(0:115)

0:624b(0:116)

0:709bc(0:114)

0:371(0:133)

dLPWN(1;1) 0:561(0:178)

0:583(0:176)

0:837abc(0:170)

0:605(0:175)

0:629(0:174)

0:690(0:172)

0:618b(0:174)

0:701b(0:172)

0:321(0:212)

Panel C: m =�n0:7

�dLP 0:204

(0:024)0:156(0:024)

0:394(0:024)

0:261(0:024)

0:183(0:024)

0:477(0:024)

0:186(0:024)

0:133(0:024)

0:364(0:024)

dLW 0:228(0:019)

0:184(0:019)

0:418(0:019)

0:273(0:019)

0:232(0:019)

0:460(0:019)

0:230(0:019)

0:177(0:019)

0:355(0:019)

dLPW 0:322a(0:028)

0:261a(0:028)

0:468a(0:028)

0:389a(0:028)

0:329a(0:028)

0:512a(0:028)

0:308a(0:028)

0:259a(0:028)

0:342a(0:028)

dLWN 0:609b(0:034)

0:628b(0:034)

0:547b(0:036)

0:650b(0:033)

0:652b(0:033)

0:579b(0:035)

0:589b(0:035)

0:671b(0:033)

0:343(0:047)

dLPWN(1;0) 0:548(0:054)

0:544(0:055)

0:633(0:051)

0:572(0:053)

0:592(0:052)

0:660(0:050)

0:615b(0:052)

0:564ab(0:054)

0:324(0:073)

dLPWN(0;1) 0:593bc(0:073)

0:606bc(0:073)

0:602(0:073)

0:622bc(0:073)

0:638bc(0:072)

0:628b(0:073)

0:575b(0:074)

0:646bc(0:072)

0:406(0:081)

dLPWN(1;1) 0:562(0:111)

0:608(0:110)

0:624(0:109)

0:592(0:110)

0:614(0:110)

0:651(0:109)

0:612(0:110)

0:661b(0:108)

0:316(0:134)

Notes: Asymptotic standard errors are in parentheses. a, b, and c denotes signi�cance at a 5% level

for the polynomial coe¢ cients �y; �w;0; and �w;1, respectively.

144

Table 4: Mean and median of the 45 estimated fractional integration parameter for adjusted log-

squared returns, adjusted log-absolute returns, and detrended turnover ratio for the SP100 composite

index.�d~r2

�dj~rj �dv d~r2 dj~rj dv

Panel A: m =�n0:5

�dLP 0:440 0:391 0:474 0:423 0:394 0:483

dLW 0:442 0:395 0:493 0:445 0:395 0:500

dLPW 0:502 0:492 0:511 0:507 0:508 0:528

dLWN 0:585 0:665 0:544 0:589 0:664 0:555

dLPWN(1;0) 0:603 0:663 0:562 0:614 0:674 0:573

dLPWN(0;1) 0:592 0:666 0:552 0:609 0:664 0:572

dLPWN(1;1) 0:667 0:674 0:544 0:609 0:672 0:529

Panel B: m =�n0:6

�dLP 0:333 0:273 0:430 0:336 0:277 0:434

dLW 0:343 0:293 0:440 0:346 0:294 0:439

dLPW 0:429 0:383 0:473 0:429 0:381 0:485

dLWN 0:597 0:637 0:526 0:606 0:635 0:545

dLPWN(1;0) 0:610 0:646 0:604 0:617 0:644 0:597

dLPWN(0;1) 0:605 0:646 0:568 0:606 0:636 0:598

dLPWN(1;1) 0:604 0:656 0:612 0:611 0:656 0:608

Panel C: m =�n0:7

�dLP 0:225 0:175 0:385 0:222 0:172 0:385

dLW 0:249 0:205 0:397 0:246 0:201 0:401

dLPW 0:330 0:279 0:429 0:327 0:275 0:434

dLWN 0:591 0:618 0:484 0:579 0:619 0:496

dLPWN(1;0) 0:607 0:631 0:552 0:610 0:618 0:556

dLPWN(0;1) 0:591 0:617 0:533 0:602 0:627 0:541

dLPWN(1;1) 0:609 0:636 0:539 0:604 0:622 0:553

Notes: �d and d denote the mean and median, respectively.

145

Table5:Rangeofmemoryestimatesfortheadjustedlog-squaredreturns,adjustedlog-absolutereturns,andthedetrendedlogturnoverratio.

m=� n0:

5�

m=� n0:

6�

m=� n0:

7�

� ~r2 min;~r2 max

�� j~rj m

in;j~rj max

�(vmin;vmax)

� ~r2 min;~r2 max

�� j~rj m

in;j~rj max

�(vmin;vmax)

� ~r2 min;~r2 max

�� j~rj m

in;j~rj max

�(vmin;vmax)

LP

� :285

(:062);:592

(:062)�

� :214

(:062);:539

(:062)�

� :287

(:062);:671

(:062)�

� :213

(:039);:423

(:039)�

� :189

(:039);:369

(:039)�

� :292

(:039);:570

(:039)�

� :176

(:024);:293

(:024)�

� :114

(:024);:243

(:024)�

� :300

(:024);:477

(:024)�

LW� :3

14

(:048);:553

(:048)�

� :271

(:048);:521

(:048)�

� :317

(:048);:641

(:048)�

� :283

(:030);:422

(:030)�

� :199

(:030);:373

(:030)�

� :292

(:030);:537

(:030)�

� :189

(:019);:299

(:019)�

� :138

(:019);:257

(:019)�

� :301

(:019);:467

(:019)�

LPW

� :326

(:073);:687

(:073)�

� :299

(:073);:667

(:073)�

� :275

(:073);:706

(:073)�

� :335

(:045);:519

(:045)�

� :299

(:045);:483

(:045)�

� :270

(:045);:582

(:045)�

� :273

(:028);:397

(:028)�

� :198

(:028);:354

(:028)�

� :292

(:028);:535

(:028)�

LWN

� :334

(:122);:828

(:078)�

� :401

(:110);:938

(:075)�

� :188

(:179);:849

(:078)�

� :409

(:068);:799

(:049)�

� :437

(:065);:900

(:047)�

� :221

(:099);:693

(:052)�

� :472

(:039);:812

(:030)�

� :446

(:040);:906

(:029)�

� :296

(:051);:613

(:034)�

LPWN(1,0)

� :366

(:173);:827

(:117)�

� :325

(:185);:937

(:112)�

� :028

(1:380);:848

(:116)�

� :324

(:116);:790

(:074)�

� :368

(:108);:895

(:071)�

� :296

(:123);:862

(:072)�

� :414

(:063);:914

(:044)�

� :456

(:060);:989

(:043)�

� :316

(:074);:838

(:045)�

LPWN(0,1)

� :341

(:220);:828

(:180)�

� :401

(:207);:938

(:180)�

� :016

(2:385);:849

(:180)�

� :409

(:129);:799

(:113)�

� :437

(:126);:916

(:113)�

� :320

(:142);:722

(:114)�

� :432

(:079);:795

(:071)�

� :450

(:078);:887

(:071)�

� :258

(:098);:701

(:072)�

LPWN(1,1)

� :313

(:343);:827

(:270)�

� :361

(:323);:938

(:270)�

� :018

(3:196);:848

(:270)�

� :377

(:199);:790

(:171)�

� :446

(:188);:895

(:170)�

� :298

(:220);:880

(:170)�

� :428

(:120);:926

(:107)�

� :453

(:117);:987

(:107)�

� :193

(:174);:826

(:107)�

Notes:Tableisbasedonthecommonstockswhereweobservethefullsampleofdailyobservations,i.e.45shares.Below

(min;max)arethe

respectivestandarderrorsinparenthesesforthespeci�cestimate.

146

Table6:Rejectionfrequencyofthenullthatd=0andd=1fortheadjustedlog-squaredreturns,adjustedlog-absolutereturns,andthe

detrendedlogturnoverratio.

m=� n0:

5�

m=� n0:

6�

m=� n0:

7�

~r2

j~rj

v~r2

j~rj

v~r2

j~rj

v

d=0

d=1

d=0

d=1

d=0

d=1

d=0

d=1

d=0

d=1

d=0

d=1

d=0

d=1

d=0

d=1

d=0

d=1

LP

4545

4545

4545

4545

4545

4545

4545

4545

4545

LW45

4545

4545

4545

4545

4545

4545

4545

4545

45LPW

4545

4545

4545

4545

4545

4545

4545

4545

4545

LWN

4545

4544

4343

4545

4545

4545

4545

4545

4545

LPWN(1,0)

4544

4538

4139

4545

4544

4545

4545

4544

4545

LPWN(0,1)

4438

4525

4036

4545

4541

4545

4545

4544

4545

LPWN(1,1)

3911

416

2921

4540

4533

4334

4544

4542

4444

Notes:Tableisbasedonthecommonstockswhereweobservethefullsampleofdailyobservations,i.e.45shares.Rejectionofthenullis

basedontheone-sidedalternativeata5%

signi�cancelevel.

147

Table 7: Results for testing long memory against spurious long memory using the sample splitting

methodology for the 3M common stock.m dLW �dLW W

(LW )S dLWN

�dLWN W(LWN)S

K = 2 K = 4 K = 2 K = 4 K = 2 K = 4 K = 2 K = 4

Panel A: ~r2

400 0:276 0:273 0:270 0:699 1:978 0:614 0:658 0:757 1:554 1:179

600 0:243 0:242 0:239 1:026 3:215 0:595 0:631 0:735 1:416 1:339

800 0:223 0:221 0:215 1:479 1:421 0:574 0:609 0:659 1:139 3:012

1000 0:204 0:201 0:197 0:890 0:459 0:569 0:605 0:648 1:499 3:908

Panel B: j~rj400 0:225 0:220 0:216 2:520 3:395 0:655 0:693 0:744 1:299 0:615

600 0:197 0:193 0:189 1:977 3:125 0:631 0:663 0:730 1:934 0:950

800 0:183 0:181 0:177 2:791 1:972 0:589 0:621 0:631 2:109 4:081

1000 0:168 0:165 0:160 3:115 1:699 0:574 0:608 0:651 2:028 2:884

Panel C: v400 0:459 0:459 0:490 1:416 3:706 0:575 0:588 0:637 2:269 1:712

600 0:425 0:423 0:450 0:928 1:313 0:564 0:574 0:610 2:690 6:220

800 0:401 0:401 0:423 0:693 1:460 0:559 0:565 0:611 2:531 6:575

1000 0:387 0:386 0:405 1:528 1:166 0:543 0:548 0:599 1:580 9:071�

Notes: � denotes rejection of the null at the 5% level. WS is �2 distributed with critical values,

�20:95 (1) = 3:84 and �20:95 (3) = 7:82.


methodology for the IBM common stock.m dLW �dLW W

(LW )S dLWN

�dLWN W(LWN)S

K = 2 K = 4 K = 2 K = 4 K = 2 K = 4 K = 2 K = 4

Panel A: ~r2

400 0:344 0:342 0:329 0:404 2:221 0:606 0:622 0:647 0:203 3:827

600 0:292 0:287 0:274 0:848 5:306 0:629 0:649 0:673 0:057 2:094

800 0:264 0:257 0:248 1:055 2:876 0:627 0:649 0:660 0:033 3:760

1000 0:253 0:246 0:235 0:845 7:289 0:595 0:613 0:626 0:105 4:295

Panel B: j~rj400 0:289 0:281 0:263 1:765 5:073 0:636 0:655 0:648 0:030 7:350

600 0:245 0:237 0:219 3:045 7:820 0:644 0:662 0:675 0:006 6:283

800 0:227 0:218 0:204 4:619� 7:029 0:617 0:634 0:637 0:174 8:619�

1000 0:215 0:207 0:193 2:884 11:617� 0:586 0:600 0:602 0:003 7:534

Panel C: v400 0:481 0:460 0:460 2:481 12:192� 0:634 0:654 0:581 0:572 6:744

600 0:471 0:454 0:451 6:819� 18:927� 0:569 0:560 0:543 0:092 5:353

800 0:463 0:445 0:449 6:185� 21:157� 0:535 0:523 0:495 1:001 7:695

1000 0:449 0:433 0:437 6:374� 22:885� 0:532 0:516 0:494 1:623 9:602�


�20:95 (1) = 3:84 and �20:95 (3) = 7:82.

148


methodology for the AA common stock.m dLW �dLW W

(LW )S dLWN

�dLWN W(LWN)S

K = 2 K = 4 K = 2 K = 4 K = 2 K = 4 K = 2 K = 4

Panel A: ~r2

400 0:275 0:273 0:253 1:782 3:556 0:626 0:660 0:754 0:182 1:516

600 0:241 0:238 0:218 1:443 5:091 0:605 0:629 0:660 0:008 0:698

800 0:210 0:207 0:190 2:096 4:378 0:619 0:642 0:649 0:051 1:789

1000 0:195 0:190 0:174 1:231 2:120 0:604 0:628 0:619 0:028 2:173

Panel B: j~rj400 0:227 0:218 0:197 2:901 4:880 0:676 0:706 0:768 0:042 2:082

600 0:188 0:180 0:161 3:983� 5:867 0:679 0:706 0:676 0:055 4:625

800 0:163 0:156 0:139 5:999� 5:649 0:689 0:713 0:657 0:265 5:714

1000 0:157 0:149 0:133 3:850� 2:664 0:640 0:661 0:595 0:001 9:015�

Panel C: v400 0:342 0:349 0:352 0:356 2:410 0:386 0:398 0:413 0:109 1:910

600 0:341 0:343 0:341 0:053 3:382 0:377 0:392 0:404 0:703 0:643

800 0:348 0:350 0:348 0:041 1:789 0:348 0:361 0:381 0:140 1:571

1000 0:344 0:347 0:345 0:030 1:720 0:356 0:364 0:386 0:001 0:617


�20:95 (1) = 3:84 and �20:95 (3) = 7:82.

Table 10: Rejection frequency of the null that the adjusted log-squared returns, adjusted log-absolute

returns, and detrended log turnover ratios are long memory processes using the sample splitting

methodology.m W

(LW )S W

(LWN)S

K = 2 K = 4 K = 2 K = 4

Panel A: ~r2

400 7 10 5 4

600 7 13 4 3

800 8 17 2 2

1000 7 18 3 2

Panel B: j~rj400 5 14 5 4

600 10 20 4 6

800 11 22 2 7

1000 10 22 2 8

Panel C: v400 8 6 0 4

600 7 6 0 7

800 7 10 1 10

1000 6 10 3 10

Note: Table is based on the common stocks where we observe the full sample of daily observations,

i.e. 45 shares.

149

Table 11: LW estimates of the fractional integration orders, joint test of pairwise equality, rejection

frequency of the joint test of pairwise equality, and estimated eigenvalues for the 3M, IBM, and AA

common stocks for the adjusted log-squared return and detrended log turnover ratio.3M IBM AA

Panel A: LW estimates of dm =

�n0:5

�~r2 0:447

(0:048)0:485(0:048)

0:427(0:048)

v 0:527(0:048)

0:641(0:048)

0:349(0:048)

T12 (h1) 1:55 5:37 0:63

T12 (h2) 1:84 6:39 0:75

m =�n0:6

�~r2 0:335

(0:030)0:391(0:030)

0:314(0:030)

v 0:475(0:030)

0:523(0:030)

0:367(0:030)

T12 (h1) 8:61� 22:75� 1:42

T12 (h2) 10:26� 27:04� 1:69

Panel B: Rejection frequency of H12T12 (h1) T12 (h2)

m =�n0:50

�15 18

m =�n0:60

�36 38

Panel C: Estimated eigenvalues for 10; 000� G��d��

�1 �2 �1 �2 �1 �2

m1 =�n0:45

�2583 50:67 1953 24:96 3868 313

m1 =�n0:55

�4209 97:88 2787 40:91 5264 274

Panel D: Estimated eigenvalues for P��d��

�1 �2 �1 �2 �1 �2

m1 =�n0:45

�1:06 0:94 1:11 0:89 1:04 0:96

m1 =�n0:55

�1:14 0:85 1:08 0:91 1:11 0:88

Notes: � denotes rejection of the null at the 5% level. T12 (hi) is the joint test of pairwise equality of the

integration level, where hi for i = 1; 2 is a bandwidth choice equal to h1 = log�1 n and h2 = log�2 n,

respectively. �i for i = 1; 2 is the ith eigenvalues of 10; 000� G��d��and P

��d��.

150

Table 12: LW estimates of the fractional integration orders, joint test of pairwise equality, rejection

frequency of the joint test of pairwise equality, and estimated eigenvalues for the 3M, IBM, and AA

common stocks for the adjusted log-absolute return and detrended log turnover ratio.3M IBM AA

Panel A: LW estimates of dm =

�n0:5

�j~rj 0:406

(0:048)0:462(0:048)

0:361(0:048)

v 0:527(0:048)

0:672(0:048)

0:359(0:048)

T12 (h1) 3:31� 7:64� 0:00

T12 (h2) 3:94� 9:07� 0:01

m =�n0:6

�j~rj 0:283

(0:030)0:336(0:030)

0:254(0:030)

v 0:475(0:030)

0:523(0:030)

0:367(0:030)

T12 (h1) 16:45� 35:57� 6:27�

T12 (h2) 19:62� 42:30� 7:46�

Panel B: Rejection frequency of H12T12 (h1) T12 (h2)

m =�n0:50

�30 31

m =�n0:60

�41 41

Panel C: Estimated eigenvalues for 10; 000� G��d��

�1 �2 �1 �2 �1 �2

m1 =�n0:45

�1147 50:37 772 25:16 2012 312

m1 =�n0:55

�2064 97:07 1351 40:77 2793 273

Panel D: Estimated eigenvalues for P��d��

�1 �2 �1 �2 �1 �2

m1 =�n0:45

�1:09 0:90 1:06 0:94 1:05 0:94

m1 =�n0:55

�1:16 0:83 1:10 0:89 1:12 0:87

Notes: � denotes rejection of the null at the 5% level. T12 (hi) is the joint test of pairwise equality of the

integration level, where hi for i = 1; 2 is a bandwidth choice equal to h1 = log�1 n and h2 = log�2 n,

respectively. �i for i = 1; 2 is the ith eigenvalues of 10; 000� G��d��and P

��d��.

151

Table 13: Rank estimates for the 3M, IBM, and AA common stocks for the adjusted log-squared

returns and detrended log turnover ratio.L (u) v (n) = m�0:45

1 v (n) = m�0:351 v (n) = m�0:25

1 v (n) = m�0:151 v (n) = m�0:05

1

Panel A: 3Mm1 =

�n0:45

�L (0) �1:69 �1:54 �1:29 �0:93 �0:38L (1) �0:91 �0:83 �0:71 �0:53 �0:25r 0 0 0 0 0

m1 =�n0:55

�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:04 �0:97 �0:86 �0:67 �0:36r 0 0 0 0 0

Panel B: IBMm1 =

�n0:45

�L (0) �1:69 �1:54 �1:29 �0:93 �0:37L (1) �0:95 �0:87 �0:75 �0:57 �0:29r 0 0 0 0 0

m1 =�n0:55

�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �0:98 �0:92 �0:81 �0:62 �0:31r 0 0 0 0 0

Panel C: AAm1 =

�n0:45

�L (0) �1:69 �1:54 �1:29 �0:93 �0:38L (1) �0:89 �0:81 �0:69 �0:50 �0:23r 0 0 0 0 0

m1 =�n0:55

�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:01 �0:94 �0:83 �0:64 �0:33r 0 0 0 0 0

Notes: L (u) denotes the value of the criteria function for u = 0; 1. r is the estimated rank of P��d��.

152

Table 14: Rank estimates for the 3M, IBM, and AA common stocks for the adjusted log-absolute

returns and detrended log turnover ratio.L (u) v (n) = m�0:45

1 v (n) = m�0:351 v (n) = m�0:25

1 v (n) = m�0:151 v (n) = m�0:05

1

Panel A: 3Mm1 =

�n0:45

�L (0) �1:69 �1:53 �1:29 �0:93 �0:37L (1) �0:94 �0:86 �0:74 �0:56 �0:28r 0 0 0 0 0

m1 =�n0:55

�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:06 �1:00 �0:88 �0:70 �0:39r 0 0 0 0 0

Panel B: IBMm1 =

�n0:45

�L (0) �1:69 �1:53 �1:29 �0:93 �0:37L (1) �0:90 �0:82 �0:71 �0:52 �0:24r 0 0 0 0 0

m1 =�n0:55

�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:00 �0:93 �0:82 �0:64 �0:33r 0 0 0 0 0

Panel C: AAm =

�n0:45

�L (0) �1:69 �1:53 �1:29 �0:93 �0:37L (1) �0:90 �0:82 �0:70 �0:52 �0:24r 0 0 0 0 0

m =�n0:55

�L (0) �1:80 �1:66 �1:44 �1:07 �0:45L (1) �1:02 �0:95 �0:84 �0:66 �0:35r 0 0 0 0 0

Notes: L (u) denotes the value of the criteria function for u = 0; 1. r is the estimated rank of P��d��.

Table 15: Frequency of how many times we estimate 1 cointegrating relation.jm�0:451

k jm�0:351

k jm�0:251

k jm�0:151

k jm�0:051

kPanel A:

�~r2; v

�m1 =

�n0:45

�0 0 0 0 5

m1 =�n0:55

�0 0 0 0 3

Panel B: (j~rj ; v)m1 =

�n0:45

�0 0 0 0 9

m1 =�n0:55

�0 0 0 0 6

Note: r is the estimated rank of P��d��.

153

SCHOOL OF ECONOMICS AND MANAGEMENT UNIVERSITY OF AARHUS - UNIVERSITETSPARKEN - BUILDING 1322

DK-8000 AARHUS C – TEL. +45 8942 1111 - www.econ.au.dk

PhD Theses: 1999-4 Philipp J.H. Schröder, Aspects of Transition in Central and Eastern Europe. 1999-5 Robert Rene Dogonowski, Aspects of Classical and Contemporary European Fiscal

Policy Issues. 1999-6 Peter Raahauge, Dynamic Programming in Computational Economics. 1999-7 Torben Dall Schmidt, Social Insurance, Incentives and Economic Integration. 1999 Jørgen Vig Pedersen, An Asset-Based Explanation of Strategic Advantage. 1999 Bjarke Jensen, Five Essays on Contingent Claim Valuation. 1999 Ken Lamdahl Bechmann, Five Essays on Convertible Bonds and Capital Structure

Theory. 1999 Birgitte Holt Andersen, Structural Analysis of the Earth Observation Industry. 2000-1 Jakob Roland Munch, Economic Integration and Industrial Location in Unionized

Countries. 2000-2 Christian Møller Dahl, Essays on Nonlinear Econometric Time Series Modelling. 2000-3 Mette C. Deding, Aspects of Income Distributions in a Labour Market Perspective. 2000-4 Michael Jansson, Testing the Null Hypothesis of Cointegration. 2000-5 Svend Jespersen, Aspects of Economic Growth and the Distribution of Wealth. 2001-1 Michael Svarer, Application of Search Models. 2001-2 Morten Berg Jensen, Financial Models for Stocks, Interest Rates, and Options: Theory

and Estimation. 2001-3 Niels C. Beier, Propagation of Nominal Shocks in Open Economies. 2001-4 Mette Verner, Causes and Consequences of Interrruptions in the Labour Market. 2001-5 Tobias Nybo Rasmussen, Dynamic Computable General Equilibrium Models: Essays

on Environmental Regulation and Economic Growth.

2001-6 Søren Vester Sørensen, Three Essays on the Propagation of Monetary Shocks in Open Economies.

2001-7 Rasmus Højbjerg Jacobsen, Essays on Endogenous Policies under Labor Union

Influence and their Implications. 2001-8 Peter Ejler Storgaard, Price Rigidity in Closed and Open Economies: Causes and

Effects. 2001 Charlotte Strunk-Hansen, Studies in Financial Econometrics. 2002-1 Mette Rose Skaksen, Multinational Enterprises: Interactions with the Labor Market. 2002-2 Nikolaj Malchow-Møller, Dynamic Behaviour and Agricultural Households in

Nicaragua. 2002-3 Boriss Siliverstovs, Multicointegration, Nonlinearity, and Forecasting. 2002-4 Søren Tang Sørensen, Aspects of Sequential Auctions and Industrial Agglomeration. 2002-5 Peter Myhre Lildholdt, Essays on Seasonality, Long Memory, and Volatility. 2002-6 Sean Hove, Three Essays on Mobility and Income Distribution Dynamics. 2002 Hanne Kargaard Thomsen, The Learning organization from a management point of

view - Theoretical perspectives and empirical findings in four Danish service organizations.

2002 Johannes Liebach Lüneborg, Technology Acquisition, Structure, and Performance in

The Nordic Banking Industry. 2003-1 Carter Bloch, Aspects of Economic Policy in Emerging Markets. 2003-2 Morten Ørregaard Nielsen, Multivariate Fractional Integration and Cointegration. 2003 Michael Knie-Andersen, Customer Relationship Management in the Financial Sector. 2004-1 Lars Stentoft, Least Squares Monte-Carlo and GARCH Methods for American

Options. 2004-2 Brian Krogh Graversen, Employment Effects of Active Labour Market Programmes:

Do the Programmes Help Welfare Benefit Recipients to Find Jobs? 2004-3 Dmitri Koulikov, Long Memory Models for Volatility and High Frequency Financial

Data Econometrics. 2004-4 René Kirkegaard, Essays on Auction Theory.

2004-5 Christian Kjær, Essays on Bargaining and the Formation of Coalitions. 2005-1 Julia Chiriaeva, Credibility of Fixed Exchange Rate Arrangements. 2005-2 Morten Spange, Fiscal Stabilization Policies and Labour Market Rigidities. 2005-3 Bjarne Brendstrup, Essays on the Empirical Analysis of Auctions. 2005-4 Lars Skipper, Essays on Estimation of Causal Relationships in the Danish Labour

Market. 2005-5 Ott Toomet, Marginalisation and Discouragement: Regional Aspects and the Impact

of Benefits. 2005-6 Marianne Simonsen, Essays on Motherhood and Female Labour Supply. 2005 Hesham Morten Gabr, Strategic Groups: The Ghosts of Yesterday when it comes to

Understanding Firm Performance within Industries? 2005 Malene Shin-Jensen, Essays on Term Structure Models, Interest Rate Derivatives and

Credit Risk. 2006-1 Peter Sandholt Jensen, Essays on Growth Empirics and Economic Development. 2006-2 Allan Sørensen, Economic Integration, Ageing and Labour Market Outcomes 2006-3 Philipp Festerling, Essays on Competition Policy 2006-4 Carina Sponholtz, Essays on Empirical Corporate Finance 2006-5 Claus Thrane-Jensen, Capital Forms and the Entrepreneur – A contingency approach

on new venture creation 2006-6 Thomas Busch, Econometric Modeling of Volatility and Price Behavior in Asset and

Derivative Markets 2007-1 Jesper Bagger, Essays on Earnings Dynamics and Job Mobility 2007-2 Niels Stender, Essays on Marketing Engineering 2007-3 Mads Peter Pilkjær Harmsen, Three Essays in Behavioral and Experimental

Economics 2007-4 Juanna Schrøter Joensen, Determinants and Consequences of Human Capital

Investments 2007-5 Peter Tind Larsen, Essays on Capital Structure and Credit Risk

2008-1 Toke Lilhauge Hjortshøj, Essays on Empirical Corporate Finance – Managerial Incentives, Information Disclosure, and Bond Covenants

2008-2 Jie Zhu, Essays on Econometric Analysis of Price and Volatility Behavior in Asset

Markets 2008-3 David Glavind Skovmand, Libor Market Models - Theory and Applications 2008-4 Martin Seneca, Aspects of Household Heterogeneity in New Keynesian Economics 2008-5 Agne Lauzadyte, Active Labour Market Policies and Labour Market Transitions in

Denmark: an Analysis of Event History Data 2009-1 Christian Dahl Winther, Strategic timing of product introduction under heterogeneous

demand 2009-2 Martin Møller Andreasen, DSGE Models and Term Structure Models with

Macroeconomic Variables 2009-3 Frank Steen Nielsen, On the estimation of fractionally integrated processes

Documents

On the estimation of fractionally integrated processes ...pure.au.dk/portal/files/98248752/fsnielsen.pdfOn the estimation of fractionally integrated processes processes ... Chapter